AI-DRIVEN ADAPTIVE REPLICA PLACEMENT AND REBALANCING IN DISTRIBUTED STORAGE SYSTEMS

Authors

  • Ankit Gupta USA Author

Keywords:

Artificial Intelligence, Distributed Storage Systems, Machine Learning, Performance Optimization, Replica Management

Abstract

This article presents an innovative AI-driven framework for adaptive replica placement and rebalancing in distributed storage systems. The framework addresses critical challenges in maintaining optimal performance and reliability in dynamic storage environments through a sophisticated three-component architecture. By leveraging advanced machine learning techniques, including Deep Q-Networks and Proximal Policy Optimization algorithms, the system demonstrates superior capability in workload prediction, resource optimization, and automated decision-making. The framework incorporates knowledge-driven approaches for performance prediction, intelligent placement strategies for load balancing, and reinforcement learning mechanisms for continuous system improvement. Through comprehensive experimental validation across various deployment scenarios, the framework exhibits significant improvements in latency reduction, resource utilization, and system reliability compared to traditional approaches. The solution's ability to adapt to changing conditions while maintaining strict service level agreements represents a paradigm shift in distributed storage management, offering promising implications for future cloud and edge computing environments.

 

References

David Reinsel, et al., "The Digitization of the World From Edge to Core," IDC White Paper, Seagate, Nov. 2018. Available: https://www.seagate.com/files/www-content/our-story/trends/files/idc-seagate-dataage-whitepaper.pdf

Elizabeth A. M. Shriver, et al., "Performance Analysis of Storage Systems," Performance Evaluation: Origins and Directions, 2000. Available: https://www.researchgate.net/publication/221025172_Performance_Analysis_of_Storage_Systems

Asit K. Mishra, et al.,, "Towards Characterizing Cloud Backend Workloads: Insights from Google Compute Clusters," ACM SIGMETRICS Performance Evaluation Review, Volume 37, Issue 4. Available: https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/35611.pdf

S. Ghemawat, H. Gobioff, and S. T. Leung, "The Google File System," in Proc. 19th ACM Symposium on Operating Systems Principles (SOSP), 2003, pp. 29-43. Available: https://static.googleusercontent.com/media/research.google.com/en//archive/gfs-sosp2003.pdf

Ibrahim UmitAkgun, et al., "A Machine Learning Framework to Improve Storage System Performance," HotStorage ’21, July 27–28, 2021, Virtual, USA. Available: https://www.fsl.cs.stonybrook.edu/~umit/files/kml.pdf

Peter Bodík, "Surviving Failures in Bandwidth-Constrained Datacenters," ACM SIGCOMM Conference, 2012, pp. 431-442. Available: https://conferences.sigcomm.org/sigcomm/2012/paper/sigcomm/p431.pdf

Gaith Rjoub, "Deep and Reinforcement Learning for Automated Task Scheduling in Large-Scale Cloud Computing Systems," International Journal of Cloud Computing, vol. 9, no. 4, pp. 312-327, 2020. Available: https://www.researchgate.net/publication/341538799_Deep_and_Reinforcement_Learning_for_Automated_Task_Scheduling_in_Large-Scale_Cloud_Computing_Systems

Sage A. Weil, et al., "RADOS: A Scalable, Reliable Storage Service for Petabyte-scale Storage Clusters," 2nd International Workshop on Petascale Data Storage, 2007, pp. 35-44. Available: https://ceph.com/assets/pdfs/weil-rados-pdsw07.pdf

Konstantin Shvachko, et al., "The Hadoop Distributed File System," IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST), 2010, pp. 1-10. Available: https://ieeexplore.ieee.org/document/5496972

Ashish Thusoo, et al., "Data warehousing and analytics infrastructure at facebook," ACM SIGMOD International Conference on Management of Data, 2010, pp. 1013-1020. Available: https://www.researchgate.net/publication/221213095_Data_warehousing_and_analytics_infrastructure_at_facebook

G. DeCandia, et al.,, "Dynamo: Amazon’s Highly Available Key-value Store," 21st ACM SIGCOMM Symposium on Operating Systems Principles (SOSP), 2007, pp. 205-220. Available: https://www.allthingsdistributed.com/files/amazon-dynamo-sosp2007.pdf

Shuang Li, et al., "Distributed storage based on cloud computing," International Conference on Cyberspace Technology (CCT 2013), 2013. Available: https://ieeexplore.ieee.org/document/6748584

Published

2025-01-22

How to Cite

Ankit Gupta. (2025). AI-DRIVEN ADAPTIVE REPLICA PLACEMENT AND REBALANCING IN DISTRIBUTED STORAGE SYSTEMS. INTERNATIONAL JOURNAL OF COMPUTER ENGINEERING AND TECHNOLOGY, 16(01), 990-1006. https://ijcet.in/index.php/ijcet/article/view/261