AI-DRIVEN ADAPTIVE REPLICA PLACEMENT AND REBALANCING IN DISTRIBUTED STORAGE SYSTEMS
Keywords:
Artificial Intelligence, Distributed Storage Systems, Machine Learning, Performance Optimization, Replica ManagementAbstract
This article presents an innovative AI-driven framework for adaptive replica placement and rebalancing in distributed storage systems. The framework addresses critical challenges in maintaining optimal performance and reliability in dynamic storage environments through a sophisticated three-component architecture. By leveraging advanced machine learning techniques, including Deep Q-Networks and Proximal Policy Optimization algorithms, the system demonstrates superior capability in workload prediction, resource optimization, and automated decision-making. The framework incorporates knowledge-driven approaches for performance prediction, intelligent placement strategies for load balancing, and reinforcement learning mechanisms for continuous system improvement. Through comprehensive experimental validation across various deployment scenarios, the framework exhibits significant improvements in latency reduction, resource utilization, and system reliability compared to traditional approaches. The solution's ability to adapt to changing conditions while maintaining strict service level agreements represents a paradigm shift in distributed storage management, offering promising implications for future cloud and edge computing environments.
References
David Reinsel, et al., "The Digitization of the World From Edge to Core," IDC White Paper, Seagate, Nov. 2018. Available: https://www.seagate.com/files/www-content/our-story/trends/files/idc-seagate-dataage-whitepaper.pdf
Elizabeth A. M. Shriver, et al., "Performance Analysis of Storage Systems," Performance Evaluation: Origins and Directions, 2000. Available: https://www.researchgate.net/publication/221025172_Performance_Analysis_of_Storage_Systems
Asit K. Mishra, et al.,, "Towards Characterizing Cloud Backend Workloads: Insights from Google Compute Clusters," ACM SIGMETRICS Performance Evaluation Review, Volume 37, Issue 4. Available: https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/35611.pdf
S. Ghemawat, H. Gobioff, and S. T. Leung, "The Google File System," in Proc. 19th ACM Symposium on Operating Systems Principles (SOSP), 2003, pp. 29-43. Available: https://static.googleusercontent.com/media/research.google.com/en//archive/gfs-sosp2003.pdf
Ibrahim UmitAkgun, et al., "A Machine Learning Framework to Improve Storage System Performance," HotStorage ’21, July 27–28, 2021, Virtual, USA. Available: https://www.fsl.cs.stonybrook.edu/~umit/files/kml.pdf
Peter Bodík, "Surviving Failures in Bandwidth-Constrained Datacenters," ACM SIGCOMM Conference, 2012, pp. 431-442. Available: https://conferences.sigcomm.org/sigcomm/2012/paper/sigcomm/p431.pdf
Gaith Rjoub, "Deep and Reinforcement Learning for Automated Task Scheduling in Large-Scale Cloud Computing Systems," International Journal of Cloud Computing, vol. 9, no. 4, pp. 312-327, 2020. Available: https://www.researchgate.net/publication/341538799_Deep_and_Reinforcement_Learning_for_Automated_Task_Scheduling_in_Large-Scale_Cloud_Computing_Systems
Sage A. Weil, et al., "RADOS: A Scalable, Reliable Storage Service for Petabyte-scale Storage Clusters," 2nd International Workshop on Petascale Data Storage, 2007, pp. 35-44. Available: https://ceph.com/assets/pdfs/weil-rados-pdsw07.pdf
Konstantin Shvachko, et al., "The Hadoop Distributed File System," IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST), 2010, pp. 1-10. Available: https://ieeexplore.ieee.org/document/5496972
Ashish Thusoo, et al., "Data warehousing and analytics infrastructure at facebook," ACM SIGMOD International Conference on Management of Data, 2010, pp. 1013-1020. Available: https://www.researchgate.net/publication/221213095_Data_warehousing_and_analytics_infrastructure_at_facebook
G. DeCandia, et al.,, "Dynamo: Amazon’s Highly Available Key-value Store," 21st ACM SIGCOMM Symposium on Operating Systems Principles (SOSP), 2007, pp. 205-220. Available: https://www.allthingsdistributed.com/files/amazon-dynamo-sosp2007.pdf
Shuang Li, et al., "Distributed storage based on cloud computing," International Conference on Cyberspace Technology (CCT 2013), 2013. Available: https://ieeexplore.ieee.org/document/6748584
Published
Issue
Section
License
Copyright (c) 2025 Ankit Gupta (Author)

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.