Abstract
Data is generated at high volume and speed, creating both challenges and opportunities. Cloud computing has further amplified some of these challenges, particularly in terms of storage options, availability of storage locations, and data access latency. To simplify data management and avoid the complexities of migrating data between cloud providers, most cloud users prefer subscribing to a single provider for hosting their applications and datasets. However, multicloud environments mitigate vendor lock-in risks while offering cost benefits and access to specialized services. In this paper, we propose an adaptive data placement framework (ADAPT) framework designed to optimize data storage costs and enhance data availability in multicloud environments. We approach the selection of optimal storage locations and data availability as classification problems. To evaluate our method, we applied four popular machine learning models and assessed their performance. Our results indicate that the XGBoost model effectively improves cost efficiency from 6.58% to 24.26% and data file availability. We then integrated XGBoost into the ADAPT framework and presented prototype results demonstrating its effectiveness.
| Original language | English |
|---|---|
| Article number | 825 |
| Journal | Cluster Computing: The Journal of Networks, Software Tools and Applications |
| Volume | 28 |
| Issue number | 13 |
| Number of pages | 17 |
| ISSN | 1386-7857 |
| DOIs | |
| Publication status | Published - Nov 2025 |
Bibliographical note
Published online: 19 September 2025.Keywords
- Availability
- Data
- Decision tree
- Multicloud
- Machine learning
- Placement
- Random forest
- SVM
- XGBoost