Umit Akgun / Projects

KML - Machine Learning for Operating Systems

KMLib

Storage systems and their OS components are designed to accommodate a wide variety of application and dynamic workloads. Storage components inside the OS contain various heuristic algorithms to provide high performance and adaptability for different workloads. These heuristics may be tunable via parameters, and some system calls allow users to optimize their system performance. These parameters are often predetermined based on experiments with limited applications and hardware. Thus, storage systems often run with these predetermined and possibly suboptimal values. Tuning these parameters manually is impractical: one needs an adaptive, intelligent system to handle dynamic and complex workloads. Machine learning (ML) techniques are capable of recognizing patterns, abstracting them, and making predictions on new data. ML can be a key component to optimize and adapt storage systems. In this position paper, we propose KML, an ML framework for storage systems. We implemented a prototype and demonstrated its capabilities on the well-known problem of tuning optimal readahead values. Our results show that KML has a small memory footprint, introduces negligible overhead, and yet enhances throughput by as much as 2.3x.

Predicting Network Buffer Capacity for BBR Fairness
Ibrahim Umit Akgun, Santiago Vargas, Michael Arkhangelskiy, Andrew Burford, Michael McNeill, Aruna Balasubramanian, Anshul Gandhi, and Erez Zadok
NeurIPS MLSys Workshop 2022,
Paper | Bibtex

Improving Storage Systems Using Machine Learning
Ibrahim Umit Akgun, Ali Selman Aydin, Andrew Burford, Michael McNeill,Michael Arkhangelskiy, Erez Zadok
ACM Transactions on Storage (TOS) 2023,
Paper | Bibtex | KML Source Code

A Machine Learning Framework to Improve Storage System Performance
Ibrahim Umit Akgun, Ali Selman Aydin, Aadil Shaikh, Lukas Velikov, Erez Zadok
Proceedings of the 13^th ACM Workshop on Hot Topics in Storage and File Systems (HotStorage'21),
Paper | Bibtex | Slides

KMLib: Towards Machine Learning For Operating Systems
Ibrahim Umit Akgun, Ali Selman Aydin, Erez Zadok
On-device Intelligence Workshop - Proceedings of the 3^rd MLSys Conference, Austin, TX, USA,
Paper Bibtex

KMLib: Towards Machine Learning For Operating Systems Storage Components
Ibrahim Umit Akgun, Ali Selman Aydin, Erez Zadok
18th USENIX Conference on File and Storage Technologies, USENIX FAST'20 — Poster
Poster

KML Source Code

KML News Coverage

Re-Animator

Modern applications use storage systems in complex and often surprising ways. Tracing system calls is a common ap- proach to understanding applications’ behavior, allowing of- fline analysis and enabling replay in other environments. But current system-call tracing tools have drawbacks: (1) they often omit some information—such as raw data buffers— needed for full analysis; (2) they have high overheads; (3) they often use non-portable trace formats; and (4) they may not offer useful and scalable analysis and replay tools.

We have developed Re-Animator, a powerful system-call tracing tool that focuses on storage-related calls and collects maximal information, capturing complete data buffers and writing all traces in the standard DataSeries format. We also created a prototype replayer that focuses on calls related to file-system state. We evaluated our system on long-running server applications such as key-value stores and databases. Our tracer has an average overhead of only 1.8–2.3×, but the overhead can be as low as 5% for I/O-bound applications. Our replayer verifies that its actions are correct, and faith- fully reproduces the logical file system state generated by the original application.

Re-Animator: Versatile High-Fidelity Storage-System Tracing and Replaying
Ibrahim Umit Akgun, Geoff Kuenning, Erez Zadok
Proceedings of the 13^th ACM International System and Storage Conference (SYSTOR'20),
Paper | Bibtex | Slides | Presentation | Re-Animator LTTng

Re-Animator: Versatile High-Fidelity System-Call Tracing and Replaying
Ibrahim Umit Akgun, Erez Zadok
Technical Report FSL 19-02 Research Proficiency Exam, Stony Brook University — May 2019
Peport Bibtex Slides | Presentation

Ibrahim Umit Akgun

KML - Machine Learning for Operating Systems

Re-Animator