CRSS publication: CAPES: Unsupervised Storage Performance Tuning Using Neural Network-Based Deep Reinforcement Learning

CAPES: Unsupervised Storage Performance Tuning Using Neural Network-Based Deep Reinforcement Learning

Appeared in Supercomputing '17.

Abstract

Parameter tuning is an important task of storage performance optimization. Current practice usually involves numerous tweak-benchmark cycles that are slow and costly. To address this issue, we developed CAPES, a model-less deep reinforcement learning-based unsupervised parameter tuning system driven by a deep neural network (DNN). It is designed to find the optimal values of tunable parameters in computer systems, from a simple client-server system to a large data center, where human tuning can be costly and often cannot achieve optimal performance. CAPES takes periodic measurements of a target computer system's state, and trains a DNN which uses Q-learning to suggest changes to the system's current parameter values. CAPES is minimally intrusive, and can be deployed into a production system to collect training data and suggest tuning actions during the system's daily operation. Evaluation of a prototype on a Lustre file system demonstrates an increase in I/O throughput up to 45% at saturation point.

Publication date:
November 2017

Authors:
Yan Li
Oceane Bel
Kenneth Chang
Ethan L. Miller
Darrell D. E. Long

Projects:
Scalable High-Performance QoS
Tracing and Benchmarking
Ultra-Large Scale Storage

Available media

Full paper text: PDF

Bibtex entry

@inproceedings{li-sc17,
  author       = {Yan Li and Oceane Bel and Kenneth Chang and Ethan L. Miller and Darrell D. E. Long},
  title        = {{CAPES}: Unsupervised Storage Performance Tuning Using Neural Network-Based Deep Reinforcement Learning},
  booktitle    = {Supercomputing '17},
  month        = nov,
  year         = {2017},
}

Last modified 24 May 2019