Find information:

[5-12]Is Your Storage System Reliable?

Date:2015-05-08

Title: Is Your Storage System Reliable? 

  

Speaker: Feng Qin 

  

Time: 12th May 2015, 10:00 a.m. 

  

Venue: 337, Level 3, Building 5,  Institute of Software, CAS 

  

Abstract: 

Modern storage technology (SSDs, No-SQL databases, commoditized RAID hardware, etc.) bring new reliability challenges to the already complicated storage stack. At the higher layer of the stack, databases provide the strongest reliability guarantees including the atomicity, consistency, isolation, and durability (ACID) properties. However, the ACID properties are far from trivial to provide, particularly when high performance must be achieved. At the lower layer of the stack, the new components such as Solid State Drive (SSD) are often ignored or under-studied in the adverse conditions. 

  

In this talk, I will mainly present our recent work on exposing and diagnosing violations of the ACID properties provided by databases and studying the behavior of SSDs, in an ostensibly easy context: power faults. More specifically, our framework for torturing databases include  workloads to exercise the ACID guarantees, a record/replay subsystem to allow the controlled injection of simulated power faults, a ranking algorithm to prioritize where to fault based on our experience, and a multi-layer tracer to diagnose root causes. Additionally, our framework for testing SSDs includes specially-designed hardware to inject power faults directly to devices, workloads to stress storage components, and techniques to detect various types of failures. After applying our frameworks to the 8 widely-used databases and 15 SSDs, respectively, the results were surprising. 

  

Biography:  

Feng Qin received his Ph.D. degree from the University of Illinois at Urbana-Champaign. He joined the Department of Computer Science and Engineering at Ohio State as an Assistant Professor in 2006 and was promoted to an Associate Professor with tenure in 2013. His research interests include Software Reliability, Operating Systems, High Performance Computing, and Security. He is particularly interested in developing system mechanisms to improve software availability and reliability at different software development stages. He has published papers in top system conferences in the past decade. One of his papers was awarded as best papers in SOSP'05. Two of his papers won IEEE Micro Top Picks in 2004 and 2007, respectively. Three of his papers were nominated as best papers in HPCA'05, SC'07, and SC'10, respectively. He has received NSF CAREER Award in 2010 and OSU Lumley Research Award in 2015.