Performance Analysis for High Performance Computing Systems

The exponential disparity between the speeds of CPUs and storage systems continues to widen. Data-intensive computations may require hundreds of disks per CPU to utilize modern processors. Even previously CPU-bound workloads are now becoming I/O-bound. Our efforts are focused on finding and fixing I/O performance problems on high performance computing systems by creating a suite of tools to benchmark, trace, profile, analyze, and visualize file and storage systems.

This project is a collaboration with Klaus Mueller and Ethan L. Miller.

Journal Articles:

# Title (click for html version) Formats Published In Date Comments
1 Is NFSv4.1 Ready for Prime Time? PDF BibTeX ;login: The USENIX Magazine Jun 2015  
2 Don't Thrash: How to Cache Your Hash on Flash PDF BibTeX The Proceedings of the VLDB Endowment (PVLDB) Aug 2012  

Conference and Workshop Papers:

# Title (click for html version) Formats Published In Date Comments
1 vNFS: Maximizing NFS Performance with Compounds and Vectorized I/O PDF BibTeX 15th USENIX Conference on File and Storage Technologies (FAST 2017) Feb 2017 Nominated for best paper award
2 A Long-Term User-Centric Analysis of Deduplication Patterns PDF BibTeX 32nd IEEE Conference on Mass Storage Systems and Technologies (MSST 2016) May 2016  
3 Using Hints to Improve Inline Block-Layer Deduplication PDF BibTeX 14th USENIX Conference on File and Storage Technologies (FAST 2016) Feb 2016  
4 Newer Is Sometimes Better: An Evaluation of NFSv4.1 PDF BibTeX International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS 2015) Jun 2015  
5 Dmdedup: Device-mapper Deduplication Target PDF BibTeX 2014 Ottawa Linux Symposium Jul 2014  
6 Don't Thrash: How to Cache Your Hash on Flash PDF BibTeX 38th International Conference on Very Large Data Bases (VLDB '12) Aug 2012  
7 Generating Realistic Datasets for Deduplication Analysis PS PDF BibTeX 2012 USENIX Annual Technical Conference (ATC 2012) Jun 2012  
8 Extracting Flexible, Replayable Models from Large Block Traces PS PDF BibTeX Tenth USENIX Conference on File and Storage Technologies (FAST 2012) Feb 2012  
9 Don't Thrash: How to Cache your Hash on Flash PS PDF BibTeX 3rd USENIX Workshop in Hot Topics in Storage and File Systems (HotStorage 2011) Jun 2011  
10 Benchmarking File System Benchmarking: It *IS* Rocket Science PS PDF BibTeX 13th USENIX Workshop in Hot Topics in Operating Systems (HotOS XIII) May 2011  
11 DARC: Dynamic Analysis of Root Causes of Latency Distributions PS PDF BibTeX International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS 2008) Jun 2008 Source code and benchmark information.

Technical Reports:

# Title (click for html version) Formats Published In Date Comments
1 Design and Implementation of an Open-Source Deduplication Platform for Research PDF BibTeX Stony Brook U. CS TechReport FSL-15-03 Dec 2015 Ph.D. Research Proficiency Exam (RPE)
2 Linux NFSv4.1 Performance Under a Microscope PDF BibTeX Stony Brook U. CS TechReport FSL-14-02 Aug 2014  
3 A Context Aware Block Layer: The Case for Block Layer Deduplication PDF BibTeX Stony Brook U. CS TechReport FSL-12-04 May 2012 M.S. Thesis

Current Students:

# Name (click for home page) Program Member Since
1 Zhen Cao PhD May 2014
2 Ming Chen PhD May 2012
3 Sonam Mandal PhD Jun 2013
4 Sun (Jason) Zhen PhD Nov 2014
5 Geetika Babu Bangera MS Jan 2017
6 Tushar Jain MS Jan 2017
7 Farhaan Jalia MS Jan 2017
8 Nidhi Panpalia MS Jan 2017
9 Vinothkumar Raja MS Sep 2016
10 Arun Ramachandran MS Aug 2016
11 Kunal Shah MS Jan 2017
12 Rushabh Shah MS Jan 2017
13 Sagar Shah MS Jan 2017
14 Mukul Sharma MS Aug 2016
15 Swaminathan Sivaraman MS Jan 2017
16 Sachin Tiwari MS Aug 2016
17 Bharath Kumar Reddy Vangoor MS Aug 2015
18 Henry Nelson HS Sep 2015

Past Students:

# Name (click for home page) Program Period Current Location
1 Nikolai Joukov PhD Jan 2004 - Dec 2006 Research Staff Member, Storage and Data Services Research group, IBM T. J. Watson Research Center (Hawthorne, NY)
2 Vasily Tarasov PhD Jan 2008 - Nov 2013 Research Staff Member, Scale-out Storage Software, IBM Research - Almaden (San Jose, USA)
3 Avishay Traeger PhD Sep 2003 - Aug 2008 R&D, Stratoscale (Herzeliya, Israel)
4 Aashray Arora MS Sep 2014 - Dec 2015 Member of Technical Staff, Core Data Path, Nutanix (San Jose, CA)
5 Akhilesh Chaganti MS Jan 2014 - May 2015 Member of Technical Staff, Disaster Recovery Group Nutanix (Seattle, WA)
6 Arvind Chaudhary MS Sep 2014 - Dec 2015 Member of Technical Staff, CNA group, VMware Inc. (Palo Alto, CA)
7 Abhishek Gupta MS May 2014 - Dec 2015 Member of Technical Staff, vSAN, VMware Inc. (Palo Alto, CA)
8 Deepak Jain MS Sep 2012 - Dec 2013 Member of Technical Staff, Project FVP - Engineering, Pernixdata Inc. (San Jose, USA)
9 Koundinya Santhosh Kumar MS Sep 2010 - Dec 2011 Senior Development Software Engineer, Advanced Software Development and Performance, SanDisk (Milpitas, CA)
10 Amar Mudrankit MS Jan 2011 - May 2012 Software Engineer, Advanced Development Group at Fusion-IO (San Jose, CA)
11 Vithiya Muthukumar MS Jan 2016 - Dec 2016 Software Engineer, Cisco Systems (San Jose, CA)
12 Dongju Ok MS Sep 2014 - May 2016 Software Engineer, Application Team, Commvault Systems Inc. (Tinton Falls, NJ)
13 Karthikeyani Palanisami MS May 2012 - Jun 2013 Member of Technical Staff, Project MARS - Engineering, NetApp Inc (Sunnyvale, USA)
14 Deepika Peringanji MS Jan 2016 - Dec 2016 SDE 2, VSAN team, VMware Inc. (Palo Alto, CA)
15 Venkatakrishnan Rajagopalan MS Jan 2016 - Dec 2016 Member of the Technical Staff, VMware Inc. (Palo Alto, CA)
16 Hari Prasath Raman MS Jan 2016 - Dec 2016 Software Engineer, Bloomberg (New York, NY)
17 Varun Shastry MS Sep 2014 - Dec 2015 Member of Technical Staff, Disaster Recovery Team, Nutanix Inc. (San Jose, CA)
18 Gyumin Sim MS Jan 2010 - Dec 2010 Software Engineer, Data Center Power Team Google (Mountain View, CA)
19 Kumar Sourav MS May 2014 - Dec 2015 Member of Technical Staff, UPIT (Next gen. snapshot technology) group, VMware Inc. (Palo Alto, CA)
20 Ivan Deras Tabora MS Jan 2007 - Dec 2007 Teacher, Computer Science, Universidad Tecnologica Centroamericana (San Pedro Sula, Cortes, Honduras)
21 Vivek Tiwari MS Sep 2015 - Dec 2015 Software Engineer, LinkedIn (Sunnyvale, CA)
22 Sagar Trehan MS Sep 2012 - Dec 2013 Member of Technical Staff, CASL Performance Group - Engineering, Nimble Storage Inc (San Jose, USA)

Sponsors:

# Sponsor Amount Period Type Title (click for award abstract)
1 Microsoft Corporation $20,000 2016-2017 Sole-PI Microsoft Azure Cloud Credits
2 NSF Core Techniques and Technologies for Advancing Big Data Science & Engineering (BIGDATA) $444,267 2013-2016 Lead-PI BIGDATA: Small: DCM: Collaborative Research: An efficient, versatile, scalable, and portable storage system for scientific data containers
3 NetApp Advanced technlogy Group $40,000 2011 Sole PI Dedup Workload Modeling, Synthetic Datasets, and Scalable Benchmarking
4 NSF HECURA $760,253 2006-2009 Lead-PI File System Tracing, Replaying, Profiling, and Analysis on HEC Systems


(Last updated: Sat Mar 11 12:56:09 EST 2017)