Hi, I'm Krapi Shah!

MS in CS student at Stony Brook University
Graduating December 2019


I am a Master in Computer Science student at Stony Brook University. I have experience with Python, Flask, Jinja2, Java, AWS.

My interests include cloud technologies, caching and storage systems, data infrastructure, management, and analysis, use of machine learning/ Artificial Intelligence for developing cost and performance-optimized systems.

Developing technical solutions for various challenges in the financial domain, gave me insights into the importance of low latency, fast processing, secure systems that our world is running on. Each financial transaction generates lots of data and that needs to secure and processed. This gives rise to the need for efficient, optimized data storage, and data analysis.

This experience inspired me to join the FSL Lab and Stony Brook University and I am currently working on cost and performance optimization for multi-tier caching systems.



Programming Languages

C++, Python, Java

Cloud Technologies

AWS, Google Cloud

Web Technologies

Bootstrap, JavaScript, Jinja2, Flask


Git, Jira, BitBucket


2018 - Present

MS in Computer Science

Stony Brook University, New York, USA

Courses: Machine Learning, Data Science Fundamentals(CSE 519), Operating Systems(CSE 506), Cryptography(CSE 590), Analysis of Algorithms(CSE 542), Data Visualisations(CSE 564), Human Computer Interactions(CSE 591)

2013 - 2017

B.Tech in Information Technology

Veermata Jijabai Technological Institute, Mumbai, India

Courses : Data structures and Algorithms, Artificial Intelligence, Database Fundamentals, Web Technologies, Distributed Systems, Software Engineering, Operating Systems, Data Mining, Network security

Work Experience

FSL Stony Brook University

Graduate Research Assistant

Jaunary 2019 - Present

Currently working on optimizing cost and performance for multi-tier caching applications.

Analysing various storage workloads for different layers of caching to determine the throughput obtained and the cost incurred. Attempting to use various physical devices including DRAM, HDD, Flash Drivce, SSD and studing the performance obtained. Since each of these devices have different cost and performance values, we are trying to obtain a metric that can help give us the bes possible optimization in terms of cost incurred again performnace.

Especially when you consider the amnount of data that we gave today and is continuously being generated data storage and infrastrutre becomes essential.


Technology Intern

June 2019 - Present

Developed a one-click central access to environment monitoring for DEV and NON-DEV Environments across Dealerweb.This tool is a part of the DevOps efforts of the team to improve productivity and software delivery. Designed and developed an realtime web interface used for monitoring of application deployed and allow to start and stop applications.
Technology Used : Python, Flask, Gunicorn, Jinja2, Bootstrap

Benchmarked AWS data integration and data streaming services like SNS, SQS, Kinesis, Kafka for performance and security features. Researched pricing, access controls and compliance policiies in AWS to understand how Tradeweb can leverage cloud technologies to distribute data to clients using the cloud.

Citi, Pune, India

Technology Analyst

Implemented an Automated File Generation Framework to generate thousands of files of various formats CSV, flat-files, XML within seconds for load testing of the payment processor application.

Performed root cause analysis to identify problems that led to the failure of the update of the security matrix for maker-checker functionalities leading to a delay of 3 weeks for each failure. Developed a program that automates the update and verifies the security matrix before generating the script for the change.
Technology Used: JAVA, Mockito

Technology Intern

Introduced the use of Apache Camel as a middle layer to ease interaction within various interaction points of payments systems like client portal, ERP file, message queues.
Technology Used: JAVA, Apache Camel

Academic Projects

Stackable File System to Support Backups

March 2019 - April 2019

Developed a stackable file system that takes backup of a file whenver more than 100 bytes of data was modified
The file system allowed configuring when to backup, what to back, how to name backup files and how many backup to mainitains using simple command line.

Created IOCTLS to support version management - delete, view and rename backup files.
Gained knowledge about different components of VFS,and error handling in kernel.

Google Analytics Customer Review

November 2018 - December 2018

Analysing the Google Analytics Data for Google Store to predict the revenue generated from each user using Random Forest Regressor and XGBoost.
Predicting the users who had a higher probability of generating revenue as compared to those that didn’t.
Determining the impact of users location, number of visits and page hit/page view ratio on probability of user generating revenue in a particular session and hence predicting most probable buyers.

Do Popular Songs Endure?

November 2018 - December 2018

Using Spotify and Last.fm API to get the current popularity index and play-count for yearly top 100 songs of Billboard
Deriving AUC as the metric to understand a songs performance using data from BillBoard weekly Hits.
Developing a prediction model to determine the popularity index and popularity curve over the years for the songs since its release and perform sniff tests to confirm the performance of our model


Solving rubik’s cube using graph theory

June 2016 - May 2017

Developed a Rubik’s Cube solver using bidirectional search algorithm in C++. Demonstrated the performance of traditional search algorithms like breadth first search, depth first search, andbidirectional search and proposed a new approach to find the solution by integrating them.
Adapted efficient memory management techniques along with Rubik’s cube properties of cube symmetries and anti symmetry to efficiently traverse the possible search space of 43 quintillion.
Published findings as : “Solving rubik’s cube using graph theory,” in Computational Intelligence: Theories, Applications and Future Directions - Volume I . 2019, pp. 301–317.


  • Teaching Assistant for Undergraduate Machine Learning at Stony Brook (CSE 353)
  • Member of Event Planning Committee at Citi
  • Sponsorhsip Manager for Technovanza - Techno managerial event of VJTI
  • Head for Rubik's Cube Mumbai Open. Spearheaded a team of 40 to organize various Rubik's Cube events throughout the year


Phone Number