SHAURYA GULATI
412-***-**** *************@***.*** LinkedIn/shauryagulati
EDUCATION
Carnegie Mellon University Pittsburgh, Pennsylvania August 2024 - December 2025 Master's in Information Systems Management (Business Intelligence and Data Analytics) CGPA: 3.49/4.0
● Coursework: Computational Data Science, Machine Learning, Operationalizing AI
● Data Analyst at Carnegie Mellon Libraries for the Research Information Management System Project Chandigarh University Mohali, India July 2018 - June 2022 Bachelor's of Engineering in Computer Science and Engineering CGPA: 3.4/4.0 Specialization in Artificial Intelligence and Machine Learning SKILLS
Languages & Tools: Python, SQL, Git, Bash, Jupyter, VSCode, Cursor Libraries & Frameworks: PyTorch, scikit-learn, Pandas, NumPy, OpenCV, Transformers, Matplotlib, Seaborn, NLTK, BERT Data & ML Engineering: MLflow, Evidently, Azure Services, Docker, MongoDB, Tableau ML Techniques: Feature Engineering, Entity Resolution, A/B Testing, Data Visualization, Model Evaluation, Model Monitoring WORK EXPERIENCE
YMGrad June 2022- July 2024
Software Development Engineer
● Engineered a robust API in Express and Node.js, enabling seamless data retrieval from a MySQL database, resulting in a 25% faster front-end loading time and enhanced user experience
● Utilized techniques such as indexing optimization, query refactoring, and restructuring joins for improved performance, leading to a 30% reduction in data fetching time
● Conducted data analysis on user queries and platform usage trends to support internal operations teams in optimizing customer support workflows and content strategy
The Sparks Foundation January 2021- March 2021
Data and Business Analyst- Intern
● Optimized a machine learning-based credit card fraud detection model using Python and SQL, analyzing fraud patterns to identify risks; led a 3-member team to automate detection with Python scripts, increasing detection speed by 30%
● Designed and implemented fraud prevention measures using anomaly detection, reducing fraud vulnerability by 20% PROJECTS
BERT-based Question Answering System
● Fine-tuned a BERT-base model on the SQuAD v1.1 dataset to build a context-aware question answering system, achieving a 60% F1 score on the evaluation set
● Designed a modular pipeline for preprocessing, training, and inference using Hugging Face Transformers and PyTorch, enabling reproducible experimentation
● Deployed a fine-tuned BERT question answering model as a REST API using Azure ML and Azure Container Instances (ACI), enabling real-time inference
UCI Air Quality Prediction System
Designed and implemented a simulated real-time air quality prediction pipeline using the UCI Air Quality dataset, forecasting pollutant concentrations from sensor readings
● Integrated Kafka for streaming data ingestion and built a robust training workflow with MLFlow for experiment tracking and reproducibility. Trained models, including XGBoost, achieved an accuracy of 91% in the predictions
● Containerized the prediction service using Docker, and deployed an API for serving real-time predictions
● Developed a monitoring dashboard with Evidently AI to track data drift and model performance in production Mini RAG System for YouTube Summarization
● Developed a lightweight Retrieval-Augmented Generation (RAG) pipeline to summarize YouTube videos using transcript data and Google Gemini Pro.
● Leveraged YouTube Transcript API for document retrieval and applied prompt-based summarization to simulate the RAG workflow with external data.
● Built an interactive Streamlit frontend with transcript extraction, thumbnail preview, and summary generation in real-time. PUBLICATIONS
● "Enhancing Sentiment Analysis in Short Texts with POS-Embedded LSTM Models," IATMSI 2024
● "Paint/Writing Application through Webcam using MediaPipe and OpenCV," ICIPTM 2022