SHREY SHAH
213-***-**** # ad1fa3@r.postjobfree.com ï Shrey Shah § Shrey7801
Education
University of Southern California, Los Angeles, CA August 2023 – May 2025 Master of Science in Computer Science
Coursework: Analysis of Algorithms, Database Systems Pandit Deendayal Energy University, Gandhinagar, India August 2018 – June 2022 B.Tech in Information and Communication Technology GPA: 3.78/4 Coursework: Machine Learning, Artificial Intelligence, Data Mining, Big Data Analytics Skills
• Languages: Python, C, HTML, CSS, Javascript, GIT, R, SQL, DAX
• Tools and Technologies: VS code, PowerBI, Tableau, RStudio, ChatGPT, Hugging Face, Office 365(Word, Excel, PPT), Github
• Cloud, Systems: Amazon Web services, Linux, Ubuntu, Windows
• Libraries and Frameworks: Pandas, Numpy, Dask, Matplotlib, Scikit-learn, Tensorflow, Flask, Fastapi, Langchain, Pytorch OpenAI, Keras, Apache Airflow, Apache Kafka
• Technical Skills: Machine Learning, AI, NLP, Large Language Models (LLM’s), Data Analysis and Visualization, ETL Pipelines
• Certifications: IBM Data Science Professional Certificate,Google Data Analytics Experience
Data Scientist - Cilans Systems, Ahmedabad, India January 2022 – July 2023 Mortality Prediction
• Conducted in-depth data analysis on eICU Collaborative Database using machine learning techniques.
• Cleaned, merged, and aggregated data, also tackled null values with strategies like column means and patient-specific averages, and leveraged AWS Athena and Glue Crawler for streamlined data processing to reduce processing time by 50%.
• Employed Random Forest and XGBoost algorithms for mortality prediction, achieving precession rates over 80%. Resume Parsing, Job Description Parsing and Resume Matching with Job Description using LLM’s
• Parsed information from resume and job description for list of fields provided by client (Personal Information, Education, Employment etc.) and presented output in json format
• Implemented a scoring system using advanced machine learning models, including gpt3.5-turbo and langchain framework, combined with regular expression matching to evaluate resumes against parsed job descriptions. KnitSmart - Fabric Fault Detection
• Contributed in a team for machine learning implementation to detect anomalies on a piece of fabric under manufacturing.
• Applied Xception model for fine-tuning of training dataset and results were captured for 3 class classification.
• Prioritized resources with business unit for model implementation on Jetson Nano. Business Insight 360 Dashboard (POC)
• Constructed a 6 page interactive PowerBI dashboard integrating finance, sales, supply chain, and executive views for comprehensive business understanding.
• Transformed complex data into clear, interactive visuals, aiding swift trend identification and informed decision-making utilizing DAX measures.[Dashboard link]
Data Science Intern - BISAG, Gandhinagar, India June – July 2021 Twitter Sentiment Analysis
• Conceptualized and formed flask API utilizing Tweepy library and integrated it with web page made using HTML/CSS
• Performed sentiment classification of tweets (positive, negative, neutral) making use of Flask API and NLP, displayed results on a web page with tweet details and languages.
Data Science Intern - The Sparks Foundation, India March – April 2021 Supervised ML
• Refined student study hours data by cleaning and handling null values and executed Linear Regression model for prediction of percentage with 95% accuracy.
Projects & Publications
Deep Neural Model For Automated Sleep Staging System using Single-Channel EEG Signal January – May 2022
• Developed CNN model to extract time-invariant features and LSTM for learning stage transition rules for sleep stages on PhysioNet’s Sleep-EDF dataset.
• Executed 5-fold cross-validation technique and achieved 91% accuracy. UAV Assisted Communication using Machine Learning August – December 2021
• Crafted dataset using frame extraction approach and applied CNN model for suspicious object detection
• Obtained accuracy of 82% and created communication channel using Matlab. E-Mail Spam Classifier March – May 2021
• Implemented techniques of tokenizing text, Lemmatizing and removing stop words to pre-process dataset
• Applied ML models and achieved accuracy of 88% for Naive Bayes and 95% for LSTM