RAKSHIT GROVER
213-***-**** ********@***.*** LINKEDIN
EDUCATION
UNIVERSITY OF SOUTHERN CALIFORNIA,LOS ANGELES - MS APPLIED DATA SCIENCE JANUARY 2023-DECEMBER 2024 CURRENT GPA-3.9/4.0
SRM INSTITUTE OF SCIENCE AND TECHNOLOGY(DELHI-NCR
CAMPUS,GHAZIABAD)-B.TECH(ELECTRONICS AND COMMUNICATION ENGINEERING) JUNE 2018 — MAY 2022 CGPA-9.1/10
EXPERIENCE
USC VITERBI SCHOOL OF ENGINEERING- TEACHING ASSISTANT- DSCI-551 FOUNDATIONS OF DATA MANAGEMENT, ITP-249 INTRODUCTION TO DATA ANALYTICS AUGUST 2023 - May 2024
● Facilitated weekly office hours, efficiently addressing 50+ student inquiries across various topics.
● Reviewed and assessed 50+ projects, exams, assignments, and labs to enhance student comprehension and performance in Python, SQL, NOSQL databases, AWS, Tableau, and related subjects.
● Implemented tailored solutions for exams, assignments, and labs, aligning with the professor's teaching objectives.
KPMG INDIA- DATA SCIENCE INTERN AUGUST 2022 - NOVEMBER 2022
● Developed dynamic web scrapers using BeautifulSoup and Selenium to accelerate data extraction accurately.
● Achieved a remarkable 90% accuracy in predicting NHAI project delays through advanced classification models, thereby optimizing efficient project management.
● Utilized regression models to assess the magnitude of delays, ensuring comprehensive project planning and execution.
● Uncovered actionable insights from IOCL data by deriving information through Pandas, Numpy, Matplotlib, and Seaborn for informed decision-making. BHARATPE, INDIA - DATA SCIENCE INTERN JUNE 2021 - AUGUST 2021
● Acquired profound insights into the pivotal role of data scientists, driving organizational success through data-driven strategies.
● Achieved proficiency in Python, SQL, and database management systems, skillfully handling vast datasets with ease.
● Engaged in a cross-functional team, harmonizing efforts to analyze, interpret, and clean data from diverse sources, applying machine learning algorithms to augment decision-making processes. PROJECTS
Developed an Advanced Virtual Teaching Assistant Platform for the University of Southern California
● Developed a web-based system mainly using python libraries like Langchain, OpenAI, Streamlit, SpeechRecognition, PIL, shutil, base64, and pickle for automated data scraping, context matching, and LLM-based answers to provide real-time responses to student inquiries.
● Enhanced user interaction by incorporating images and videos into answers, including snapshots with timestamps,links to YouTube videos, and images from lecture notes.
● Regularly updated the database and algorithms, integrating new data and refining embedding techniques to ensure ongoing performance enhancements. Multi-Output Landmark and Category Classification using VGG16 and Transfer Learning
● Led a team project on architectural style and landmark image classification with a challenging dataset of 6 categories and 30 landmarks.
● Achieved 96% accuracy in category classification and 87% in landmark classification using transfer learning with the VGG16 model.
● Enhanced model robustness with strategic data augmentation techniques. NLP Abstract Sentence Classification Project: SkimLit
● Engineered SkimLit, a Natural Language Processing model enabling categorization of medical abstract sentences into roles like objective, methods, and results, expediting literature review for researchers.
● Utilized the PubMed RCT20k dataset to train and evaluate SkimLit, achieving a precision of 90% in categorization accuracy.
● Demonstrated mastery in NLP methodologies and model architecture, showcasing expertise in advanced techniques applied to medical literature analysis. SKILLS
● Skills:Machine Learning(Supervised and Unsupervised),Regression, Clustering,Classification,Dimensionality Reduction, A/B Testing,Data Visualization,Data Manipulation,Cross Validation, Large Language Models (LLM), Natural Language Processing, Computer Vision,Time Series Analysis
● Tools: Python (Pandas, Scikit Learn, NumPy, matplotlib, Regex, XGBoost, BeautifulSoup, Selenium
, TensorFlow, Langchain, Spacy, OpenAI, Streamlit),R(dplyr, ggplot2, caret, gmodels, randomForest, olsrr), Jupyter Notebooks, SQL, Excel, Pyspark,Tableau,SQL(MYSQL),NOSQL(MongoDB,Google Firebase,DynamoDB),Apache Hadoop, Microsoft Excel
CERTIFICATIONS
● AI Foundations (IBM), Deep Learning Specialization (Coursera), Machine Learning (Udemy), Data Structures & Algorithms (Python), Pandas Bootcamp (Udemy)