Nanda Kishore Vuppili
Willing to Relocate *****************@*****.*** Github Linkedin Portfolio
Education
Stony Brook University (State University of New York) Stony Brook, NY
M.S. in Data Science GPA: 3.89/4.0 January 2024 - December 2025
Coursework: Statistical Computing, Big Data Systems, Data Analysis, Data Structures & Algorithms, Probability
GITAM University Visakhapatnam, IND
B.Tech. in Computer Science GPA: 8.86/10.0 June 2019 - April 2023
Skills
Languages: Python, SQL, Java, R, C, HTML, CSS
Databases and Tools: Excel, Git, Github, Google Colab, Visual Studio, Docker, Snowflake, MySQL, PostgreSQL, Redshift, Oracle, MongoDB, NoSQL, PySpark, Apache Airflow, Apache Spark, Hadoop, Hive, dbt, Kafka
Cloud Platforms: Amazon AWS (S3, EC2, Lambda), Azure (Data Lake, Synapse), Google Cloud (BigQuery, Dataflow)
Machine Learning: Matplotlib, Scikit-learn, TensorFlow, Pytorch, Keras, XGBoost, NLP (BERT), Predictive modeling
General: ETL, Pipeline Optimization, Data Visualization, Data Warehousing, Statistics, Data Modeling, Predictive Analysis, A/B Testing, Agile Methodologies, APIs, KPI, Problem-Solving
Experience
GITAM University Visakhapatnam, IND
Research Scholar November 2022-April 2023
●Developed and deployed a deep learning-based sentiment analysis system for YouTube videos, achieving 80% accuracy in emotion detection by integrating CNN for facial recognition and VADER for comment analysis.
●Implemented an interactive UI enabling users to upload videos or share YouTube links for real-time analysis.
●Optimized data storage and retrieval by designing a hashing-based map table, cutting load time by 30%, and reducing redundant processing, which enhanced system performance for faster and more reliable analysis.
●Published the research findings in ”Shodhasamhita: Journal of Fundamental and Comparative Research” .
Phoenix Global Trade Solutions Visakhapatnam,IND
Full Stack Engineer Intern May 2022 – August 2022
●Data Analyst Contribution: Engineered an e-commerce dashboard using JavaScript and Tableau, boosting user engagement by 20% through real-time sales tracking.
●Pipeline Optimization: Streamlined remote shopping features using Agile methodologies, enhancing product accessibility and increasing online revenue by 40% while boosting customer retention by 25%
Projects
R Stats: Advanced Modeling and Variable Selection
●Created an R package for advanced statistical modeling, implementing linear, logistic, ridge, LASSO, and elastic net regression, supporting datasets with up to 10,000 predictors.
●Boosted accuracy by 15% via ensemble learning and bagging, using bootstrap aggregating across 100 datasets.
●Improved variable selection efficiency by 30% using pre-screening of top 50 predictors, enhancing performance .
Real-Time Uber Demand Analytics
●Built an automated ETL pipeline on GCP, reducing manual effort by 100% with monthly scheduled processing.
●Achieved 85% trip duration accuracy and 0.012 MSE in demand forecasting using BERT and LightGBM.
●Developed Looker Studio dashboards for real-time insights on demand, trip duration, and revenue trends.
Tokyo Olympics Analysis
●Streamlined data for 11,000+ athletes across 743 teams using Azure Data Factory and Databricks, optimizing storage and transformation.
●Automated Data pipelines for enriched datasets in Azure Data Lake, enabling advanced analytics with Synapse.
●Created Power BI dashboards to visualize key Olympic metrics, including the US winning 4.7% of all medals.
Certifications
●British Airways Data Science Job Simulation on Forage; IBM Data Analysis with Python