RITUM SINGH
716-***-**** ********@*******.*** linkedin.com/in/ritum-singh-0947b6112 github.com/ritumsingh23 EDUCATION
Master of Science: Data Science (CGPA: 3.852), University at Buffalo, The State University of New York, February 2023 Bachelor of Technology: Electrical and Electronics Engineering, SRM University, India, May 2017
SKILLS & TOOLS
Languages: Python, R, MATLAB, C
Data Management & Analytics: PostgreSQL, SQLite, MySQL, AWS Redshift, AWS Athena
Tools: Jupyter Notebook, Visual Studio, PyCharm, Tableau, AWS EC2, AWS S3, AWS Lambda, Microsoft Excel
Machine Learning: Supervised and Unsupervised Regression – Linear, Logistic, Ridge, Lasso Decision Tree Random Forest Neural Networks Convolution Neural Network (CNN) Support Vector Machine (SVM) Bagging XGBoost K-Nearest Neighbors
Naïve Bayes Principal Component Analysis (PCA) A/B Testing
WORK EXPERIENCE
1. Data Science Intern, AI-Camp Incorporated, Palo Alto, California, USA: May 31, 2022 – August 5, 2022
• Trained 19 summer campers on concepts of basic python, machine learning, and flask website delivering 3 end-to-end data science projects in 9 weeks
• Delivered machine learning web application for “Stroke Prediction” achieving 97% accuracy with the AdaBoost model
2. Programmer Analyst, Cognizant Technology Solutions, Kolkata, WB, India: February 2018 – July 2021
• Maintained Enterprise Resource Planning (ERP) with MS SQL on Oracle Database for a US-based client
• Conducted Patch Impact Analysis (PIA) on the Oracle database, and managed Incidents and Requests as an L1.5 team member
• Automated 60% of manual overhead in collaboration with Automation Anywhere and Service Now Orchestrator team in 6 months
• Implemented Tableau-based reporting dashboard for weekly review meetings (WSR)
• Introduced triggers to automate Global Business Location (GBL) ingestion into the database upon creation
PROJECTS
1. YouTube viewer data analytics dashboard: Python, PySpark, AWS S3, AWS Lambda, AWS Glue, AWS Athena, AWS QuickSight
• Built ETL pipeline with raw data from Kaggle – “Trending YouTube Video Statistics” to produce an analytical database
• Designed BI dashboard with AWS QuickSight with 4 key visualizations
2. Prediction of O2 requirement on expeditions: Python, Pandas, Numpy, Scikit Learn, Plotly, Bootstrap, Flask, Git, TensorFlow
• Analyzed dataset with 5 visualizations from plotly express to identify success rates, popularity, international participation, demographic-based O2 requirements, and different elevation levels
• Evaluated performance of 4 different models and Logistic Regression performed best employing different evaluation metrics such as accuracy, F1-score, and Recall at 90.5%, 82.9%, and 18% respectively
• Deployed bootstrap website with flask server to convey a story and enable prediction for future expeditions
3. Online Streaming platform database normalization to 4NF: Python, SQLite3, Pandas, PostgreSQL, ETL, Flask
• Developed database with 4th normal form (4NF) on merged online streaming platform data from Amazon Prime, Netflix, Hulu, and Disney Plus
• Facilitated smooth search, insert and update operations on the database from React-based frond-end via Flask API deploying stored procedures on PostgreSQL
4. Fraud Detection in Ethereum Transactions: R, SMOTE, GGplot2
• Trained machine learning model with R to predict fraudulent transactions on Ethereum employing ensemble techniques
• Achieved a 98% accuracy and highest generalizability keeping sensitivity, specificity, and accuracy as assessment parameters with boosting technique
5. Analysis of Wild-Fires in California: Python, SQLite3, Pandas, Folium, Matplotlib
• Performed database normalization importing pandas and SQLite on a non-normalized dataset
• Created an interactive dashboard to visualize the effects of California Wildfires between 2013-19 based on the number of incidents, annual episode frequency distribution, acres burnt and fatalities on Jupyter Notebook using Matplotlib and Folium