Siddharth Singh
551-***-**** *********.*********@*****.*** 74 Bowers Street, Jersey City, NJ 07307 EDUCATION
Stevens Institute of Technology May 2021
M.S., Engineering Management GPA 3.5/4.0
Don Bosco Institute of Technology, Mumbai June 2018 B.E., Electronics and Telecommunication GPA 3.5/4.0 TECHNICAL SKILLS
Tools and Technologies: Tableau, Power BI
Programming: Python, R, SQL
Databases and Big Data Analytics: MS SQL Server, MySQL Key Skills: Machine Learning, Statistics, Descriptive and Predictive Analytics, Data Visualization, Business Intelligence Predictive/Statistical Modeling: Linear Regression, Logistic Regression, Time Series EXPERIENCE
Intelligent Experts Multitrading Pvt. Ltd. Mumbai, India Junior Data Analyst Sept 2018 – May 2019
• Extract data, perform data cleaning and structuring using data mining algorithms-SQL queries and Python.
• Developed clear visualizations and data modeling using Tableau for documentations an in delivering real time feedback
• Managed databases of 7 textile locations in multiple locations across India.
• Reporting and visualizing the entire system with data provided in Excel and identify trends on Tableau.
• Retrieve data from the data warehouse and use the data to create insightful reports and analyses. ACADEMIC PROJECTS
Truck Fleet Data Analysis (Big Data Analytics) Oct 2020 – Nov 2020
• Processed truck fleet data stored in Hadoop Distributed File System (HDFS) using Hive and provided analytical reporting using Tableau to understand trucking movement across the United States
• Developed dashboards and data reports in Tableau to provide analytical support and reduce risk in accident- prone areas
Twitter Sentiment Classification for Airline Tweets Apr 2020 – May 2020
• Performed NLP based Tokenization, Lemmatization, Vectorization and pre-processed for Airline Tweet data using Python
• Designed and modeled Naïve Bayes and XGboost Algorithm for data analysis to determine sentiment polarity of dataset.
Fake News Detection in Python Mar 2020 – Apr 2020
• Using sklearn, I built a TfidfVectorizer in the dataset and to fit the model I initialized a PassiveAggressive classifier.
• Final output was a confusion matrix with an accuracy of 92.82% Relational Database Management System for Bike Company Jan 2020 – Mar 2020
• Built a Relational Database Management System for a bike company for shifting from traditional book-keeping system to database system using MS Access and SQL. Housing Sale Price Prediction using Advanced Regression Technique Nov 2019 – Dec 2019
• Using feature engineering skills and advanced regression techniques mainly in Random Forest built a machine learning regression model. Currently, I have reduced the Log RMSE down from 15.3% to 12% error. Prediction of Vehicle Loan Default Nov 2019 – Dec 2019
• Sampled the imbalanced data, successfully dealt with outliers, and handled 8% missing values in the data.
• Tuned hyperparameters to get a model that which predicted defaulting in loan accurately 75% of times.