**** – 2019 Master of Science, The University of Texas at Arlington Concentration: Data mining and business analytics
2019 IBM certified Data Science Professional
2019 Machine learning (Andrew N.G) – Coursera
2019 MATLAB Deep learning certification, MathWorks 2017 Numerical Python certification, CETPA Info tech EXPERIENCE:
Data Analyst, Desein Private Limited, 05/2016 – 11/2017
• Acquired data corresponding to KPI in a Direct to customer warehouse.
• Analyzed using a CRISP-DM data analytics approach and implemented statistical techniques with data visualization to build a report documenting the findings that optimized efficiency.
• Reduced the number of Full-time employees from 45 to 36.
• Reduced the material handling equipment by 25%.
• Improved space utilization from 42% to 72% for the forward picking and 96% for the direct to the customer area.
• Saved 193,000 USD in the form of reduced labor, equipment, utilization and turnover times. Operation Assistant, The University of Texas at Arlington, Part-time, 06/2018 – 12/2019
• Awarded “Rising star leadership.”
• Supervised a crew of 10 student workers to complete setups for the 8000 events hosted per year through leadership and organizational skills.
• Maintained & updated as a team of 5 across 242,000 square feet of building space inventory records spreadsheets of the union’s equipment.
• Redacted detailed daily comprehensive reports of tasks using Event Management Software
(EMS) to ensure accurate billing of up to 5 figures per event.
• Scheduled staff, training and screening interviews for 50+ students to assistant to the professional staff.
Programming: R, SAS, Python (Pandas, NumPy, SciPy, Scikit-learn, matplotlib), Java. Tools: MATLAB, Octave, Tableau, Excel, Power BI, Visio. Database: SQL, Microsoft SQL Server.
Techniques: Regression, classification, resampling methods, model selection and regularization, tree-based methods, support vector machines, maximal margin classifiers, principal component analysis, hierarchical clustering.
Spot detection in X-ray medical images, Microsoft, Thales AI Challenge4Health
• Trained deep learning Convolutional Neural Network model with 5 convolutional layers using MATLAB on a 209,933 training and 144,994 testing X-ray image datasets. Classified the images into one of three categories with an accuracy of 98.89%. Arial cactus identification, Kaggle competition
• Developed a Convolutional Neural Network model using Keras and Tidyverse libraries in R on 15,750 samples and validated on 1750 samples with 100 epochs. Identified the images of columnar cacti with accuracy of 99.83% and loss of 00.45%. PROJECTS:
TMDB box office prediction challenge, Kaggle (currently working)
• Collaborating on a data set obtained from The Movie Database with 7398 samples training data in R programming to predict the worldwide revenue for 4398 movies. NYC squirrel type prediction in Central park, R and R Shiny
• Predicted the mouse type and their probable location in the NYC Central park using general linear models. Built a R Shiny visualization for various variables in the dataset. Horror movie rating prediction, R
• Visualized the data using lasso regression to predict the effect of Country, languages, words and cast on the IMDB ratings of over 3000 movies.
Profit prediction for a restaurant franchise, Octave/MATLAB
• Predicted the profits for a restaurant franchise in MATLAB using multivariate linear regression. Visualized the data, implemented cost function and gradient descent algorithm for the profits and population data considering different cities. Quality assurance prediction test, MATLAB
• Developed regularized logistic regression model to determine whether the manufactured microchips should be accepted or rejected from two different test sample results. Spam classifier with SVM, MATLAB
• Implemented Gaussian kernel with support vector machines on a 4000 training and 1000 testing dataset to perform non- linear classification of an email. Trained a spam classifier that classifies the email into spam and non-spam based on the vocabulary list with a training accuracy of 99.8% and testing accuracy of 98.5%
Image compression with K-means, MATLAB
• Implemented K-means for reducing the 1000 colored image into a 16-colored image. Trained the algorithm to select 16 colors that will be used to represent the compressed image in RGB space. Modeled the inner loop to iterate over finding closest centroids and computing centroid means.
Hand written digits recognition, MATLAB
• Implemented vectorized logistic regression, feedforward neural networks and backpropagation algorithm to recognize handwritten digits on 5000 training examples in MATLAB. Used one-vs-all logistic regression model to train a multi-class classifier. Came up with a training set accuracy of 94.9% for the logistic regression model, 97.5% for the feed forward NN model and 95.3% for the backpropagation algorithm. Scrapping Net worth of Billionaires from Bloomberg, Python
• Developed a web scrapper to collect the data, processed it and visualized billionaires net worth standings for 2 years.
ACHIEVEMENTS & EXTRACIRRICULAR:
• Co-Founder and Mentor, SRM SCRO (2014)
• Gold Medal for Innovative Idea and overall performance by Mayur Ramgir, President & CEO of Zonopact InfoTech. (2017)
• Held “Domain organizer” position at Aarush national level tech. fest. (2 years)
• Achieved Merit in “Personal Flying Machine” presentation at Aarush 2013 Tech fest.