DIVEN KUMAR SAMBHWANI, MS
Languages: Python, R, SQL.
Technology and Skills: Statistical Analysis, Machine Learning (Regression, Classification, Clustering, Dimensionality Reduction, Bagging, Boosting), Deep Learning, Recommender Systems (Implicit, Collaborative Filtering), NLP, Mongo DB, Apache Spark, Databricks, Amazon EC2, S3, Sagemaker, Redshift, Tableau, GCP. EXPERIENCE
Data Scientist — AISC, Toronto, Canada Sep 2019 - Present
• Built a restaurant recommender system using collaborative filtering.
• Developed a review rating prediction model and deployed on Kubernetes using GCP. Used Transfer learning to create a language model to help train the classifier model. Wiki103 pre-trained model was used to train the language model.
• Developing a hybrid recommender system using multiple algorithms that recommend to users which article to read next.
• Collaborating with Senior Data Scientist at RBC. Machine Learning Engineer Intern — Saint Mary’s University, Halifax, Canada Apr 2019 – Aug 2019
• Collected Tweets, Google Trends, Metrological and Weather data using scripts that automated collection and transformation every hour using the REST API in Python
• Performed opinion mining on Tweets and extracted real-time asthma-related information and understood the sentiment of users.
• Used Lasso Regression Technique to find the variables relevant in predicting asthma search intensities for each province. Data Scientist — International Business Analytics Challenge, Montreal, Canada Dec 2018 – Mar 2019
• Created a recommender system using Python where Apriori, AHP, Association Mining and unsupervised machine learning were used to recommend jobs to users where they can succeed.
• Jobs were recommended based on their skills, background, and location, improving the platform rate of matching job opportunity and sourcing an ideal candidate.
Data Research Analyst — Saint Mary’s University, Halifax, Canada Oct 2018 – Aug 2019
• Used Levenshtein algorithm to increase the efficiency of the Fuzzy Matching to match mutual fund names.
• Saved 85% of manual work by independently researching and implementing the automated process using Selenium (Python), after noticing that manual downloading of financial reports was costing 20 hours per week.
• Focused on projects for deep-dive analysis
• understand customer behavior, uncovering patterns and finding valuable insights.
• The detailed analysis focused on customer feedback, prospect scoring, service retention and basket analysis. Created ad hoc reports as needed and presented them to clients.
EDUCATION AWARDS AND ACHIEVEMENTS
M.Sc. - Computing and Data Analytics 5th prize in International Business Analytics challenge 2019 Saint Mary’s University, Canada — 2018 – 2019 2nd prize in Sun Life Data Analytics Hackathon challenge 2019 M.Sc. - Information Technology President of the Cultural Committee 2017 DAIICT, India — 2016 – 2018 1st prize in a final project in the bachelor’s 2016 DATA SCIENCE PROJECTS
Bulldozers Auction Price Prediction
Developed a price prediction model using Random Forest. Got a better score than the top scorer on the Kaggle leaderboard by using perfect hyperparameters, feature generation and removing redundant features with help of feature importance and cluster analysis. Image Restoration using GAN
Developed an image restoration algorithm by first creating a function to destroy an image and then using gan where the generator was the unet with resnet architecture and critic was a classifier model to classify image as original or generated. Project is still in progress and trying to implement a technique explained in “Perceptual losses for real time style transfer and super resolution” paper. Sentiment Analysis on Streaming Data using PySpark Developed a model to detect hate speech (Racist or Sexist) in Tweets on steaming data by creating a pipeline of the ML model and then setting up the spark streaming context with a batch duration of 3 seconds, where the person can send the text to get the prediction from the model.