Professionally accomplished data scientist with industrial experience in handling big data infrastructure. Experience in essential scripting languages like Python, R, SQL backed by 7+ years of experience in quantitative analysis, and statistics. Strong experience of applying machine learning algorithms to a versatile range of real industrial problems involving structured as well as unstructured data to extract important business intelligence. Efficiency in making reports, dashboards using industry available software.


• Advanced skills in updating, cleaning, organizing raw data for exploratory data analytics using various statistical models.

• Advanced programming knowledge in Python, R, Scala, SQL (Hive, Pig, Presto).

• Created presentation, report, dashboard using Tableau, Matplotlib, Mode Analytics, Microsoft Power BI, Microsoft PowerPoint.

• Developed various machine learning models such as regression, classification and clustering models with variable selection, feature engineering, principal component analysis (PCA), hyperparameter tuning, regularization, and cross-validation, using real-life industrial dataset.

• Worked on Natural Language Processing (NLP) with data extracted from social media.

• Advanced knowledge in ML platforms like SciKit-learn, Tensor Flow, Keras, Pytorch etc.

• Worked in GIT and GitHub. 7+ years of analytical experience in building models, writing codes, visualization using scientific and mathematical software like MATHEMATICA.

• Worked with time series to predict the stock market, cryptocurrency trends using ARIMA, RNN.

• Worked on data pipeline for data extraction (from GCP), transformation and load (ETL).

• The concept of Kimball model of data warehousing, dimension & fact tables, different schemas.

• Advanced knowledge in MS office tools particularly Word, Excel, Access, and PowerPoint.

• Exposure to AWS, Databricks cloud platform.


• Intern Data Scientist at Shopify, Waterloo Sep–Dec 2018

(1) Worked in big data Hadoop infrastructure and its various abstraction layers.

(2) Worked on the development of data pipeline using Spark and Scala to deploy code in production.

(3) Used complex SQL queries in Hive and Presto.

(4) Prepared reports and dashboards for different service and product lines in mode analytics with important business intelligence.


(5) In depth understanding of the data platform stating from data acquisition, modelling, distribution, discovery, analysis, experiments.

• Worked with Wondeur AI, Toronto on their event dataset for the capstone project -DO- as a part of the Big Data Certification from McMaster CCE.

(1) Worked on actual event data of the company. Developed a machine learning model appropriate to the problem. Tested, verified and applied the model.

(2) Made use of graph theory, network analysis.


(1) Costa Rican Household Poverty Level Prediction( prediction-with-99-recall)

(2) Predictions on Brooklyn Housing Prices( brooklyn-housing-prices)

(3) Text Classification modeling using NLTK movie reviews (Academics)

(4) Sentiment analysis of real-time Tweets (Academics)

(5) Text summarization of Wikipedia Canada page (Academics)

(6) Business analytics for NYC Yellow Taxi dataset for first three months of the year 2017 using only Power BI ( first-three-mukherjee/)

(7) Predicting success of an event based on dynamic multilayer-graph / ML analysis

(Industry, Wondeur AI, Toronto)


• McMaster University, Hamilton Certificate in Big Data Analytics Jan–Dec 2018

• Jadavpur University, Ph.D. Theoretical Condensed Matter Physics

• University of Calcutta, Master of Science in Physics

• University of Calcutta, Bachelor of Science in Physics ADDITIONAL EXPERIENCE

McMaster University, Canada Research Associate in theoretical physics 2016-2017

• Framing problems, writing publication standard articles, programming in Mathematica, building models

Brock University, Canada Postdoctoral fellow in theoretical physics 2014-2016

• Coding, model building, data interpretation

Memorial University, Canada Postdoctoral fellow in theoretical physics 2013-2014

• Extensive programming involving linear algebra and higher dimensional matrix equations. Asia Pacific Center for Theoretical Physics, Pohang, South Korea 2010–2013 Postdoctoral Researcher

• Programming in Mathematica, coding, data interpretation, model building

