Debolin Sinha Email: *******.*****@*****.***
Phone: 614-***-****
Location: Columbus, OH
Data Scientist linkedin.com/in/debolin7
Data Scientist with a thorough understanding of current and emerging software technologies with 7years+ experience of data mining, wrangling and building machine learning models. Experienced at creating supervised & unsupervised learning models & deliver vital business insights to implement action-oriented solutions to complex business problems.
Databases: MS-SQL Server, AWS RedShift, AWS S3, PostgreSQL, Snowflake
Languages: SQL, Python, R, Java, C#
DevTools: Jupyter, AWS, Databricks, MS-SQL Studio, Visual Studio, R-Studio, MS Azure
Big Data Tools: Hadoop MapReduce, Hive, Spark, Scala, Kaggle, MongoDB, NoSql
ETL: AWS Data Pipeline, AWS Glue, MS-SSIS, PySpark
Machine Learning: Scikit Learn, NumPy, Pandas, Keras, TensorFlow, PyTorch, SageMaker
Reporting Tools: Power BI, Tableau, Looker, MS-SSRS, Matplotlib
Methodologies: OOP, MVC, SDLC, Agile, Scrum
EXPERIENCE
Data Scientist
Tracir Financial Services
Reynoldsburg, OH
July 2015 – Current
Built a Loan Default Predictor using an ANN (Artificial Neural Network) model which analyzed default probabilities of loans by estimating the impact on a set of risk variables. The model used a GAM framework to achieve the desired accuracy. Increased the efficiency of the collections department by 34% and lead to revamp of the team and resources.
Consumed and transformed Parquet formatted bureau data from 3rd party sources like RiskView & Equifax using AWS Redshift & Athena. Used information to build company’s proprietary application scoring model with KNN & Random Forest Tuning to predict delinquency.
Built and deployed unsupervised learning models on EDW data via K-means Clustering and association to find key value metrics related to customer paying habits and or fee structures of car dealerships
Supervised inventory analysis & built car sales prediction models for a Top 50 Car Dealership (Ricart Automotive) in United States with AWS SageMaker & XGBoost algorithm
Used AWS Data Catalog to create and monitor ETL jobs. Customized AWS Glue PySpark scripts to add custom transforms and time and event-based schedules.
Transformed the company’s legacy infrastructure of petabytes of unstructured EDW data to S3 buckets for initial loads and then ran AWS Glue Crawlers to transform and centralize company data with Amazon Redshift Spectrum
Built complex multi-valued parameterized visualization reports for sales, collections and recovery teams in Tableau, Looker & PowerBI with DAX (Data Analysis Expressions) and a multi-perspective view into datasets.
Data Scientist
Infor (TDCI)
Columbus, OH
October 2014 – July 2015
Used machine learning models to predict sales of Infor’s SyteLine product with a 92% accuracy rate.
Ingested & centralized data into AWS S3 to build data lakes for analytical insight.
Worked on a dynamic pricing model to price Infor’s SalesPortal for prospective clients.
Wrote multiple complex Stored Procedures & Triggers for ETL jobs (T-SQL & SSIS)
Designed complex SSRS reports for Sales department.
EDUCATION
BS in Computer Science
South Dakota State University
2014