Data Science Engineer

Location:

Columbus, OH

Salary:

100000

Posted:

November 20, 2020

Contact this candidate

Resume:

Debolin Sinha Email: *******.*****@*****.***

Phone: 614-***-****

Location: Columbus, OH

Data Scientist linkedin.com/in/debolin7

Data Scientist with a thorough understanding of current and emerging software technologies with 7years+ experience of data mining, wrangling and building machine learning models. Experienced at creating supervised & unsupervised learning models & deliver vital business insights to implement action-oriented solutions to complex business problems.

Databases: MS-SQL Server, AWS RedShift, AWS S3, PostgreSQL, Snowflake

Languages: SQL, Python, R, Java, C#

DevTools: Jupyter, AWS, Databricks, MS-SQL Studio, Visual Studio, R-Studio, MS Azure

Big Data Tools: Hadoop MapReduce, Hive, Spark, Scala, Kaggle, MongoDB, NoSql

ETL: AWS Data Pipeline, AWS Glue, MS-SSIS, PySpark

Machine Learning: Scikit Learn, NumPy, Pandas, Keras, TensorFlow, PyTorch, SageMaker

Reporting Tools: Power BI, Tableau, Looker, MS-SSRS, Matplotlib

Methodologies: OOP, MVC, SDLC, Agile, Scrum

EXPERIENCE

Data Scientist

Tracir Financial Services

Reynoldsburg, OH

July 2015 – Current

Built a Loan Default Predictor using an ANN (Artificial Neural Network) model which analyzed default probabilities of loans by estimating the impact on a set of risk variables. The model used a GAM framework to achieve the desired accuracy. Increased the efficiency of the collections department by 34% and lead to revamp of the team and resources.

Consumed and transformed Parquet formatted bureau data from 3rd party sources like RiskView & Equifax using AWS Redshift & Athena. Used information to build company’s proprietary application scoring model with KNN & Random Forest Tuning to predict delinquency.

Built and deployed unsupervised learning models on EDW data via K-means Clustering and association to find key value metrics related to customer paying habits and or fee structures of car dealerships

Supervised inventory analysis & built car sales prediction models for a Top 50 Car Dealership (Ricart Automotive) in United States with AWS SageMaker & XGBoost algorithm

Used AWS Data Catalog to create and monitor ETL jobs. Customized AWS Glue PySpark scripts to add custom transforms and time and event-based schedules.

Transformed the company’s legacy infrastructure of petabytes of unstructured EDW data to S3 buckets for initial loads and then ran AWS Glue Crawlers to transform and centralize company data with Amazon Redshift Spectrum

Built complex multi-valued parameterized visualization reports for sales, collections and recovery teams in Tableau, Looker & PowerBI with DAX (Data Analysis Expressions) and a multi-perspective view into datasets.

Data Scientist

Infor (TDCI)

Columbus, OH

October 2014 – July 2015

Used machine learning models to predict sales of Infor’s SyteLine product with a 92% accuracy rate.

Ingested & centralized data into AWS S3 to build data lakes for analytical insight.

Worked on a dynamic pricing model to price Infor’s SalesPortal for prospective clients.

Wrote multiple complex Stored Procedures & Triggers for ETL jobs (T-SQL & SSIS)

Designed complex SSRS reports for Sales department.

EDUCATION

BS in Computer Science

South Dakota State University

2014

Contact this candidate