Post Job Free
Sign in

Data Scientist

Location:
Bloomington, IN
Posted:
August 27, 2022

Contact this candidate

Resume:

Vikrant Deshpande

*******.**************@*****.*** Github LinkedIn Bloomington, Indiana

Education

Indiana University Bloomington

Master of Science, Data Science

Aug 2021 - May 2023

University of Pune

Bachelor of Engineering, Information Technology

Aug 2013 - May 2017

Experience

Indiana University

Graduate Research Assistant

Aug 2021 - Present

Bloomington, IN

● Provided proof of statistical-significance of demographics metadata on size of Functional Tissue Units in organs, using Power analysis and Multiple-Linear-Regression analysis.

● Extracted meaningful reports to compare scRNA-sequencing tools for annotating cell-type partonomy structures.

● Researched disparate data sources, and enabled researchers to map more cell-types into the HuBMAP project.

● Automated comparison reports to reduce 40% of manual checks, using Github Actions, R, and Unix Bash. MasterBrand Cabinets Inc.

Data Science Intern

Jun 2022 – Aug 2022

Jasper, IN

● Architected a multiple time-series framework to forecast product demand via Seasonal-ARIMA and FB Prophet.

● Collaborated with Finance, Marketing teams as a cross-functional intern to find causal factors for demand spikes.

● Created actionable experiment-tracking visualizations to find macroeconomic factors affecting trend changes.

● Built Azure Machine Learning pipelines for dynamic model-selection & forecasts producing minimal 1.2% MAPE. Deloitte USI

Consultant - Data Engineer

Jan 2018 - Jul 2021

Mumbai, India

● Specialized in Master Data Delivery and Clinical Data Management, right from data instrumentation to insight.

● Improved patient-risk monitoring using LASSO regression models in R, with a Tableau report to track variations.

● Enhanced Master-Data language detection by 40%, by using a FastText classification model.

● Developed 40+ ETL pipelines for ingestion and aggregation from a variety of sources like APIs, AWS S3 Buckets, and SharePoint Lists into a central SQLServer Data Warehouse.

● Designed effective Data Quality frameworks, and Data Model strategies for Type 1 and Type 2 SCD tables.

● Reduced reporting time by 60% across the entire system, through SQL performance tuning and automated alerts.

● Built a Flask app for parsing confidential XML/XSD data, reducing 80% manual efforts of 10 FTEs per week. Projects

Distributed Weather Reporter Python, Java, MongoDB, Docker, Kafka, Circle-CI, Github Jan 2022 - May 2022

● Led a team of 3 to create a loosely coupled distributed weather-reporting system using microservices.

● Developed the data-centric Flask service for visualization, and SpringBoot service for audits and authentication.

● Used Kafka for streaming GIF image reports, and Circle-CI for CI/CD pipeline of Kubernetes pods (Link). Masterizing Hospital Entities Python, PySpark, R, SQL, Bash, Informatica PowerCenter Dec 2021 - Jan 2022

● Architected an in-house tool to identify a single source of truth for better KPI reporting via master-data management.

● Achieved 64% compression-ratio for a dataset of 50k records, by incepting a data-pipeline to recursively parse and merge subsets of data, and the Levenshtein algorithm for fuzzy string-comparisons (Link). XML/XSD Parser for Pyspark Data-Ingestion Python, Flask, jQuery, HTML, CSS, Heroku Jan 2020 - Apr 2020

● Created an app aimed to allow user-friendly ingestion of confidential XML/XSD data into PySpark.

● The app parses XML/XSD data, and translates the chosen tags to a relational schema format.

● Deployed it to Heroku; used by 2 teams in Deloitte-USI to make XML manual parsing 80% quicker (Link). Skills

● Languages: Python, SQL, R, Bash, Java, Javascript.

● Databases: SQL Server, MySQL, MongoDB, Postgres, Snowflake.

● Frameworks & packages: Flask, PySpark, Tidyverse, SpringBoot, SciKit, Airflow, TensorFlow, Keras.

● Tools: Git, Tableau, Heroku, Docker, Kafka, Kubernetes, Informatica, Jira, Bitbucket, Postman, AWS, Azure. Achievements

● Graduate-Program Ambassador: Data Science Program Indiana University 2022-23

● Research Publications/Patent: Undergraduate thesis in IEEE ICCCA 2017: “Voice over Light-Fidelity (VoLF)”.

● Bronze Medal, Kaggle Competition: Revenue prediction on a dataset of 1.6Mil customer transactions (Top 6%).

● Certifications: AWS Certified Cloud Practitioner, Deep Learning Specialization



Contact this candidate