Vikrant Deshpande
*******.**************@*****.*** Github LinkedIn Bloomington, Indiana
Education
Indiana University Bloomington
Master of Science, Data Science
Aug 2021 - May 2023
University of Pune
Bachelor of Engineering, Information Technology
Aug 2013 - May 2017
Experience
Indiana University
Graduate Research Assistant
Aug 2021 - Present
Bloomington, IN
● Provided proof of statistical-significance of demographics metadata on size of Functional Tissue Units in organs, using Power analysis and Multiple-Linear-Regression analysis.
● Extracted meaningful reports to compare scRNA-sequencing tools for annotating cell-type partonomy structures.
● Researched disparate data sources, and enabled researchers to map more cell-types into the HuBMAP project.
● Automated comparison reports to reduce 40% of manual checks, using Github Actions, R, and Unix Bash. MasterBrand Cabinets Inc.
Data Science Intern
Jun 2022 – Aug 2022
Jasper, IN
● Architected a multiple time-series framework to forecast product demand via Seasonal-ARIMA and FB Prophet.
● Collaborated with Finance, Marketing teams as a cross-functional intern to find causal factors for demand spikes.
● Created actionable experiment-tracking visualizations to find macroeconomic factors affecting trend changes.
● Built Azure Machine Learning pipelines for dynamic model-selection & forecasts producing minimal 1.2% MAPE. Deloitte USI
Consultant - Data Engineer
Jan 2018 - Jul 2021
Mumbai, India
● Specialized in Master Data Delivery and Clinical Data Management, right from data instrumentation to insight.
● Improved patient-risk monitoring using LASSO regression models in R, with a Tableau report to track variations.
● Enhanced Master-Data language detection by 40%, by using a FastText classification model.
● Developed 40+ ETL pipelines for ingestion and aggregation from a variety of sources like APIs, AWS S3 Buckets, and SharePoint Lists into a central SQLServer Data Warehouse.
● Designed effective Data Quality frameworks, and Data Model strategies for Type 1 and Type 2 SCD tables.
● Reduced reporting time by 60% across the entire system, through SQL performance tuning and automated alerts.
● Built a Flask app for parsing confidential XML/XSD data, reducing 80% manual efforts of 10 FTEs per week. Projects
Distributed Weather Reporter Python, Java, MongoDB, Docker, Kafka, Circle-CI, Github Jan 2022 - May 2022
● Led a team of 3 to create a loosely coupled distributed weather-reporting system using microservices.
● Developed the data-centric Flask service for visualization, and SpringBoot service for audits and authentication.
● Used Kafka for streaming GIF image reports, and Circle-CI for CI/CD pipeline of Kubernetes pods (Link). Masterizing Hospital Entities Python, PySpark, R, SQL, Bash, Informatica PowerCenter Dec 2021 - Jan 2022
● Architected an in-house tool to identify a single source of truth for better KPI reporting via master-data management.
● Achieved 64% compression-ratio for a dataset of 50k records, by incepting a data-pipeline to recursively parse and merge subsets of data, and the Levenshtein algorithm for fuzzy string-comparisons (Link). XML/XSD Parser for Pyspark Data-Ingestion Python, Flask, jQuery, HTML, CSS, Heroku Jan 2020 - Apr 2020
● Created an app aimed to allow user-friendly ingestion of confidential XML/XSD data into PySpark.
● The app parses XML/XSD data, and translates the chosen tags to a relational schema format.
● Deployed it to Heroku; used by 2 teams in Deloitte-USI to make XML manual parsing 80% quicker (Link). Skills
● Languages: Python, SQL, R, Bash, Java, Javascript.
● Databases: SQL Server, MySQL, MongoDB, Postgres, Snowflake.
● Frameworks & packages: Flask, PySpark, Tidyverse, SpringBoot, SciKit, Airflow, TensorFlow, Keras.
● Tools: Git, Tableau, Heroku, Docker, Kafka, Kubernetes, Informatica, Jira, Bitbucket, Postman, AWS, Azure. Achievements
● Graduate-Program Ambassador: Data Science Program Indiana University 2022-23
● Research Publications/Patent: Undergraduate thesis in IEEE ICCCA 2017: “Voice over Light-Fidelity (VoLF)”.
● Bronze Medal, Kaggle Competition: Revenue prediction on a dataset of 1.6Mil customer transactions (Top 6%).
● Certifications: AWS Certified Cloud Practitioner, Deep Learning Specialization