Post Job Free

Resume

Sign in

Data Engineering

Location:
Chicago, IL
Posted:
April 16, 2021

Contact this candidate

Resume:

RITIN VASHISHTHA

312-***-**** Chicago, IL adlqy1@r.postjobfree.com www.linkedin.com/in/ritinv

EDUCATION

Master of Science, Business Analytics University of Illinois at Chicago 2019-2020 Bachelor of Technology, Electronics and Telecomm Engineering KIIT University 2013-2017 WORK EXPERIENCE

University Of Illinois at Chicago – Research Assistant Feb 2021

• Research about the effect of certain data patterns affecting data integrity and quality in UI Health databases.

• Building data repositories for statistical data analytics, and testing health records for quality standards.

• Cleaning, wrangling datasets and extend current data quality characterizations to new data sources using SQL and Python. Deloitte - Solution Delivery Analyst July 2017 – May 2019

● Worked as Cyber Security Analyst and provided in-depth security incident report through careful analysis of the logs/data obtained and provided solutions to time-sensitive security incidents, averaging 35+ incidents per day for Fortune 500 clients across industries.

● Handled multiple high/critical priority tasks and ad-hoc requests simultaneously. Role included querying and filtering in the threat handling tools to extract datasets, tables, payloads, and logs for analysis.

● Worked in sync with other engineering and analytics teams on various use cases for content testing and tuning recommendations. PROJECTS

Capstone: Kavi Labs SEPT 2020 – DEC 2020

● Predictive models were built on a healthcare dataset containing 60,000 ICU admissions data to Predict Mortality in ICU using Kavi’s No-code Data Engineering and Data Analytics platform Plexa and Advana which are based on Spark Machine Learning.

● Captured insights about the data by exploring it using Power BI, followed by Data transformations using the Data Engineering platform Plexa.

● Reduced data transformation step for individual use cases by creating a master repository of the common variables and features that were utilized concurrently in all the use cases.

● Trained and Tuned machine learning algorithms such as Decision Trees, Random Forests, and Logistic regression to optimize model performance on the Data Analytics platform Advana. Tableau FEB 2020

● Presented a comprehensive study of the Number of flights running from different geographical locations in the USA and comparison was provided on basis of measures like on-time/late arrival and departures, number of passengers traveling, local geography, and climate. Finance: Portfolio Management FEB 2020 – MAR 2020

● Computed returns, deviations, and correlations for 6 different companies for SPX in a period of 6 years. Created an EWP for the same and excess returns, beta coefficients, total variance, and R2 were calculated. Targeted Marketing – Paralyzed Veterans of America (PVA) Fundraising SEP 2019 -OCT 2019

● Performed Synthetic Minority Over-Sampling Technique (SMOTE) to balance the disproportionate dataset.

● Used GBM and RF for feature importance followed by PCA to develop a RF classification model with 95.01% accuracy.

● Regression models to predict the donation amount was developed using GBM, RF, and OLS out of which the best performance was given by RF with a Train RMSE of 7.44 and a Test RMSE of 7.61. Loan default prediction and investment strategies OCT 2019 – NOV 2019

● Dataset of 120k customers was used to develop a binary classification model to predict loan being risky or not.

● Conducted EDA and built decision tree (train - 86.15%, test - 83.20%) and RF (train – 100%, test - 84.93%) models.

● Developed a GBM model with the top 30 most important variables which resulted in a test accuracy of 85.73%

● Lift curve and profit curve were plotted based on various threshold values of the probability of fully paying a loan. Deloitte Project OCT 2018

● Reduced individual workload on team members by 40% by identifying key gaps in ticket assignment and time worked using JIRA and Excel pivot tables.

SKILLS

Languages and Tools: Python, R, Tableau, SQL, PowerBI Frameworks: Scikit-learn, Numpy, Pandas, PySpark, Matplotlib Statistical: Random Forest, GBM, XGboost, Decision Tree, SVM, Naive Bayes, KNN, Neural Networks (MLP, LSTM, GRU, CNN), Gaussian Mixture Models, K-means, PCA, LDA, Time series analysis, NLP, EDA(Exploratory Data Analysis), Regression analysis Packages: Microsoft Office 365, Visual Studio

Incident Management: ServiceNow, JIRA, Confluence, SharePoint, Agile RELEVANT COURSEWORK

Technical: Data Mining, Machine Learning, Advanced Database Management Systems, Deep Learning, Text Mining, Big Data analytics, Analytics Strategy & Practice

Business: Corporate Finance, Investments, Derivative markets, Accounting



Contact this candidate