Post Job Free

Resume

Sign in

Data Analyst Python

Location:
Detroit, MI
Posted:
October 09, 2020

Contact this candidate

Resume:

Aditya Jillellamudi

adgtjj@r.postjobfree.com 419-***-**** LinkedIn Detroit, MI

SUMMARY

Experienced data science professional in Health sector and Banking. Improved marketing strategies by reassessing Project Management needs and increased revenue by 14%. Provided statistical guidance for diverse operations research problems. Created churn model for customer turnover saving company ~$250K. Achieved 11% improvement in revenue loss estimation model.

PROFESSIONAL EXPERIENCE

Data Analyst, Henry Ford Health Systems 3/2020 – Present Pathology Informatics (turnaround time) project:

• Gathered business requirements from pathology informatics team, worked with various cross-functional teams.

• Performed preliminary data analysis on gathered data into organized form with data quality checks, integrity, anomaly detection, and feature engineering using PySpark.

• Designed Data Architecture and built a Data Model for Courier tracking project allowed to track pending logs.

• Built real time geospatial COVID dashboards in Power BI &API’S to allow analyze trends and track antibodies.

• Determined KPI’s, metrics to design various business cases and created data models with data workflows.

• Identified data leakage in the existing pipelines and improved data quality standards resulted saving $10K.

• Retrieved large amounts of Pathology informatics data using PostgreSQL queries and generate reports with Qlik Sense, SAP Crystal reports.

• Built a data pipeline for revenue optimization, allowing $90k in annual cost savings using snowflake and python.

• Analyzed the performance evaluation of laboratory equipment based on their accuracy, precision, false positives.

• Performed exploratory in-depth statistical and inferential analysis using R to recommend actions in lowering patient expenses.

• Decreased turnaround time of testing’s by 50% by optimizing & segmenting test tubes using Python. HAP (health alliance plan) Project:

• Analyzed data by extracting from multiple sources including claims, provider, members data using SQL Server to identify and assess the business impact and trends of customer growth & revenue.

• Gathered large amount of costumer insurance data to analyze personal characteristics using Hadoop, Hive.

• Prepared Business Requirement Document (BRD) by gathering requirements through detailed discussions with business users from SMEs and defining business processes and identifying risks.

• Provided meaningful insights on costumer data by creating ETL pipelines to extract the data and perform Data Analysis using Python.

• Evaluated associations between personal characters and customer plan decisions using transactional data.

• Formulated and built performance measures and KPIs for clients using population statistics and ML methods based on feedback received, and engagements and comments from customers.

• Built perceptive interactive Tableau dashboards that give insights into the overall performance of different levels in HAP (health alliance plan).

Data Scientist Intern, Nemo IT Solutions 10/2019 - 3/2020 Data Analyst (GRA), Bowling Green State University 08/2018 - 09/2019

• Analyzed the engagement survey data for HR team to determine the factors responsible for Employee turnover.

• Built models like Random forest, XGboost in python to determine the importance of different survey questions.

• Performed text opinion-mining (LDA) and understood main reasons for employee dissatisfaction using text survey data.

• Predictive Modelling of food sales data to forecast product demand & weekly sales. Data Analyst II, INFOSYS 10/2017 - 08/2018

• Created business, functional requirement documents to gather requirements from the business and SMEs.

• Integrate enterprise data and delivered self-service reporting and analysis with SAS BI.

• Took business requirements and translated them into technical requirements then convert to Implementable SAS solution.

• Created fully interactive KPI Tableau dashboards translating business needs into data-driven solutions.

• Gained a 68% renewal rate by predicting customer churn and reduced revenue loss on targeting segmented users.

• Generated actionable insights by performing statistical modeling and analysis of customer’s loan behavior.

• Developed ETL procedures, combined 10+ datasets, organized millions of rows of raw data, and wrote advanced SQL queries used in metric-reporting tools. Enhanced query performance without affecting accuracy.

• Achieved 20 % lower average loan cycle time by prioritizing customers based on their importance.

• Led an intern team to mine high-volume customer transactional data to develop predictive models using python. Data Analyst I, INFOSYS 2/2017 - 10/2017

• Extracted required project data from multiple client databases to prepare datasets using ETL tool (Talend).

• Performed preliminary data analysis on gathered data into organized form with data integrity checks, anomaly detection, and feature engineering.

• Demonstrating the product with end-users and receiving the feedback in a sprint review meeting.

• Acted as a liaison by collaborating with the client, accounting and data science team to design KPI’s.

• Created fully interactive Tableau KPI dashboards translating business needs into data-driven solutions.

• Plotted the performance of models with different metrics and compared them weekly, monthly in python using Matplotlib.

RELEVANT PROJECTS

Github

Customer Segmentation and Churn Model for Telecom:

• Developed an ML model of churn analysis to identify and list the contracts that are likely to be canceled soon.

• Dealt with unbalanced data using SMOTE and predicted risk levels of the churners and revenue loss caused by them.

• Built a Customer Segmentation model, then compared different distributions of each cluster with relevant features

Techniques Used: K-means clustering, XGBRegressor, Gradient-Boosted Trees, Random forest. (PySpark) Malaria Detection using blood cell images (Image classification):

• Extracted infected parasite area in blood cell images as the features using Contour detection.

• Tested Machine Learning techniques on the features after converting those images (27,560) into table in CSV.

• Applied Convolutional neural network on images to improvise the accuracy using Tensorflow. Techniques Used: CV2, Contour detection, Random Forest, SVC, Tensorflow, Keras. (R, Python) Topic modeling for reviews and Predicting price for an Airbnb:

• Predicted spot-on expected price for a new place with the location and amenities, etc., a host can offer to Airbnb.

• Extracted missing features through text processing. Worked on feature engineering, EDA, hyper-parameter tuning.

• Developed sentiment classifier using VADER(Python) to implement topic modeling and discover deciding factors. Techniques Used: NLP, XGboost, LDA, Linear Regression (Lasso and Ridge), Random forest, Gensim. (Python) Time Series Analysis and Forecasting for retail sales:

• Performed time series analysis on different categories in superstores sales data(4-years) and forecasted the sales.

• Obtained optimal set of parameters to fit the SARIMAX model to Analyze trends and derived important conclusions.

Techniques Used: ARIMA, Pylab, Prophet. (Python)

Amazon Reviews and Ratings Analysis:

• Applied MapReduce techniques through python to nearly 8 million Amazon electronics reviews (over 7Gb).

• Used PySpark to refine bigdata and performed Exploratory data analysis and Data Visualization using Pyplot.

• Built recommendation engine and characterized the relationship between number of reviews and average review score for both customers and products. Created dashboards using SAP predictive analysis. Techniques Used: K-means, Linear SVM, Random forest, NLP. (Spark, Python, Tableau) SKILLS

Programming Languages/tools:

Modeling Techniques:

Certifications:

SQL, Python (Scikit-learn, Seaborn), Spark, R, Tableau,Qlik, Unix, Excel. Linear & Logistic Regression, Naïve Bayes, SVM, Random forest, DecisionTree, Clustering, Text Analytics, Cross validation, NLP, root cause analysis, Web scraping, Neural nets, Spark MLlib, ensemble models. Machine learning, Neural Networks and Deep Learning (Coursera, Infosys Lex)

EDUCATION

Bowling Green State University: M.S., Data Analytics 08/2018 - 08/2019 Coursework: Regression Analysis, Time series Analysis, Decision Optimization, Exploratory Data Analysis, Business Intelligence, Machine Learning, Data Mining, Big Data Analytics, Project Management. Amrita University: B. Tech, Electronics and Communications 06/2012 - 06/2016 Coursework: Operations Research, Statistics and Numerical methods, Vector calculus, Calculus, Matrix Algebra.



Contact this candidate