PRIYANKA PHADNIS
DATA SCIENTIST
D
DD
PRIYANKA PHADNIS
DATA ENGINEER
SUMMARY
Data Engineer with a combination of
software and data skills. Professional
experience of 4+ years. Master’s degree in
Data Science. Skilled with building and
deploying Restful webservices, Python
scripts, SQL, Airflow DAGS. Proficient in
Machine learning, data analysis. Open to
relocation anywhere in the United states.
CONTACT
***************@*****.***
Los Angeles,CA
linkedin.com/in/priyankaphadnis/
https://github.com/pinks1995
EDUCATION
2021
WPI [Worcester, MA, USA]
Master of Science in Data Science
2017
VJTI [Mumbai,MH, India]
Bachelor of Technology in Electronics
MANAGEMENT SKILLS
JIRA, Git, Gitlab, MS Office, Excel, Rally,
MS Powerpoint.
TECHNICAL SKILLS
Programming languages: Java, Python,
SQL, Typescript, MySQL, CSS
Frameworks: Spring Boot, MVC,
Hibernate, ReactJS, Hadoop, Spark,
Apache Kafka, PyTorch, Tensorflow
Data Visualization: Seaborn,
Matplotlib
Data Manipulation: Pandas, Numpy,
Scikit-learn, Spacy, OpenCV
Modelling: Scikit-learn, Keras
Cloud: Google Cloud Platform, Cloud
composer, Airflow, Google Dataproc
Databases: Cloud SQL,PostgreSQL
Data Warehouse: Google BigQuery
WORK EXPERIENCE
Data Engineer Aetna/CVS Health,Remote, September 2022-Present
Created utility packages using Python which read Google Dataproc cluster properties from text file and implemented it in all Apache Airflow DAGS which are used to target users in CVS’s experimentation framework.
Implemented checkpointing using PySpark, using Google Biquery and Google Cloud Storage buckets to prevent mismatch in data between two subsequent write operations, which caused incorrect information to be used. Software Developer Optum, Cypress, Orange County, July 2021 – August 2022
Part of operations team, where I successfully developed end-to-end webservice, which enables doctors to view all medical reports in one PDF, by merging all reports into one report for each patient in Spring boot and Kubernetes.
Developed and deployed webservices which enabled machine assisted prior authorization of medical tests by doctors and nurses. Data Science Intern Homesite Insurance, Boston, Feb 2021 – May 2021
Redesigned parts of Homesite’s homegrown insurance peril risk rating package in R to integrate H2O and reduced model building time by 50%
Implemented H2O’s ElasticNet regularization to improve original general linear model performance for Fire Frequency and Severity prediction models. Data Science Intern Eatiquette, Worcester, Aug 2020 – Dec 2020
Developed a python script to extract ingredients from 1000 food product descriptions. Cleaned the extracted text using custom regex rules and pandas and reduced it to a list of 8000 unique ingredients from 20,000 ingredients.
Software developer Citiustech, Mumbai, July 2017-March 2019
Successfully developed and deployed multiple webservices in Spring Boot which retrieved patient data from Mirth channel and helped database team monitor and analyze the quality of HL7 data attributes of patient’s medical reports and Electronic Health Records.
DATA SCIENCE PROJECTS
Salary prediction using Regression in Python
Developed a salary prediction model in Python that helps the talent acquisition, HR department in improving their compensation strategy to attract the best talent and reduce attrition rate with scikit-learn.
Built a Random Forest Regressor with mean squared error of 367 over a baseline of 385 from linear regression when trained on a dataset with a million records. Predicting Telecom Customer Churn using Python- Feb 2019
Developed models to predict telecom customer churn in the next month using Logistic Regression, XGBOOST and Sequential Neural Network using Keras and Scikit-learn with accuracy 76%,78% and 79% respectively. Airline Sentiment Analysis using Text Sentiment Analysis
Performed Sentiment Analysis on text car reviews using text cleaning,Vader Sentiment analysis, and generated tags for car reviews which help you summarize a review quickly.
Implemented BM25 for most relevant review search using a keyword.