Data Analyst Python

Location:

Alpharetta, GA

Posted:

September 02, 2020

Contact this candidate

Resume:

***** ********** **

Alpharetta, GA.

507-***-**** *****.******@*****.***

Divya Mereddy

https://www.linkedin.com/in/divya-mereddy

https://github.com/DivyaMereddy007

https://sites.google.com/prod/view/divyamereddy

SKILLS

Programming: Python, R, SQL server, Spark( R, Scala, Python), Hive, NoSQL (MongoDB, SPARQL), Weka, Google Analytics, Django, AngularJS, MATLAB.

Utilities: AWS (EC2, S3, RDS, Machine Learning, AWS IOT, Redshift), Azure, Tableau, ggplot

Machine Learning Algorithms: Supervised and Unsupervised Learning, Deep Learning, Optimization Techniques, Ensemble Learning Techniques, Association Techniques, Clustering Techniques, Survival Models.

Other Skills: Leadership skills, Passion to learn new technologies, Advanced Excel, Presentation skills, Ability to work independently and on a team, Innovative, Attention to detail, Expert in Agile.

EDUCATION

THE UNIVERSITY OF CINCINNATI, OHIO

Masters in Computer Science (GPA: 3.7)

Coursework on Advanced Algorithms, Machine Learning, Intelligent Data Analysis, Cloud Computing, AI etc.

EXPERIENCE

EMPLOYER: LUXOFT

CLIENT: AT&T, GA, USA FEB 2020 – TILL DATE

Data Scientist

Skills: Python, Hadoop (Hive), Teradata, SQL, classification techniques, ensemble methods

Working on end to end of a model to predict necessary dispatches from a group of tickets reported by customers and marked as not necessary based on MLT testing.

Working on data extraction and creating features using Hive, Teradata from data lake.

Developed and validated classification models with different types of data and with different ML techniques using SVM, Linear Logistic Regression, Random Forest, Gradient Boosting, Decision Tree.

Successfully achieved good accuracy by implementing Ensemble Methods and Unbalanced Data Handling Techniques like choosing Balanced Random Forest, Balanced SVM, Under Sampling etc.

MANTECH, MD, USA (UNITED HEALTH SERVICES, MEDICARE & MEDICAID) JULY 2019 – FEB 2020

Data Scientist

Skills: R, Spark(R, Scala), H2o,SystemML, DML, PhotonML, Survival Analysis( Cox Model, AFT), AWS(EC2, EMR, S3 data late)

Involved in rewriting legacy R Cox Proportional Hazard Model in distributed environment to predict the hazard rate of facility providers for ESRD system to find standard transfusion rates (STRR) of providers.

As a part of it, developed the cox proportional model in spark using Apache systemML with Scala and DML.

Involved in developing the same project using SparkML AFT model.

Developed the same project in h2o with Sparklr and involved in introducing artificial weights to the model which doesn't accept weights as input, developing predict function from scratch to introduce off_sets, weights etc to predict system.

Involved in analyzing and comparing the results of legacy R model with our new distributed models.

Worked on data extraction and modification using Sparklr, H2o.

Involved in EC2, EMR setup and spark base setup for distributed models using h2o, photonML, reticulate etc.

Worked on rewriting Standard Hospitalization Rate (SHR) for providers based on Cox Model and successfully handled large data by implementing techniques like PCA and EMR configuration improvement.

Involved in rewriting a SAS GEE model in R language for VAT model and rewriting R language GLMER Standardized Re-admission Rate (SRR) model in distributed environment using photonML package.

UNITED INSTALLS, KY, USA MAY 2018 – JUN 2019

Data Scientist, Team Lead

Skills: Python, SQL server, Unix, AWS (EC2,S3,RDS,Machine Learning, Redshift), Regression, Time series, Clustering, Tableau

Working on end to end of analytics systems from data extraction (using APIs& SSIS) to visualization including business requirements collection, data cleaning, filtering and developing machine learning algorithms.

Developed a SARMA time series order management (Supply Chain) system to predict the orders count and predict goods required to be ordered to maintain a continuous workflow and decrease the service delivery time in python.

Developing an Automated Optimized Scheduling system using Geospatial HDBSCAN clustering to increase installers utilization and decrease the service delivery time using python.

Leading an application team of 6 members. Involving in hiring employees/ Consulting services, collecting requirements, documentation of project, building application workflow, guiding the team and tracking the resources utilization.

Worked on AWS EC2, S3, Redshift, cloud RDS database.

Developed Tableau dashboard to visualize goods analysis, service analysis results, employee utilization etc.

Performed different types of statistical testing like Dickey–Fuller test, rolling average, T test, F test etc to check the scope of the project and check the performance of different types of machine learning algorithms on our data.

Developed a POC on User Interest Prediction on Home Improvement Business data in python.

Involved in a POC on Predicting Floorplans System based on customer input like number of floors, bedrooms, bathrooms etc for creating visual building plans using Decision Trees in Python.

Developed a POC on Price Optimization System using Regression Algorithm to dynamically change our Home Improve Services price as per factors like supply, demand etc.

Predicted the estimated time of task using Multi Variant Regression based on factors like installation type, number of people in crew, rating etc.

UNIVERSITY OF CINCINNATI IT, CINCINNATI, OH JAN 2018 – APR 2018

Student Worker (Data Analyst)

Skills: Python, SSIS, Azure, SQL, SPARQL, Control Vocabulary, Apache Solr

Extracting, transforming and loading data from different systems of UC medical campus for further analysis using SQL, SSIS etc. Working on data validation, filtering, cleaning and mapping of data using Python.

Worked on developing Solr search implementing Control Vocabulary and visualization of resulted data in network diagrams using Python. Involved in Azure cloud setup.

Developed an SSIS system to load data in EAV database from RDF file using SPARQL.

Worked on enhancements of a Naive Bayes classifier to analysis the university financial system and enrollment rate based on growth of different departments and plans implemented. Provide analysis reports to the managers to show which department can help in university growth and where they should spend money.

TATA CONSULTANCY SERVICES, HYD, INDIA SEP 2015 – AUG 2017

Research Engineer (Machine learning), Team lead

Skill: Python, Tableau, SQL, AWS, Association Algorithms

Developed a Content-Based and Collaborative-Based Filtering User Interest Prediction application based on the user purchase history and plausible requirements using Python.

Lead a data science team of 5 people. Involved in requirements collection, client meetings.

Involved in high priority production support.

Worked on Data Visualization of Retail and Wholesale Data Analytics using Tableau.

Developed a system to predict the suitable candidates for a position and filter them for recruiting process based on resumes available in Indeed, LinkedIn, Glassdoor etc.

Data Analyst, Head of Fun at Work, Head of a project development team

Skill: Spark, SQL, Java, ID3 Algorithm

Developed an ID3 algorithm to analyze the impact of the different factors on educational organization development like Quality of Education, Research, Extra activities using Hadoop frameworks (PySpark and Hive).

Also experienced with presenting data patterns to the clients and managers to help in business decisions making.

Worked as a full-stack developer in Manufacturing domain in a collaborative team based on SQL, AngularJS, Java.

Received numerous awards from the client and TCS. Led a team of 10 people in a project development activity and achieved best team award & “Best Idea of The Project”.

M SYSTEMS, HYD, INDIA APR 2015 - SEP 2015

Data Science Developer

Skill: SQL, Python, Naive Bayes Classifier

Worked on end to end of a Lyrics Mood Classifier application based on words of the song (positive/negative) using Naive Bayes Classifier in Python. This is useful to automatically play songs in public places depending on the mood of people.

BHARAT HEAVY ELECTRICALS, HYD, INDIA JAN 2014 – JUNE 2014

Developer

Skill: SQL, Python, J2EE, JavaScript

Developed a web application to calculate the power distribution based on J2EE and JavaScript.

PROJECTS

Developed Income Prediction Naive-Bayes Tree algorithm using weka to predict the income of an individual.

Developed a Regression tree without using in inbuilt libs based on Gini index and popularity based spiting index for Breast Cancer Prognostics dataset and Wine Quality datasets in Python.

Decision Tree algorithm to predict whether the patient is normal or abnormal from biomechanical data.

Developed Linear SVM and non-linear SVM with RBF on Biomechanical data in Python.

Developed a K-means algorithm on BCP and Wine Quality dataset in Python.

Linear Regression with single feature and multiple features: Developed an algorithm to predict the patient illness.

Using the Hadoop streaming API, built a mapper and reducer scripts to analyze the vehicle accidents data.

Developed a project to analyze the weather dataset using Hive.

Developed a cloud web application user create account and login web application in the cloud using AWS.

Developed a project to detect the anomaly in IOT data.

ACHIEVEMENTS & CERTIFICATIONS

“MSME- GOVERNMENT OF INDIA” funded projects.

Worked as “ICRE-2015” Reviewer

“IEEE ICESA-2015” Published paper

Project presentation from UCIT at CNI Spring 2018.

Oracle Certified Java SE 6 Programmer *Basic R Programming * Python for Data Science by Data Camp * Shaping up with Angular.js

Contact this candidate