Data Analyst Python

Location:

Indianapolis, IN

Salary:

60000

Posted:

October 16, 2020

Contact this candidate

Resume:

Bhavani Prasad Rao Ejanthkar

Data Scientist Data Engineer Data Analyst

********@******.***

317-***-****

Indianapolis, IN

linkedin.com/in/bhavaniprasad73

SKILLS

Programming Languages: Python, R, SQL, Tableau, Power BI, Excel Machine Learning/Data Science: Predictive Modeling, Regression, Clustering, Classification, Anomaly Detection Libraries: Tidyverse, NumPy, Pandas, SciPy, Scikit-Learn, Matplotlib, Seaborn, Bokeh Techniques: ETL, Data Cleaning/Quality/Extraction/Visualizations, Supervised/Unsupervised Learning Cloud Platforms: AWS, Azure, Databricks

Distributed Computing: Hadoop, Spark, PySpark, Spark SQL, Pig, EMR Leadership: Cross-functional teammate capable of communicating business objectives into actionable insight EXPERIENCE

Data Scientist Consultant 01/2020 – 07/2020

Solware IT Technologies (DXC Technology) Tysons, VA

Project Requirement: To build the data architecture for the ETL pipeline that automates key business decisions

Created MySQL database that pulls the data from legacy systems

Developed connection between MySQL and Python for cleaning and data quality check

Built a scalable ETL pipeline that automated the dashboard and reports in Power BI and Excel

Built a Proof of Concept predictive models to estimate the projected revenue for next quarter and year using Linear Regression model

Communicated actionable insights from data for deriving strategy and helped leadership for making informed decisions

Co-founder 01/2019 – Present

Zenext LLC Indianapolis, IN

Mission: To build AI driven Voice Assistant Device for Law Enforcement Officers in the US to make public communities safer

Collaborated and performed User-Centered Research with Carmel PD, NYPD, CMUPD, and Chicago PD to collect pain points and challenges the Police Officers are facing in an operational environment

Partnered with Carmel Police Department to test the solution in real-time

Received total funds of $112k support from NIST to advance in developing the solution further

Developing best-in-class product through cutting-edge technology for advancing public safety communications to help emergency responders save lives Database Developer and Data Analyst 12/2017 – 12/2018 IU School of Medicine Indianapolis, IN

Project Requirement: To build a database for nursing centers web application

Designed & developed the database to store, retrieve, and update data

Performed EDA on data by integrating the database with Python

Actively collaborated with team and medical professionals to align the project with their needs

The prototype won grant money of $30,000

Business Data Analyst 01/2015 – 12/2016

ProGen Business Solutions Hyderabad, India

Project Requirement: To provide actionable insights from the customer sales transactions data

Created an ETL data pipeline that pulls data from the database into Python for downstream analysis

Performed EDA to identify patterns in the customer sales data

Prepared financial and business sales revenue report using Tableau dashboards

Provided KPIs for improving the sales revenue & actionable insights for making key business decisions EDUCATION

MS, Computer Science, Purdue University, Indiana, USA 12/2018 BS, Computer Science, JNTU, Hyderabad, India 06/2015 PROJECTS

Bank Marketing Campaign Prediction Using PySpark and MLlib in Databricks

Goal: To build the end-to-end data science pipeline for predicting whether the client subscribe to a term deposit or not?

Extracted, transformed, and loaded data into an DBFS table

Created Spark Cluster and performed EDA and Feature Engineering in Python

Implemented Logistic Regression, Random Forest, Decision Tree, and XGBoost model for predicting the patients’ heart disease

Achieved optimized accuracy score of 90.8% with an XGBoost model after performing hyperparameter tuning

Heart Disease Prediction Using Spark in Databricks

Goal: To build the end-to-end data science pipeline for predicting the heart disease using Spark SQL and PySpark in Databricks

Extracted, transformed, and loaded data into an DBFS table

Created Cluster and performed Data Cleaning, Data Quality check, and EDA in Jupyter Notebook

Implemented Decision Tree model for predicting the patients’ heart disease

Achieved optimized accuracy score of 98.3% with improved time and space complexity

Built scalable workflow that is 20x faster than Hadoop for large scale data processing Airplane Data Analysis Using Apache Pig

Goal: To find store, process, and build the ETL Pipeline for analyzing 1 million rows of data in AWS using NoSQL database (Apache Pig)

Extracted, transformed, loaded data in S3 bucket

Ran the MapReduce job in the EMR Cluster and computed with the EC2 instance

Performed Analysis on 1 million records of airplane traffic data

Achieved insightful information with high efficiency and less time Illinois 101st General Assembly Prediction

Goal: To predict whether the bill is Public Act or not?

Created an end-to-end data science pipeline

Extracted data from website through web-scraping in Python

Performed EDA, Statistical Analysis, and Data Visualization

Implemented Oversampling, Undersampling, and SMOTE on the Imbalanced Classes

Performed classification using Logistic Regression, Random Forest, XGBoost, and Decision Tree

Achieved the best Recall score of 1.0 with Logistic Regression Apple App Store Data Analysis

Goal: To find top 300 apps that had best rank growth over the past 365 days

Cleaned data and performed EDA and Feature Engineering in Python

Performed Statistical Analysis and Data Visualization to communicate the results

Provided actionable insights and KPIs to make key business decisions

Contact this candidate