Data Scientist Analyst

Location:

Atlanta, GA

Posted:

December 05, 2023

Contact this candidate

Resume:

Jagadeesh Angineti Ravi

Data Analyst/Data Scientist

*****, ******* 470-***-**** *********************@*****.*** LinkedIn

PROFESSIONAL SUMMARY

Analytical and process-oriented Data Scientist with around 4+ years of experience in data science, analysis and engineering with in-depth understanding of database types, research methodologies and big data capture, manipulation, analytics, curation, and visualization using Agile and Waterfall methodologies.

Expert in analyzing requirements, furnishing insights, analytics, and business intelligence to advance opportunity identification, process re engineering and corporate growth.

Proven expertise in collecting, interpreting, and data analysis from multiple data streams, including Access, SQL, Tableau, Python and Excel to drive successful business solutions.

Excel at identifying efficiencies and problem areas within data streams and communicating requirements for smooth project execution. Also, participated in daily scrum meetings.

Proven analytical skills in comprehending business cases, preparing functional design documents to ensure the data- driven decision-making solutions are provided to the business strategy and problems.

Expertise in using statistical software (SAS, Python) to Clean, Integrate, Transform, Reduce, Analyze the large sets of Data.

Utilized object-oriented libraries and frameworks such as scikit-learn and TensorFlow to build and deploy machine learning models in production, optimizing the entire data science workflow.

Experienced in Python's modules NumPy, beautiful soup, matplotlib, pandas etc. for data preprocessing, web scraping, and visualization, statistical modeling, econometric modeling, deep learning (CNN’s and RNN’s ) and machine learning.

Experience with Project Management roadmap building tools such as JIRA and Microsoft Project.

Also, possess exceptional decision-making skills, interpersonal skills, analytical and problem-solving skills, time management skills, project management skills, organizational skills, verbal and written communications skills.

KEY TECHNICAL SKILLS

Operation System : Windows series, Mac OS

Cloud : Azure Blob, Databricks, BigQuery, Azure Synapse, AWS S3, Redshift, EC2, Lambda

BI and Visualization : Tableau, Kibana, PowerBI

Database Management Tools : SQL(Oracle, MySQL, MS SQL Server), T-SQL, Relational Database, NoSQL

Programming Languages : Python(NumPy, SciPy, Pandas, scikit-learn, Matplotlib, Plotly) and R

(tidyverse,dplyr, ggplot2), Deep Learning and Machine Learning Packages, RNN, CNN, NLP, STATA, Scala, Teradata, Neural Networks, Natural Language Processing

Other Tools and Technologies : Advanced MS Excel(Pivot Tables, Vlookup, Xlookup), MS Office, Google

Spreadsheets, PowerPoint, PySpark, Adobe Analytics, SSRS, SPSS, SSIS

WORK EXPERIENCE

Data Scientist – Newell Brands, Georgia July 2023 – Present

Automated USA gross margin pivot Python, R, Scala, SQL codes and enhanced efficiency by freezing all previous years finance datasets, connecting with real time Hyperion and SAP Sales of 6.7+ million records total.

Applied quantitative machine learning algorithms to develop predictive models for customer segmentation and attained a 10% increase in accuracy by leveraging Python libraries OpenCV, scikit-learn, Pytorch and TensorFlow.

Built predictive solution to detect at-risk orders within the Supply chain and identify key drivers to minimize risk & penalties from customers (~$2MM - $3.8MM/year).

Created interactive dashboards and visualizations, allowing stakeholders to gain insights using real time SAP data.

Generated interactive visual representations and reports to effectively communicate analytical discoveries to audiences with varying technical expertise.

Worked with NLTK library and word2Vec model to build word embedding for better performance and accuracy.

Designed CI/CD pipelines using tools like Jenkins and GitHub CI to automate model training and deployment.

Designed Sales performance dashboards for KPIs, ROI metrics to automate parts ordering from potential vendors and provide insights on Newell brand products manufacturing data using visualization application PowerBI and conducted analysis on market situations and provided strategic recommendations.

Utilized SAP S4 HANA to perform advanced statistics, data analytics and predictive modeling for product sales, enabling data-driven insights that led to more effective pricing strategies and enhanced customer satisfaction.

Extensively created data pipelines in cloud using Azure Data Factory.

Worked with Azure Data Factory as it’s a great SaaS solution to compose and orchestrate Azure data services.

Utilized Azure Databricks for data transformation, analysis, and machine learning tasks, enabling data-driven decision-making in this retail project.

Employed time-series analysis using PyCaret for anomaly detection in the time series data, identifying irregular patterns and outliers, which contributed to a proactive approach in addressing supply chain challenges.

Designed and implemented advanced econometric models like VAR, ARIMA to analyze and forecast retail sales trends, providing valuable insights for optimizing inventory management and pricing strategies.

Utilized CRM data to extract actionable insights, such as customer behavior patterns, purchase history, and demographic information, enabling data-driven decision-making for marketing and sales teams.

Participated in Data Acquisition to extract historical and real-time data by using Hadoop MapReduce and HDFS.

Data Analyst – The Kroger Co, Cincinnati, OH(Remote) Jan 2022 – June 2023

Worked with cross functional teams for requirements gathering and to gain insights into client business challenges for problems specific and conducted data analysis using suitable statistical methods.

Integrated Git and SVN to ensure seamless collaboration and code management.

Developed and implemented predictive models using Natural Language Processing Techniques and machine learning algorithms such as linear regression, classification, multivariate regression, Naive Bayes, Random Forests, cluster analysis, SVM, KNN, PCA and regularization for data analysis.

Created financial models and conducted cost-benefit evaluations, leading to data-supported suggestions that boosted ROI by 25%.

Designed and developed ETL pipelines in AWS to migrate campaign data from external sources like S3, ORC/Parquet/Text Files into AWS Redshift. I designed serverless application CI/CD using AWS Lambda application.

Setup storage and data analysis tools in Amazon Web Services cloud computing instances (AWS EC2) and setup storage buckets in the AWS S3 service.

Used AWS EC2 console to launch cloud instance, chose Amazon Machine Image (AMI) to launch virtual machine, and configured the instance.

Coordinated as an ETL developer with a understanding of data warehouse concepts, star and snowflake schema Involved in documentation of reports & dashboards, including data sources, key metrics and techniques used.

Balanced the query runtime by code review to less than 7 minutes by creating high-performance SQL queries on large datasets and created database performance tuning steps as well as formulated ad-hoc reports.

Created ad-hoc dashboards effectively using data blending, joins, actions, filters, calculations, sets, parameters, graphs, charts and maps.

Evaluated model performance using metrics such as accuracy, precision, recall, F1-score, and ROC AUC.

Data Scientist – Care Health Insurance,India May 2019 – Aug 2021

Pre-processed raw data using ETL tools, Python Pandas, and performed data acquisition, data cleaning, including missing data treatment, redundant values, inconsistent information and outliers removement.

Utilized MS SQL, data warehousing programs, Power BI, deciding useful metrics for performance measurement and other dashboards and visualization libraries for data intelligence and analysis.

Used STATA for advanced statistical modeling, analytical and ML techniques, pipelines including Multivariate analysis, clustering, to uncover hidden patterns, optimization pricing strategies, enhanced recommendations systems, improved customer experience, resulting in a significant boost in sales, pricing and customer satisfaction.

Actively involved in SQL and Azure SQL DW code development using T-SQL.

Demonstrated proficiency in hypothesis testing, utilizing a variety of statistical techniques such as t-tests, chi- square tests, ANOVA, and regression analysis to draw meaningful conclusions from data.

Applied regularization techniques, such as L1 and L2 regularization and early stopping techniques in Boosting algorithms to mitigate overfitting issues and ensure model generalization.

Successfully fine-tuned XGBoost models to achieve improved performance by adjusting hyperparameters, addressing overfitting, and enhancing model generalization.

Chose relevant AI and machine learning models, performed hyper parameter tuning for optimal values for models, including supervised and unsupervised algorithms like logistic regression, decision tree, SVM and random forest.

Analyzed results and issues of different models from the A/B test, generated assumptions, and conducted t-test to assumptions validations and obtained a high precision.

EDUCATION

Masters of Science in Data analytics CGPA: 3.7/4

Clark University, Massachusetts Aug 2021 – May 2023

B.E. Computer Science CGPA: 3.6/4

Anna University, India Jun 2016 – Apr 2020

CERTIFICATIONS

Tensorflow Developer Udemy, 2023

Tableau Desktop Specialist Tableau, 2022

Oracle SQL Developer Oracle, 2022

Contact this candidate