Summary
Krupakar Pulkam
Data Scientist
OK 405-***-**** ********.******@**********.***
Around 4 years of experience as a Data Scientist with an understanding of Data Modeling, Evaluating Data Sources, and understanding of Data Warehouse/Data Mart Design, and Client/Server applications.
Good Knowledge of the Software Development Life Cycle (SDLC), Agile, and Waterfall Methodologies.
Experienced in a wide range of machine learning algorithms, including Linear and Logistic Regression, DBSCAN, Clustering (K-Means, Hierarchical), Decision Trees, Neural Networks, Random Forest, Navies Bayes, SVM, Gradient Boosting, NLP, and Deep Learning using frameworks like Keras and PyTorch.
Working knowledge of Python and R libraries such as NumPy, Pandas, Matplotlib, SciPy, and ggplot2.
Proficient in designing stunning visualizations using Tableau and Power BI software and publishing and presenting dashboards, and Storyline on web and desktop platforms.
Knowledge of various Relational Database Management Systems (RDBMS) such as MySQL and SQL Server.
Experienced in descriptive and inferential analysis, A/B and hypothesis testing, distribution analysis, experimental design, and statistical modeling, ensuring data-driven decisions.
Experience
HCA Healthcare, USA June 2023 - Current Data Scientist
Involving requirements gathering, analysis, design, development, and testing production of an application using the agile model.
Apply unsupervised clustering techniques like DBSCAN, K-Means, and hierarchical clustering to group data points with similar characteristics.
Using packages like ggplot2 in R Studio for data visualization and generated scatter plots and high-low graphs to identify the relationship between different variables.
Optimize linear regression algorithms for real-time application, resulting in a 20% reduction in processing time.
Develop a manager dashboard of call center metrics and KPIs to analyze the team performance using Tableau.
Led a team of sales associates, providing guidance, setting performance goals, and conducting regular performance evaluations.
Develop KNN-based recommendation systems, increasing user engagement by 20% in personalized content delivery platforms.
Utilized historical data and trend analysis within Salesforce to support sales forecasting and pipeline management.
Generate reports and dashboards using SSRS (SQL Server Reporting Services), providing stakeholders with valuable data insights.
Conduct descriptive and inferential analyses, performed A/B and hypothesis testing, analyzed data distributions, and designed experiments while creating statistical models to facilitate data-driven decisions.
Outstanding data analysis skills including data extraction from MS SQL database, data mapping from source to target schemas, and data cleansing and preparation.
Hexaware Technologies, India Oct 2018 - Nov 2021 Data Scientist
Used Waterfall life cycle Methodology for Iterative development and Rapid delivery of the product.
Employed logistic regression models for customer churn prediction, reducing churn rate by 10% through targeted retention strategies.
Developed programs with manipulated arrays using packages such as NumPy, Matplotlib, Pandas, and Python.
Embedded Power BI reports to internal Portal to manage access of reports and data for the individual user based on roles.
Employed supervised learning techniques for stock price prediction, resulting in a 12% increase in prediction accuracy.
Leveraged Data Lake Storage and GCP Cloud Storage for storing structured and unstructured data, ensuring data availability for analysis.
Generated periodic reports based on the statistical analysis of the data using SQL Server Reporting Services.
Applied Naive Bayes classifiers for email spam detection, achieving a 95% accuracy rate in identifying spam emails.
Conducted market research to identify customer preferences and adjusted product offerings accordingly.
Utilized Power BI to create various analytical dashboards that help business users get a quick insight into the data.
Skills
Methodologies:
SDLC, Agile, Waterfall
Language:
Python, R, SQL, Spark
IDEs:
Visual Studio Code, Jupyter Notebook
Machine Learning Algorithm:
Linear Regression, Logistic Regression, Decision Trees, Supervised Learning, Unsupervised Learning, Classification, SVM, Random Forests, Naive Bayes, KNN, K Means, CNN, Natural Language Processing (NLP)
Packages:
NumPy, Pandas, Matplotlib, SciPy, ggplot2, Seaborn, TensorFlow, Minitab
Visualization Tools:
Tableau, Power BI, Microsoft Excel
Cloud Technologies:
AWS, Azure, GCP
ETL Tools:
ETL (SSIS), SSRS, MS Office
Database:
MySQL, SQL Server, PostgreSQL, T-SQL
Statistical Testing:
A/B testing, Hypothesis testing, Multivariate regression, Trend Analysis
Operating System:
Windows, Linux
Education
Master of Science in Computer Science Oklahoma Christian University, Edmond, OK
Bachelor of Technology in Computer Science and Engineering Jyothishmathi Institute of Technological Sciences, India