Datta Krishna Nikith Chokkakula
Data Analyst
Texas, USA 201-***-**** ********.**@*****.*** LinkedIn
SUMMARY
Data Analyst with 5+ years of experience delivering end-to-end data solutions, leveraging Python, SQL, Snowflake, and cloud platforms to
drive scalable analytics across healthcare and finance domains. Built and optimized ETL pipelines using AWS Glue, Azure Data Factory,
and dbt, improving data accuracy, governance, and operational efficiency. Applied exploratory data analysis and machine learning
techniques to uncover insights, detect fraud, and forecast trends using Pandas and Scikit-learn. Developed impactful BI dashboards in
Power BI and Tableau, enabling stakeholders to make informed, data-driven decisions and enhance business performance.
SKILLS
Methodologies: SDLC, Agile, Waterfall
Languages: Python, R, SQL
IDEs: Visual Studio Code, PyCharm, Jupiter Notebook
Packages: NumPy, Pandas, Matplotlib, SciPy, Scikit-learn, TensorFlow, Seaborn, ggplot2
Visualization Tools: Tableau, Power BI, Advanced Excel (Pivot Tables, VLOOKUP)
Cloud Platform: Amazon Web Services (AWS), Microsoft Azure
Database: MySQL, SQL Server, PostgreSQL, MongoDB, Oracle
AI & Analytics Tools ChatGPT, Copilot, OpenAI API, AutoML tools, Prompt Engineering, AI-assisted data cleaning
Other Technical Skills: SSIS, SSRS, Machine Learning Algorithms, ETL Tools, EDA, Ad hoc Analysis, Statistics, Snowflake, Alteryx, SAS,
MS Visio, Hypothesis Testing, Regression Analysis, Linear Algebra, Advance Analytics, Data Mining, Data
warehousing, Data transformation, Data Integration, Data Interpretation, Data Pipeline, Association rules, A/B
Testing, Forecasting & Modelling, Data Cleaning, Data Wrangling, Git, GitHub, DBT, Collibra, Ad-hoc Analysis
EXPERIENCE
Data Analyst Baylor Scott & White Health USA Aug 2024 Present
Develop Python-based risk-scoring models like logistic regression and survival models in scikit-learn to predict hospitalization and
readmission, reducing avoidable admissions by 15% and positioning the analyst as a bridge between analytics and clinical teams.
Used MongoDB to store semi-structured healthcare data such as clinical notes, enabling flexible data retrieval and supporting
advanced analytics use cases, while working in an Agile environment and ensuring compliance with HIPAA regulations.
Utilized Amazon Redshift to perform high-performance querying and aggregation on large healthcare datasets, enabling 27%
faster generation of disease trend reports and improving turnaround time for clinical and operational insights.
Created interactive dashboards in Power BI to visualize disease prevalence, seasonal spikes, and regional hotspots, driving faster
clinical and policy decisions and giving the analyst a visible stakeholder -facing role in management meetings.
Conducted statistical analysis using ANOVA (Analysis of Variance) to compare treatment effectiveness across different patient
groups, providing actionable insights for healthcare providers and supporting evidence-based medical decisions.
Implemented time-series forecasting of disease outbreaks using Prophet (Meta) over aggregated case-count data, enabling
proactive resource planning, and allowing the analyst to showcase advanced forecasting skills .
Designed and executed A/B testing frameworks to evaluate effectiveness of treatment protocols and intervention strategies,
improving patient outcomes by 61% and enabling data-driven clinical decisions.
Acted as a liaison between analytics and clinical teams by translating complex data-driven insights into actionable clinical
recommendations, enabling improved collaboration and enhanced adoption of predictive healthcare models in treatment planning.
Data Analyst Capgemini India Apr 2019 Aug 2023
Designed end-to-end data extraction workflows using SQL Server Integration Services (SSIS) to consolidate transactional data from
multiple banking systems, improving data availability by 20% and reducing manual reporting effort.
Performed exploratory data analysis using Pandas and leveraged NumPy for statistical computations and feature engineering to
identify unusual transaction patterns, enabling efficient handling of large datasets, improving fraud detection model perform ance,
and reducing financial losses through actionable data-driven insights.
Applied SAS-based regression and forecasting techniques to analyze historical fraud trends, enabling proactive risk mitigation, and
supporting fraud prevention strategy planning.
Developed advanced data transformation models using dbt, integrated with MySQL, utilizing functions such as JOIN, GROUP BY,
CASE WHEN, and window functions ensuring 30% data consistency and enabling accurate downstream fraud risk scoring.
Utilized Microsoft Excel (Advanced) for ad-hoc analysis and quick data validation, supporting business users in identifying
suspicious activities and improving 28% turnaround time for fraud investigations.
Created executive-level reporting dashboards using Tableau, enabling leadership teams to monitor fraud KPIs, track investigation
outcomes, and drive strategic initiatives for fraud prevention and operational efficiency.
Developed real-time data processing solutions using Apache Spark, enabling 33% faster detection of fraudulent transactions, and
significantly reducing latency in risk scoring and alert generation systems.
Enforced secure data storage and access management using Azure Data Lake Storage (ADLS), ensuring compliance with financial
regulatory standards such as RBI guidelines, PCI-DSS, and GDPR.
EDUCATION
Master of Science in Business Analytics : East Texas A&M, Texas, USA