Data Analyst Analytics

Location:

Austin, TX

Salary:

65000

Posted:

March 17, 2025

Contact this candidate

Resume:

Sai Sumanth Kolanupaka

Data Analyst

Austin, Tx, 78717 • **************@*****.*** • 816-***-****• www.linkedin.com/in/sai-sumanth-13b600210 SUMMARY

Data Analyst with over 3+ years of experience delivering actionable insights and improving business efficiency through data analytics and engineering solutions, including 2+ years in healthcare data analytics, working with STARs, HEDIS, RAF, utilization metrics, and MLR/HBR. Proficient in Python, SQL, .NET, with hands-on expertise in geodatabase design, ETL workflows, and GIS application development. Experienced in developing, deploying, and maintaining enterprise-scale GIS and asset management solutions. Skilled in building automated ETL pipelines, developing interactive dashboards, statistical testing, A/B experiment design and implementing machine learning models to streamline workflows. Demonstrated experience using SQL and DOMO to manage and visualize data effectively. Competent in deploying data-driven solutions to optimize resources, reduce costs, and improve decision-making. Strong at collaborating cross-functionally to analyze complex data, ensuring high-quality reporting and real-time analytics. SKILLS

● Programming & Scripting: Python (Pandas, NumPy, Matplotlib), SQL, Django, R

● GIS Development: ArcGIS Pro, ArcGIS Server, ArcGIS Portal, ArcGIS Online

● Big Data & Cloud Tools: Kafka, Spark, PySpark, Alteryx, Snowflake, Azure Data Factory (ADF), MongoDB, MS Azure, AWS

● Data Analytics: Data Cleaning, ETL Pipelines, Machine Learning Algorithms, Survey/Likert Data Analysis, Alteryx

● Data Visualization: DOMO, Power BI (DAX, Data Modeling), Tableau, R Shiny

● Databases: MySQL, MongoDB, Oracle, Snowflake

● Operations & Process Analysis: Lean Six Sigma, Kaizen

● Version Control & Automation: Git, MS Office, JIRA, Jenkins, Docker, Apache Airflow

● Data Governance: Data Quality Checks, HIPAA Compliance, Data Integrity Management PROFESSIONAL EXPERIENCE

MCKESSON, KS

Software Engineer/ Data Analyst Jul 2024 - Current

● Designed and maintained complex SQL queries to support data extraction, transformation, and loading processes, improving query efficiency and reducing processing time by 30%.

● Utilized advanced Python libraries such as Pandas, NumPy, and Matplotlib for data manipulation, analysis, and visualization, enabling actionable insights for healthcare operations.

● Conducted comprehensive data cleansing workflows, including handling missing data, outlier detection, and normalization, improving dataset accuracy and reliability by 40%.

● Developed Python-based scripts for automated data preprocessing, reducing manual effort and ensuring consistency across datasets containing over 50,000 records.

● Streamlined data validation processes using SQL triggers and stored procedures, ensuring data accuracy and compliance with regulatory standards.

● Leveraged PySpark to process large-scale clinical datasets, optimizing computation times and enabling real-time analytics for critical business decisions.

● Built reusable Python modules for data cleansing tasks, including imputation and scaling techniques, enhancing team productivity and consistency.

● Integrated SQL-based analytics with Python workflows for seamless data pipeline execution and reporting, improving end-to-end processing efficiency by 25%.

● Applied statistical analysis and machine learning techniques using SciPy and Statsmodels to uncover key trends, aiding in decision-making for patient care and resource planning.

● Enhanced database indexing and optimization strategies in SQL, reducing query execution times for high-volume datasets by 50%.

● Created Python scripts for data reconciliation tasks, ensuring alignment between production and backup databases and reducing errors by 20%.

● Visualized cleansed and processed data through interactive Power BI dashboards, improving stakeholder understanding and supporting data-driven decisions.

UBER, MO

Software Developer / Data Analyst Intern Jan 2024 – May 2024

● Assisted in developing and optimizing ETL workflows using PySpark and Snowflake, processing 100,000+ records to support ride and vehicle analytics dashboards.

● Built Django-based applications for backend data ingestion and real-time processing of ride metrics, improving analysis efficiency by 25%.

● Collaborated with the team to implement Apache Kafka for streaming real-time ride and vehicle data into MongoDB, ensuring seamless and fault-tolerant data pipelines.

● Performed data preprocessing with Python libraries (Pandas, NumPy), applying imputation, scaling, and feature engineering to enhance data accuracy by 90%.

● Supported the integration of Django with big data tools like PySpark and Snowflake to extract insights on fleet optimization, contributing to 12% cost savings.

● Assisted in demand forecasting using Python-based time-series analysis, improving peak-hour fleet availability by 15%.

● Participated in creating 3 Power BI dashboards, sourcing data from Snowflake and PySpark pipelines to provide actionable insights for decision-making.

● Contributed to building real-time analytics pipelines using Kafka and PySpark, reducing latency and manual data updates by 40%.

ADANI, INDIA

Data Analyst Feb 2021 – Jun 2022

● Conducted data mining and analysis on large-scale energy datasets using SQL, improving data accuracy and optimizing resource allocation by 15%.

● Developed and optimized SQL queries for data extraction, validation, and transformation, ensuring adherence to data governance standards and reducing redundancies by 20%.

● Built and automated data pipelines to streamline reporting processes, improving data availability and operational efficiency.

● Performed anomaly detection using SQL-based analysis, identifying equipment inefficiencies and improving operational uptime by 20%.

● Designed and implemented data quality checks to ensure clean and reliable datasets for energy performance reporting.

● Leveraged SQL insights to support the development of energy demand forecasting models, improving accuracy and enabling data-driven decision-making.

● Automated the generation of performance dashboards and reports, reducing manual effort and improving real-time monitoring capabilities.

CIPLA, INDIA

Data Analyst Intern Aug 2020 – Jan 2021

● Cleaned and analyzed pharmaceutical sales data using Python (Pandas, NumPy), improving data accuracy and enabling 10% more reliable demand forecasting.

● Developed automated ETL workflows using Azure Data Factory, streamlining data integration processes and reducing manual intervention by 25%.

● Conducted data mining and cleansing techniques to preprocess large datasets, ensuring high-quality data for advanced analytics and reporting.

● Created 2 interactive Power BI dashboards to track inventory trends and supply chain metrics, enhancing planning efficiency for stakeholders.

● Designed Python-based analytical reports, reducing turnaround time for supply chain analysis by 4 hours/week.

● Conceptualized MongoDB for data storage and processing, enabling scalable solutions for clinical data management.

● Performed time-series analysis to identify demand trends, optimizing inventory management with 8% fewer stockouts. EDUCATION

UNIVERSITY OF CENTRAL MISSOURI Missouri, USA

Master of Science in Computer Science 2022 - 2024

JAWAHARLAL NEHRU TECHNOLOGICAL UNIVERSITY India

Bachelor of Technology in Electronics and Communication Engineering 2017 - 2021 CERTIFICATIONS

● Internship on Internet of Things(IoT)/Adobe Red Hat

● Certification in Python Programming / Michigan State University, Coursera

● Certification in SQL programming / W3Schools

● Certification in IEEE and also hosted the IEEE conference (2018-2019)

● Certification in NASA Space App Challenge 2018

Contact this candidate