Sri Phani Bhushan Mada
Data Analyst
+1-980-***-**** # ***********@*****.*** ï LinkedIn § GitHub Portfolio Professional Summary
• Data Analyst with 3+ years of experience in extracting, analyzing, and interpreting complex datasets to drive business decisions. Specialized in predictive modeling (Random Forest, XGBoost, K-means clustering), statistical analysis (hypothesis testing, regression, time series forecasting), and data-driven strategy development across finance, HR, and supply chain domains.
• Optimized complex SQL queries (CTEs, window functions, query tuning) to reduce report generation time for financial clients. Built ETL pipelines (PySpark, Alteryx, AWS Redshift) to automate data ingestion, improving accuracy while cutting processing costs in cloud migrations.
• Developed customer churn models (Scikit-learn, SHAP analysis) with AUC, enabling targeted retention campaigns that reduced attrition. Applied time series forecasting (ARIMA, Prophet) to predict demand, cutting overstock costs annually.
• Designed interactive dashboards (Tableau, Power BI, Streamlit) to track KPIs like customer lifetime value
(CLTV), inventory turnover, and HR attrition, reducing manual reporting efforts. Created real-time logistics dashboards that improved on-time deliveries.
• Migrated enterprise data from on-premise SQL Server to AWS Redshift, optimizing schemas and CDC pipelines for scalable analytics. Deployed RAG-based GPT models (AWS SageMaker) for HR automation, reducing Tier-0 support queries.
• Partnered with finance, marketing, and operations teams to define KPIs, automate reports (Power BI + DAX), and implement A/B testing frameworks (SciPy, Bayesian methods) that boosted conversion rates. Skills
Programming Languages: Python (Pandas, NumPy, Scikit-learn), R, SQL (Advanced), JavaScript, C. Database & ETL Tools: MySQL, PostgreSQL, Alteryx, AWS Redshift, PySpark, Apache Airflow (basic). Data Visualization: Tableau, Power BI, Matplotlib, Seaborn, ggplot, Streamlit, Plotly. Machine Learning & AI: Predictive Modeling, Feature Engineering, Regression/Classification, NLP, Retrieval-Augmented Generation (RAG), Time Series (ARIMA, Exponential Smoothing), Clustering (K-means). Cloud & Big Data: AWS (S3, SageMaker, Redshift), PySpark, MapReduce. Advanced Analytics: A/B Testing, Hypothesis Testing, Statistical Modeling, Customer Segmentation, Cohort Analysis.
Productivity Tools: Advanced Excel (Pivot Tables, VLOOKUP, Power Query), Git, JIRA. Soft Skills: Cross-functional Collaboration, Data Storytelling, Agile Methodologies. Operating Systems: Windows, Linux.
Education
University of North Carolina at Charlotte GPA : 4.00 Master of Science in Data Science & Business Analytics Charlotte, North Carolina SRM University GPA : 3.78
Bachelor of Engineering in Computer Science Amaravati, Andhra Pradesh Work Experience
CGI Inc. June 2024 – Present
Data Analyst United States
• Designed and deployed Python (Pandas, PySpark) and Alteryx workflows to ingest and clean rows of client transaction data, reducing manual processing time by 40% and improving data accuracy for quarterly financial reports.
• Built interactive Tableau dashboards with drill-down capabilities to visualize customer churn, revenue leakage, and operational KPIs, enabling leadership to identify cost-saving opportunities within 3 months.
• Rewrote complex SQL queries (joins, CTEs, window functions) for a client’s legacy database, reducing report generation time and enabling real-time decision-making.
• Developed a Random Forest model (Python, Scikit-learn) to segment high-value customers based on transaction history and demographics, leading to a increase in targeted campaign conversions.
• Led migration of 50+ TB of on-premise SQL Server data to AWS Redshift, orchestrating schema redesign and CDC
(Change Data Capture) pipelines to ensure data integrity and reducing storage costs by 30%.
• Partnered with finance and marketing teams to define KPIs and automate monthly performance reports using Power BI
+ DAX, eliminating 20+ hours of manual Excel work. Polaris May 2021 – July 2023
Data Analyst India
• Created a real-time Power BI dashboard tracking inventory turnover, supplier lead times, and delivery delays, enabling the logistics team to reduce stockouts and improve on-time deliveries.
• Developed Excel VBA macros and Power Query scripts to automate weekly sales and inventory reports, saving 15+ hours/month and reducing human errors by 35%.
• Designed and analyzed A/B tests (Python, SciPy) for dynamic pricing models, identifying optimal price points that boosted conversion rates without sacrificing margin.
• Built a Prophet time-series model to predict regional product demand, reducing overstock costs annually and improving warehouse allocation efficiency.
• Streamlined IoT sensor data from manufacturing equipment into Azure Synapse Analytics, enabling predictive maintenance alerts that reduced unplanned downtime.
• Conducted deep-dive analyses (e.g., root cause of shipping delays) using SQL + Python, presenting findings to executives in quarterly business reviews to drive process improvements. Projects
Customer Retention Using Predictive CLTV Modeling Python, Tableau, K-means Clustering, PCA
• Developed a customer lifetime value (CLTV) model for a telecom dataset (1,409 customers) using RFM analysis and K-means clustering to segment high-risk churn customers.
• Applied Principal Component Analysis (PCA) to reduce dimensionality and improve model efficiency, achieving accuracy with a Random Forest classifier.
• Identified 336 at-risk customers (including 135 high-value accounts), enabling targeted retention campaigns that reduced churn by 15% in a simulated business case.
• Created an interactive Tableau dashboard to visualize customer segments, CLTV trends, and churn risk factors for stakeholder presentations.
Cricket Performance Analytics Dashboard Python (Pandas, Matplotlib), Tableau, Streamlit
• Built a comparative analytics dashboard to evaluate batting performance of cricket legends Virat Kohli vs. Sachin Tendulkar across 500+ matches.
• Scraped and cleaned data from ESPN Cricinfo, engineered features like rolling averages and strike rate trends, and performed statistical tests to identify significant differences.
• Designed dynamic visualizations (runs distribution by opponent, strike rate by innings phase) in Tableau, revealing Kohli’s dominance in run chases and Tendulkar’s consistency against top teams.
• Deployed an interactive Streamlit app allowing users to filter by match type, opponent, and era, enhancing engagement for cricket analytics communities.
HR Analytics: Employee Attrition Prediction Python, Power BI, Logistic Regression
• Analyzed IBM HR Analytics dataset to predict attrition risk using feature engineering (scaled tenure, salary bins, job role impact).
• Trained a logistic regression model (AUC: 0.92) and XGBoost classifier to identify key drivers (e.g., low satisfaction, overtime) with SHAP values for interpretability.
• Developed a Power BI dashboard to track attrition risk scores, department-wise trends, and mitigation recommendations, reducing hypothetical turnover by 20% in scenario testing. Achievements and Publications
• IEEE SPICES 2024 – “Optimizing Recommendation Systems: Analyzing the Impact of Imputation Techniques on Individual and Group Recommendation Systems” Published in IEEE International Conference on Signal Processing, Informatics, Communication and Energy Systems. DOI: 10.1109/SPICES62143.2024.10779628
• IEEE ICRAIE 2022 – “A Comparison of Various Class Balancing and Dimensionality Reduction Techniques on Customer Churn Prediction” Published in the 7th IEEE Conference on Recent Advances and Innovations in Engineering
(ICRAIE). DOI: 10.1109/ICRAIE56454.2022.10054321
• Runner-Up, 23XI Sports Business Analytics Challenge — Presented customer segmentation strategy to NASCAR leadership