Mounika Poreddi
Data Analyst
Location: MD Phone: 443-***-**** Email: ******************@*****.*** Portfolio linkedin.com/in/mounika-poreddi SUMMARY
● Adept Data Analyst with around 3 years of experience in data wrangling, exploratory analysis, and model development, using tools like Python (Pandas, NumPy, Seaborn) and SQL (MySQL, PostgreSQL) for deriving actionable insights from structured and unstructured datasets.
● Proficient in building ML models using algorithms such as Linear & Logistic Regression, Decision Trees, Random Forest, SVM, KNN, and K-Means, applied to business use cases like churn prediction, segmentation, and anomaly detection.
● Hands-on experience with ETL development, leveraging SQL and Python scripting to perform data extraction, transformation, and loading, improving pipeline performance, and ensuring high data quality.
● Experienced in cloud-based analytics using platforms like AWS (S3, Redshift, SageMaker) and Azure (Synapse, Blob Storage), with exposure to GCP’s BigQuery, enabling scalable and distributed data processing.
● Strong grasp on Snowflake for cloud data warehousing, performing data modeling, schema design, and performance tuning for large-scale analytics workloads.
● Developed interactive dashboards using Power BI, Tableau, and Matplotlib, translating complex metrics and KPIs into visual stories for stakeholders and executive reporting.
● Familiar with Big Data tools like Hadoop and MongoDB, and experienced in NoSQL querying, supporting hybrid data architectures for both structured and semi-structured data.
PROFESSIONAL EXPERIENCE
Blue Cross Blue Shield, MD Feb 2024 – Present
Data Science Co-op / Data Analyst
● Automated data extraction, transformation, and loading (ETL) processes using SQL, Python (Pandas, NumPy), and R, improving data accuracy and reducing processing time by 35%.
● Conducted detailed analysis of claims and treatment datasets to identify trends, anomalies, and performance metrics supporting operational improvements.
● Developed and maintained interactive Tableau dashboards and R-based visualizations to report on KPIs, risk segments, and predictive model results for both operational and clinical stakeholders.
● Performed root cause analysis on claim denials and operational bottlenecks, providing actionable recommendations to reduce errors and improve process efficiency.
● Utilized statistical methods such as regression and hypothesis testing to validate data patterns and support business decisions.
● Created data models and performed variance and trend analysis to forecast claim denials and support decision-making in claims management.
● Applied text analytics on clinical notes to extract actionable insights, enhancing reporting capabilities and reducing manual review efforts.
● Executed data quality audits across multiple systems to ensure integrity, accuracy, and compliance with internal policies and CMS regulations.
● Collaborated with cross-functional teams to translate complex data findings into clear, actionable reports and presentations for clinical and actuarial units.
● Designed and executed data quality audits, ensuring integrity and consistency across multiple data sources and improving reliability of analytics outputs.
● Developed automated reporting workflows using Excel VBA and Python scripts, streamlining weekly and monthly business reviews.
● Ensured adherence to data governance standards, audit readiness documentation, and the secure handling of sensitive patient data.
● Supported infrastructure on AWS, including pipeline monitoring and secure data access controls aligned with HIPAA compliance standards. Infosys (Client: Walmart – Merchandise AI Innovation), India Sep 2021 – Jan 2023 Digital Specialist Engineer / Machine Learning Engineer
● Engineered end-to-end ML pipelines for inventory forecasting and sales prediction using Python (Scikit-learn, TensorFlow, PyTorch) and SQL
(Azure SQL, PostgreSQL), achieving a 16% reduction in stockouts and a 12% decrease in overstocking costs.
● Optimized and deployed BERT-based NLP models using Hugging Face Transformers and spaCy to enrich product metadata and improve product tagging accuracy, increasing catalog search precision.
● Developed clustering models using K-Means, DBSCAN, and Isolation Forest for assortment optimization and anomaly detection, helping merchandising teams identify underperforming product categories.
● Built a GPT-3-based LLM assistant using the OpenAI API for auto-generating product descriptions, reducing manual content effort by 25% and streamlining product onboarding processes.
● Utilized GANs (Generative Adversarial Networks) to create synthetic datasets addressing class imbalance issues in low-frequency purchase segments, boosting model generalization and recall performance.
● Led customer churn analysis using Random Forest, XGBoost, and Logistic Regression, supporting precision marketing efforts and improving customer retention targeting accuracy by over 20%.
● Applied Apriori and FP-Growth algorithms on 100K+ transaction logs using MLxtend and Spark MLlib for market basket analysis, driving strategic product bundling, and optimizing retail shelf allocation.
● Authored 50+ optimized SQL queries and stored procedures to perform large-scale data aggregation across Azure SQL, Snowflake, and PostgreSQL, improving query performance and data availability.
● Created dynamic Power BI dashboards integrating DAX measures and drill-through features to visualize sales trends, SKU performance, and cross-sell metrics for real-time merchandising decisions.
● Conducted extensive EDA and feature engineering using R, Pandas, NumPy, Seaborn, and Matplotlib, identifying key revenue drivers and user engagement patterns that shaped ML model inputs.
● Participated in Agile Scrum ceremonies, working with Product Owners and Data Architects to convert merchandising KPIs into deployable machine learning solutions, ensuring alignment with business objectives.
● Contributed to ETL process design and warehouse schema development using Azure Data Factory and dbt, integrating data from ERP, POS systems, and third-party APIs into a unified data lake architecture. Programmer Analyst Trainee / Data Analyst Intern Jan 2021 – Aug 2021 Cognizant Technology Solutions, India
● Contributed to the development of interactive dashboards using R, Power BI, and Excel PivotTables, enabling business stakeholders to track KPIs such as customer churn, transaction volume, and fraud alerts in real time.
● Executed complex SQL queries on relational databases (Oracle, MySQL) for data extraction, transformation, and aggregation, facilitating accurate reporting on millions of financial transaction records.
● Performed data cleaning and normalization using Python (Pandas, NumPy), R to remove duplicates, handle missing values, and standardize data formats, improving overall data quality and readiness for analysis.
● Conducted exploratory data analysis (EDA) using Seaborn and Matplotlib to visualize trends, outliers, and patterns in user transactions and risk scores, aiding in stakeholder decision-making.
● Worked closely with data engineers to validate ETL pipelines, ensuring data consistency and integrity across staging and production layers of the data warehouse.
● Automated recurring data validation tasks with Python scripts, reducing manual effort and increasing reliability in the report generation process.
● Documented key data metrics, business logic, and report specifications, contributing to the creation of a centralized knowledge base for the analytics team.
● Supported anomaly detection initiatives by conducting basic statistical analysis and assisting in feature selection for clustering models like K-Means.
● Participated in daily Agile stand-ups and sprint retrospectives, contributing to planning, backlog grooming, and delivery of analytics tasks aligned with business goals.
● Gained exposure to cloud data platforms such as AWS S3 for data storage and observed the model lifecycle in AWS SageMaker, enhancing understanding of scalable data workflows.
EDUCATION
University of Maryland, Baltimore County (UMBC)Baltimore, MD Master of Professional Studies, Data Science (GPA: 3.9)Jan 2023 – Dec 2024 Jawaharlal Nehru Technological University India
Bachelor of Technology in Electrical and Electronics Engineering Jul 2017 – May 2021 TECHNICAL SKILLS
Programming languages: R,Python, SQL, SAS
Databases: MySQL, NoSQL Database Management, Snowflake Big Data & NoSQL Databases: Hadoop, MongoDB
Data Visualization Tools: R, Tableau, Power BI
Python Libraries: NumPy, Pandas, SciPy, Seaborn
Machine Learning: Scikit-learn, TensorFlow
Machine Learning Algorithms: Linear Regression, Logistic Regression, Decision Trees, SVM, KNN, K Means Data Science Tools: JupyterLab, PyCharm, Amazon SageMaker, Apache Spark Cloud Technologies: AWS, Microsoft Azure, GCP
Containerization and Orchestration: Docker, Kubernetes Version Control Systems: Git
Natural Language Processing (NLP): NLTK, spaCy.
Statistical Analysis: Advanced Statistical Techniques, Time-Series Forecasting Methods PROJECTS:
Strategic Portfolio Optimization and Forecasting
Tech Stack: Python, ARIMA, CAPM, Matplotlib
● Designed a portfolio optimization framework with CAPM and Yahoo Finance data, forecasted prices with ARIMA models.
● Built diversification matrices via correlation analysis, stationarity tests (ADF, ACF/PACF), visualized benchmarks against the S&P 500.
● Created dynamic visualizations using Matplotlib and Seaborn. Road Crash Analysis and Severity Prediction
Tech Stack: Python, Machine Learning
● Built predictive models (Linear Regression, Random Forest, Decision Tree) for accident severity, using Baltimore County crash data.
● Visualized crash trends and presented actionable road safety recommendations. Taxi Availability Prediction
Tech Stack: Python, Machine Learning
● Built predictive models using historical ride data, identified ride demand patterns based on time and weather.
● Developed insights to optimize taxi fleet distribution and peak hour management. CERTIFICATION:
●Databricks Lakehouse Fundamentals (Databricks) — Issued Apr 2025 Skills: Data Warehousing · Delta Lake · SQL
●Data Engineering Essentials (Coursera) — Issued Jan 2025 Skills: Data Engineering · ETL Pipelines · Cloud Storage
●Introduction to Data Engineering (IBM) — Issued Jan 2025 Skills: Big Data Fundamentals · Data Warehousing
●Supervised Machine Learning: Regression and Classification (DeepLearning.AI) — Issued Dec 2024 Skills: Machine Learning · Regression Analysis · Classification Models
●Natural Language Processing for Speech and Text (LinkedIn Learning) — Issued Jan 2025 Skills: Natural Language Processing (NLP)