Jahanavi Reddy Thallapally
Data Analyst
Colorado, USA 231-***-**** ******************@*****.*** LinkedIn SUMMARY
• Data Analyst with 3+ years of experience in healthcare and marketing analytics, skilled in extracting, transforming, and analyzing large-scale datasets using SQL, Python, and AWS to support data-driven decision-making and strategic planning across enterprise environments, Defined roles and privileges required to access different database objects and virtual sizing for Snowflake for different workloads.
• Adept at building dynamic dashboards in Tableau and Power BI, implementing predictive models, and performing advanced statistical analysis to uncover actionable insights, streamline reporting workflows, and enhance cross-functional collaboration in Agile settings.
SKILLS
Methodologies: SDLC, Agile, Waterfall, Kanban, Lean Six Sigma Languages: Python, SQL, R, SAS
IDEs: Visual Studio Code, PyCharm, Jupyter Notebook Packages: NumPy, Pandas, Matplotlib, SciPy, ggplot2, TensorFlow, Seaborn, Scikit-learn Visualization Tools: Tableau, Power BI, Advanced Excel (Pivot Tables, VLOOKUP) Database: MySQL, SQL Server, PostgreSQL, Oracle
Other Skills: Amazon Web Services(AWS), Azure, Informatica Power Center, Machine Learning Algorithms, Deep Learning, NLP, Big
Data Technologies, Spark, Probability distributions, Predictive Modelling, Hypothesis Testing, Regression Analysis, Linear Algebra, Advance Analytics, SSIS, SSRS, SSMS, Data Mining, Data warehousing, Data transformation, Clustering, Classification, Regression, A/B Testing, Forecasting & Modelling, Data Cleaning, Data Wrangling, Jira, Confluence, GitHub, Bitbucket Operating System: Windows, Linux, Mac OS
EXPERIENCE
Data Analyst Optum, Colorado, USA Jan 2024 – Present
• Extracted and joined over 20 million records from pharmacy claims, medical claims, and EHR systems using SQL Server and AWS Redshift, enabling reliable patient linkage through deterministic matching with a 98% accuracy rate.
• Standardized multi-source healthcare datasets using Python (pandas, NumPy) to address missing values, normalize formats, and prepare data for longitudinal analysis, improving dataset reliability and reducing downstream processing errors by 40%.
• Constructed time-series datasets to track patient therapy adherence across multi-year periods, supporting chronic disease treatment analysis and enabling more accurate forecasting of refill behavior and discontinuation trends.
• Applied clustering techniques in Python (scikit-learn) to segment patient populations by payer type, cost utilization, and therapy patterns, enhancing targeting strategies and improving campaign efficiency by 22%.
• Extracted structured insights from 500K+ unstructured clinical notes using spaCy and AWS Comprehend Medical, enriching EHR datasets with additional indicators like diagnoses, medications, and lab values, increasing analytic completeness by 18%.
• Designed ZIP-level adherence forecasting models using Python and visualized geographic patterns in Power BI, supporting regional strategy teams with actionable insights that improved outreach efficiency by 20%.
• Developed interactive Tableau dashboards to track key metrics such as total prescriptions (TRx), new prescriptions (NRx), and therapy persistency, reducing reporting cycle time by 40% and improving visibility for brand performance teams.
• Conducted data validation using SQL to identify discrepancies, ensure accuracy, and maintain HIPAA compliance across all datasets, resulting in a 35% reduction in reporting errors and improved audit readiness. Data Analyst Cybage Software, India Jan 2021 – Dec 2022
• Extracted and merged large-scale datasets (12M+ records) from sources such as Google Ads, Nielsen TV ratings, CRM exports, and internal sales databases using SQL (PostgreSQL), creating a unified dataset for MMM analysis.
• Utilized Python (pandas, NumPy) within Jupyter to standardize campaign-level data, handle missing values, correct date misalignments, and normalize spend variables, improving data consistency and reliability by 98%
• Conducted correlation analysis, time-lag detection, and channel saturation studies to identify campaign effectiveness and optimize media strategy, revealing that 18% of total spend delivered less than 5% return.
• Developed a custom attribution schema based on flight dates, product lines, and region codes to align multi-channel media spend with sales performance, improving traceability and attribution accuracy by 35%.
• Built automated Excel-based KPI trackers using pivot tables and slicers for weekly reporting, reducing manual workload by 30% and supporting timely performance reviews across account teams.
• Built REST APIs with Spring Boot and integrated them with cloud platforms such as Azure.
• Designed Tableau dashboards to visualize channel-wise ROI, marginal return curves, and spend optimization scenarios, enabling leadership to reallocate media budgets and achieve a 23% increase in ROI.
• Collaborated with marketing, analytics, and media teams across APAC and EMEA to validate campaign metadata, refine KPI logic, and ensure reporting consistency across 150+ campaigns.
• Authored process documentation covering data workflows, variable definitions, validation steps, and dashboard logic, ensuring audit compliance and scalability for future MMM reporting cycles. EDUCATION
Master of Science in Information Systems and Technologies – University of North Texas, Denton,Texas, United Sates Bachelor of Technology in Computer Science and Engineering - Jawaharlal Nehru Technological University, Hyderabad, Telangana, India