Cherishma S
************@*****.*** +1-940-***-****
Professional Summary
• Experienced Senior Data Analyst with over 5+ years of hands-on experience working on real-time data analytics projects across the banking and healthcare domains, delivering actionable, KPI-driven insights that improved business decision- making and operational efficiency.
• Proficient in Python, SQL, R, and Java for data extraction, transformation, and automation, working with both structured and semi-structured data (CSV, JSON, XML, Parquet, Excel) from multiple sources to ensure accuracy, consistency, and completeness of analysis.
• Designed, developed, and deployed over 30+ interactive dashboards and KPI scorecards using Power BI and Tableau, enabling leadership teams to track key metrics such as revenue growth, risk exposure, patient outcomes, and operational efficiency, improving decision accuracy by 45%.
• Built and optimized ETL/ELT pipelines and dbt data models for data integration across RDBMS (SQL Server, PostgreSQL, Oracle) and NoSQL systems (MongoDB, Cosmos DB), reducing data latency and improving KPI update frequency.
• Developed and deployed Machine Learning and Generative AI (GenAI) models using Scikit-learn, TensorFlow, and LangChain to automate KPI forecasting, anomaly detection, and trend prediction, achieving 30% higher predictive accuracy across financial and healthcare datasets.
• Contributed to banking analytics projects including fraud detection, credit scoring, and customer segmentation, and healthcare analytics projects such as claims analysis, patient recovery forecasting, and cost optimization, improving overall analytical efficiency by 40%.
• Managed cloud-based analytics environments using Azure (Data Factory, Synapse), AWS (Redshift, S3, Glue), BigQuery, and Snowflake, delivering scalable, secure, and high-performance reporting solutions for real-time KPI tracking.
• Enforced data governance and data quality standards, performing data profiling, lineage tracking, and validation checks to reduce inconsistencies and ensure compliance with HIPAA, GDPR, and Basel III standards.
• Experienced in containerization and orchestration using Docker and Kubernetes, enabling efficient deployment and scaling of data pipelines, ML models, and BI services across hybrid cloud environments.
• Implemented CI/CD pipelines using Git, GitHub, Jenkins, and Azure DevOps for automated deployment of ETL pipelines, dbt transformations, and BI dashboards, reducing deployment time by 40% and improving workflow consistency.
• Collaborated effectively with cross-functional teams including business analysts, data engineers, and compliance officers in Agile environments using Jira and Confluence, ensuring alignment between technical deliverables and business goals.
• Experienced in working across multiple operating systems including Windows, Linux for data processing, scripting, and deployment tasks in both on-premises and cloud-based environments.
• Partnered with stakeholders to define KPI hierarchies, governance frameworks, and data strategy roadmaps, helping organizations establish robust, data-driven decision-making cultures and improve performance visibility. Tools and Technologies
• Programming & Data Processing: Python (Pandas, NumPy, Matplotlib, Seaborn, Scikit-learn), R, SQL (MySQL, PostgreSQL), Java, Shell Scripting
• Data Visualization & BI Tools: Power BI (DAX, Power Query, Power BI Service), Tableau, Excel (Pivot Tables, VBA, Power Query), Google Data Studio, KPI Dashboards
• ETL, Data Integration & Automation: dbt (Data Build Tool), SSIS, Alteryx, Informatica, Apache Airflow, Azure Data Factory, AWS Glue, API Integration, Data Pipelines (Batch & Streaming), Excel VBA Automation
• Databases & Data Warehousing: SQL Server, PostgreSQL, Oracle, MongoDB, Snowflake, Redshift, Azure Synapse, GCP BigQuery, Azure Data Lake, Data Modeling (Star & Snowflake Schema).
• Machine Learning & AI: Regression, Classification, Time Series Forecasting, NLP, Generative AI (LLMs, LangChain).
• Cloud, DevOps & Big Data Technologies: Azure (Data Factory, Databricks), AWS (S3, Redshift, Glue), GCP
(BigQuery), Snowflake, Apache Spark, Hadoop, Kafka, Git, GitHub, Jenkins, Azure DevOps, CI/CD Pipelines, Docker, Kubernetes
• Data Governance & Quality Management: Data Profiling, Cleansing, Validation, Data Lineage, Metadata Management, Data Cataloging, Data Compliance (HIPAA, GDPR, Basel III).
• Operating Systems & Collaboration Tools: Windows, Linux; Jira, Confluence, Notion, Power Automate, ServiceNow, Agile Methodologies, Cross-functional Team Collaboration Work Experience
CVS Health Senior Data Analyst Irving, TX, USA Aug 2022 – Present
• Developed and automated SQL and Python pipelines integrating patient, clinical, and claims data across multiple healthcare systems, reducing manual processing time by 40% while improving reporting accuracy and timeliness for regulatory and clinical analytics.
• Designed and maintained Power BI and Tableau dashboards visualizing key patient KPIs such as readmission rates, discharge efficiency, treatment outcomes, and claim settlement times, improving executive decision-making speed.
• Built and optimized ETL workflows using dbt, Python, and SQL to automate data ingestion and transformation from diverse structured and semi-structured enhancing data quality and consistency.
• Implemented predictive analytics models in Scikit-learn and TensorFlow for readmission prediction, treatment success analysis, and claim denial forecasting, increasing prediction accuracy by 30% and enabling proactive healthcare planning.
• Integrated Generative AI (GenAI) and NLP-driven automation using LangChain to summarize medical reports, discharge notes, and clinical feedback, improving documentation efficiency and reducing manual report preparation.
• Utilized Azure Data Factory, Synapse, and Data Lake for managing healthcare datasets and enabling real-time analytics across large patient populations, improving scalability and data availability for enterprise reporting.
• Ensured data governance and quality compliance by implementing validation and profiling checks aligned with HIPAA and GDPR standards, improving data integrity and audit readiness by 30%.
• Standardized reporting templates and created KPI documentation for analytics teams, improving process consistency and onboarding time for new analysts by 25%.
• Implemented CI/CD pipelines using Git, GitHub, Jenkins, and Azure DevOps to automate dashboard and data model deployments, reducing deployment time by 40% while maintaining version control and release accuracy.
• Partnered with cross-functional teams including physicians, data engineers, and operations analysts to define data requirements, refine KPIs, and ensure alignment of analytics deliverables with clinical and operational objectives. PNC Financial Services Data Analyst Remote, USA Jan 2020 – Jul 2022
• Spearheaded the development and automation of data pipelines using SQL and Python to integrate transaction, credit, and customer data from multiple banking systems, reducing manual reporting effort by 45% and increasing the speed of regulatory reporting cycles.
• Played a key role in building real-time risk monitoring dashboards in Power BI and Tableau, delivering interactive KPI visualizations for loan performance, default probabilities, delinquency rates, and portfolio exposure, improving executive decision accuracy by 50%.
• Designed and optimized ETL pipelines using dbt, Python, and SQL to support end-to-end data integration, ensuring high- quality structured and semi-structured data ingestion for enterprise-wide reporting.
• Developed predictive analytics models leveraging Scikit-learn, Linear regression and statistical techniques to forecast credit defaults, fraud probability, and customer churn, enabling proactive portfolio management and reducing credit risk.
• Conducted in-depth ad-hoc analyses on borrower behavior, macroeconomic indicators, and loan loss metrics to deliver actionable insights that influenced risk strategy and capital adequacy planning.
• Streamlined reporting processes and standardized documentation templates across multiple analytics teams, improving consistency, scalability, and onboarding efficiency by 25%.
• Utilized Azure Synapse, AWS Redshift, Snowflake, and S3 for secure, scalable data storage and real-time analytics, supporting enterprise risk management and performance monitoring systems.
• Implemented CI/CD pipelines via Git, GitHub, Jenkins, and Azure DevOps, automating deployments of ETL jobs, dashboards, and model updates, improving reliability and reducing deployment time by 40%.
• Enforced data governance and quality control standards, performing data profiling, validation, and lineage tracking to ensure high accuracy and compliance across regulatory datasets.
• Collaborated with business stakeholders to establish standardized risk KPI frameworks, aligning analytical reporting with Basel III and CCAR compliance requirements, improving risk model transparency and audit readiness.
• Partnered with cross-functional teams including risk management, finance, data engineering, and compliance to define data requirements, ensure regulatory alignment, and drive consistent analytics delivery across global banking operations. Education
Masters in Information Technology University of North Texas Denton, TX, USA GPA:3.33