Post Job Free
Sign in

Data Analyst Power Bi

Location:
Jersey City, NJ
Posted:
September 14, 2025

Contact this candidate

Resume:

Yogendra Chandra Hasan Karri

+* (***) *** - **** **************@*****.*** Jersey City, NJ LinkedIn PROFESSIONAL SUMMARY

I am a results-driven and detail-oriented Data Analyst with a strong foundation in data engineering, cloud platforms, and advanced analytics, bringing hands-on experience from impactful roles at Cigna and HDFC ERGO General Insurance. Skilled in building end-to-end ETL pipelines, real-time streaming solutions, and predictive models, I have worked with diverse tools such as Talend, Apache Kafka, Azure Data Factory, Snowflake, and Python to deliver data-driven insights that improve operational efficiency, detect fraud, and enhance decision-making. My expertise spans data governance, visualization with Tableau and Power BI, and applying machine learning techniques for actionable outcomes. With a proven track record of optimizing healthcare and insurance analytics, I combine technical proficiency with a problem-solving mindset, ensuring data solutions are accurate, secure, and aligned with business goals. Certified in Azure Data Engineering and Power BI, I bring both the technical depth and business acumen to bridge the gap between complex data systems and strategic insights. TECHNICAL SKILLS

Technical Category Technical Skills

Data Engineering & ETL

Talend, Apache Kafka, Azure Data Factory, SSIS, HL7 Parsing, Alteryx, REST API Integration, Apache Atlas, Collibra Databases & Data Warehousing

SQL Server, Snowflake, Azure SQL Database, Amazon S3, Azure Blob Storage

Programming & Scripting

Python (Pandas, NumPy, Scikit-learn, NLTK), R, Jupyter Notebook, Power Query

Big Data & Cloud Computing

Apache Spark, AWS Lambda, Azure ML Studio, Amazon S3, Azure Cloud Services

Business Intelligence &

Visualization

Tableau, Power BI, Excel, SSRS

Machine Learning & Analytics

Predictive Modeling, Statistical Analysis, XGBoost, CNN, LSTM, Sentiment Analysis

Data Quality & Governance

Great Expectations, Data Cataloging, HIPAA Compliance, Data Validation

Collaboration & DevOps Git, GitHub, Jira, Confluence EXPERIENCE

Cigna Jan 2024 – Present

Data Analyst New York City

• Designed and optimized complex SQL Server queries integrated with Talend ETL workflows to consolidate patient, hospital, and insurance claim data into Snowflake for unified analytics.

• Leveraged Python (Pandas) with Apache Spark to process millions of claim records in near real time, enabling faster fraud detection and cost analysis.

• Built interactive Tableau and Power BI dashboards connected to Snowflake, delivering executive-level KPIs on patient outcomes, hospital efficiency, and cost reduction.

• Developed HL7-compliant data ingestion pipelines using Talend and Apache Kafka, ensuring standardized and secure transfer of healthcare data across multiple hospital systems.

• Utilized Scikit-learn with R statistical models to predict patient readmission probabilities and forecast disease trends, driving proactive care management.

• Implemented AWS Lambda with Amazon S3 for serverless processing and secure storage of large-scale medical reports and lab results, improving data retrieval efficiency.

• Automated repetitive data preparation tasks in Alteryx and integrated outputs into Tableau for real-time visualization of emergency admissions and claim approvals.

• Applied Great Expectations within Talend ETL flows to validate incoming hospital data, ensuring 100% compliance with quality and format standards before loading into Snowflake.

• Integrated REST APIs with Apache Kafka to stream live patient vitals into dashboards, enabling care teams to respond to critical cases instantly.

• Created scalable big data pipelines in Apache Spark with AWS Lambda triggers to handle spikes in claim processing without impacting system performance.

• Designed and maintained a centralized data catalog in Apache Atlas, linking datasets from SQL Server, Amazon S3, and Snowflake for faster analyst self-service.

• Implemented Collibra governance policies to control user access to sensitive health data, ensuring HIPAA compliance and audit readiness.

• Developed multi-source ETL frameworks in Talend to merge historical claims data from SQL Server with live Kafka streams for comprehensive trend analysis.

• Built predictive fraud detection models in Scikit-learn and deployed outputs to Power BI dashboards for the insurance risk team.

• Collaborated via Jira and GitHub to manage code versions, dashboard iterations, and cross-team task dependencies, reducing delivery delays.

• Orchestrated end-to-end healthcare data processing combining HL7 parsing, Talend transformation, and Snowflake warehousing for consistent analytics.

• Conducted advanced statistical analysis in R and visualized findings in Tableau to help hospital administrators improve recovery times and service quality.

• Designed streaming analytics architecture using Apache Kafka, AWS Lambda, and Tableau to provide near real-time monitoring of ICU patient metrics.

HDFC ERGO General Insurance Aug 2021 – Mar 2023

Data Analyst Mumbai, India

• Developed and maintained ETL workflows using azure data factory and SSIS to automate data ingestion from multiple sources into azure SQL database for claim analysis.

• Cleaned and transformed large insurance datasets using Python (pandas, numpy) and power query, resulting in 98% clean and structured data for downstream analytics.

• Built interactive dashboards in power bi and excel to visualize claim trends, fraud hotspots, and customer demographics, helping reduce reporting time by 40%.

• Analyzed claim patterns and customer behavior using python and jupyter notebook, providing insights that led to the detection of unusual claim spikes across regions.

• Supported fraud detection efforts by preparing training and testing datasets for scikit-learn models in Azure ML Studio, improving model accuracy to 87%.

• Integrated structured and unstructured data sources using azure data factory and stored them in azure blob storage for scalable and secure access.

• Created real-time reporting dashboards in Power BI using live connections to azure sql database, enabling proactive monitoring of high-value claims.

• Performed statistical analysis using python and R (basic level) to identify correlations in policyholder behavior and claim submission frequency.

• Developed rule-based and predictive analytics using python and scikit-learn to flag potential duplicate or suspicious claims for manual review.

• Utilized git and github for version control and collaboration, ensuring seamless teamwork on data pipelines and analytics scripts.

• Documented analytical processes, findings, and fraud detection patterns using confluence while managing project deliverables and sprint tasks in jira.

• Automated the extraction and transformation of policyholder and agent data using SSIS and azure data factory, reducing manual processing efforts by 60%.

• Conducted deep-dive analysis on customer segmentation and claim types using jupyter notebook and visualized key metrics using power bi for strategic planning.

• Collaborated with cross-functional teams to identify fraud indicators by combining insights from python analytics and business rules defined in SQL.

• Created custom SSRS reports and excel dashboards for claims operations teams to highlight high-risk profiles and claim frequency trends.

• Enabled large-scale data storage and access by designing Azure blob storage structures integrated with analytics workflows in azure ml and Power BI.

ACHIEVEMENTS

• Predicted high-risk hospitals with 90% accuracy using machine learning models built in Python with Scikit-learn and XGBoost, enabling Stryker to proactively address equipment issues, reduce maintenance costs by $2 million, and improve client satisfaction.

• Played a pivotal role in reducing fraudulent insurance claim approvals by 35% by preparing high-quality datasets using Python and SQL, enabling the successful deployment of predictive fraud detection models in Azure ML Studio. ACADEMIC PROJECT

Project Title: Sentiment Analysis in Customer Reviews Using Machine Learning (Tech Stack: Python, Scikit-learn, Pandas, NLTK, Jupyter Notebook, Matplotlib)

Project Description:

• Built a machine learning model to analyze sentiment in Amazon customer reviews using both supervised and unsupervised learning, enabling businesses to gain insights for improving products and services. Project Title: Enhancing Visual Question Answering with Hybrid Deep Learning Models (Tech Stack: Python, TensorFlow, Keras, OpenCV, CNN, LSTM)

Project Description:

• Developed a Visual Question Answering system combining CNNs for image understanding and LSTMs for question processing to generate accurate, context-aware responses.

CERTIFICATIONS

• Microsoft Certified: Azure Data Engineer Associate

• Certified Power BI Data Analyst Associate (PL-300) EDUCATION

Master of Science in Data Science NEW JERSEY INSTITUTE OF TECHNOLOGY Bachelor of Technology in Computer Science SRM INSTITUTE OF TECHNOLOGY



Contact this candidate