CHAITANYA
Data Engineer / Data Analyst / BI Engineer Multi-Cloud (AWS, GCP, Azure) ETL, Data Warehousing & BI Reporting
Email: ******************@*****.*** / Contact: +1-475-***-****) / LinkedIn
PROFESSIONAL SUMMARY
Data Engineer / BI Engineer with 5+ years of experience designing and implementing scalable data pipelines, cloud-based data platforms, and business intelligence solutions across financial services, healthcare, and IT domains.
Expert in multi-cloud environments (AWS, GCP, Azure) with hands-on experience in ETL development, Spark, PySpark, and modern data warehousing technologies including Snowflake, Redshift, BigQuery, and Azure Synapse.
Strong background in data analysis, visualization, and storytelling using SQL, Tableau, Power BI, and Looker to enable actionable business decisions.
Experienced in requirements gathering, stakeholder reporting, and KPI dashboarding, delivering insights across financial, healthcare, and IT operations.
Proficient in translating complex datasets into clear business insights, supporting compliance, fraud monitoring, risk assessment, and clinical reporting.
Skilled in integrating AI/ML models into production pipelines, enabling predictive analytics for fraud detection, credit risk, and AML.
Proficient in building real-time and batch pipelines, automated workflows, and operational dashboards using Airflow, Python, Tableau, Power BI, and Looker.
Adept at collaborating with cross-functional teams to deliver data-driven insights and actionable business solutions.
CORE TECHNICAL SKILLS
Programming & Scripting: Python, SQL, PySpark, Pandas, NumPy, Shell Scripting, APIs
Data Engineering & ETL: Airflow, AWS Glue, Azure Data Factory, GCP Dataflow, SSIS, Spark, Hadoop MapReduce, Databricks
Data Warehousing: Snowflake, Redshift, BigQuery, Azure Synapse, Teradata, Oracle, MySQL
Data Analysis & Visualization: Tableau, Power BI, Looker, Excel (Pivot Tables, VLOOKUP, Macros), DAX, Data Modeling, Statistical Analysis, KPI Dashboards, Ad-hoc Reporting
Business/Reporting Tools: SSAS, SSRS, Business Objects
AI/ML Integration: Operationalizing ML model outputs, predictive dashboards for risk, fraud, and clinical insights
Data Streaming & Messaging: Kafka, Pub/Sub
Cloud Platforms: AWS (S3, Redshift, EMR, Lambda, Glue, Athena), GCP (BigQuery, DataProc, Pub/Sub, Cloud Composer), Azure (Data Lake, Synapse, Data Factory)
DevOps & Containerization: Docker, Kubernetes, Git
Other Tools: Presto, Logging & Data Quality frameworks
PROFESSIONAL EXPERIENCE
Data Engineer / Data Analyst
Wells Fargo, Texas Sep 2023 – Present
Responsibilities:
Partnered with business analysts to design KPI dashboards in Tableau and Power BI that improved risk and compliance monitoring across credit risk and AML.
Performed trend analysis and ad-hoc reporting using SQL, Presto, and BigQuery to support treasury operations and executive decision-making.
Translated business requirements into dashboards and reports, enabling stakeholders to monitor fraud detection and compliance metrics in real time.
Conducted data validation and profiling to ensure accuracy of BI insights before publishing to business teams.
Migrated legacy data to Snowflake and created staging/metric tables to support BI reporting in Tableau, Power BI, and Looker.
Designed and developed scalable ETL pipelines using AWS Glue, S3, and Redshift to ingest and transform enterprise data.
Built and orchestrated Airflow DAGs for scheduling workflows across AWS and GCP data services.
Processed large-scale datasets using Spark on AWS EMR and GCP DataProc.
Created complex SQL queries in Presto and BigQuery for KPI dashboards and automated BI refresh pipelines.
Developed Python APIs and automation scripts for ingestion and validation from multiple data sources.
Implemented data quality validations using PySpark and SQL, ensuring accuracy in downstream BI reporting.
Integrated outputs from AI/ML models into BI dashboards for fraud monitoring, credit risk, and AML, enabling predictive insights for compliance teams.
Collaborated with data science teams to operationalize ML models within ETL workflows using Python, Spark, and Snowflake.
Optimized data pipelines to support real-time reporting for treasury and retail banking operations.
Environment: Python, SQL, AWS (S3, Glue, Redshift, EMR, Lambda, Athena), GCP (BigQuery, DataFlow, DataProc, Pub/Sub, Cloud Composer), Airflow, Snowflake, Spark, Kafka, Tableau, Power BI, Looker, Docker, Kubernetes, Git.
Data Engineer / BI Engineer
UnitedHealth Group (Optum) – Eden Prairie, MN Feb 2022 – Aug 2023
Responsibilities:
Developed batch and streaming pipelines using GCP (BigQuery, Dataflow, Pub/Sub, Cloud Composer) and Azure Data Factory.
Built ETL frameworks to move data across AWS S3, Snowflake/Redshift, and Azure Data Lake for BI dashboards.
Automated workflows and monitoring using Airflow and Python, improving reliability and reducing manual effort.
Implemented Spark/PySpark pipelines for cleansing, aggregation, and advanced analytics across multi-cloud platforms (AWS, GCP, Azure).
Designed SQL-based data marts in Snowflake, Redshift, and Azure Synapse to support Tableau, Looker, and Power BI reports.
Built interactive dashboards in Power BI and Looker to analyze claims, provider, and patient data for regulatory and operational reporting.
Delivered executive-level KPI reports, reducing reporting time from weekly to daily for 15+ stakeholders.
Conducted SQL-based data analysis across BigQuery, Snowflake, and Redshift to identify healthcare cost trends and patient outcomes.
Partnered with business teams to define metrics, reporting requirements, and compliance-driven data models.
Created real-time operational dashboards in Looker, Tableau, and Power BI for leadership and business operations.
Optimized SQL queries and warehouse performance, improving BI refresh cycles.
Developed healthcare analytics pipelines to integrate claims, patient, and provider data for regulatory and clinical reporting.
Environment: Python, SQL, GCP (BigQuery, Dataflow, Pub/Sub, Cloud Composer, DataProc), AWS (S3, Redshift, EMR, Lambda), Azure (Data Factory, Data Lake, Synapse), Snowflake, Spark, Airflow, Looker, Tableau, Power BI, Docker, Kubernetes, Git.
Data Engineer / BI Engineer
Cyient, India Jan 2020 to Jul 2021
Responsibilities:
Designed and developed ETL pipelines for cloud and on-premise data warehouses using Python, Airflow, Spark, and Snowflake/Redshift.
Processed large-scale structured and unstructured data using PySpark, Hadoop MapReduce, and Pandas, integrating multiple data sources.
Built and orchestrated Airflow DAGs for automated workflows across AWS, Azure, and GCP environments.
Created SQL-based data models, stored procedures, and transformations for data marts and dashboards in Snowflake, Redshift, and Azure SQL.
Developed BI reports and dashboards using Tableau, Power BI, and Looker for business insights and operational KPIs.
Implemented data quality validations and logging frameworks to ensure accuracy and reliability of ETL pipelines.
Worked on real-time and batch pipelines using cloud services, AWS S3, Lambda, Redshift, GCP BigQuery, Dataflow, Azure Data Factory and Spark jobs.
Collaborated with DevOps and cross-functional teams for deployment, testing, and production support of ETL and BI solutions.
Environment: Python, SQL, Airflow, Spark, Hadoop, AWS, S3, Redshift, Lambda, GCP, BigQuery, Dataflow, Cloud Composer, Azure, Data Factory, SQL, Data Lake, Snowflake, Tableau, Power BI, Looker, SSIS, SSAS, SSRS, Oracle, Teradata, MySQL, Pandas, Docker, Kubernetes, Git.
PROJECTS
Real-Time Data & Predictive Analytics – Wells Fargo Sep 2023 – Present
Built and optimized ETL pipelines and data warehouses (fact/dimension tables) using Python, Spark, SQL, Airflow, AWS (S3, Redshift, Glue, Lambda), and GCP (BigQuery, DataFlow, DataProc, Pub/Sub). Developed real-time dashboards in Tableau and Power BI to monitor credit risk, AML, and fraud detection KPIs. Integrated AI/ML model outputs into BI dashboards for predictive insights, improving risk monitoring and operational decision-making.
Data Engineering & Healthcare Analytics – UnitedHealth Group (Optum) Feb 2022 – Aug 2023
Designed batch and streaming pipelines across AWS, GCP, and Azure using BigQuery, Dataflow, Cloud Composer, Azure Data Factory, Spark, and PySpark. Built data marts and dashboards in Tableau, Power BI, and Looker for regulatory, operational, and clinical reporting. Conducted SQL-based analysis on patient, provider, and claims data to identify trends in healthcare costs and outcomes. Optimized warehouse performance and ETL workflows, reducing BI report refresh cycles and improving stakeholder reporting efficiency.
ETL & BI Solutions – Cyient, India Jan 2020 – Jul 2021
Developed ETL pipelines and automated workflows using Python, Airflow, Spark, Hadoop, Snowflake, Redshift, and Azure SQL. Processed large-scale structured and unstructured data for reporting and analytics. Built interactive dashboards in Tableau, Power BI, and Looker to monitor KPIs, operational metrics, and business insights. Implemented data quality checks and logging frameworks to ensure accuracy and reliability of BI outputs.
SOFT SKILLS:
Analytical & Problem-Solving, Communication, Agile & Scrum, Stakeholder Management, Adaptability, Attention to Detail, Time Management, Mentoring, Teamwork, Confidentiality
EDUCATION
Master of Science in Computer Science – Dec 2022
Sacred Heart University, Fairfield, CT