SUMMARY
Experienced Data Engineer with * years of expertise in designing, developing, and optimizing data pipelines and ETL processes across GCP, Azure, AWS, and hybrid cloud environments. Enhanced Azure Data Lake pipelines to support real-time analytics and streamlined data accessibility. Led automation of workflows using Azure Logic Apps, improving pipeline reliability by 50%. Delivered enterprise-grade data governance frameworks to ensure accuracy and compliance. Proficient in utilizing Spark, PySpark, and Databricks for high-performance, distributed data processing. Focused on building scalable solutions with strong focus on security, compliance, and maintainability.
TECHNICAL SKILLS
Programming & Scripting Languages: Python, Scala, Java, PowerShell, DAX, M Query, Shell Scripting
Data Integration & ETL: ADF, SSIS, Informatica, Kafka, REST, GraphQL, CDC, SCD, Parameterized ETL, Azure Functions
Big Data & Distributed Processing: Spark, PySpark, Delta Lake, Parquet, ORC, Avro, Iceberg, Hadoop, Event Hubs, Airflow
Cloud Platforms: AWS, Azure
Cloud Services & Storage Platforms: Azure Synapse, Databricks, ADLS Gen2, Cosmos DB, Snowflake, Blob Lifecycle, Cost Management
Web & API Development: Flask, FastAPI
Databases: SQL, SQL Server, PostgreSQL, MySQL, Oracle, NoSQL
Business Intelligence & Visualization Tools: Power BI, Tableau, Looker, AAS, DirectQuery, RLS, Multi-dimensional Models
Data Modeling & Warehousing: Star Schema, Snowflake Schema, OLAP Cubes, Fact Tables, Dimension Tables, Indexing, Kimball, Inmon, Materialized Views
Machine Learning & AI: Azure ML, Scikit-Learn, TensorFlow, Cognitive Services, Time Series, AutoML, MLOps
DevOps & Infrastructure as Code (IaC): Azure DevOps, Terraform, ARM, GitHub Actions, Docker, Kubernetes, Git, AKS
Security & Compliance: RBAC, Data Masking, Tokenization, Encryption, Sentinel, SIEM, SOC2, GDPR, HIPAA
Monitoring & Governance: Azure Purview, Synapse Workload Management, Auto-scaling, Log Analytics
PROFESSIONAL EXPERIENCE
Capital One October 2024 – Present Dallas, TX
ETL / Data Integration Engineer
Designed and implemented ETL workflows using Informatica PowerCenter, Azure Data Factory (ADF), and AWS Glue, enabling seamless integration from multiple source systems.
Migrated legacy on-prem ETL processes to cloud platforms (AWS + Snowflake), improving scalability and reducing infrastructure costs.
Developed Python and SQL scripts to enhance ETL jobs, automate validation, and accelerate data transformation tasks.
Integrated datasets from Teradata, SQL Server, DB2, and Oracle into Snowflake and Synapse for enterprise-wide reporting.
Built and optimized parameterized ETL pipelines with incremental loading and change data capture (CDC), reducing processing overhead.
Partnered with business analysts to gather requirements and deliver ETL solutions, aligning pipelines with BI and reporting needs.
Authored technical documentation and workflow diagrams to support ETL processes, pipelines, and data models.
Enforced data quality and cleansing rules, improving consistency and accuracy across customer and financial datasets.
Applied CI/CD pipelines via Azure DevOps for ETL deployments, improving release management and reliability.
CVS Health June 2023 – September 2024 Northbrook, IL
Data Engineer
Built scalable ETL pipelines using Informatica IICS (IDMC), AWS Glue, and Python, supporting CRM and healthcare data integration.
Migrated legacy SAS jobs into modern cloud ETL platforms, streamlining workflows and reducing runtime by 30%.
Integrated data from Oracle, Teradata, SQL Server, and ERP systems into AWS Redshift and Snowflake, enabling advanced analytics.
Developed data cleansing and validation logic within ETL jobs, improving data accuracy and reliability by 25%.
Partnered with stakeholders to define ETL requirements, mappings, and workflows, ensuring alignment with business goals.
Designed Python-based automation scripts to monitor ETL jobs, perform error handling, and notify stakeholders proactively.
Enhanced workflow orchestration with Airflow and Informatica CDI, reducing ETL failures and improving SLA adherence.
Contributed to Agile/Kanban delivery through Jira-based sprint planning, daily standups, and retrospectives.
Produced BI-ready datasets and schemas for reporting teams using Snowflake and Tableau/Power BI.
Infosys September 2020 – July 2021 Bangalore
Data Engineer
Led the design and implementation of scalable ETL pipelines using Azure Data Factory, improving data processing efficiency for large-scale datasets.
Established data workflows, automating data ingestion tasks, leveraging Azure Databricks and SQL for high-performance processing.
Enhanced data transformation processes using Azure SQL Database and Azure Synapse Analytics, reducing query processing time by 35% and improving overall system performance.
Streamlined reporting and dashboard creation with Power BI, enabling real-time business insights and reducing manual reporting efforts.
Executed robust data security measures within Azure, ensuring compliance with GDPR and reducing security breaches by 20%.
Hexaware Technologies June 2019 – August 2020 Hyderabad
Data Engineer
Engineered an automated claims validation system using Azure OCR and Python AI models, reducing manual processing time, streamlining claims processing efficiency.
Created a policyholder lifetime value (LTV) prediction model with Azure ML AutoML and Gradient Boosting Trees, increasing customer retention by 15% and refining targeted marketing approaches.
Implemented a data versioning system utilizing Azure DevOps and Delta Lake, ensuring auditability and reproducibility of insurance records, and improving data management protocols.
Optimized insurance risk modeling queries in Azure SQL and Databricks Delta, slashing actuarial model execution time by 50%, enhancing the speed and precision of risk assessments.
Developed an AI-powered chatbot with Azure Bot Services and LUIS, enhancing customer support for claims processing, and reducing response times and improving customer satisfaction.
EDUCATION
Master of Science in Computer & Information Science Southern Arkansas University, Magnolia, Arkansas JAN 2022 – May 2023
Bachelor of Technology in Computer Science and Engineering
Jawaharlal Nehru Technological University Hyderabad (JNTUH), India June 2016 – May 2020
BHARATH REDDY
Data Engineer
Email: ************@*****.*** LinkedIn: linkedin.com/in/Bhrath Mobile: +1-361-***-**** Place: United States