VAMSHIDHAR REDDY BOLLAMPALLY
+1-660-***-**** ***********@*****.*** LinkedIn
SUMMARY
Data Engineer with 4+ years of experience designing and optimizing scalable cloud data architectures and high-volume data pipelines across AWS and Azure. Proficient in ETL/ELT, Apache Spark, dbt, Databricks, Snowflake, and real-time streaming. Skilled in data modeling, governance, compliance, and pipeline performance optimization. Collaborative team player with a track record of mentoring peers and delivering automated, monitored, and reliable data analytics pipelines that enable faster insights and operational efficiency. TECHNICAL SKILLS
Programming & Scripting: Python, SQL, Scala, Shell Scripting Big Data & Streaming: Apache Spark (PySpark, Spark SQL), Hadoop, Hive, Kafka, Kinesis Real-Time & Event-Driven Pipelines Data Engineering & ETL: ETL/ELT Pipeline Design, dbt, Informatica, Data Modeling, Schema Evolution, Data Quality & Validation, Lakehouse / Medallion Architecture, API Integration (REST/SOAP) Databases & Analytics: Snowflake, PostgreSQL, Power BI Cloud Platforms: AWS (S3, Glue, Redshift, EMR, Lambda), Azure (Data Factory, Synapse Analytics, Data Lake, Event Hubs, Functions) Orchestration & Automation: Apache Airflow, Databricks Workflows, Terraform, Jenkins, Git, GitLab CI/CD Governance & Compliance: Azure Purview, RBAC, HIPAA/GDPR Compliance, PHI Data Handling & Healthcare Data Pipelines, Data Quality Monitoring / Observability
Engineering Practices: Infrastructure as Code (IaC), CI/CD Automation, Pipeline Monitoring & SLA Management, Cost Optimization PROFESSIONAL EXPERIENCE
Data Engineer eviCore Healthcare, Nashville, TN Jan 2023 – Present
Engineered and optimized large-scale ETL/ELT pipelines using AWS Glue and PySpark, processing 100M+ healthcare claims monthly, accelerating ingestion and delivery for analytics and operations teams.
Migrated legacy claims datasets to Snowflake with dbt workflows, implementing a performance-optimized warehouse architecture that reduced query runtime by 35% and improved reliability of Power BI dashboards for leadership reporting.
Built Delta Lake pipelines on Databricks for batch and streaming workloads, ensuring ACID compliance, robust data lineage, and 40% faster processing for high-volume post-acute care datasets.
Developed real-time streaming pipelines with Azure Event Hubs and Databricks Structured Streaming, cutting latency from minutes to seconds, enabling actionable insights for care management and faster operational decision-making.
Mentored junior engineers on data modeling, dbt modular design, and Agile practices, promoting knowledge sharing, cross-team collaboration, and higher-quality deliverables.
Data Engineer Syscon Solutions, Hyderabad, India Apr 2021 – May 2022
Designed and automated ETL workflows using Apache Airflow and Databricks Jobs, improving SLA adherence across 100+ daily financial and operational pipelines while ensuring consistent and reliable data availability.
Built distributed data pipelines ingesting and transforming 1TB+ daily from multiple sources using PySpark and SQL, enhancing throughput and operational reporting efficiency.
Engineered real-time streaming pipelines with AWS Kinesis and Lambda, enabling sub-minute analytics for financial operations, facilitating high-volume transaction monitoring and faster decision-making for downstream teams.
Migrated 60+ legacy ETL jobs to Informatica and Delta Lake, consolidating redundant workflows, reducing maintenance effort, ensuring consistent pipeline performance, and improving SLA adherence across critical business processes.
Implemented Azure Purview for data lineage and metadata tracking, providing enterprise-wide visibility, improving governance, auditability, and compliance across 5 business domains, and ensuring controlled access to sensitive datasets.
Integrated REST and SOAP APIs for secure data exchange, enabling reliable cross-system communication, enhancing operational data accuracy, and fostering better collaboration between internal and external teams.
Developed monitoring dashboards and alerting mechanisms for ETL pipelines, proactively detecting anomalies, reducing error resolution time by 50%, and improving overall data reliability, quality, and observability. PROJECTS
Healthcare Data Migration & Real-Time Analytics
Technologies Used - Snowflake, PySpark, AWS Kinesis & EMR, ETL/ELT, Data Governance, HIPAA Compliance, Power BI
Led migration of multi-terabyte datasets from legacy on-prem databases to Snowflake, designing optimized ETL pipelines that accelerated queries and improved downstream reporting efficiency.
Built real-time streaming pipelines using AWS Kinesis, EMR, and PySpark, enabling near-instant ingestion, processing, and operational reporting for healthcare claim data.
Integrated data governance, quality monitoring, and compliance checks into pipelines, ensuring adherence to HIPAA standards and organizational SLA requirements.
Delivered interactive Power BI dashboards visualizing operational metrics for leadership, improving decision-making and enabling proactive management of healthcare workflows.
EDUCATION
University of Central Missouri
Bachelor of Science, Computer Science Warrensburg, USA CERTIFICATIONS
AWS Certified Data Analytics – Associate
Microsoft Certified: Azure Data Engineer Associate