Venkata Gopi Ranjith Kumar Maturi
+1-469-***-**** ***************@*****.***
Profile Summary
Data Engineer with 5+ years of experience building scalable, cloud-native data pipelines across healthcare, CPG, and telecom domains in US and Indian markets. Skilled in Azure, AWS, Databricks, Spark, and SQL with proven expertise in modernizing legacy systems and enabling real-time analytics. Strong track record in HIPAA-compliant healthcare solutions, IoT-driven supply chain optimization, and enterprise data governance.
Skills
• Languages: Python, Shell, SQL, HiveQL, Scala, R, COBOL
• Cloud Platforms: Azure (Data Factory, Synapse Analytics, Databricks, Stream Analytics, Event Hubs, Blob Storage, Cosmos DB, Key Vault, Logic Apps, Purview, Azure Analysis Services, Azure SQL Database), AWS (S3, EMR, Redshift, Glue, Kinesis, Athena, Lambda, ECS, Dynamo DB)
• Big Data & Analytics: Hadoop, Hive, Spark (PySpark), Delta Lake, Parquet, Airflow, Kafka, Snowflake, Apache Flink, dbt
• Orchestration & Automation: Apache Airflow, Jenkins, Terraform, Helm, Power Automate
• Data Modeling & Governance: Azure Purview, Glue Data Catalog, RBAC, GDPR, HIPAA, SOX, RBI, PRA, 3GPP compliance
• Databases: SQL Server, PostgreSQL, MySQL, MongoDB, Cassandra, ClickHouse, DynamoDB, Cosmos DB, SAP ERP
• Visualization & Reporting: Power BI, Tableau, Looker, Azure Analysis Services
• DevOps & CI/CD: Docker, Kubernetes (EKS), Terraform, Jenkins, GitHub Actions
• Security: SAS Tokens, Azure AD Authentication, Azure Key Vault, Private Endpoints, PII masking
• Other Tools & Technologies: VSAM, FTP, SQL Server, Mainframe, SAP ERP, MRI & IoT Data Systems Relevant Experience
HCA Healthcare Jun 2024 - Present
Data Engineer Dallas, TX
• Designed and deployed scalable Azure Data Lake pipelines to consolidate EHR and clinical trial data, enabling real-time patient care analytics. Implemented HIPAA-compliant encryption and role-based access controls to secure PHI.
• Architected Azure Synapse Analytics workflows to integrate MRI and lab data, enabling 30% more accurate diagnostic reports for oncology teams.
• Migrated legacy ETL pipelines from on-premises SQL Server to Azure Data Factory, automating data integration for critical care datasets and cutting manual processing time by 35%.
• Streamlined ICU sensor data ingestion via Azure Stream Analytics and Event Hubs, improving real-time monitoring accuracy by 25% for emergency response dashboards.
• Implemented Delta Lake schemas in Azure Databricks to standardize clinical trial data, reducing data preparation time by 30% and accelerating FDA submission timelines for new drug approvals.
• Secured data pipelines using Azure Key Vault and private endpoints, ensuring HIPAA and GDPR compliance for cross-border patient data.
• Collaborated with clinicians to design Azure Purview metadata frameworks, enhancing traceability and governance of chemotherapy treatment datasets.
• Optimized Azure SQL database performance through indexing and partitioning, reducing query latency for high-volume billing and claims analytics.
PepsiCo Jul 2023 - May 2024
Data Engineer Dallas, TX
• Built Azure Data Factory pipelines to ingest and transform IoT sensor data from 100+ manufacturing plants, improving supply chain demand forecasting accuracy by 25%.
• Designed Azure Databricks workflows to process terabytes of sales data, improving regional inventory replenishment accuracy by 20% through AI-driven insights.
• Migrated legacy SAP ERP datasets to Azure Synapse, cutting report generation time by 35% and improving decision-making for North American sales operations.
• Automated retail partner data onboarding using Azure Logic Apps and Power Automate, reducing manual validation effort by 40%.
• Implemented Azure Stream Analytics for real-time production line monitoring, cutting equipment downtime through predictive maintenance alerts.
• Secured Azure Blob Storage with SAS tokens and Azure AD authentication, ensuring compliance with CPG data governance standards.
• Developed Power BI dashboards with Azure Analysis Services to visualize SKU performance, supporting data-driven pricing strategy adjustments.
• Optimized Azure Cosmos DB throughput for global distributor data, improving scalability for peak season transaction processing. Verizon Jun 2020 - Dec 2021
Data Engineer Hyderabad, India
• Migrated Hadoop-based CDR pipelines to AWS EMR for large-scale telecom record processing, optimizing Spark jobs with Kryo serialization to reduce memory overhead and improve execution efficiency. Jan 2022 - May 2023
Jun 2013 - Mar 2017
• Ingested telecom logs and CDRs into Amazon S3, reducing storage costs by 30% and enabling efficient downstream Spark processing on AWS EMR.
• Configured Amazon S3 lifecycle rules for log archiving and automated cleanup, reducing storage costs by 25%.
• Built Airflow DAGs to orchestrate 5G network usage workflows. Reduced pipeline failures through SLA based monitoring.
• Automated AWS EMR cluster provisioning with Terraform and Jenkins, reducing deployment time by 50% for development teams.
• Implemented Parquet partitioning for historical call datasets, reducing Athena query runtime by 40% for billing analytics.
• Built PySpark validators to detect missing tower data in real-time streams, integrating findings with NOC dashboards via Kafka, reducing data discrepancies by 30%.
• Converted COBOL batch jobs to Spark SQL for roaming charges, removing 100% of manual reconciliation steps and improving processing efficiency.
• Containerized legacy Java parsers with Docker and deployed on EKS using Helm charts, improving deployment scalability and reducing setup time by 40%.
• Collaborated with engineers to align data models with 3GPP standards, streamlining VoLTE call drop analytics.
• Documented GDPR-compliant data lineage using AWS Glue Catalog and audited PII masking in customer reports, ensuring data privacy and regulatory compliance.
Llyods Banking Group Mar 2018 - May 2020
Junior Data Engineer Hyderabad, India
• Modernized core banking reports by migrating COBOL mainframe data to Hadoop and developing HiveQL transformations for regulatory submissions.
• Scripted Python validators to ensure accuracy of migrated transaction records, automating reconciliation of legacy VSAM datasets.
• Automated UK regulatory audit documentation with Shell utilities, collaborating with compliance teams to streamline control frame works and improve reporting accuracy by 40%.
• Transitioned batch reporting systems to Hive for capital adequacy calculations, aligning workflows with PRA guidelines.
• Built data quality checks for loan portfolio datasets, supporting risk management teams in detecting anomalies.
• Migrated legacy flat-file archives to HDFS, reducing data retrieval time by 40% and enabling faster fraud detection analytics.
• Maintained end-to-end data lineage for SOX compliance, reducing audit review time by 25% while ensuring control validation accuracy.
• Automated file transfers from mainframe to Hadoop using FTP scripts, reducing manual intervention in daily ETL jobs. Education
University of Texas at Dallas
Masters, Business Analytics GPA: 3.73/4
V R Siddhartha Engineering College
Bachelors, Electronics and Instrumentation Engineering GPA: 7/10 Certifications
• AWS Certified Data Analytics - Specialty
• Microsoft Certified: Azure Fundamentals (AZ-900)