Gowri Shankar Chintala
Data Engineer
Location: Texas, USA Mail Ph: 682-***-**** LinkedIn
PROFESSIONAL SUMMARY:
• Data Engineer with 5+ years of expertise architecting and optimizing scalable, secure, and high-performance data platforms across healthcare and financial domains.
• Proven ability to design and deploy cloud-based solutions (Azure, AWS, Databricks, Hadoop) and real-time streaming pipelines (Kafka, Spark Streaming, Flink).
• Experienced in building advanced ETL workflows, optimizing SQL, and engineering data warehouses and dimensional models to support analytics and reporting.
• Strong background in implementing governance, compliance, and security frameworks including HIPAA, GDPR, RBI, RBAC, AWS KMS, and Apache Ranger.
• Skilled in ensuring seamless data interoperability using standards like FHIR, HL7, and EDI to enhance cross-platform integration.
• Adept at delivering actionable insights through dynamic dashboards and analytics solutions using Tableau and Power BI.
• Hands-on expertise in CI/CD automation with Jenkins and Docker, enabling rapid, reliable, and secure data pipeline deployments.
• Python-focused data engineer with strong problem-solving and optimization skills to deliver robust, future-ready data ecosystems. TECHNICAL SKILLS:
• Skills: JAVA, Python, .net, SQL, c++, PySpark, Databricks
• Data Warehousing: Azure Synapse Analytics, AWS Redshift, Snowflake, AWS S3, Glue, Data Lake Solutions, Data warehouse environment
• ETL & Data Pipelines: Apache NiFi, Apache Kafka, Apache Flink, Apache Spark, Apache Airflow, Python, SQL, Azure KMS, AWS KMS, ETL processes
• Data Modelling: ERwin Data Modeler, Star Schema, Snowflake Schema, Data Marts, Data models, Relational databases
• Data Governance: Apache Atlas, Collibra, Data Lineage, Metadata Management, HIPAA Compliance, RBI, GDPR Compliance
• Data Security: Role-Based Access Control (RBAC), Column-Level Encryption, Row-Level Security, Azure Key Vault, FHIR, HL7, EDI
• Real-time Processing: Apache Kafka, Spark Streaming, Apache Flink, ELK Stack (Elasticsearch, Logstash, Kibana)
• Data Visualization: Tableau, Power BI
• Cloud Services: AWS (S3, Glue, Redshift, KMS, Ranger), Azure (Synapse Analytics, Key Vault, KMS), Amazon Web Services
(AWS), MilCloud 2.0
• CI/CD & DevOps: Jenkins, Docker, Kubernetes, Prometheus, Grafana
• Automation & Orchestration: Python (Scripting, ETL Automation), Apache NiFi, Jenkins
• Monitoring & Alerting: Prometheus, Grafana, ELK Stack (Elasticsearch, Logstash, Kibana)
• Agile & Collaboration: Scrum, Jira, Cross-Functional Team Collaboration
• Data Compliance & Standards: HIPAA, FHIR, HL7, EDI
• Other & Soft Skills: Time Management, Leadership and Management, Problem solving, Negotiation, Decision-Making, Monitor- ing, Documentation, Presentation, Verbal Communication, Interpersonal Skills
• Version Control: Git, GitHub
• Operating Systems: Linux, Windows, iOS
• Database Management: archive recovery, Database architecture, Metadata repository creation, Data acquisitions, object PROFESSIONAL EXPERIENCE:
Tenet Healthcare – TX January 2024 – Present
Data Engineer
• Architected and implemented enterprise-grade data integration pipelines leveraging FHIR, HL7, and EDI standards to ensure seamless interoperability across heterogeneous EHR and EMR platforms.
• Automated ingestion of high-volume clinical, claims, and laboratory datasets via Apache NiFi, achieving a 60% reduction in manual processing and improving operational efficiency.
• Developed modular, reusable Python ETL frameworks incorporating advanced validation and transformation logic, resulting in a 40% increase in pipeline throughput.
• Designed and deployed scalable healthcare data warehouses and star-schema dimensional models using ERwin Data Modeler, improving analytical query performance by 30%.
• Delivered subject-oriented data marts to enable self-service analytics, enhancing population health reporting and yielding a 20% improvement in query responsiveness.
• Strengthened data security and HIPAA compliance through the implementation of Azure Key Vault encryption and Role-Based Access Control (RBAC) across cloud-native platforms.
• Orchestrated CI/CD pipelines with Jenkins for automated deployment of Python-based ETL solutions into AWS S3, Redshift, and ELK ecosystems, accelerating release cycles and ensuring reliable data checkpointing.
• Optimized ETL workflows by integrating Apache NiFi and Kafka orchestration, delivering scalable, high-performance architectures for Python- and SQL-driven pipelines.
• Architected and optimized end-to-end healthcare data pipelines integrating FHIR, HL7, and EDI standards with Apache NiFi, Kafka, and Python, enabling secure, HIPAA-compliant, real-time analytics across EHR, EMR, and cloud platforms.
• Implemented data quality and governance frameworks with automated anomaly detection, reconciliation, and lineage tracking, ensuring trust worthy and audit-ready healthcare datasets.
• Containerized ETL workloads using Docker and Kubernetes, enabling on-demand scalability, portability, and cost optimization for large-scale healthcare data processing
• Deployed metadata cataloging and lineage tools (Apache Atlas, Azure Purview) to enhance data discoverability, governance, and compliance oversight.
• Conducted cloud cost optimization initiatives across AWS and Azure data platforms, achieving 20% reduction in monthly spend without sacrificing performance.
• Implemented real-time alerting systems reducing systemic downtime, and automated ETL processes to increase throughput compatible with demanding AWS workflows synced across databanked resolutions.
• Established robust data lineage and governance frameworks using Apache Atlas and Collibra, increasing compliance audit readiness by 50%.
• Implemented system monitoring and alerting with Prometheus and Grafana, reducing pipeline downtime and incident response time by 25%.
• Deployed containerized applications using Docker and Kubernetes to enhance scalability aligned with AWS architecture goals, particularly apropos of a relational database structure and processing.
• Designed SQL queries and migrated them into structured databases to reduce healthcare claim processing times by leveraging comprehensive database models.
• Automated CI/CD pipelines with Jenkins and integrated coding in Python for deployment in AWS S3 and ELK Stack environments, reducing deployment times and streamlining data checkpoint creation.
• Implemented real-time alerting systems reducing systemic downtime, and automated ETL processes to increase throughput compatible with demanding AWS workflows synced across databanked resolutions.
• Facilitated cross-functional Scrum ceremonies to stabilize and enhance Agile teams developing deployable ETL schedules directly related to DoD data mesh notions
HCLTech – India May 2019 – November 2022
Data Engineer
• Established a robust data governance framework leveraging Apache Atlas, enabling automated lineage tracking and metadata management to meet RBI audit and compliance requirements.
• Implemented advanced data security measures using AWS Key Management Service (KMS) for column-level encryption and Apache Ranger for fine-grained access control, ensuring adherence to RBI cybersecurity norms and GDPR compliance.
• Designed and optimized ETL pipelines using Apache Spark and Apache Airflow, reducing data processing latency by 40% for high- volume banking transaction systems, ensuring seamless integration with core banking platforms.
• Tuned Hadoop clusters for optimal performance, boosting system throughput by 30% through resource allocation and configuration enhancements, supporting large-scale banking data workloads.
• Developed and deployed real-time fraud detection frameworks utilizing Apache Kafka and Spark Streaming, processing millions of daily transactions to strengthen financial security and comply with RBI fraud prevention guidelines.
• Engineered a custom Python and Spark-based data profiling tool to identify and resolve data quality issues in banking datasets, improving data accuracy for regulatory reporting and analytics.
• Architected a Databricks-based self-service analytics platform tailored for financial modeling, empowering data scientists to perform predictive analytics for credit risk and customer segmentation.
• Created interactive Power BI dashboards to provide real-time insights into customer financial behavior, loan portfolio performance & banking product metrics, driving data-informed decision-making for stakeholders.
• Automated CI/CD pipelines with Jenkins and Docker, achieving a 70% reduction in release cycles to enable rapid deployment of banking data solutions, enhancing operational efficiency.
• Deployed a 24/7 monitoring solution using the ELK Stack to ensure continuous oversight of data pipelines, enabling proactive issue resolution and maintaining operational reliability for critical banking systems.
• Carried out Hadoop cluster optimizations via treatment applications pressing evolutionary moves for calculating transactions present with equanamity for multi-manja professional set reviews.
• Manufactured Python-coded screening-like scripts with real featured recalibrations penetrating narrow-based regimes testable by accompanying code modules in forAAHQ pushes.
• Created a real-time monitoring aggregation in ELK Stack formatting decisions crisply along operators locating within frameworks common to linked mechanisms.
• Automated CI/CD pipelines with Jenkins and integrated coding in Python for deployment in AWS S3 and ELK Stack environments, reducing deployment times and streamlining data checkpoint creation.
• Implemented real-time alerting systems reducing systemic downtime, and automated ETL processes to increase throughput compatible with demanding AWS workflows synced across databanked resolutions. EDUCATION:
Masters of Computer Science - Texas A&M University-Commerce, Texas, USA Bachelor of Computer Science - Jawaharlal Nehru Technological University, Kakinada andhra Pradesh, India CERTIFICATIONS:
AWS Certified Data Engineer - Associate