Post Job Free
Sign in

Data Engineer Real-Time

Location:
Dallas, TX
Salary:
90,000
Posted:
April 23, 2025

Contact this candidate

Resume:

AKANKSHA CHENNA

**************@*****.*** 820-***-**** LinkedIn

SUMMARY

●Results-driven Data Engineer with 4 years of experience designing scalable data platforms and cloud-native architectures in banking and healthcare, delivering cost-efficient, secure, and ML-integrated solutions.

●Proven ability to build robust ETL/ELT pipelines and real-time data frameworks using PySpark, SQL, and orchestration tools like Airflow, Glue, and Azure Data Factory.

●Hands-on with AWS, GCP, and Azure, with experience in Snowflake, Databricks, BigQuery, and Redshift for building high-performance data infrastructure.

●Skilled in containerizing workflows with Docker, deploying via CI/CD and Terraform, and integrating GenAI/ML models into pipelines to drive intelligent insights.

●AWS Certified, Agile-savvy, and committed to delivering data solutions that optimize performance, accuracy, and decision-making.

PROFESSIONAL EXPERIENCE

Data Engineer, Citi Group Charlotte, NC Jan ‘24 - Present

●Designed and automated ETL pipelines for financial transactions using PySpark and Airflow, integrating data from 5+ sources, which reduced manual data handling by 70% and improved data refresh frequency from weekly to daily.

●Built a centralized data lake using AWS S3, Lambda, Glue, and Redshift, consolidating cross-departmental data, which improved compliance reporting turnaround by 40% and enabled self-serve access for 6 business teams.

●Implemented real-time data pipelines using Kafka and Spark Streaming to monitor suspicious transaction activities, enhancing fraud detection speed by 40%.

●Optimized SQL queries and partitioning strategies in Redshift, cutting down report generation time by 50%.

●Containerized PySpark-based ETL jobs with Docker and implemented CI/CD pipelines via Jenkins, provisioning infrastructure with Terraform, which cut down deployment time by 60% and reduced pipeline failures by 30%.

●Collaborated with ML engineers to integrate predictive models into ETL workflows and leveraged Agile practices to deliver iterative product releases on time.

●Collaborated with risk, compliance, and analytics teams to ensure pipelines align with regulatory frameworks and data governance standards.

Data Engineer, Philips Healthcare Noida, India Apr ‘20 - Jul ‘22

●Designed and maintained secure, HIPAA-compliant data pipelines on GCP using Dataflow, BigQuery, and Cloud Composer for ingesting patient and device telemetry data from over 1M IoT devices.

●Re-architected legacy ETL workflows using Apache Beam and GCP (Dataflow, Composer), optimizing batch and stream jobs, which reduced processing costs by 35% and improved pipeline reliability.

●Developed a Snowflake data warehouse connected to Databricks notebooks to run ML models on real-time patient data, enabling clinicians to detect health anomalies 25% earlier and intervene proactively.

●Developed and maintained NoSQL data stores using MongoDB for unstructured device logs and integrated them into centralized analytics dashboards.

●Containerized batch and stream processing applications using Docker and deployed them on GKE (Google Kubernetes Engine) to ensure scalable and fault-tolerant operations.

●Automated data quality checks and alerts using custom Python scripts and Cloud Functions, increasing data integrity and operational efficiency.

●Developed real-time Tableau dashboards for 1M+ IoT data streams, enabling clinicians to track patient vitals live and reduce critical response delays by 20%.

TECHNICAL SKILLS

●Languages & Frameworks: Python, SQL, Scala, Java, PySpark

●Big Data & Processing: Spark, Hadoop, Hive, Beam, Kafka

●Cloud & Tools: AWS (Glue, S3, Lambda, Redshift), GCP (BigQuery, Dataflow, Composer), Azure Data Factory

●Data Warehousing: Snowflake, Redshift, BigQuery, Databricks

●Databases: PostgreSQL, MySQL, MongoDB

●DevOps & Infra: Docker, Jenkins, Terraform, Kubernetes, Git, CI/CD

●ML/AI & Visualization: MLflow, Vertex AI, LangChain, Power BI, Tableau, Data Studio

CERTIFICATIONS

AWS Certified Data Engineer – Associate

EDUCATION

Masters in Computer Science - University of Texas at Arlington 3.8/4.0 Aug ‘22 - May ‘24

Bachelors in Information Technology - Gokaraju Rangaraju Institute of Engineering and Technology 8.7/10.0 Jul ‘18 - May ‘22

Relevant Coursework: Cloud Computing & Big Data, Data Mining, Machine Learning, Database Systems, Distributed Systems, Operating Systems, Software Engineering, Web Data Management, Data Structures & Algorithms.



Contact this candidate