AKANKSHA CHENNA
**************@*****.*** 820-***-**** LinkedIn
SUMMARY
●Results-driven Data Engineer with 4 years of experience designing scalable data platforms and cloud-native architectures in banking and healthcare, delivering cost-efficient, secure, and ML-integrated solutions.
●Proven ability to build robust ETL/ELT pipelines and real-time data frameworks using PySpark, SQL, and orchestration tools like Airflow, Glue, and Azure Data Factory.
●Hands-on with AWS, GCP, and Azure, with experience in Snowflake, Databricks, BigQuery, and Redshift for building high-performance data infrastructure.
●Skilled in containerizing workflows with Docker, deploying via CI/CD and Terraform, and integrating GenAI/ML models into pipelines to drive intelligent insights.
●AWS Certified, Agile-savvy, and committed to delivering data solutions that optimize performance, accuracy, and decision-making.
PROFESSIONAL EXPERIENCE
Data Engineer, Citi Group Charlotte, NC Jan ‘24 - Present
●Designed and automated ETL pipelines for financial transactions using PySpark and Airflow, integrating data from 5+ sources, which reduced manual data handling by 70% and improved data refresh frequency from weekly to daily.
●Built a centralized data lake using AWS S3, Lambda, Glue, and Redshift, consolidating cross-departmental data, which improved compliance reporting turnaround by 40% and enabled self-serve access for 6 business teams.
●Implemented real-time data pipelines using Kafka and Spark Streaming to monitor suspicious transaction activities, enhancing fraud detection speed by 40%.
●Optimized SQL queries and partitioning strategies in Redshift, cutting down report generation time by 50%.
●Containerized PySpark-based ETL jobs with Docker and implemented CI/CD pipelines via Jenkins, provisioning infrastructure with Terraform, which cut down deployment time by 60% and reduced pipeline failures by 30%.
●Collaborated with ML engineers to integrate predictive models into ETL workflows and leveraged Agile practices to deliver iterative product releases on time.
●Collaborated with risk, compliance, and analytics teams to ensure pipelines align with regulatory frameworks and data governance standards.
Data Engineer, Philips Healthcare Noida, India Apr ‘20 - Jul ‘22
●Designed and maintained secure, HIPAA-compliant data pipelines on GCP using Dataflow, BigQuery, and Cloud Composer for ingesting patient and device telemetry data from over 1M IoT devices.
●Re-architected legacy ETL workflows using Apache Beam and GCP (Dataflow, Composer), optimizing batch and stream jobs, which reduced processing costs by 35% and improved pipeline reliability.
●Developed a Snowflake data warehouse connected to Databricks notebooks to run ML models on real-time patient data, enabling clinicians to detect health anomalies 25% earlier and intervene proactively.
●Developed and maintained NoSQL data stores using MongoDB for unstructured device logs and integrated them into centralized analytics dashboards.
●Containerized batch and stream processing applications using Docker and deployed them on GKE (Google Kubernetes Engine) to ensure scalable and fault-tolerant operations.
●Automated data quality checks and alerts using custom Python scripts and Cloud Functions, increasing data integrity and operational efficiency.
●Developed real-time Tableau dashboards for 1M+ IoT data streams, enabling clinicians to track patient vitals live and reduce critical response delays by 20%.
TECHNICAL SKILLS
●Languages & Frameworks: Python, SQL, Scala, Java, PySpark
●Big Data & Processing: Spark, Hadoop, Hive, Beam, Kafka
●Cloud & Tools: AWS (Glue, S3, Lambda, Redshift), GCP (BigQuery, Dataflow, Composer), Azure Data Factory
●Data Warehousing: Snowflake, Redshift, BigQuery, Databricks
●Databases: PostgreSQL, MySQL, MongoDB
●DevOps & Infra: Docker, Jenkins, Terraform, Kubernetes, Git, CI/CD
●ML/AI & Visualization: MLflow, Vertex AI, LangChain, Power BI, Tableau, Data Studio
CERTIFICATIONS
AWS Certified Data Engineer – Associate
EDUCATION
Masters in Computer Science - University of Texas at Arlington 3.8/4.0 Aug ‘22 - May ‘24
Bachelors in Information Technology - Gokaraju Rangaraju Institute of Engineering and Technology 8.7/10.0 Jul ‘18 - May ‘22
Relevant Coursework: Cloud Computing & Big Data, Data Mining, Machine Learning, Database Systems, Distributed Systems, Operating Systems, Software Engineering, Web Data Management, Data Structures & Algorithms.