Tharun Lingampally
Dallas, TX ********************@*****.*** +1-469-***-**** LinkedIn
Summary
Experienced Data Engineer with over 5 years of building and managing data systems on AWS, Azure, and GCP. Skilled in creating reliable data pipelines using Spark, Kafka, and Airflow. Strong background in SQL, Python, and cloud data tools like Snowflake and BigQuery. Proven ability to work with large data sets, improve performance, and support business teams with clean, well-organized data.
Skills
Programming Languages: Python, Java, Scala, SQL, Shell Scripting, PySpark
Big Data Technologies: Apache Spark, Apache Kafka, Hadoop, Hive, Amazon EMR, Apache Flink, HDFS, Sqoop, Airflow, HBase, Presto, Zookeeper
ETL & Data Warehousing: AWS Glue, Informatica, Apache NiFi, Talend, SSIS, Snowflake, Redshift
Cloud Platforms: AWS (S3, EC2, EMR, Lambda, RDS), Azure (Data Factory, Synapse), GCP (BigQuery, Dataflow)
Databases: PostgreSQL, MySQL, Oracle, MongoDB, DynamoDB, Cassandra
DevOps & CI/CD: Git, GitHub, Jenkins, Docker, Kubernetes, Terraform
Workflow Orchestration: Apache Airflow, Oozie, Luigi, Control-M
Monitoring & Logging: Grafana, Prometheus, Splunk, Cloud Watch
Testing & Data Validation: Great Expectations, PyTest, Unit Testing, Postman
Data Modeling & Architecture: Star Schema, Snowflake Schema, Dimensional Modeling, Data Lakes, Data Mesh Experience
Capital One, TX Jul 2024 - Current
Data Engineer
Delivered high-volume ETL pipelines using Apache Spark, Apache Kafka, and AWS Glue to process terabytes of transactional data daily.
Engineered robust data lakes and dimensional models with Snowflake Schema in AWS S3, improving data accessibility and reducing query latency by 30%.
Automated workflows with Apache Airflow, enabling real-time monitoring via Grafana, Cloud Watch, and Prometheus, achieving 99.9% SLA compliance.
Employed Terraform, Docker, and Kubernetes for CI/CD deployment, integrating seamlessly with Jenkins and GitHub Actions.
Developed test cases using Great Expectations, PyTest, and Postman, enhancing data integrity and unit test coverage.
Championed Agile Methodology, contributing to sprint planning and cross-functional collaboration with exceptional analytical thinking, problem-solving, and attention to detail. Fidelity Investments, TX Jan 2022 - Jul 2024
Data Engineer
Designed and maintained real-time data streaming pipelines using Apache Flink, Kafka, and HBase for financial market feeds, reducing data delivery lag by 40%.
Implemented Star Schema-based data marts in Redshift and Snowflake, optimizing reporting performance for 200+ analysts.
Built enterprise-grade ETL workflows with Apache NiFi and Informatica, increasing data ingestion reliability by 35%.
Led unit and integration testing using PyTest, Postman, and Great Expectations, ensuring 100% compliance with internal quality standards.
Collaborated with DevOps on CI/CD pipelines via Docker, GitLab, and Jenkins, deploying infrastructure through Terraform.
Maintained excellent time management, clear communication, and team collaboration while leading peer reviews and documentation efforts.
DSIG, India Jun 2021 - Dec 2021
Data Engineer
Built scalable batch data pipelines using Hadoop, Hive, and Sqoop, supporting migration from legacy ETL tools and reducing job execution time by 28%.
Automated and orchestrated data workflows using Apache Oozie, Control-M, and Azure Data Factory, enabling consistent cross- team reporting.
Designed and optimized ETL processes in Python, SQL, and Shell scripting, integrating with Azure Blob Storage, Azure Synapse, Oracle, and Cassandra.
Leveraged Azure Monitor, Cloud Watch, and Splunk to track system health, proactively addressing latency and data delivery issues.
Benchmarked performance of Apache Spark and Presto jobs on Azure HDInsight, significantly reducing compute resource costs and improving scalability.
Known for delivering under pressure, collaborating across teams, and applying analytical thinking, adaptability, and clear communication to high-impact projects.
Certification
AWS Certified Cloud Practitioner
Azure Data Engineer Associate
Education
Master of Science, Business Analytics, Dec 2023
The University of Texas, TX, USA
Bachelor's in Mechanical Engineering, May 2021
Mahindra University, India