Data Engineer Analytics

Location:

Hyderabad, Telangana, India

Posted:

September 10, 2025

Contact this candidate

Resume:

Rahul Reddy Certified Data Engineer New York, NY

+1-469-***-**** ************@*****.*** LINKEDIN CERTIFICATON GITHUB SUMMARY

Data Engineer with 5+ years of experience in the data pipeline, from acquiring and validating large datasets (structured and unstructured) to building data models, developing reports, and utilizing visualization tools for impactful insights. Leveraged Spark SQL and Spark APIs to perform large-scale data analytics on distributed datasets, achieving faster processing than traditional MapReduce jobs. Proficient in data pipeline deployment and monitoring using AWS & Azure services and Big Data ecosystem, ensuring consistent delivery and improved reliability of data pipelines. Automated data pipeline monitoring with Apache Airflow, reducing manual intervention and improving operational efficiency. Skilled in fostering cross-functional collaboration to drive data- driven decisions and achieve business goals more efficiently. EDUCATION

Southern Arkansas University Dec 2021 – Jan 2023

Master of Science in Computer and Information Science SKILLS

Technical: Python, Java, SQL (MySQL, PostgreSQL, Hive, MongoDB), Spark (Pyspark, Spark SQL), Flink, Databricks, HDFS, MapReduce, Kafka, REST API.

Cloud & DevOps: AWS (Lambda, Redshift, S3, RDS, CloudWatch, EMR, EC2, Glue, Event Bridge, VPC, IAM), Azure (ADLS, Synapse Analytics, ADF, Blob Storage, Cosmos DB), GCP (Big Query, Dataproc, Dataflow, Cloud composer, KMS, IAM), Azure DevOps, Apache Airflow, Terraform, Kibana, Jenkins, Maven, GitHub, Linux, CI/CD. Data Engineering & Analytics: Snowflake, Power BI, DBT, SSIS, Machine Learning, AI/ML, Data Warehousing, Tableau, ETL, Unit Testing, Data Pipelines, Data Lineage, Data Lake, Data Modeling, Data Quality & Observability, Dynatrace. WORK EXPERIENCE

TRAVELERS INSURANCE – Sr. Data Engineer, Analytics Jan 2023 - Present Data Acquisition & Storage Framework

● Designed and managed data pipelines using tools like Fivetran, Databricks and Airflow, DBT, and Snowflake.

● Enabled real-time anomaly detection, improving event-driven alerting by 7 bps with AWS Lambda & Kinesis.

● Scaled system capacity, handling 4x growth with Kafka partitioning & Flink checkpointing.

● Cut ETL execution time in half, optimizing Databricks, Python, Pyspark & Spark SQL workflows.

● Developed interactive monitoring dashboards, boosting 28% engagement with Python Dash & WebSocket.

● Minimized system downtime, automating infrastructure provisioning with IAC to improve deployment speed by 2x.

● Accelerated CI/CD deployment by 70%, reducing manual overhead with Terraform & Jenkins.

● Built and maintained data quality and governance practices to keep data accurate, secure, and well-documented.

● Automated data workflows with Airflow, reducing manual work and improving reliability.

● Implemented tracking systems for data and metadata to help teams understand where data comes from and how it’s used.

● Helped reduce cloud costs by optimizing infrastructure and storage, saving over $10K per year.

● Delivered data that supported better business decisions, improving insights and increasing team efficiency.

● Reduced processing latency by 40%, optimizing Kafka, Flink & AWS Kinesis for real-time data ingestion. Data Visualization Storytelling with Data

● Presented insights to leadership through data-driven storytelling, translating complex dashboard analysis into action items.

● Provided data support to Marketing, Engagement & Finance team, building dashboards for compliance, sales, and accounting. AMAZON - Data Engineer I Aug 2018 - Nov 2021

● Built and optimized end-to-end ETL pipelines using Scala, Spark, and AWS Glue, reducing Prime Sales Core’s data processing time by 10% and enabling faster time-to-insight.

● Leveraged AWS Glue, S3, Redshift, and Snowflake to optimize large-scale analytics workloads with significant improvements in data query performance.

● Maintained scalable data infrastructure capable of handling 3x more volume, using Hadoop, Spark, and Hive.

● Delivered near real-time streaming solutions with Spark and HBase, improving event processing latency by 7 basis points.

● Improved streaming reliability and scalability, supporting high-throughput processing with Kafka and distributed data frameworks.

● Tuned Snowflake warehouse configurations to efficiently handle complex joins and aggregations on large datasets.

● Resolved data discrepancies across sources, improving backend-to-analytics consistency and increasing predictability by 10%.

● Established robust data quality checks and monitoring to ensure high integrity across all ingestion and transformation stages.

● Partnered with analytics and product teams to track and improve campaign performance, resulting in a 7% increase in conversion rates for Prime sales.

● Translated data outcomes into actionable business insights that directly influenced marketing, finance, and product strategy. Data Engineering Ensuring High Data Quality

● Developed Engagement foundational data model and designed transactional schema, enabling scalable and quick analysis.

● Resolved discrepancies between backend configuration data and amplitude, increasing predictability by 10%

Contact this candidate