Post Job Free
Sign in

Cloud Data Engineer - AWS - PySpark ETL Specialist

Location:
London, Greater London, United Kingdom
Posted:
December 12, 2025

Contact this candidate

Resume:

Gokul Ananth Sasi Kumar

Data Engineer

+44-776******* ******************@*****.*** Dundee, Scotland linkedin.com/in/gokul-ananth/ SUMMARY:

Designed and delivered cloud-based ETL pipelines, PySpark workloads, and automated data quality checks handling multi-source datasets exceeding one million records. Skilled in AWS, SQL, and data modelling, supporting analytics teams with stable, documented, and scalable ingestion workflows aligned to business and governance requirements. EXPERIENCE:

Data Engineer Cognizant Channai, India Jan 2022 - Aug 2023

• Built metadata-driven ingestion pipelines in Glue and EMR that auto-handled new providers, formats, and schema changes, reducing onboarding from days to simple configuration updates.

• Developed Delta/Hudi tables with ACID and incremental MERGE operations for multi-million-row datasets, replacing full refreshes with sub-hour upserts and shrinking processing windows.

• Engineered PySpark ETL frameworks with optimized partitioning, caching, and parallel execution, cutting multi-hour pipelines to faster, reliable runtimes and improving production stability.

• Designed AWS ingestion pipelines using DMS, DataSync, and Glue to migrate SQL Server, Oracle, SFTP, and OCI data into S3, boosting throughput and reducing ingestion times by hours. Software Engineer Internship Full Creative Channai, India Jun 2021 - Jul 2021

• Deployed Spring Boot applications on Elastic Beanstalk, managing environments and monitoring performance to ensure stable backend services for APIs and user workflows.

• Integrated AWS SES by configuring verified identities, routing rules, and automated triggers, enabling reliable notification and alerting systems across application workflows.

EDUCATION:

M.Sc. Data Science and Engineering University of Dundee Dundee, Scotland 2023 - 2024 Relevant Modules: Machine Learning, Statistical Modelling, Data Visualization, AWS, Big Data Analytics. BE. Computer Science Karpagam College Of Engineering Tamil Nadu, India 2017 - 2021 Relevant Modules: Algorithms, Database Management Systems, Software Engineering. PROJECTS:

End-to-End Retail Customer Analytics Pipeline Development Airflow, dbt, PostgreSQL, Power BI:

• Architected an end-to-end data pipeline using Airflow, PostgreSQL, and dbt to ingest and transform 6+ retail datasets (10k–50k records per run) into curated analytics layers for customer lifecycle insights.

• Built analytics models for RFM segmentation, engagement trends, churn risk scoring, and support performance, producing 8 downstream tables consumed by Power BI dashboards. End-to-End ETL and Analytics System for Web Log Data Airflow, Snowflake, Power BI:

• Built an end-to-end ETL pipeline processing 50,000+ IIS log records, adding geolocation enrichment, crawler detection, and a 7- dimension star schema to support analytics in Power BI.

• Designed a data warehouse model supporting 15+ analytical reports (traffic trends, file-type usage, errors, geolocation), significantly improving accessibility of log insights for non-technical users. Time Series Forecasting for Resource Optimization TensorFlow AutoARIMA:

• Developed multivariate LSTM and AutoARIMA forecasting models using 50,000+ hourly power-consumption records, incorporating lagged and exogenous features to deliver accurate short- and long-term demand predictions.

• Designed a complete forecasting workflow including feature engineering, model tuning, and visual diagnostics, enabling peak-load planning and more efficient energy resource allocation. SKILLS & ACHIEVEMENTS:

Languages & Frameworks: Python, SQL, PySpark, Java Cloud & Platforms: AWS (S3, Glue, EMR, Lambda, Athena, EC2, RDS), Snowflake Data Engineering: Airflow, dbt, Docker, CI/CD, ETL/ELT pipelines, Data Modelling, Data Quality, Monitoring & Alerting Databases: PostgreSQL, MySQL, MongoDB Analytics & ML: Power BI, Pandas, Scikit-learn, TensorFlow Achievements: SQL & Python Gold Badges (HackerRank); Led coding event with 50+ participants



Contact this candidate