Senior Data Engineer with 6+ Years of Cloud Data Platforms

Location:

Atlanta, GA

Salary:

80000

Posted:

February 26, 2026

Contact this candidate

Resume:

PRATIBHA VISHWAKARMA

Data Engineer

Phone: 912-***-**** Email: ********.*****@*****.*** Linkedin.com/in/pratibhavishwakarma SUMMARY

Senior Data Engineer with 6+ years of experience designing and scaling distributed data systems and cloud data platforms across retail and financial services. Expertise in building high-performance batch and streaming pipelines using SQL, Python, PySpark, Java, and AWS (S3, Redshift, Glue, EMR). Delivered lakehouse architectures supporting 80+ production dashboards with 99% data accuracy, driving large-scale migrations, governance, and CI/CD automation in fast-paced, cross-functional environments. PROFESSIONAL EXPERIENCE

Y-STEM & Chess United States July 2025 - Present Senior Data Engineer

● Designed and led cloud-native data pipelines on AWS (S3, Redshift) to support analytics and user engagement insights.

● Built lakehouse architecture using Apache Iceberg and Redshift, improving storage efficiency, and reducing infrastructure costs by 20% while supporting scalable reporting.

● Implemented Kafka and Spark Structured Streaming pipelines to capture and process user activity data for analytics and reporting.

● Developed batch ingestion and transformation pipelines using PySpark and SQL, reducing manual reporting effort by 40% and improving data processing efficiency.

● Engineered small Java-based utility components to support ingestion workflows, data validations, and pipeline orchestration tasks.

● Drove data governance initiatives, including PII masking, encryption, and role-based access control

(RBAC).

● Led code reviews, defining engineering standards and mentoring junior engineers to improve pipeline reliability and delivery speed.

Accenture Solutions Pvt. Ltd. Remote Aug 2018 – Dec 2023 Data Engineering Senior Analyst

Global Retail Data Platform & Visualization (Multi-Continent Client)

● Designed and maintained 100+ end-to-end data ingestion pipelines using PySpark and SQL on AWS

(S3, Redshift) to process large-scale, multi-continent datasets.

● Engineered Spark jobs in PySpark with supporting Java components for handling complex transformations and distributed processing.

● Led two major platform migrations from Talend to AWS and later to the client’s AWS environment, improving scalability, performance, and cost efficiency by 25%.

● Automated complex ETL workflows using AWS Step Functions and DynamoDB, ensuring SLA compliance and consistent delivery of analytics-ready data, reducing manual intervention by 40%.

● Implemented incremental loading strategies, data validation rules, and error-handling mechanisms, achieving 99% data accuracy.

● Optimized Spark transformation logic and partition strategies, reducing processing latency by 30% and improving distributed workload efficiency.

● Developed and maintained 80+ BI dashboards using Power BI, Qlik Sense, and QlikView to support sales, supply chain, and customer analytics.

● Delivered standardized KPIs and governed dashboards with role-based access control and column-level masking for secure data sharing.

● Led Two Agile teams (10+ engineers), driving engineering standards, Spark optimization practices, and cross-team collaboration across multi-continent data programs. Financial Data Platform Modernization (North America Client)

● Built scalable AWS Glue and Spark-based data pipelines using S3 and Python for high-volume financial datasets, improving ingestion efficiency.

● Assisted in developing and maintaining Java utilities for data ingestion control, logging, and exception handling in distributed pipelines.

● Designed database schemas, table structures, and partition strategies to improve ingestion efficiency and downstream query performance by 20%.

● Developed reusable shell and Python automation scripts to streamline ETL execution and reduce operational overhead.

● Collaborated with data architects, auditors, and business analysts to ensure regulatory, security, and audit compliance.

● Established CI/CD pipelines in Azure DevOps, enabling version control, automated testing, and predictable release cycles, reducing deployment issues by 30%.

● Awarded the Accenture Celebrates Excellence (ACE) Award for consistent and resilient project execution.

SKILLS

Programming & Scripting: SQL, Python, Java, Shell Scripting, Scala Data Engineering & Architecture: ETL Pipeline Development, Data Ingestion, Data Modeling, Data Warehousing, Data Lakehouse

Cloud Platforms: AWS - S3, Redshift, Glue, EMR, Step Functions, DynamoDB Big Data & Processing: Apache Spark, PySpark, Hadoop, Hive BI & Visualization & Analytics: Power BI, Qlik Sense, QlikView, Tableau Data Quality & Governance: Metadata Management, Data Lineage, Audit Controls, Role-Based Access Control

(RBAC)

DevOps & Workflow: GitHub, Azure Repos, CI/CD, JIRA, ServiceNow, Agile/Scrum, SDLC EDUCATION

M.A. in Interactive Design

Savannah College of Art and Design (SCAD), GA, USA Jan 2024 – May 2025 Bachelor of Technology in Mechanical Engineering

Veer Surendra Sai University of Technology, India Sept 2014 – Jun 2018 CERTIFICATION

● AWS Certified Solutions Architect – Associate

● MySQL HeatWave Implementation Certified Associate (Rel 1)

● Oracle Cloud Infrastructure 2025 Certified Foundations Associate

● Oracle Cloud Infrastructure 2025 Certified AI Foundations Associate

Contact this candidate