PRATIBHA VISHWAKARMA
Data Engineer
Phone: 912-***-**** Email: ********.*****@*****.*** Linkedin.com/in/pratibhavishwakarma SUMMARY
Senior Data Engineer with 6+ years of experience designing and scaling distributed data systems and cloud data platforms across retail and financial services. Expertise in building high-performance batch and streaming pipelines using SQL, Python, PySpark, Java, and AWS (S3, Redshift, Glue, EMR). Delivered lakehouse architectures supporting 80+ production dashboards with 99% data accuracy, driving large-scale migrations, governance, and CI/CD automation in fast-paced, cross-functional environments. PROFESSIONAL EXPERIENCE
Y-STEM & Chess United States July 2025 - Present Senior Data Engineer
● Designed and led cloud-native data pipelines on AWS (S3, Redshift) to support analytics and user engagement insights.
● Built lakehouse architecture using Apache Iceberg and Redshift, improving storage efficiency, and reducing infrastructure costs by 20% while supporting scalable reporting.
● Implemented Kafka and Spark Structured Streaming pipelines to capture and process user activity data for analytics and reporting.
● Developed batch ingestion and transformation pipelines using PySpark and SQL, reducing manual reporting effort by 40% and improving data processing efficiency.
● Engineered small Java-based utility components to support ingestion workflows, data validations, and pipeline orchestration tasks.
● Drove data governance initiatives, including PII masking, encryption, and role-based access control
(RBAC).
● Led code reviews, defining engineering standards and mentoring junior engineers to improve pipeline reliability and delivery speed.
Accenture Solutions Pvt. Ltd. Remote Aug 2018 – Dec 2023 Data Engineering Senior Analyst
Global Retail Data Platform & Visualization (Multi-Continent Client)
● Designed and maintained 100+ end-to-end data ingestion pipelines using PySpark and SQL on AWS
(S3, Redshift) to process large-scale, multi-continent datasets.
● Engineered Spark jobs in PySpark with supporting Java components for handling complex transformations and distributed processing.
● Led two major platform migrations from Talend to AWS and later to the client’s AWS environment, improving scalability, performance, and cost efficiency by 25%.
● Automated complex ETL workflows using AWS Step Functions and DynamoDB, ensuring SLA compliance and consistent delivery of analytics-ready data, reducing manual intervention by 40%.
● Implemented incremental loading strategies, data validation rules, and error-handling mechanisms, achieving 99% data accuracy.
● Optimized Spark transformation logic and partition strategies, reducing processing latency by 30% and improving distributed workload efficiency.
● Developed and maintained 80+ BI dashboards using Power BI, Qlik Sense, and QlikView to support sales, supply chain, and customer analytics.
● Delivered standardized KPIs and governed dashboards with role-based access control and column-level masking for secure data sharing.
● Led Two Agile teams (10+ engineers), driving engineering standards, Spark optimization practices, and cross-team collaboration across multi-continent data programs. Financial Data Platform Modernization (North America Client)
● Built scalable AWS Glue and Spark-based data pipelines using S3 and Python for high-volume financial datasets, improving ingestion efficiency.
● Assisted in developing and maintaining Java utilities for data ingestion control, logging, and exception handling in distributed pipelines.
● Designed database schemas, table structures, and partition strategies to improve ingestion efficiency and downstream query performance by 20%.
● Developed reusable shell and Python automation scripts to streamline ETL execution and reduce operational overhead.
● Collaborated with data architects, auditors, and business analysts to ensure regulatory, security, and audit compliance.
● Established CI/CD pipelines in Azure DevOps, enabling version control, automated testing, and predictable release cycles, reducing deployment issues by 30%.
● Awarded the Accenture Celebrates Excellence (ACE) Award for consistent and resilient project execution.
SKILLS
Programming & Scripting: SQL, Python, Java, Shell Scripting, Scala Data Engineering & Architecture: ETL Pipeline Development, Data Ingestion, Data Modeling, Data Warehousing, Data Lakehouse
Cloud Platforms: AWS - S3, Redshift, Glue, EMR, Step Functions, DynamoDB Big Data & Processing: Apache Spark, PySpark, Hadoop, Hive BI & Visualization & Analytics: Power BI, Qlik Sense, QlikView, Tableau Data Quality & Governance: Metadata Management, Data Lineage, Audit Controls, Role-Based Access Control
(RBAC)
DevOps & Workflow: GitHub, Azure Repos, CI/CD, JIRA, ServiceNow, Agile/Scrum, SDLC EDUCATION
M.A. in Interactive Design
Savannah College of Art and Design (SCAD), GA, USA Jan 2024 – May 2025 Bachelor of Technology in Mechanical Engineering
Veer Surendra Sai University of Technology, India Sept 2014 – Jun 2018 CERTIFICATION
● AWS Certified Solutions Architect – Associate
● MySQL HeatWave Implementation Certified Associate (Rel 1)
● Oracle Cloud Infrastructure 2025 Certified Foundations Associate
● Oracle Cloud Infrastructure 2025 Certified AI Foundations Associate