Data Engineer Senior

Location:

Atlanta, GA

Posted:

May 21, 2025

Contact this candidate

Resume:

• Experienced Data Engineer with over *.* years of expertise in designing and building scalable data

pipelines using Hadoop, Spark, Hive, and AWS.

• Proven ability to optimize big data workflows, implement robust ETL pipelines, and convert complex datasets into actionable business insights.

• Skilled in modern data platforms such as Snowflake, AWS Glue, and EMR, as well as orchestration tools including Apache Airflow and Oozie.

• Highly adaptable and collaborative, with a strong performance record across retail, insurance, and manufacturing domains.

Technical skills

• Big Data Ecosystem: Hadoop, Hive, Pig, Sqoop, Oozie

• Data Processing: Apache Spark (RDD/Data Frame), MapReduce

• Cloud Technologies: AWS (EMR, S3, Glue, Lambda), Snowﬂake

• Workﬂow OrchestraPon: Apache Airﬂow, Oozie

• File Formats: ORC, Parquet, Avro, JSON, TextFile

• Databases: Snowﬂake, MySQL, PostgreSQL, Oracle

• Programming Languages: Python, SQL

• Tools & ScripPng: Git, Jenkins, Unix Shell ScripPng

• OperaPng Systems: Linux, Windows

• Other Skills: Hive Performance Tuning (ParPPoning, BuckePng, VectorizaPon), Data IngesPon (HDFS, Local FS), Data Quality ValidaPon

Work Experience

Company: Target India (Apr 2022 – Mar 2023)

Current Designation: Senior Data Engineer

Description : Target CorporaPon is an American retail company. Which is widely known for its owned brand products. Worked on eDW (enterprise Data Warehouse) project, which is an enterprise data warehouse for all Professional Summary

Alphare(a

*************@*****.***

+1-551-***-****

Venkatarinumounika

Kandula

Senior Data Engineer

the digital transacPons that happens through target.com. Handled projects like Fixed assets and import claims using internal framework (Kelsa) developed on Spark Scala. Roles & Responsibilities:

• Designed and developed data ingesPon and processing pipelines on AWS, leveraging services such as AWS Glue, S3, and EMR for scalability and performance.

• Migrated and opPmized criPcal datasets for Item, Demand, Sales, and Inventory modules to Snowﬂake, resulPng in a 40–60% improvement in query performance and reduced operaPonal cost.

• Wrote complex SQL and Python scripts to perform transformaPons in Snowﬂake and handled data orchestraPon using Apache Airﬂow.

• Collaborated with Product Owners to gather and translate business requirements into scalable data soluPons.

• Resolved data quality issues through validaPon checks and anomaly detecPon logic to ensure data accuracy and integrity.

• Generated daily reports in Snowﬂake to support business trend analysis and strategic decision-making.

• Scheduled and managed ETL workﬂows using Airﬂow, ensuring smooth execuPon and monitoring of data jobs.

Company: Al,metrik India Pvt Ltd, India (Jun 2019 – Mar 2021) Designa,on: Staﬀ Engineer-Data Engineering and Analy,cs Client: Ford

Roles & Responsibilities:

• Developed Hive queries to parse raw input data, transforming it into refined datasets ready for analysis.

• Designed and created external and managed Hive tables, utilizing partitioning for improved query performance and data management.

• Built views and ad-hoc queries to address evolving business requirements for reporting and analysis.

• Implemented country-specific access controls using Apache Ranger, ensuring secure data access based on geographical data requirements.

• Loaded both historical and incremental data into Hive, maintaining data integrity and lifecycle management across multiple data sets.

• Applied query optimization techniques like bucketing, partitioning, and vectorization to improve query performance and reduce processing times.

• Worked with various file formats such as TextFile, ORC, and JSON, enabling flexible storage and high-performance processing.

• Leveraged Sqoop to ingest data from MySQL to HDFS, streamlining the data integration process.

• Automated data loading from HDFS and local file systems into Hive, reducing manual intervention and improving operational efficiency.

• Performed unit testing and provided production support on the Hortonworks platform, ensuring smooth data operations.

• Developed shell scripts for job automation and deployments, enhancing workflow management.

• Scheduled and monitored ETL jobs using Oozie workflows, ensuring reliable job execution and error handling.

• Led code reviews, ensuring adherence to best practices and team-wide coding standards, fostering high-quality code development.

Company: Infosys (Apr 2018 – May 2019)

Designation: Technology Analyst

Client: Manulife

Roles & Responsibilities:

• Collaborated with business partners to gather and understand business requirements, ensuring that the data soluPon aligned with their needs and objecPves.

• AcPvely parPcipated in HDP (Hortonworks Data Pladorm) version upgrades, making necessary code changes to ensure compaPbility with the latest pladorm version.

• Engaged in performance tuning for Hive queries, including opPmizing data processing using parPPoning, buckePng, and improving join queries for beeer execuPon Pmes.

• Designed and created external and managed Hive tables, processing raw data into staging tables before moving it to the master tables.

• Managed historical data processing in Hive tables, especially when dealing with schema changes that impacted data structures.

• Developed and executed Hive queries for incremental and historical data processing, ensuring that both real-Pme and past data could be eﬀecPvely handled.

• Created views in Hive and implemented Ranger scripts for access control, enabling ﬁne-grained control over user permissions based on roles and security policies.

• Worked on data deduplicaPon in Hive tables, idenPfying and removing duplicates to maintain clean and accurate datasets.

• Designed and populated lookup tables and loaded data into master tables according to speciﬁc business logic and requirements.

• Developed Pig scripts to extract data from various ﬁle formats and load it to HDFS, expanding the versaPlity of the data processing pipeline.

• Created test ﬁles for new projects to facilitate unit tesPng, ensuring the integrity and funcPonality of the data pipeline.

• Conducted unit tesPng and prepared corresponding test documentaPon to validate data accuracy and funcPonality.

• Led code reviews, ensuring that the team adhered to coding standards and best pracPces for consistency and maintainability.

• Prepared a release checklist document to ensure all aspects of the release process were covered, reducing the risk of errors in producPon.

• Developed shell scripts to automate the deployment process and monitored job execuPon, ensuring smooth operaPon of ETL jobs and data pipelines.

Company: Cognizant Technology Solutions (Nov 2013 – Mar 2018) Designation: Associate

Client: Walmart

Roles & Responsibilities:

• Collaborated with business partners to thoroughly discuss and understand the business requirements, ensuring the technical soluPons met their expectaPons.

• Developed COBOL, DB2, and JCL code to implement Store number enhancement changes, ensuring the exisPng code could handle these new business requirements eﬀecPvely.

• Enhanced exisPng code to support Store number changes, ensuring smooth integraPon with legacy systems and consistent data ﬂow.

• Worked on migraPng data and logic from IMS to DB2, performing extensive tesPng to ensure data accuracy and process integrity.

• Managed the migraPon of code from DB2 to Teradata, ensuring smooth transiPons and performing necessary unit and integraPon tesPng to validate the migraPon success.

• ParPcipated in performance tuning, opPmizing Hive queries using parPPoning, buckePng, and improved join queries to boost performance in large-scale data processing jobs.

• Designed and created external and managed tables in Hive, systemaPcally loading data into staging tables before moving it to the master tables.

• Conducted unit tesPng of new code and prepared test documentaPon, ensuring funcPonality and business logic conformance.

• Led code reviews, providing feedback to ensure adherence to coding standards and best pracPces within the team.

• Collaborated on preparing CRQ (Change Request) documentaPon as part of release management, ensuring the proper execuPon of changes within producPon environments.

• Developed and maintained a release checklist document, streamlining the release process and minimizing risk during deployment.

Educa9on

Discipline: B.Tech, Electronics and CommunicaPons with 85% University: JNTUA, 2009 – 2013

Discipline: M.S, InformaPon Technology with CGPA 4 University: UC, Kentucky, 2021 – 2022

Contact this candidate