Data Engineer with Cloud & ETL Expertise

Location:

Franklin Park, NJ

Salary:

70000

Posted:

November 19, 2025

Contact this candidate

Resume:

MANIKANTA REDDY

DATA ENGINEER

*************@*****.*** +1-848-***-**** LinkedIn GitHub

PROFESSIONAL SUMMARY

Results-driven Data Engineer with 3+ years of experience in designing, developing, and deploying data pipelines, ETL workflows, and cloud- based data platforms across AWS, Azure, and GCP environments. Proficient in SQL, Python, and PySpark for large-scale data processing, data modeling, and automation. Skilled in building reliable data orchestration frameworks (Airflow, dbt) and optimizing data warehouses

(Snowflake, Redshift, Synapse) to support analytics and machine learning use cases. Recognized for improving pipeline efficiency by up to 40%, ensuring data accuracy, and delivering scalable, production-ready solutions in Agile environments. SKILLS

Programming & Scripting: Python, SQL, PySpark, Bash, Shell Scripting ETL & Data Orchestration: Apache Airflow, dbt, Talend, Informatica Cloud, AWS Glue, Azure Data Factory Databases & Warehousing: Snowflake, Amazon Redshift, PostgreSQL, MySQL, Google BigQuery, MS SQL Server Cloud & DevOps Tools: AWS (S3, Lambda, Glue, Redshift, EC2, IAM), Azure (Data Factory, Synapse, Databricks), GCP (BigQuery, Dataflow), Docker, CI/CD

Data Modeling & Integration: Dimensional Modeling, Star/Snowflake Schema, Data Transformation, API Integration, JSON/Parquet/CSV Handling

Version Control & Collaboration: Git, GitHub, GitLab, Jira, Confluence, Trello, Slack Soft Skills: Analytical Thinking, Problem-Solving, Team Collaboration, Communication, Attention to Detail, Time Management PROFESSIONAL EXPERIENCE

DATA ENGINEER Feb 2025 – Present INSIGHT ENTERPRISES

Designed and automated ETL pipelines using AWS Glue and Apache Airflow, processing 2TB+ structured and semi-structured data daily, enabling high-performance analytics and reporting.

Engineered optimized data models in Snowflake and implemented advanced SQL transformations, reducing query execution time by 35% and improving dashboard performance.

Developed Python-based data validation scripts to ensure 99% data accuracy across ingestion and transformation stages.

Enhanced data orchestration within multi-layer AWS architectures, eliminating redundancies and improving task scheduling efficiency.

Collaborated with cross-functional Agile teams using Jira and Confluence for sprint planning, backlog prioritization, and technical documentation.

Created reusable dbt macros and models for maintaining standardized transformations and improving pipeline maintainability. BIG DATA ENGINEER Jul 2021 – Jul 2023 MPHASIS

Built and managed large-scale distributed data pipelines using Apache Spark and Talend, handling over 5TB of data daily across hybrid environments.

Developed ETL workflows using dbt and Airflow for schema validation, data lineage, and automated dependency resolution.

Migrated on-prem workloads to Azure Synapse and Azure Data Factory, achieving zero data loss and improving data refresh time by 30%.

Implemented index optimization and query tuning in PostgreSQL and SQL Server, reducing query latency by 45% and optimizing resource utilization.

Integrated RESTful APIs and Parquet/JSON data feeds to build analytics-ready data sets with complete schema traceability.

Deployed automated monitoring and failure recovery mechanisms in Azure pipelines, reducing manual interventions by 40%.

Collaborated with data scientists and BI teams to align data structures with analytical requirements, ensuring seamless integration with Power BI and Tableau dashboards.

Utilized Git and GitHub Actions for version control, ensuring code integrity, peer review, and CI/CD deployment consistency. DATA ENGINEER INTERN Jul 2020 – Jun 2021 MINDTREE

Designed and optimized ETL workflows in Informatica Cloud and Google Cloud Dataflow to automate multi-source data movement with minimal errors.

Managed MySQL and Redshift databases, performing data quality audits and batch ingestion optimizations, increasing reporting accuracy by 20%.

Developed PySpark and Shell scripts that improved ETL runtime efficiency by 30% on pilot projects.

Built API integration workflows handling 500K+ daily JSON/CSV records with full validation, error logging, and automated retries.

Contributed to Agile sprint cycles via Trello, Slack, and GitLab, enhancing project delivery predictability and team collaboration. EDUCATION

New Jersey Institute of Technology, New Jersey, USA – Masters, Computer Science Amrita Vishwa Vidyapeetham, Coimbatore, India – Bachelor, Computer Science CERTIFICATIONS

AWS Certified Solutions Architect – Associate (SAA C03)

AWS Certified Cloud Practitioner (CLF-C02)

PROJECTS

ArchAI GitHub (Feb 2025 – Mar 2025): Developed an AI-driven tool converting plain English descriptions into AWS architecture diagrams, reducing design time by 50%, with secure pre-signed URL access and Docker containerized Lambda functions for optimized AWS resource usage.

Paint Rental Application GitHub (Sept 2020 – Dec 2020): Built an e-commerce platform for renting/purchasing paintings using ReactJS and ExpressJS, integrated MySQL for 10,000+ records, improving data retrieval by 20%, with a 98% API testing success rate ensuring reliable communication.

Contact this candidate