Vamci Reddy
Email: **********@*****.***
Mobile: +1-213-***-****
Sr. Data Engineer
PROFESSIONAL SUMMARY:
Demonstrated expertise in analytical thinking and problem-solving, connecting dots across applications to understand end-to-end views, and presenting effectively to both technical and non-technical audiences. Strong communication skills are essential.
Proficient in utilizing query tools, possessing strong PL/SQL skills to write and analyze complex queries and stored procedures, while maintaining attention to detail in data analysis and project execution.
Experienced with Microsoft Office suite and Oracle Exadata or 10g and above, adept at identifying priorities and managing multiple projects simultaneously with minimal supervision in a team environment.
Know-how working in Agile/scrum teams for prioritization of work and resource assignments, demonstrating innovative thinking and willingness to ask questions and reach out for assistance as required for success.
A team player who can influence and guide the team for success, with good skills in identifying priorities and managing multiple projects simultaneously, and know-how in effort and financials estimation.
Solid understanding of star and snowflake schema modeling to support analytics workloads and ensure optimal database performance in cloud data warehouses, with a focus on attention to detail and problem-solving.
Experience in managing version control and CI/CD for data pipelines using Azure DevOps and GitHub Actions to support seamless development and deployment, while working well in a team environment.
Utilized Python, SQL, and Scala to build custom transformation logic for various business domains including logistics, healthcare, and retail industries, demonstrating analytical thinking and innovative solutions.
Developed scalable ETL jobs in PySpark to handle batch processing of high-volume data files in cloud environments with distributed computing, while maintaining strong communication skills across teams.
Created reusable and modular ADF pipelines and Glue jobs, reducing development time and improving maintainability across data engineering workflows, with a focus on attention to detail and problem-solving.
Implemented data quality checks and validation frameworks using SQL and Python to ensure high accuracy and reliability of ingested datasets, while working well in a team environment with minimal supervision.
Involved in infrastructure as code (IaC) automation for data services provisioning using Terraform and ARM templates within enterprise Azure setups, demonstrating analytical thinking and innovative solutions.
Designed role-based access and key vault integration in Azure to enforce secure handling of secrets, credentials, and sensitive pipeline configurations, while maintaining attention to detail and problem-solving.
Applied advanced tuning strategies on Spark clusters and SQL queries to enhance data processing performance, reducing overall job execution time, while working well in a team environment with minimal supervision.
Integrated various structured and unstructured data formats (CSV, Parquet, JSON, Avro) into centralized data lakes for efficient downstream processing, demonstrating analytical thinking and innovative solutions.
Designed automated monitoring and alerting mechanisms for production ETL jobs using Azure Monitor and CloudWatch for high availability, while maintaining attention to detail and problem-solving skills.
Extensively worked with Power BI and Tableau to create real-time dashboards reflecting operational KPIs and facilitating strategic insights, while working well in a team environment with minimal supervision.
Experience working within agile/scrum frameworks; collaborated cross-functionally with product owners, data scientists, and QA for feature releases, demonstrating analytical thinking and innovative solutions.
Developed unit and integration test cases for Spark and ADF pipelines ensuring code quality and minimizing data integrity issues during deployments, while maintaining attention to detail and problem-solving.
Participated in cloud cost optimization initiatives by right-sizing compute resources and implementing job-level metering and scheduling policies, while working well in a team environment with minimal supervision.
Created job orchestration logic using Azure Logic Apps and AWS Step Functions to streamline and modularize data workflows, demonstrating analytical thinking and innovative solutions for complex problems.
Mentored junior data engineers and led code review sessions promoting best practices in coding standards, data governance, and pipeline development, while maintaining attention to detail and problem-solving.
Built framework to auto-load metadata into data catalogs (Purview/Glue) enabling discoverability and governance compliance in multi-tenant systems, while working well in a team environment with minimal supervision.
Leveraged Snowflake in hybrid environments to enable scalable analytics processing using its virtual warehouse architecture and zero-copy cloning, demonstrating analytical thinking and innovative solutions.
Created secure, parameterized ADF pipelines with reusable datasets and linked services for rapid onboarding of new data sources, while maintaining attention to detail and problem-solving capabilities.
Coordinated with infrastructure teams to design highly available architectures for mission-critical data workloads in Azure and AWS clouds, while working well in a team environment with minimal supervision.
Deployed real-time analytics workloads using Spark Structured Streaming and EventHub to process clickstream and event-based data feeds, demonstrating analytical thinking and innovative solutions.
Collaborated with business analysts to translate complex requirements into technical specifications for streamlined data delivery, while maintaining attention to detail and problem-solving capabilities.
Involved in cross-region data replication strategies and backup recovery planning to ensure business continuity for cloud data platforms, while working well in a team environment with minimal supervision.
Regularly contributed to documentation repositories detailing pipeline logic, data flow architecture, error handling, and platform optimization guidelines, demonstrating analytical thinking and innovative solutions.
TECHNICAL SKILLS:
Databases - Oracle, MySQL, Hive, SQL Server, HBase, Cassandra, MongoDB, Oracle Exadata
Bigdata Technologies - HDFS, Hive, PySpark, Map Reduce, Pig, YARN, Sqoop, Oozie, Zookeeper, Flume
Programming Languages - Python, Java, SQL, R, PL/SQL, Scala, JSON, XML, C#, SQL
Cloud Services - Azure, Cosmos, Blob storage, Kubernetes, Azure Synapse Analytics(DW), Azure Data Lake, Databricks, DWH, Data Factory
Techniques - Datamining, Clustering, Data Visualization, Data Analytics
Methodologies - Agile/Scrum, UML, Design Patterns, Waterfall
Container Platform - Docker, Kubernetes, CI/CD, Jenkins
Tools & Utilities - JIRA, GitHub, Tableau 9.1, Power BI, Control-M, PowerShell, Microsoft Office
PROFESSIONAL EXPERIENCE:
Chipotle Mexican Grill January 2023 – Present
Azure Data Engineer
Responsibilities:
Demonstrated expertise in analytical thinking by designing robust Azure Data Factory pipelines, integrating data from diverse sources into Azure Synapse Analytics, ensuring end-to-end data flow visibility. This involved innovative thinking to optimize data ingestion and transformation processes, aligning with business needs and data governance standards.
Applied strong PL/SQL skills to create modular SQL views and stored procedures in Azure SQL DB, simplifying reporting and ad hoc queries for analysts, enhancing data accessibility. This improved data-driven decision-making capabilities across the organization.
Exhibited attention to detail in establishing automated data quality and validation logic using custom Python scripts embedded within Databricks notebooks, ensuring data accuracy and reliability. This proactive approach minimized data discrepancies and improved overall data integrity.
Showcased problem-solving skills by performing performance tuning on Spark clusters and Synapse queries, improving pipeline efficiency by over 30% in batch cycles, optimizing resource utilization. This resulted in significant cost savings and faster data processing times.
Effectively communicated across the organization, collaborating with BI teams to deliver datasets structured for Power BI dashboards, enabling real-time business KPI monitoring and insights. This facilitated data-driven decision-making at all levels.
Worked well in a team environment, partnering with the infrastructure team to create auto-scaling configurations for Databricks jobs using cluster policies and job clusters, optimizing resource allocation. This ensured efficient and cost-effective data processing.
Demonstrated proficiency in connecting dots across various applications and business units to understand the end-to-end view of data flows and dependencies. This holistic perspective enabled proactive identification of potential issues and optimization opportunities.
Utilized Azure Monitor and Log Analytics to track pipeline metrics and optimize job runtimes through continuous monitoring, demonstrating a commitment to continuous improvement. This proactive approach ensured optimal performance and resource utilization.
Exhibited willingness to ask questions and reach out for assistance as required, fostering a collaborative and supportive team environment. This proactive approach ensured timely resolution of issues and knowledge sharing within the team.
Good with identifying priorities and managing multiple projects simultaneously, ensuring timely delivery of high-quality data solutions. This involved effective prioritization and resource allocation to meet business needs and project deadlines.
FedEx Dataworks March 2021 – January 2023
AWS Data Engineer II
Responsibilities:
Demonstrated expertise in analytical thinking by developing scalable AWS Glue jobs using PySpark to ingest, cleanse, and transform high-volume logistics and shipment data. This supported downstream analytics and improved data-driven decision-making.
Applied strong PL/SQL skills to optimize Redshift data warehouse schemas with effective distribution keys and sort keys, enhancing query performance and data retrieval speeds. This resulted in faster and more efficient data analysis capabilities.
Exhibited attention to detail in building reusable Python libraries for data cleansing, schema validation, and transformation logic reused across multiple pipeline modules. This ensured data quality and consistency across various data processing workflows.
Showcased problem-solving skills by migrating several legacy ETL pipelines from on-prem Informatica into cloud-native Glue pipelines, reducing cost by 40%. This involved innovative thinking and efficient resource utilization to achieve significant cost savings.
Effectively communicated across the organization, collaborating with business users to deliver curated datasets and interactive dashboards using QuickSight and Athena query layers. This facilitated data-driven decision-making and improved business insights.
Worked well in a team environment, building and maintaining CI/CD pipelines using GitHub Actions to automate deployment of infrastructure and Glue job artifacts. This ensured efficient and reliable deployment of data solutions.
Demonstrated proficiency in connecting dots across various applications and business units to understand the end-to-end view of data flows and dependencies. This holistic perspective enabled proactive identification of potential issues and optimization opportunities.
Utilized CloudWatch and SNS to implement alerting and monitoring systems for pipeline health and SLA violations, demonstrating a commitment to continuous improvement. This proactive approach ensured optimal performance and timely issue resolution.
Exhibited willingness to ask questions and reach out for assistance as required, fostering a collaborative and supportive team environment. This proactive approach ensured timely resolution of issues and knowledge sharing within the team.
Good with identifying priorities and managing multiple projects simultaneously, ensuring timely delivery of high-quality data solutions. This involved effective prioritization and resource allocation to meet business needs and project deadlines.
MacroHealth June 2019 – March 2021
Data Analyst
Responsibilities:
Demonstrated expertise in analytical thinking by developing end-to-end data pipelines using Azure Data Factory (ADF) to ingest healthcare claims data into Azure SQL and Synapse Analytics. This involved innovative thinking to optimize data ingestion and transformation processes.
Applied strong PL/SQL skills to tune Synapse SQL queries for cost-efficient execution and reduced total compute billing using materialized views and partitioning. This resulted in significant cost savings and improved query performance.
Exhibited attention to detail in creating custom logging and exception handling frameworks across ADF and Databricks jobs for better observability and error tracking. This ensured data quality and facilitated efficient troubleshooting.
Showcased problem-solving skills by migrating legacy SSIS workflows to cloud-native ADF pipelines, reducing maintenance costs and improving performance by over 25%. This involved efficient resource utilization and innovative solutions.
Effectively communicated across the organization, partnering with DevOps teams to automate deployment of ADF pipelines and Databricks notebooks using Azure DevOps CI/CD pipelines. This ensured efficient and reliable deployment of data solutions.
Worked well in a team environment, coordinating with business analysts to map data lineage and document complete source-to-report mappings for data cataloging. This ensured data governance and compliance with industry standards.
Demonstrated proficiency in connecting dots across various applications and business units to understand the end-to-end view of data flows and dependencies. This holistic perspective enabled proactive identification of potential issues and optimization opportunities.
Utilized Azure Monitor dashboards to track pipeline executions, job durations, and SLA compliance in real time, demonstrating a commitment to continuous improvement. This proactive approach ensured optimal performance and timely issue resolution.
Exhibited willingness to ask questions and reach out for assistance as required, fostering a collaborative and supportive team environment. This proactive approach ensured timely resolution of issues and knowledge sharing within the team.
Good with identifying priorities and managing multiple projects simultaneously, ensuring timely delivery of high-quality data solutions. This involved effective prioritization and resource allocation to meet business needs and project deadlines.
Educational Details:
Masters in Computer Science - University of Dayton