Data Engineer Azure

Location:

Boston, MA, 02133

Posted:

February 27, 2025

Contact this candidate

Resume:

POOJA REDDY

DATA ENGINEER

+1-857-***-**** **************@*****.*** United States linkedin.com/in/pooja-namireddy-39a506161 SUMMARY

A highly skilled Data Engineer with extensive experience in designing and maintaining data pipelines. Proficient in Python, SQL, and Java, with a deep understanding of distributed computing frameworks like Apache Spark. Skilled in data modeling, ETL processes, and optimizing performance for scalability. Experienced in cloud platforms such as AWS, Azure, and Google Cloud, leveraging services like AWS Glue and Azure Data Factory. Proficient in Snowflake for scalable data models and real-time analytics. Collaborative team player with a proven history of delivering high-quality solutions.

TECHNICAL SKILLS

Programming Languages: Python, SQL, Java, Scala

Big Data Technologies: Hadoop, Spark, Kafka

Data Visualization Tools: Tableau, Power BI

Snowflake: SnowSQL, Snowflake SQL, Snowflake Data Sharing, Snowflake Data Warehouse Administration

Data Integration: ETL/ELT processes, Data pipelines development, Integration tools (Apache Airflow, Talend, Informatica)

Data Modeling: Star Schema, Snowflake Schema, ERD design

Cloud Platforms: AWS, Azure, Google Cloud Platform

AWS Cloud : EC2, S3, Redshift, Glue, Lambda functions

Azure Cloud: Azure Data Lake (ADLS), Azure Data Factory (ADF), Data Lake Analytics

Version Control: Git, GitHub, GitLab

Databases: MySQL, PostgreSQL, Oracle

Data Integration Tools: Informatica PowerCenter, Talend Open Studio

Data Manipulation: Data cleaning, transformation, normalization

Data Security: Encryption, access control, compliance

Operating Systems: Linux, Unix, Windows

Scripting: Shell scripting, PowerShell

Data Warehousing: Data warehouse design, optimization, performance tuning

Containerization: Docker, Kubernetes

Monitoring and Logging: Prometheus, Grafana, ELK stack

Data Governance: Data quality management, metadata management

Data Storage: Object storage, file systems

Networking: TCP/IP, VPN, DNS

Performance Optimization: Indexing, query optimization, caching PROFESSIONAL EXPERIENCE

United Health Group May 2023 – Present

Data Engineer

Orchestrated the design and execution of ETL pipelines in Snowflake, streamlining data processing for heightened operational efficiency and agility.

Implemented cutting-edge data modeling techniques within Snowflake, optimizing storage and retrieval to drive improved performance and scalability.

Collaborated cross-functionally with data analysts, scientists, and stakeholders to translate business requirements into actionable technical solutions, ensuring alignment and timely project delivery.

Safeguarded data integrity and confidentiality within Snowflake, enforcing robust security measures to mitigate risks and vulnerabilities effectively.

Automated data processing workflows with DevOps methodologies, enabling seamless integration and continuous delivery of data pipelines for accelerated insights.

Leveraged Python scripting for automating data manipulation tasks, enhancing workflow efficiency and reducing manual intervention.

Executed performance tuning and SQL optimization strategies in Snowflake, refining query performance and maximizing resource utilization.

Led the seamless migration of legacy ETL processes to Snowflake, minimizing disruption and optimizing data integration capabilities.

Contributed proactively to the evaluation and adoption of emerging technologies, bolstering data engineering capabilities and staying ahead of industry trends.

Utilized AWS Data Factory, Spark SQL, and AWS Glue for Extract, Transform, and Load operations, ingesting data into various AWS services such as Amazon S3, Amazon RDS, Amazon Redshift, and Amazon Data Warehouse, resulting in a 30% increase in data availability.

Developed and deployed various Lambda functions in AWS with in-built AWS Lambda Libraries and also deployed Lambda Functions.

Mentored and nurtured junior team members, fostering their professional development and cultivating a collaborative, high-performing team culture.

Rabobank March 2021 – January 2022

Data Engineer

Engineered data ingestion pipelines, optimizing ETL processes for accelerated extraction, transformation, and loading of insurance data from diverse sources.

Architected and implemented a comprehensive data model tailored to the insurance industry, enhancing storage efficiency, and enabling seamless data retrieval for analysis and reporting.

Automated data quality checks to ensure the accuracy, consistency, and completeness of insurance data throughout its lifecycle, mitigating errors and bolstering data reliability.

Scripted Python automation for repetitive data processing tasks, streamlining workflow, and liberating time for strategic initiatives.

Instituted robust data security measures, encompassing encryption and access controls, to safeguard sensitive customer information and ensure regulatory compliance.

Supported the creation of interactive Power BI dashboards and reporting for data facilitation, integrating with AWS QuickSight or other visualization tools

Collaborated closely with data analysts, actuaries, and stakeholders to discern their data needs and deliver bespoke solutions, amplifying the efficacy of data-driven decision-making processes.

Operated Hadoop and Spark for efficient processing of large volumes of insurance data, heightening scalability and performance of data processing operations.

Orchestrated the migration of on-premises data infrastructure to the AWS cloud platform, slashing operational overhead and empowering greater flexibility in data management.

Spearheaded the adoption of Snowflake data warehouse technology, optimizing storage and retrieval of insurance-related data and fortifying analytical capabilities.

Guided junior team members, nurturing their professional development, and fostering a culture of continual learning and innovation within the team. Cisco August 2018 – February 2021

Data Engineer

Designed and implemented robust data pipelines, facilitating seamless data movement from diverse sources.

Integrated GCP cloud platform for efficient data storage and processing, optimizing costs through resource utilization analysis and cost-saving strategies.

Developed Python scripts for data validation and transformation processes, enhancing data accuracy and streamlining data cleansing efforts.

Designed and implemented data warehouse structures using Snowflake, enabling storage and analysis of large datasets with enhanced scalability.

Orchestrated automated data pipelines using Apache Airflow, reducing manual intervention and ensuring timely and reliable data processing.

Collaborated with cross-functional teams, including data scientists and analysts, to understand data needs and deliver tailored solutions.

Documented data pipelines and processes, ensuring maintainability and knowledge transfer across teams.

Operated SQL for data manipulation and analysis, generating actionable insights and reports for stakeholders.

Conducted regular performance optimizations and troubleshooting, ensuring uninterrupted data availability.

Implemented Git version control system for efficient code management, fostering collaboration and enabling seamless integration of new feature.

EDUCATION

Master of Science in Computers in Information Systems NEW ENGLAND COLLEGE Jan 2022– March 2024

Bachelor of Technology in Computer Science

SREYAS INSTITUTE OF ENGINEERING AND TECHNOLOGY June 2014 – May 2018

Contact this candidate