Data Engineer Sql Server

Location:

Wesley Chapel, FL

Posted:

January 31, 2025

Contact this candidate

Resume:

Srujana Gangavarapu

PROFESSIONAL SUMMARY

I am a self-motivated Data Engineer with 16+ years of ETL experience, including 6 years in AWS Bigdata. Proficient in AWS Glue, PySpark, Python, Lambda, Redshift and SQL to create end-to-end ETL data pipelines in AWS. My expertise covers other ETL tools like Informatica Power Center and databases as Teradata, Oracle and SQL server. Delivered multiple Data Warehouse projects, understands data modeling, and excels in Agile methodology.

CORE EXPERTISE

Big Data

ETL Technologies

Cloud Data Warehousing

Dimensional Modeling

Agile Methodology

TECHNICAL SKILLS

Languages: Java, Python

Databases: Oracle, Microsoft SQL Server, AWS RDS, Aurora, MySQL, Teradata

Cloud: AWS S3, Glue, PySpark, Athena, DMS, Lambda, Airflow

Data Warehouse: AWS Redshift, Snowflake

Other ETL Tools: Informatica

CAREER PROGRESSION

SDG&E (San Diego Gas & Electric), Remote

Senior AWS Data Engineer

July 2023 – Present

Worked with data scientists and business stakeholders to understand the requirements and design data solutions that meet their needs.

Designed and implemented efficient data pipelines to extract, transform and load data from various sources into data lake(S3) and data warehouse (Redshift) using AWS Glue.

Written complex SQL queries for data analysis and validation.

Improved data processing performance by using data partitioning and optimization techniques.

Developed and maintained documentation for data pipelines and data models to understand usage for the team.

Deployed pipelines via automated CI/CD processes to execute data pipelines running in batch.

Designed and implemented data governance policies and data quality frameworks.

Utilized test-driven development pipelines to design data transformations running on Apache Spark thar are orchestrated via Airflow.

Environment: S3, Glue, PySpark, Python, Redshift, Athena, Lambda, Airflow.

GP, Remote

AWS Data Engineer

Feb 2022 – Mar 2023

Developed scalable and efficient data pipelines using AWS services such as S3, Glue, Lambda and Redshift.

Converted complex stored procedures from SQL Server to PostgreSQL compatible with AWS Redshift.

Extracted data from API’s and transformed data as per the requirements and loaded into Redshift using AWS Lambda.

troubleshoot and resolve data issues, performance bottlenecks and ETL job failures.

Written complex SQL queries for data analysis and validation.

Developed and maintained data processing workflows using Step Functions.

Environment: S3, Glue, PySpark, Python, Redshift, Athena, Lambda, Step Functions

AEP, Columbus, Ohio

ETL Data Engineer

October 2018 – January 2022

Responsible for migrating existing Work and Asset Management transactional systems to AWS Cloud and creating central data lake and data warehousing solutions using AWS native technologies.

Performed in-depth analysis of the current system landscape including Asset Management and Asset Suite Work applications, and the new Maximo system with a goal to identify and document data profile.

Migrated on-prem data from Oracle and Flat files to AWS Aurora using AWS DMS and SCT (Schema Conversion Tool).

Created central data lake on S3 with three zones- Landing, Transform, Consume zones.

Converted existing Informatica ETL jobs into AWS glue jobs.

Design, development, and implementation of ETL pipelines using python API (PySpark) of Apache Spark on AWS Glue.

Significantly improved Spark job run times by performing in depth tuning using partitions, broadcasts, and salting techniques.

Implemented metadata repository on AWS Glue data catalog for the S3 data lake.

Designed and implemented data warehousing in AWS Redshift.

Implemented extraction jobs for loading data from S3 data lake Parquet files to Redshift using copy commands.

Environment: AWS RDS Aurora, S3, DMS, SCT, Glue, PySpark, Python, Redshift, Athena, Informatica.

Client: Anthem, Richmond, VA

ETL Specialist

Aug 2017 – September 2018

Designed/Implemented mappings to load data from heterogenous sources into data warehouse using Informatica Power center.

Developed complex store procedures using input/output parameters, cursors, views, triggers, and

complex queries using temp tables and joins.

Designed complex Informatica mappings to execute one time and incremental data load. High

throughput was achieved by implementing workflow partitioning.

Troubleshooting of long running sessions by identifying performance bottlenecks in various levels

like sources, targets, mappings, and sessions and resolving them with performance tuning.

Written complex BTEQ scripts to transform and load data from staging database into target database.

Performed tuning and optimization of complex SQL queries using Teradata explain plan.

Maintained Data Warehouse by loading dimensions and facts as part of project. Also worked for

different enhancements in FACT tables.

Documented ETL test plans, test cases, test scripts, and validations based on design specifications for unit testing, system testing, functional testing, prepared test data for testing, error handling and analysis.

Participated in weekly status meetings.

Environment: Informatica, Teradata, Control-M

Client: Nationwide Insurance, Columbus, Ohio

ETL Specialist

July 2013 – June 2017

Extracted data from various source systems like, Oracle and Teradata.

Analyzed the business requirements and functional specifications.

Developed complex mappings using Lookups connected and unconnected, Rank, Sorter, Joiner,

Aggregator, Filter, Router transformations to transform the data as per the target requirements.

Created Workflows and used various tasks like Email, Event-wait, Scheduler, Control, Decision, Session in the Workflow Manager.

Did performance tuning on targets, sources, mappings and sessions to improve system performance.

Execute performance tuning of ETL loads and SQL queries for large volumes of data processing.

Created Ruby/Cucumber automate tests to run daily to fix any data quality issues.

Prepared deployment document and assisted to deployment team for code migration.

Involved into Jobs schedule using Control M Scheduler and different Schedulers.

Partner with DEV/Business partner for KT/Reviews about development details.

Environment: Informatica 9.5, Ruby, Cucumber, Teradata, ESP, Oracle 11g, Toad, Linux, Perl

Client: Ohio Department of Education, Columbus, Ohio

ETL Developer

June 2011 – June 2013

Analyzed business requirements and worked closely with the various application teams and business teams to develop ETL procedures that are consistent across all application and systems.

Used Informatica designer for designing mappings and mapplets to extract data from various sources like Oracle and flat files.

Environment: Informatica 8.6.1, Oracle 10i

Client: Wipro Technologies, India

ETL Developer

March 2008 – November 2010

Developed standard and reusable mappings and mapplets using various transformations like Expression, Aggregator, Joiner, Router, Lookup (Connected and Unconnected) and Filter.

Extensive use of Persistent cache to reduce session processing time.

Environment: Informatica, Teradata, SQL Server 2005, Erwin 4.1, Windows XP, UNIX.

*******.****@*****.***

Cell: 505-***-****

Contact this candidate