Data Engineer

Location:

Fremont, CA

Posted:

February 18, 2021

Contact this candidate

Resume:

Lalitha Sarva

AWS ETL Data Engineer Cell: 678-***-**** E-Mail: adkalw@r.postjobfree.com

SUMMARY

Having around 11+ years of experience in information technology industry with a strong background in System Analysis, Design and Development in the fields of AWS, BIGDATA and ETL mechanism as a part of the Decision Support Systems (DSS) initiatives.

Technical Summary

Involved in implementing large and complex projects.

Proven Expertise on Dimensional Modeling like Star schema and Snowflake schema.

5+ years of implementing big large-scale AWS in Ecommerce, Banking &Financial Sectors

Proven Expertise on Dimensional Modeling like Star schema and Snowflake schema.

Extensivity used SNOWTASK/DBT for Transformations for ININ/CISCO/Autorek DATA on SNOWFLAKE Database

Extensively worked on Airflow to load data in snowflake

Expertise in AWS Stack AWS, SNOWFLAKE, EC2,S3,IAM,LAMBDA,DATAPIPELINE,EMR,SNS,CloudWatch,AWS-REDSHIFT,DMS,ATHENA

Proficient Knowledge and worked on AWS and BIG Data Technologies like HDFS, HIVE, EMR, SPARK AWS, REDSHIFT, EMR, EC2, DATAPIPELINE

Solid experience in managing and developing Ab-Initio applications for Extraction, Transformation, Cleansing and Loading into Data Warehouse/Data mart. Used Ab-Initio with Very Large Database Systems (VLDB) that are Massively Parallel (MPP)

Proven Expertise in performing analytics on AWS Redshift and Big Data using Hive.

Extensively worked on Moving data from Snowflake to RDS/Snowflake to S3

Worked on migration of Snowflake from Oracle

Used SPARK Streaming and Spark SQL to build low latency applications.

Strong understanding of Hadoop internals, different compressions like AVRO, JSON, and different file formats

Expertise in Troubleshooting and resolve Data Pipeline, S3, Redshift issues.

Expertise in Creating AbInitio graphs to read and write to HDFS, Generic graphs, EME, Dependency Analysis, Conduct IT, Continuous flows, utilizing Rollup, Join, Sort, Normalize, Scan, Partition components to speed up the ETL process

Worked on L1/L2/L3 Production Support Issues.

Expertise in Resolving Service Now Tasks and Incidents

Created Ad-hoc reports using SAP-BO Web Intelligence for end user’s requirements.

Integration of Business Intelligence Reporting Solution (Tableau) with various databases like RDBMS/Hadoop.

Involved in effort estimation, design, development, review, implementation and maintenance of AbInitio graphs.

Understand and contribute towards Projects’ Technical Design as well, along with the requirement specifications. Involved in preparing HLD and LLD

Have worked on Autosys and control-m scheduling tools

Have played crucial lead role in handling production support L3issues which requires good analytical skill and quick response.

Sound Skills in structured query language (Oracle SQL). Experience with all phases of SDLC including design, development review and maintenance.

Technology Expertise

AWS/BIGDATA (AWS, EC2, S3, IAM, DATAPIPELINE, EMR, RDS, DMS, SNOWFLAKE, REDSHIFT, SQS, SNS, LAMBDA, HIVE, HDFS, AWS, Spark, Athena)

ETL (Ab initio, Informatica/Attunity)

Programming Languages Python, UNIX Shell Script, PDL

DBT

BI Analytical Tools

(SAP BO, Tableau)

Databases (Oracle (SQL/PLSQL), SNOWFLAKE, RDS-MySQL,AWS-REDSHIFT, DB2, Teradata)

Tools/Scheduling Tools : Airflow,Service now, TOAD for Oracle, SQL developer, SQL Work Bench, Agility Work Bench, Autosys, control-m, JIRA, Confluence, ITSM GitHub,BitBucket

Water Fall & Agile Methodology (Scrum)

CI/CD: Bamboo/Jenkins

Job Roles

Sr. Cloud Data Engineer

AWS Lead Data Engineer

Sr. Data Analyst

Sr. Technology Lead

BI Developer

Industry Domains

Banking & Finance

Retail & Ecommerce

Major Clients

FIS GLOBAL

USFOODS

Levi Strauss & Co

JPMC

CITI Group

H&M

General Electric

Certifications

AWS Architect Associate: YLWVCHS11JF11S3ECSM

Education

Bachelor of Engineering (EEE)

MAJOR PROJECTS

FISGLOBAL, SFO, CA SEP 2020-Till Date

Sr. Cloud Data Engineer

Responsibilities:

CDP-RADAR is an ongoing increment data load from various data source.

Extensivity used SNOWTASK for ININ/CISCO/Autorek DATA into SNOWFLAKE

Extensively used EMR and Airflow process to load Data in snowflake

Expertise in Troubleshooting and resolve Data Pipeline related issues.

Used enqueue/Init Lambda to trigger the EMR Process

EMR used for all the Transformation of SQL and Python Scripts loads ETL process. informatica is used for picking the file s3 from Source

Worked on RADAR migration project from Oracle to Snowflake

Performed advanced procedures like text analytics and processing, using the in-memory computing capabilities of Spark using Python On EMR

Processed data from different sources to SNOWFLAKE and MYSQL.

Implemented Spark using Python Spark SQL for faster testing and processing of data.

Write the data and Loaded data into Data Lake environment (SNOWFLAKE) From AWS-EMR which was accessed by business users and data scientist s Using Tableau

Copied data from s3 to Snowflake and connect with SQL workbench seamless importing and movement of data via S3.

Met with business/user groups to understand the business process and fixed the High priority Production support issues

Radar team has been involved and supported all production support activities

Served as a Subject Matter Expert on assigned projects.

Environment: AWS, SNOWFLAKE TASK, DBT, AIRFLOW EC2, S3, IAM,SQS,SNS,SNOWFLAKE, EMR-SPARK, RDS, JSON MySQL Workbench, ETL-Informatica, Oracle, Red Hat Linux, Tableau.

USFOODS, Rosemont, IL July 2019- Aug 2020

Sr. AWS Data Engineer

Responsibilities:

Data Solutions is an ongoing increment data load from various data source.

Data Solutions feeds are transferred to Snowflake/MySQL/S3 process through ETL EMR process

Extensively worked on Moving data from Snowflake to Snowflake to S3 for the TMCOMP/ESD feeds

Extensively worked on Moving data from Snowflake to Snowflake to S3 for the LMA/LMA Search

Expertise in Troubleshooting and resolve Data Pipeline related issues.

Used enqueue/Init Lambda to trigger the EMR Process

EMR used for all the Transformation of SQL and Python Scripts loads ETL process. informatica is used for picking the file s3 from Source

Worked on CDMR migration project from Oracle to Snowflake

Performed advanced procedures like text analytics and processing, using the in-memory computing capabilities of Spark using Python On EMR

Processed data from different sources to SNOWFLAKE and MYSQL.

Implemented Spark using Python Spark SQL for faster testing and processing of data.

Write the data and Loaded data into Data Lake environment (SNOWFLAKE) From AWS-EMR which was accessed by business users and data scientist s Using Tableau/OBIEE

Copied data from s3 to Snowflake and connect with SQL workbench seamless importing and movement of data via S3.

Worked on DMS to Process the data to SQL Server

Met with business/user groups to understand the business process and fixed the High priority Production support issues

Data solution team has been involved and supported all production support activities

Served as a Subject Matter Expert on assigned projects.

POC for LMA Search team end -to end production process

Delivers MTTR Analysis Report every quarter and WSR Reports Weekly

Environment: AWS, EC2, S3, IAM,SQS,SNS,SNOWFLAKE, EMR-SPARK, RDS, JSON MySQL Workbench, ETL-Informatica, Oracle, Red Hat Linux, Tableau.

Levi Strauss & Co SFO, CA APR 2016-JULY 2019

AWS Lead Data Engineer

Responsibilities:

Rivet-Know me is an ongoing increment data load from various data source.

Rivet- Know me feeds are transferred through ETL process from three types of data Sources Hybris, Omniture, Responsys is transferred to AWS S3

Used DMS to load source data into Type 1 and Type 2 Redshift.

Extensively worked on Moving data from Snowflake to RDS/Snowflake to S3 for the ESD and CLM projects

Expertise in Troubleshooting and resolve Data Pipeline related issues.

Worked on L1/L2/L3 Production Support Issues.

Expertise in Resolving Service Now Tasks and Incidents

Used Lambda to trigger the files using Data pipelines

Data pipeline used for all the Transformation of SQL and Python Scripts loads in Redshift for Incremental load process and ETL-Talend is used for picking the file s3 to targets

Performed advanced procedures like text analytics and processing, using the in-memory computing capabilities of Spark using Python On EMR

Processed data from different sources to AWS Redshift using EMR –Spark, Python programming.

Implemented Spark using Python and Scala and Spark SQL for faster testing and processing of data.

Write the data and Loaded data into Data Lake environment (AWS-Redshift) From AWS-EMR and Data Pipelines

AWS –Redshift which was accessed by business users and data scientist s Using Tableau

Copied data from s3 to Redshift and connect with SQL workbench seamless importing and movement of data via S3.

Worked on DMS to Process the data to Redshift for Business/Analytics Team

SQL Workbench for Redshift brings greater access to your data faster

Worked in AWS environment for development and deployment of Custom Hadoop Applications.

Met with business/user groups to understand the business process and fixed the High priority Production support issues

Leading the KNOWME team and involved and supported all production support activities

Served as a Subject Matter Expert on assigned projects.

POC for KNOWME/GMI team end -to end production process

Delivers MTTR Analysis Report every quarter and WSR Reports Weekly

Environment: AWS, EC2, S3, IAM, DATAPIPELINE, EMR-SPARK, RDS, REDSHIFT,SNOWFLAKE, RDS, AVRO, JSON SqlWorkBench, ETL-TALEND, Oracle, Red Hat Linux, Tableau.

CITIBANK BANK, NYC July 2013 - April 2016 Sr. Technology lead

Responsibilities:

Developed the code for Importing and exporting data into HDFS and Hive using Sqoop.

Responsible for writing Hive Queries for analyzing data in Hive warehouse using HQL.

Involved in defining job flows using Oozie for scheduling jobs to manage apache Hadoop jobs by directed.

Developing Hive User Defined Functions in java, compiling them into jars and adding them to the HDFS and executing them with Hive Queries.

Experienced in managing and reviewing Hadoop log files. Tested and reported defects in an Agile Methodology perspective.

Involved in installing Hadoop ecosystems (Hive, Sqoop, HBase, Oozie) on top of Hadoop cluster

Importing data from SQL to HDFS & Hive for analytical purpose.

Involved in creating Hive Tables, loading with data and writing Hive queries which will invoke and run Map Reduce jobs in the backend.

Worked on clean dependency analysis.

Created AbInitio graphs to read and write to HDFS, utilizing Rollup, Join, Sort, Replicate, Partition components to speed up the ETL process.

Understand and contribute towards Projects’ Technical Design as well, along with the requirement specifications.

Testing the solution to validate project objectives.

Managing the application end-to-end delivery, Ownership of the quarterly application release development and ensure a smooth User Acceptance Testing, issues resolution

Preparation and review of Test Plans/Scenarios/Test Cases at development, IST, UAT, prod stages.

Close participation in all stages of SDLC creation of AbInitio development work products that conform to the stated business requirements and high-level design documents.

Done appropriate unit-level testing of work products and the management of the review process for the AbInitio deliverables.

Tracks and reports on issues and risks, escalating as needed

Expertly handles last minute requests and stressful situations

Develop test strategy based on design/architectural documents, requirements, specifications and other documented sources

Develop test cases and test scripts based on documented sources

Close participation in all stages of SDLC creation using Agile methodology

Organizing events & conducting Presentations, Trainings, Effective Meetings, Project Status Reporting to Senior Management.

Coordinates with other team members to ensure that all work products integrate together as a complete solution and adopts a supporting role to any other team member to resolve issues, or to complete tasks sooner

Environment: Hadoop, HDFS, Sqoop, Hive, RDMS, Sqoop, AbInitio ETL Tool, oracle, Teradata, UNIX and Autosys.

JPMC, USA June 2010 - Sep 2011

Sr. Developer

Responsibilities:

Creation of AbInitio development work products that conform to the stated business requirements and high-level design documents.

Created AbInitio graphs, utilizing Rollup, Join, Sort, Replicate, Partition components to speed up the ETL process

Have worked in UAT and L1 and L2 Production support

Worked on service now and peregrine tickets

Monitoring daily, weekly and monthly jobs

Promoted code to production environment from lower environment

Handled Database and ETL issues

Addressed end user queries in timely fashion

Engaging different teams when load job in failed state.

Tracks on issues and risks, and Escalations as needed

Done appropriate unit-level testing of work products and the management of the review process for the AbInitio deliverables.

Involved in preparing HLD, LLDs. Extensively involved in EME concepts. Preparing the test data to test the developed components

Designed and developed graphs by using AbInitio

Developed complex generic and conditionalized AbInitio graphs with emphasis on optimizing performance.

Environment: AbInitio ETL Tool, DB2, and UNIX, Control-M

CITIBANK, USA May 2009 - May 2010

Sr. Developer

Responsibilities:

Created Abinitio graphs, utilizing Rollup, Join, Sort, Replicate, Partition components to speed up the ETL process

Involved in preparing HLD, LLDs. Extensively involved in EME concepts. Preparing the test data to test the developed components

Designed and developed graphs by using AbInitio.

Developed complex generic and conditionalized AbInitio graphs with emphasis on optimizing performance

Done appropriate unit-level testing of work products and the management of the review process for the AbInitio deliverables.

Tracks and reports on issues and risks, escalating as needed

Expertly handles last minute requests and stressful situations

Environment: AbInitio ETL Tool, Oracle 9i, Unix

H&M, Sweden Oct 2008 - Mar 2009

Sr. Developer

Responsibilities:

Responsible for designing and developing various applications in the project. Analyzing the requirement and development of graphs and scripts to get appropriate results.

Testing of the graphs and application.

Involved in Amendments of Graphs.

Done appropriate unit-level testing of work products and the management of the review process for the AbInitio deliverables.

Tracks and reports on issues and risks, escalating as needed

Expertly handles last minute requests and stressful situations

Environment: AbInitio ETL Tool, Oracle 9i, UNIX

General Electric, USA Dec 2006 - Sep 2008

IM Programmer

Responsibilities:

Creation of AbInitio development work products that conform to the stated business requirements and high-level design documents.

Created AbInitio graphs, utilizing Rollup, Join, Sort, Replicate, Partition components to speed up the ETL process

Done appropriate unit-level testing of work products and the management of the review process for the AbInitio deliverables.

Environment: AbInitio ETL Tool, Oracle 9i, UNIX

Contact this candidate