Lalitha Sarva
AWS ETL Data Engineer Cell: 678-***-**** E-Mail: adkalw@r.postjobfree.com
SUMMARY
Having around 11+ years of experience in information technology industry with a strong background in System Analysis, Design and Development in the fields of AWS, BIGDATA and ETL mechanism as a part of the Decision Support Systems (DSS) initiatives.
Technical Summary
Involved in implementing large and complex projects.
Proven Expertise on Dimensional Modeling like Star schema and Snowflake schema.
5+ years of implementing big large-scale AWS in Ecommerce, Banking &Financial Sectors
Proven Expertise on Dimensional Modeling like Star schema and Snowflake schema.
Extensivity used SNOWTASK/DBT for Transformations for ININ/CISCO/Autorek DATA on SNOWFLAKE Database
Extensively worked on Airflow to load data in snowflake
Expertise in AWS Stack AWS, SNOWFLAKE, EC2,S3,IAM,LAMBDA,DATAPIPELINE,EMR,SNS,CloudWatch,AWS-REDSHIFT,DMS,ATHENA
Proficient Knowledge and worked on AWS and BIG Data Technologies like HDFS, HIVE, EMR, SPARK AWS, REDSHIFT, EMR, EC2, DATAPIPELINE
Solid experience in managing and developing Ab-Initio applications for Extraction, Transformation, Cleansing and Loading into Data Warehouse/Data mart. Used Ab-Initio with Very Large Database Systems (VLDB) that are Massively Parallel (MPP)
Proven Expertise in performing analytics on AWS Redshift and Big Data using Hive.
Extensively worked on Moving data from Snowflake to RDS/Snowflake to S3
Worked on migration of Snowflake from Oracle
Used SPARK Streaming and Spark SQL to build low latency applications.
Strong understanding of Hadoop internals, different compressions like AVRO, JSON, and different file formats
Expertise in Troubleshooting and resolve Data Pipeline, S3, Redshift issues.
Expertise in Creating AbInitio graphs to read and write to HDFS, Generic graphs, EME, Dependency Analysis, Conduct IT, Continuous flows, utilizing Rollup, Join, Sort, Normalize, Scan, Partition components to speed up the ETL process
Worked on L1/L2/L3 Production Support Issues.
Expertise in Resolving Service Now Tasks and Incidents
Created Ad-hoc reports using SAP-BO Web Intelligence for end user’s requirements.
Integration of Business Intelligence Reporting Solution (Tableau) with various databases like RDBMS/Hadoop.
Involved in effort estimation, design, development, review, implementation and maintenance of AbInitio graphs.
Understand and contribute towards Projects’ Technical Design as well, along with the requirement specifications. Involved in preparing HLD and LLD
Have worked on Autosys and control-m scheduling tools
Have played crucial lead role in handling production support L3issues which requires good analytical skill and quick response.
Sound Skills in structured query language (Oracle SQL). Experience with all phases of SDLC including design, development review and maintenance.
Technology Expertise
AWS/BIGDATA (AWS, EC2, S3, IAM, DATAPIPELINE, EMR, RDS, DMS, SNOWFLAKE, REDSHIFT, SQS, SNS, LAMBDA, HIVE, HDFS, AWS, Spark, Athena)
ETL (Ab initio, Informatica/Attunity)
Programming Languages Python, UNIX Shell Script, PDL
DBT
BI Analytical Tools
(SAP BO, Tableau)
Databases (Oracle (SQL/PLSQL), SNOWFLAKE, RDS-MySQL,AWS-REDSHIFT, DB2, Teradata)
Tools/Scheduling Tools : Airflow,Service now, TOAD for Oracle, SQL developer, SQL Work Bench, Agility Work Bench, Autosys, control-m, JIRA, Confluence, ITSM GitHub,BitBucket
Water Fall & Agile Methodology (Scrum)
CI/CD: Bamboo/Jenkins
Job Roles
Sr. Cloud Data Engineer
AWS Lead Data Engineer
Sr. Data Analyst
Sr. Technology Lead
BI Developer
Industry Domains
Banking & Finance
Retail & Ecommerce
Major Clients
FIS GLOBAL
USFOODS
Levi Strauss & Co
JPMC
CITI Group
H&M
General Electric
Certifications
AWS Architect Associate: YLWVCHS11JF11S3ECSM
Education
Bachelor of Engineering (EEE)
MAJOR PROJECTS
FISGLOBAL, SFO, CA SEP 2020-Till Date
Sr. Cloud Data Engineer
Responsibilities:
CDP-RADAR is an ongoing increment data load from various data source.
Extensivity used SNOWTASK for ININ/CISCO/Autorek DATA into SNOWFLAKE
Extensively used EMR and Airflow process to load Data in snowflake
Expertise in Troubleshooting and resolve Data Pipeline related issues.
Used enqueue/Init Lambda to trigger the EMR Process
EMR used for all the Transformation of SQL and Python Scripts loads ETL process. informatica is used for picking the file s3 from Source
Worked on RADAR migration project from Oracle to Snowflake
Performed advanced procedures like text analytics and processing, using the in-memory computing capabilities of Spark using Python On EMR
Processed data from different sources to SNOWFLAKE and MYSQL.
Implemented Spark using Python Spark SQL for faster testing and processing of data.
Write the data and Loaded data into Data Lake environment (SNOWFLAKE) From AWS-EMR which was accessed by business users and data scientist s Using Tableau
Copied data from s3 to Snowflake and connect with SQL workbench seamless importing and movement of data via S3.
Met with business/user groups to understand the business process and fixed the High priority Production support issues
Radar team has been involved and supported all production support activities
Served as a Subject Matter Expert on assigned projects.
Environment: AWS, SNOWFLAKE TASK, DBT, AIRFLOW EC2, S3, IAM,SQS,SNS,SNOWFLAKE, EMR-SPARK, RDS, JSON MySQL Workbench, ETL-Informatica, Oracle, Red Hat Linux, Tableau.
USFOODS, Rosemont, IL July 2019- Aug 2020
Sr. AWS Data Engineer
Responsibilities:
Data Solutions is an ongoing increment data load from various data source.
Data Solutions feeds are transferred to Snowflake/MySQL/S3 process through ETL EMR process
Extensively worked on Moving data from Snowflake to Snowflake to S3 for the TMCOMP/ESD feeds
Extensively worked on Moving data from Snowflake to Snowflake to S3 for the LMA/LMA Search
Expertise in Troubleshooting and resolve Data Pipeline related issues.
Used enqueue/Init Lambda to trigger the EMR Process
EMR used for all the Transformation of SQL and Python Scripts loads ETL process. informatica is used for picking the file s3 from Source
Worked on CDMR migration project from Oracle to Snowflake
Performed advanced procedures like text analytics and processing, using the in-memory computing capabilities of Spark using Python On EMR
Processed data from different sources to SNOWFLAKE and MYSQL.
Implemented Spark using Python Spark SQL for faster testing and processing of data.
Write the data and Loaded data into Data Lake environment (SNOWFLAKE) From AWS-EMR which was accessed by business users and data scientist s Using Tableau/OBIEE
Copied data from s3 to Snowflake and connect with SQL workbench seamless importing and movement of data via S3.
Worked on DMS to Process the data to SQL Server
Met with business/user groups to understand the business process and fixed the High priority Production support issues
Data solution team has been involved and supported all production support activities
Served as a Subject Matter Expert on assigned projects.
POC for LMA Search team end -to end production process
Delivers MTTR Analysis Report every quarter and WSR Reports Weekly
Environment: AWS, EC2, S3, IAM,SQS,SNS,SNOWFLAKE, EMR-SPARK, RDS, JSON MySQL Workbench, ETL-Informatica, Oracle, Red Hat Linux, Tableau.
Levi Strauss & Co SFO, CA APR 2016-JULY 2019
AWS Lead Data Engineer
Responsibilities:
Rivet-Know me is an ongoing increment data load from various data source.
Rivet- Know me feeds are transferred through ETL process from three types of data Sources Hybris, Omniture, Responsys is transferred to AWS S3
Used DMS to load source data into Type 1 and Type 2 Redshift.
Extensively worked on Moving data from Snowflake to RDS/Snowflake to S3 for the ESD and CLM projects
Expertise in Troubleshooting and resolve Data Pipeline related issues.
Worked on L1/L2/L3 Production Support Issues.
Expertise in Resolving Service Now Tasks and Incidents
Used Lambda to trigger the files using Data pipelines
Data pipeline used for all the Transformation of SQL and Python Scripts loads in Redshift for Incremental load process and ETL-Talend is used for picking the file s3 to targets
Performed advanced procedures like text analytics and processing, using the in-memory computing capabilities of Spark using Python On EMR
Processed data from different sources to AWS Redshift using EMR –Spark, Python programming.
Implemented Spark using Python and Scala and Spark SQL for faster testing and processing of data.
Write the data and Loaded data into Data Lake environment (AWS-Redshift) From AWS-EMR and Data Pipelines
AWS –Redshift which was accessed by business users and data scientist s Using Tableau
Copied data from s3 to Redshift and connect with SQL workbench seamless importing and movement of data via S3.
Worked on DMS to Process the data to Redshift for Business/Analytics Team
SQL Workbench for Redshift brings greater access to your data faster
Worked in AWS environment for development and deployment of Custom Hadoop Applications.
Met with business/user groups to understand the business process and fixed the High priority Production support issues
Leading the KNOWME team and involved and supported all production support activities
Served as a Subject Matter Expert on assigned projects.
POC for KNOWME/GMI team end -to end production process
Delivers MTTR Analysis Report every quarter and WSR Reports Weekly
Environment: AWS, EC2, S3, IAM, DATAPIPELINE, EMR-SPARK, RDS, REDSHIFT,SNOWFLAKE, RDS, AVRO, JSON SqlWorkBench, ETL-TALEND, Oracle, Red Hat Linux, Tableau.
CITIBANK BANK, NYC July 2013 - April 2016 Sr. Technology lead
Responsibilities:
Developed the code for Importing and exporting data into HDFS and Hive using Sqoop.
Responsible for writing Hive Queries for analyzing data in Hive warehouse using HQL.
Involved in defining job flows using Oozie for scheduling jobs to manage apache Hadoop jobs by directed.
Developing Hive User Defined Functions in java, compiling them into jars and adding them to the HDFS and executing them with Hive Queries.
Experienced in managing and reviewing Hadoop log files. Tested and reported defects in an Agile Methodology perspective.
Involved in installing Hadoop ecosystems (Hive, Sqoop, HBase, Oozie) on top of Hadoop cluster
Importing data from SQL to HDFS & Hive for analytical purpose.
Involved in creating Hive Tables, loading with data and writing Hive queries which will invoke and run Map Reduce jobs in the backend.
Worked on clean dependency analysis.
Created AbInitio graphs to read and write to HDFS, utilizing Rollup, Join, Sort, Replicate, Partition components to speed up the ETL process.
Understand and contribute towards Projects’ Technical Design as well, along with the requirement specifications.
Testing the solution to validate project objectives.
Managing the application end-to-end delivery, Ownership of the quarterly application release development and ensure a smooth User Acceptance Testing, issues resolution
Preparation and review of Test Plans/Scenarios/Test Cases at development, IST, UAT, prod stages.
Close participation in all stages of SDLC creation of AbInitio development work products that conform to the stated business requirements and high-level design documents.
Done appropriate unit-level testing of work products and the management of the review process for the AbInitio deliverables.
Tracks and reports on issues and risks, escalating as needed
Expertly handles last minute requests and stressful situations
Develop test strategy based on design/architectural documents, requirements, specifications and other documented sources
Develop test cases and test scripts based on documented sources
Close participation in all stages of SDLC creation using Agile methodology
Organizing events & conducting Presentations, Trainings, Effective Meetings, Project Status Reporting to Senior Management.
Coordinates with other team members to ensure that all work products integrate together as a complete solution and adopts a supporting role to any other team member to resolve issues, or to complete tasks sooner
Environment: Hadoop, HDFS, Sqoop, Hive, RDMS, Sqoop, AbInitio ETL Tool, oracle, Teradata, UNIX and Autosys.
JPMC, USA June 2010 - Sep 2011
Sr. Developer
Responsibilities:
Creation of AbInitio development work products that conform to the stated business requirements and high-level design documents.
Created AbInitio graphs, utilizing Rollup, Join, Sort, Replicate, Partition components to speed up the ETL process
Coordinates with other team members to ensure that all work products integrate together as a complete solution and adopts a supporting role to any other team member to resolve issues, or to complete tasks sooner
Have worked in UAT and L1 and L2 Production support
Worked on service now and peregrine tickets
Monitoring daily, weekly and monthly jobs
Promoted code to production environment from lower environment
Handled Database and ETL issues
Addressed end user queries in timely fashion
Engaging different teams when load job in failed state.
Tracks on issues and risks, and Escalations as needed
Done appropriate unit-level testing of work products and the management of the review process for the AbInitio deliverables.
Involved in preparing HLD, LLDs. Extensively involved in EME concepts. Preparing the test data to test the developed components
Designed and developed graphs by using AbInitio
Developed complex generic and conditionalized AbInitio graphs with emphasis on optimizing performance.
Environment: AbInitio ETL Tool, DB2, and UNIX, Control-M
CITIBANK, USA May 2009 - May 2010
Sr. Developer
Responsibilities:
Created Abinitio graphs, utilizing Rollup, Join, Sort, Replicate, Partition components to speed up the ETL process
Coordinates with other team members to ensure that all work products integrate together as a complete solution and adopts a supporting role to any other team member to resolve issues, or to complete tasks sooner
Involved in preparing HLD, LLDs. Extensively involved in EME concepts. Preparing the test data to test the developed components
Designed and developed graphs by using AbInitio.
Developed complex generic and conditionalized AbInitio graphs with emphasis on optimizing performance
Done appropriate unit-level testing of work products and the management of the review process for the AbInitio deliverables.
Tracks and reports on issues and risks, escalating as needed
Expertly handles last minute requests and stressful situations
Environment: AbInitio ETL Tool, Oracle 9i, Unix
H&M, Sweden Oct 2008 - Mar 2009
Sr. Developer
Responsibilities:
Responsible for designing and developing various applications in the project. Analyzing the requirement and development of graphs and scripts to get appropriate results.
Testing of the graphs and application.
Involved in Amendments of Graphs.
Done appropriate unit-level testing of work products and the management of the review process for the AbInitio deliverables.
Tracks and reports on issues and risks, escalating as needed
Expertly handles last minute requests and stressful situations
Environment: AbInitio ETL Tool, Oracle 9i, UNIX
General Electric, USA Dec 2006 - Sep 2008
IM Programmer
Responsibilities:
Creation of AbInitio development work products that conform to the stated business requirements and high-level design documents.
Created AbInitio graphs, utilizing Rollup, Join, Sort, Replicate, Partition components to speed up the ETL process
Coordinates with other team members to ensure that all work products integrate together as a complete solution and adopts a supporting role to any other team member to resolve issues, or to complete tasks sooner
Done appropriate unit-level testing of work products and the management of the review process for the AbInitio deliverables.
Environment: AbInitio ETL Tool, Oracle 9i, UNIX