Post Job Free
Sign in

Data Engineer

Location:
Piscataway, NJ
Posted:
August 03, 2018

Contact this candidate

Resume:

Bharath Kumar Gondi

Phone: 732-***-**** Email: ************@*****.***

Professional Summary:

3+ years of experience in IT Industry involving Software Analysis, Design, Implementation,

Coding, Development, Testing and Maintenance with focus on Data warehousing applications

using ETL tools like Talend and Informatica.

3+ years of experience using Talend Integration Suite (6.1/5.x) / Talend Open Studio (6.1/5.x)

2+ years of experience with Talend Admin Console (TAC).

Experience working with Data Warehousing Concepts like Kimball/ Inmon methodologies,

OLAP, OLTP, Star Schema, Snow Flake Schema, Logical/Physical/ Dimensional Data Modeling.

In depth understanding of the Gap Analysis i.e., As-Is and To-Be business processes and

Experience in converting these requirements into Technical Specifications andTest Plans.

Highly Proficient in Agile, Test Driven, Iterative, Scrum and Waterfall software development life

cycle.

Extensively used ETL methodology for performing Data Profiling, Data Migration, Extraction,

Transformation and Loading using Talend and designed data conversions from wide variety of

source systems including Amazon Redshift, Hive, Amazon Aurora and relational sources like flat files, XML and Mainframe Files.

Experience in analyzing data using HiveQL and Pig Latin in HDFS.

Extracted data from multiple operational sources for loading staging area, Data warehouse, Data

Marts using SCDs (Type 1/Type 2/ Type 3) loads

Technical Skills:

Technologies: Amazon EC2, S3, Simple DB, RDS, Elastic Load Balancing, SQS, Elastic IPs, EBS, Virtual Private Cloud (VPC), Elastic Beanstalk, RDS, Dynamo DB, Redshift, Hadoop Map Reduce, Hive, Pig, Sqoop, Spark, Java, SQL

Data Warehouse/Data Marts: ETL/ELT Development with big data environment, Dimensional Modeling

Repositories: GIT and SVN

Tools: Talend, Informatica BDE, Informatica PC, SQL Server, Aginity, SQL Workbench, Tableau, Eclipse IDE, Jenkins, Kanban board

Monitoring tools: Talend Administrator Console (TAC), Cloud Watch, Packet tracer

Software Development Methodologies: Agile using Scrum and Kanban

Operating Systems: Linux, Windows

Certifications/Licenses

AWS Certified Solutions Architect- Associate

Validation Number 5W3FB8X2CMQ11E5N

Validate at: http://aws.amazon.com/verification

AWS Certified Developer- Associate

Validation Number NB5EJ3D1KFF41ZCL

Validate at: http://aws.amazon.com/verification

AWS Certified Sysops Administrator- Associate

Validation Number BR2X67RKCMEEQHKC

Validate at: http://aws.amazon.com/verification

Talend 6.3 DQ Essentials (DQ-201-v63-EN)

Talend 6.3 MDM Basics (MDM-301-v63-EN)

Professional Experience:

Akorn pharmaceuticals, lake Forest, IL August 2017-Present

Talend ETL Data Integration Developer/Data Engineer

Involved in the project and worked from data staging till saturation of DataMart and reporting

Extracted data from various data sources into EC2 instance, Amazon S3

Processed data with Audit’s, placed into redshift DB

Transformed the data into the refined schema, built a data lake with Dimensional model using Slowly Changing Dimensions (SCDs)

Created Talend ETL mappings using components: tMap, tJoin, tReplicate, tParallelize, tConvertType, tflowtoIterate, tAggregate, tSortRow, tFlowMeter, tLogCatcher, tRowGenerator, tNormalize, tDenormalize, tSetGlobalVar, tHashInput, tHashOutput, tJava, tJavarow, tAggregateRow, tWarn, tLogCatcher, tMysqlScd, tFilter, tGlobalmap, tDie

Extracted the data from refined schema into the Reporting Mart which will be used for the reporting purpose

Involved in setting up the development and production instances for the Talend jobs, involved in the deployment of the jobs into the production environment

Created the jobs in job conductor of TAC and created execution plan for the jobs to run using cron trigger

Involved in performance analysis, monitoring and SQL query tuning using Collect Statistics

Regeneron pharmaceuticals, Tarrytown, NY Oct 2016- July 2017

Talend ETL Data Engineer

Involved in developing an ingestion frame work for ingesting the data from different data sources, different formats. Transformed data into a standard format

Designed a data lake (4 tier architecture) to ingest data from source to target

Involved in implementing Talend orchestration pipeline

Developed an Event driven mechanism (Amazon Lambda) which invokes the Talend job to place the data from one zone to another in the data lake

Performed data Quality and standardization while extracting data from one zone to another

Developed transformation jobs, placed data into refined schema using Talend components

Contributed in setting up the Development and Production instances in AWS environment and involved in setting up GitHub repositories across the environments

Performed Parameterization such as ETL control tables (Audit, Error) and unit testing

Created end user scenarios which have viewing the dashboard, searching for data, viewing details about the data, Editing the metadata, comments, tags using Tableau

Took initiative to create data lineage for the data catalog registration

Performed Backend data validations between the tableau reports and database schemas using SQL queries

Blue Cross Blue Shield Michigan: Jan2016- September 2016

Big data Engineer

Involved in gathering business requirements and analysis of business use cases

Designed and developed SQOOP scripts for datasets transfer between Hadoop and RDBMS

Developed Pig scripts to do data cleansing, transformations, event joins, filter and some pre- aggregations

Developed Pig Latin UDFs in Java and used various UDFs from Piggybanks and other sources to customize pig scripts

Created Hive tables, loaded with data and written Hive queries to analyze the data

Worked on implementation and maintenance of Cloudera Hadoop cluster

Performed extensive data validation using HIVE Dynamic Partitioning and Bucketing for efficient data

access.

Produced a quality Technical documentation for operating, complex configuration management, architecture changes and maintaining HADOOP Clusters

Used Jira for project tracking, Bug tracking and Project Management

Involved in Scrum calls, Grooming and Demo meeting

THETAKE, NY Feb 2015 - May 2015

Intern

Involved in creation of Infrastructure build and User guide documents and Knowledge transfer to new Team Members

Performed Disaster Recovery Test for applications in a seamless manner along with DC-DR failover activities

AMAZON SOFTWARE, San Francisco, CA oct 2014 - Dec2015

Intern

Built mini-player for the Amazon Music for PC/Mac that involved making changes to cross platform native C++/Objective C code that interact with the native OS layers (Windows and Mac)

Implemented toggling between mini player and full player using buttons, keyboard shortcuts and window dragging, and rich User interfaces are implemented using SASS features, backbone models with require.js and underscore libraries

Reached medal podium in Amazon Global Intern Hackathon Event out of 27 teams

Education:

New York University Polytechnic School of Engineering, Brooklyn, NY January 2014 - December 2015

Master of Science in Computer Science

Visvesvaraya Technological University, Bangalore, India September 2008 -June 2013

Bachelor of Engineering in Computer Science



Contact this candidate