Data Engineer

Location:

Posted:

March 31, 2018

Resume:

Accomplished IT Professional with * years of IT experience in design and implementation of data warehousing, analytics, data integration and big data projects. This includes more than 4 years of experience in Big data Technologies.

Strong communication and leadership skills with sound knowledge and practical experience in data warehousing concepts. A self-motivated hardworking team player with short learning curve and the constant zeal to learn more.

Professional Summary

DataWarehouse:

Successfully worked and implemented multiple end-to-end Projects independently on various database applications depending on business partners requirements.

Extensive Experience in data and dimension modeling, data warehouse and Ralph Kimball models with Star/Snowflake Schema Designs with analysis-definition, database design, testing, and implementation and Quality process.

Successfully worked on leveraging multiple ETL technologies like datastage, hadoop and teradata to develop ETL applications to achieve maximum efficiency and optimum resource utilization.

Consulted with business partners and made recommendations to improve the effectiveness of Big Data systems, descriptive analytics systems, and prescriptive analytics systems.

Proficient in creating new data collection systems that optimize data management, capturing, delivery and quality

Worked on Erwin for Conceptual/logical/Physical data model.

Midrange ETL:

Extensive experience in ETL/ELT methodologies supporting Data Migration, Data Transformation and Data Cleansing of structured, unstructured and semi structured data formats using ETL tools (Informatica Power Center, DataStage).

Bigdata

Working knowledge of Big Data Analytics, Cloudera distributed Hadoop ecosystems (Hadoop, Hive).

Experience in utilizing HIVE for working with data stored in the Hadoop file system (HDFS)

Sound understanding of Hadoop Architecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node, YARN and Map Reduce concepts.

Worked on developing big data solutions using Apache Hadoop ecosystem and tools like Hive, Spark, Sqoop, Pig, Kafka, Python, Oozie, Storm, Spark with AWS.

Designed and Implemented data streaming applications to produce and consume data feeds using Kafka and spark streaming.

Successfully built and migrated multiple legacy application into hadoop environment with added efficiency and performance.

Domain Expertise:

Successfully built, maintained and enhanced various federal compliance applications in baking sector for AML, OFAC, KYC and customer loyalty program in telecom sector.

Proven knowledge in building banking specific applications to ingest data from various LOB’s like GWM, CED and WCC to build customer’s holistic repository for enterprise reporting and reports for federal compliance.

Worked on other enterprise application in deposits, credit cards and loans LOB’s.

Built customer loyalty program for Verizon called MyRewards+ to ingest data from various transaction points to identity and award points for redemption.

Technical Skills:

BigData Tool : Hive, Pig, Oozie, Sqoop, Impala, Kafka, Scala, spark, NIFI, AWS, BDA, Cloudera distribution.

Programing : SQL, Unix Shell Scripting, Phyton

ETL Tools : IBM DataStage, Informatica

DataBases : Teradata 13.0, 14.1, MySQL, NoSQL MongoDB, Exadata

Schedulers : Autosys, Oozie, Tivoli

Versioning : GitHub, SVN tortoise

Certifications:

Cloudera Certified Hadoop Developer(CCDH-410) (License Number 100-014-320)

Teradata 12 Basics

Professional Experience:

BigData Engineer

Vantiv inc – Contractor, Cincinnati, Ohio, USA (Jan 2018 – Present)

Project: Enterprise data Office – Settlements

Roles & Responsibilities:

Ingest data from different sources to BDA to build Enterprise BigData data wareshouse.

Develop & Schedule Oozie (4.1) workflows for processing data.

Transform and analyze the data using Spark, HIVE, PIG based on ETL mappings

Perform L1 L2 production support to address user tickets and annual DR activities.

Use datastage, BDA, BDM and Teradata to perform ETL and prepare data lakes for various domains

Extract data to HDFS from Teradata using Sqoop (1.4) for settlement and billing domain.

Application performance tuning to optimize resource and time utilization.

Design application flow and implement end to end from gathering requirements, Build Code, perform testing and implement in production.

Teradata Consultant

Verizon – Contractor, Temple Terrace, Florida, USA (May 2016 – Dec 2017)

Project: aEDW - MyRewards Social Gifting - Consumer and Business

Roles & Responsibilities:

Perform End-to-end development activities, from requirement gathering and analysis, to system design, coding and testing.

Develop various Teradata utilities like Mload, Bteq, FastExport etc. for required data transfer of various applications.

Query Tuning and Index optimization for various complex SQL queries in production.

Extract data to HDFS from Teradata/Oracle using Sqoop (1.4) for customer journey index application which feeds data to customer churn model.

Import/Export data with HDFS from Amazon Redshift(AWS) using Sqoop

Load data files from UNIX server to HDFS for loading into HIVE.

Transform and analyze the data using Spark, HIVE, PIG, NIFI in cloudera distribution.

Build and automate applications using spark to process data.

Perform performance tuning on HIVE by using concepts like partitioning and merge multiple small files etc.

Ingest (import/export) data to and from hdfs into rdbms using Sqoop for different kinds of file formats

Develop & Schedule Oozie (4.1) workflows for processing data.

Data Engineer

TCS – Bank of America Relations, Jacksonville, Florida, USA (Oct 2013 – Apr 2016)

Project: The W - PRDS ADS Application Development

Roles & Responsibilities:

Gathered requirements from client partners for application Development and implement end to end project with zero defects.

Implemented slowly changing Dimension logics in the mapping to effectively handle change data capture which is typical in data warehousing systems.

Build and develop Logical and Physical data models for enterprise data warehouse for easy access and optimum storage of various domain data.

Developed automated process to perform data quality checks between two systems and generate quality reports via data stage and Hadoop technologies.

Leveraged multiple ETL technologies like DataStage, Hadoop and Teradata to develop ETL applications to achieve maximum efficiency and optimum resource utilization.

Build Data Model by analyzing the table structures to reduce Data Redundancy.

Involved in Several POC on Cloudera Hadoop converting small, medium, complex legacy functionality into Hadoop

Develop & Schedule Oozie (4.1) workflows for processing data.

Load data files from UNIX server to HDFS for loading into HIVE database.

ETL Midrange and Teradata Applications Developer

TCS – Bank of America Relations, Chennai, India (Sep 2010 – Sep 2013)

Project: The W - Hadoop Enhancements and Production Support

Roles & Responsibilities:

Managed around 60 applications in Teradata Data Warehouse (Called The W) across different LOBs under ECIO domain in Global Technologies and operations.

Planning and execution of teradata hardware/Software upgrades, Disaster Recovery exercises and technology Migrations across the years and coordinated the recovery for enterprise Applications post upgrade to bring the system back to BAU.

Worked on application enhancements running in production environment to improve efficiency involving better run duration and resource utilization.

Performed ETL operations on raw data from various sources namely mainframes, DataStage and Hadoop.

Address user tickets and queries on the data and the applications with the business logic behind them.

Worked as support level 2 & 3 analyst in DataStage and Informatica platform applications. Provided 24/7 support which involves responsibilities of resolving the abends under predefined SLA’s.

Did performance analysis on both DataStage and target Teradata systems by analyzing for various bottlenecks and implementing indexes, collect stats and query tuning.

Worked on various value add automations resulting in saving both in terms of man-hours as well as the cost savings.

Query Tuning and Index optimization for various complex SQL queries in production.

Education:

Bachelor of Engineering in Electronics and Communication from SRM University, Chennai, Tamil Nadu, India.

Contact this candidate