Post Job Free
Sign in

Data Entry Mainframe Developer

Location:
Atlanta, GA
Posted:
February 09, 2023

Contact this candidate

Resume:

Senior Data Engineer

E-mail: *********.*******@*****.***

Mobile: +1-470-***-****

PROFESSIONAL SUMMARY

Around 10 years of professional experience in Information Technology and around 5 years of expertise in BIGDATA using HADOOP framework and Analysis, Design, Development, Testing, Documentation, Deployment, and Integration using SQL and Big Data technologies.

Expertise in using major components of Hadoop ecosystem components like HDFS, YARN, MapReduce, Hive, Impala, Pig, Sqoop, HBase, Spark, Spark SQL, Flume, Oozie, Zookeeper, Hue.

Good understanding of distributed systems, HDFS architecture, Internal working details of MapReduce and Spark processing frameworks.

Knowledge of ETL methods for data extraction, transformation and loading in corporate-wide ETL Solutions and Data Warehouse tools for reporting and data analysis.

Deployed the Big Data Hadoop application using Talend on cloud Microsoft Azure.

Involved in creating external Hive tables from the files stored in the Azure ADLS.

Good knowledge in AWS cloud services like Amazon S3, Glu, EC2 and redshift.

Optimized Hive tables utilizing partitions and bucketing to give better execution Hive QL queries.

Used Spark-SQL to read data from hive tables and perform various data cleansing, data validations, transformations, and aggregations as per down stream business team requirements.

Worked extensively with Data Science team to help productionalize machine learning models and to build various feature datasets as needed for data analysis and modelling.

Hands on experience on Scala programming to implementation spark scala code.

Good working knowledge with RDD’s and Data Frames.

Worked with different RDD transformations and actions in order to transform data.

Involved in Spark Sql integration with Hive in order to work with hive tables and process the data quickly in spark

Experience in complete project life cycle (design, development, testing and implementation) of Client Server and Web applications.

Excellent programming skills with experience in Core Java, C, SQL, and Python Programming.

Worked on various programming languages using IDEs like Eclipse, IntelliJ, Putty and GIT.

Experienced in working in SDLC, Agile and Waterfall Methodologies.

EDUCATION

Masters in Computer Application (M.C.A.) from Bharath University, India in 2010.

Bachelor of Sciences (Math, Physics & Computers) from Nagarjuna University, India in 2007.

TOOLS AND TECHNOLOGIES

BigData/Hadoop Technologies

MapReduce, Spark, SparkSQL, Azure, Spark Streaming, Kafka, Pig, Hive, HBase, Flume, Yarn, Oozie, Zookeeper, Hue, Ambari Server

Languages

C, C++, Core Java, Scala, Python, Shell Scripting

NO SQL Databases

Cassandra, HBase, MongoDB

Web Design Tools

HTML, JavaScript, XML

Development Tools

Microsoft SQL Studio, IntelliJ, Azure Databricks, Eclipse.

Public Cloud

Azure ADLS, Data Factory, Data Bricks

Orchestration tools

Oozie, Airflow, Azkaban, Control M, DSeries

Development Methodologies

Agile/Scrum, Waterfall

Build Tools

Jenkins, SQL Loader, Talend, Maven, Control-M, Oozie, Hue

Reporting Tools

MS Office (Word/Excel/Power Point/ Outlook)

Databases

Postgre SQL,DB2, MySQL 4.x/5.x, Oracle 11g, 12c, Teradata

Operating Systems

All versions of Windows, UNIX, LINUX.

PROJECT #:

Project Name : Visa

Client : Visa, Austin

Environment : Apache Hadoop, Hotonworks

Tools : Hive, Sqoop, Spark, Scala, Control M

Duration : May 2022 to Present

Role : Bigdata Engineer

Description:

Visa is handling different applications, that application data we are processing.

Roles and Responsibilities:

Involved in analyzing the system and business requirements.

Involved in gathering the requirements, designing, development and testing.

Worked with sqoop import commands inorder to ingest data from sql server to Hadoop.

Created Hive external tables in publish layer and loaded data into those tables.

Worked with hive optimization techniques like partitioning and bucketing to improve the query performance.

Developed Spark code using Scala and Spark-SQL/Streaming for faster testing and processing of data.

Analyzed the SQL scripts and designed the solution to implement using Scala.

Used Spark-SQL to Load JSON data and create Schema RDD and loaded it into Hive Tables and handled structured data using Spark SQL.

Implemented Spark Scripts using Scala, Spark SQL to access hive tables into Spark for faster processing of data.

Tested Apache Tez for building high performance batch and interactive data processing applications on Pig and Hive jobs.

Environment: Hadoop (HDFS, Map Reduce), Scala, Yarn, Spark, Hive, Pig, Azure ADLS, ADF, Pyspark, Hue, Sqoop, Oracle, Postgre SQL,NIFI, Git, Gerrit, Jenkins, Control M, DSeries, Jira

PROJECT #:

Project Name : Customer Data Insights

Client : Synchrony Financial, Chicago

Environment : Apache Hadoop, Hotonworks

Tools : Hive, Sqoop, Spark, Scala, Azkaban

Duration : April 2019 to May 2022

Role : Bigdata Engineer

Description:

Customer Data Insights project deals about customer data with regards to customer complaints on transactions happened on credit cards. It carries all information about customer bank account, credit card information and credits, debits.This project is migration project from Data Ware house to Data Lake. Using Abnitio graphs we have to do reverse engineering process and develop the code in spark for data ingestion into Data lake.

Roles and Responsibilities:

Involved in Implementation of the generic framework for data onboarding and processing.

Responsible for onboarding the data into HDFS from Different source system which includes File systems and RDBMS using the framework.

Responsible for maintaining / organizing the data in different layers of centralized data lake.

Building code in spark from existing Abnitio graphs.

Meeting the business requirements and do unit testing for developed code.

Code Integrations using Git bash and Bit bucket.

Jenkins build.

Created various hive external tables, staging tables and joined the tables as per the requirement. Implemented static Partitioning, Dynamic partitioning, and Bucketing.

Worked with various HDFS file formats like Parque, ORC, Json for serializing and deserializing.

Worked with the Spark for improving performance and optimization of the existing algorithms in Hadoop using Spark Context, Spark-SQL, Azure, PySpark, Pair RDD's, Spark YARN.

Used PySpark for interactive queries, processing of streaming data and integration with popular NoSQL database for huge volume of data.

Environment: Hadoop (HDFS, Map Reduce), Scala, Yarn, Spark, Hive, Pig, Azure ADLS, ADF, Pyspark, Azure Databricks, Mongo DB, Control M, HBase, Hue, Sqoop, Oracle, Postgre SQL,NIFI, Git, Gerrit, Jenkins, Tonomy, Jira

PROJECT #:

Project Name : Barclay’s – ABSA Power curve

Client : ABSA, Johannesburg

Environment : Apache Hadoop, Hotonworks

Tools : Hive, Sqoop, PySpark, Scala, Azkaban

Duration : Sep 2018 to April 2019

Role : Data Engineer

Description:

From different external vendor’s data is coming to Power Curve which is accommodated in Sql Server (2014). We are migrating that data to Hadoop environment in different layers like Raw, Published and Insight layers etc and building data lake.

Roles and Responsibilities:

Involved in analyzing the system and business requirements.

Involved in gathering the requirements, designing, development and testing.

Worked with sqoop import commands inorder to ingest data from sql server to Hadoop.

Created Hive external tables in publish layer and loaded data into those tables.

Worked with hive optimization techniques like partitioning and bucketing to improve the query performance.

Developed Spark code using Scala and Spark-SQL/Streaming for faster testing and processing of data.

Analyzed the SQL scripts and designed the solution to implement using Scala.

Used Spark-SQL to Load JSON data and create Schema RDD and loaded it into Hive Tables and handled structured data using Spark SQL.

Implemented Spark Scripts using Scala, Spark SQL to access hive tables into Spark for faster processing of data.

Tested Apache Tez for building high performance batch and interactive data processing applications on Pig and Hive jobs.

Exploring with Spark to improve the performance and optimization of the existing algorithms in Hadoop using Spark context, Spark-SQL, postgreSQL, Scala, Data Frame, Impala, OpenShift, Talend, pair RDD's.

Environment: Hadoop (HDFS, Map Reduce), Scala, Yarn, Spark, Hive, Pig, Mongo DB, Control M, HBase, Hue, Sqoop, Postgre SQL, NIFI, Git, Gerrit, Jenkins, Tonomy, Jira

PROJECT #:

Project Name : Marketing Transformation Program (MTP Ref-60)

Client : KOHL’s, California

Environment : Apache Hadoop, Google Cloud Platform

Tools : Apache Pig, Hive, Sqoop, Spark, Scala, Azkaban

Duration : Dec 2016 to Aug 2018

Role : Hadoop Developer

Description:

MTP Ref-60 is migration project from different platforms to Google Cloud Platform. Main aim of this project is move all the retail sector traditional data of outlets which is in the form of semi structured into cloud platform through performing cleansing and transformations in Hadoop environment. After that data will be pushed into Google Cloud Platform where we would be going to do Big Query and Big Table operations for analytical purpose.

Roles and Responsibilities:

Worked with hive optimization techniques like partitioning and bucketing to improve the query performance.

Written UDF’s in Hive based on the business requirements.

Developed Spark code using Scala and Spark-SQL/Streaming for faster testing and processing of data.

Analyzed the SQL scripts and designed the solution to implement using Scala.

Used Spark-SQL to Load JSON data and create Schema RDD and loaded it into Hive Tables and handled structured data using Spark SQL.

Implemented Spark Scripts using Scala, Spark SQL to access hive tables into Spark for faster processing of data.

Tested Apache Tez for building high performance batch and interactive data processing applications on Pig and Hive jobs.

Worked with HCatalog Loader and Storer to bring hive tables data into pig for processing and again to store in hive.

Created Hive queries that helped market analysts spot emerging trends by comparing fresh data with EDW reference tables and historical metrics.

Worked with different RDD transformations and actions in order to transform data.

Involved in Spark Sql integration with Hive in order to work with hive tables and process the data quickly in spark.

Completely involved in the requirement analysis phase.

Environment: Hadoop (HDFS, Map Reduce), Scala, Yarn, Spark, Hive, Pig, Mongo DB, Control M, HBase, Hue, Sqoop, Oracle, NIFI, Git, Gerrit, Jenkins, Tonomy, Jira

PROJECT #:

Project Name : Apple

Client : Apple – California

Environment : Hadoop, Apache Pig, Hive, Sqoop, Spark, Scala, Unix, Mysql

Duration : Oct 2015 to Aug 2016

Role : Hadoop Developer

Description:

Apple account has many Hadoop projects like iTunes Store, App Store and iBook store. Our team is responsible for much of iTunes’ big data infrastructure, as well as designing and delivering key systems powering iTunes Radio, iTunes Charts and many other personalized and cloud features of the iTunes ecosystem. Our work covers the full stack from iTunes’ internet-facing services (public HTTP services), internal services used by customer features (internal RPC APIs); design and implementation of data pipelines/lifecycles; Hadoop infrastructure, strategy and implementation; distributed key-value storage and putting all this together to operate live customer-facing features.

Roles and Responsibilities:

Involved in analyzing the system and business requirements.

Involved in gathering the requirements, designing, development and testing.

Written PIG Scripts based on the requirement.

Developed UDF (User Defined Functions) in Hive using Java and created temporary functions in order to execute them.

Experience in analyzing data using HIVEQL and custom Map Reduce programs in Java.

Worked with Hive data warehouse tool like creating tables, data distribution by implementing partitioning and bucketing.

Data injection from Mysql to HDFS through Sqoop.

Worked with Spark RDD’s like creating different transformations and actions in Scala prompt.

Environment: Hadoop (HDFS, Map Reduce), Scala, Yarn, Spark, Hive, Pig, Mongo DB, Control M, HBase, Hue, Sqoop, Oracle, NIFI, Git, Gerrit, Jenkins, Tonomy, Jira

PROJECT #:

Project Name : Image Plus

Client : CIGNA, Bloomfield

Environment : Core Java, Servlets, DB2

Duration : Dec 2013 to Nov 2014

Role : Java Developer

Description: Image plus is a desktop application used by the claim services operations of Cigna Healthcare for logical work distribution and inventory management (similar to work flow engine). iTrack uses RBAC (Role Based Access Control) mechanism to authenticate users and restrict the functions that can be performed by them. It comprises of batch and online applications.

Roles and Responsibilities:

Code Review and preparation of the unit test documents.

Fixing the bugs, prepared and reviewed test cases.

Environment: Core Java, Mainframe, DB2, JCL, ESP

PROJECT #:

Project Name : Proclaim

Client : CIGNA, Chattanooga

Environment : COBOL, JCL, DB2, VSAM

Duration : Sep 2011 to Nov 2013

Role : Mainframe Developer

Description: The project involved a sophisticated on-line health claim processing system developed by CIGNA Health Care. Numerous changes and rapidly increasing costs within the health care industry demanded claims processing system that could meet traditional requirements for accuracy, control, speed, data collection & retention, which could accommodate changing benefit designs & cost containment provisions. Proclaim was alienated into 5 major domains; Acquisition, Pre-processing, Adjudication, Post Processing and Inquiry.

Roles & Responsibities:

As a Team Member, responsible for

Understanding the business requirements sent by the Client

Impact Analysis and Prepare Estimations

Design and UTC preparation and getting approvals

Coding as Per coding standards

Build test data and execute test cases during unit testing

Execution of test cycles in different regions and resolve the abends.

Environment: Mainframe, COBOL, DB2, JCL, ESP, CICS

PROJECT #:

Project Name : DENTACOM

Client : CIGNA, Chattanooga

Environment : COBOL, JCL, DB2, VSAM

Duration : SEP 2010 to AUG 2011

Role : Mainframe Developer

Description: Dentacom is a Dental Claim paying system that was written about 20 years ago. It has an online portion that performs claim payments, data entry, inquiry and update functions. In addition, a batch cycle is run nightly whenever the online is up during the day. Currently the online is up Monday through Saturday and every holiday except Christmas.

Roles & Responsibities:

As a Team Member, responsible for

Monitor the cycle while the batch is running.

Able to fix production abends.

Involved in application maintenance tasks including table Updates.

Involved in running Test Cycles in different regions and able to run reports.

Environment: Core Java, Mainframe, DB2, JCL, ESP



Contact this candidate