Big Data Engineer

Location:

West Bloomfield Township, MI

Posted:

October 12, 2020

Contact this candidate

Resume:

Suraj Reddy Janampally

Big Data Developer

**** *********,

West Bloomfield,MI.48322.

734-***-****

**********@*****.***

SUMMARY

●4+ years of experience in IT industry with strong emphasis on Hadoop Ecosystem tools like Map Reduce, HDFS, Yarn, Sqoop,Hive, HBase,Zookeeper,Oozie, Spark,Impala,AWS, Kafka.

●Worked on Spark components such as Spark Core, Spark SQL, Spark Streaming, RDD, DataFrames, Datasets in Python and Scala Language to run jobs on Hadoop cluster

●Experience in different Hadoop distributions such as Horton Works Distributions (HDP 2.6, 2.7, 3.1).

●Created real time data streaming solutions using Apache Spark Core, Spark SQL & Data Frames, Spark Streaming .

●Created Data pipelines using Python to transfer the data from one system to another.

●Performed Production support for large scale applications to resolve late arriving facts, latency issues, memory issues etc.,.

●Experience with Sequence files,Parquet,JSON, AVRO and HAR file formats and compression

●Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems and vice- versa.

●Experience in creating Map Reduce codes in Java as per the business requirements.

●Skilled in creating workflows using Oozie for scheduling daily, weekly and monthly jobs.

●Experience in working with the columnar NOSQL databases like Hbase, Cassandra to manage extremely large datasets.

●Experience in creating MapReduce codes in Java as per business requirements.

●Implemented and utilized Hadoop components in AWS such as Redshift,EMR,EC2,ElasticSearch

●Transferred data from Hadoop to store in S3 buckets for cloud storage.

●Implemented CI/CD/TA pipeline using Jenkins to automate the jobs in the cluster.

●Implemented Linux Shell Scripts in to perform certain monitoring operations in Hadoop clusters

●Experience in writing custom UDFs and UDAFs for extending Hive and Pig functions

●Hands on experience with message broker such as Apache Kafka.

●Experience in all stages of SDLC (Agile,Waterfall) writing design document,Development,Testing and Implementation of Enterprise level Data warehouse.

EXPERIENCE

Ford Motor Company, Dearborn - Big Data Developer

Sept 2016 - July 2020

●Monitored around 470 tasks ensuring proper maintenance standards to reduce data loss and reduce downtime.

●Worked closely with Oracle and SQL DBA teams to resolve difficult production incidents which helped in maintain production standards.

●Launched jobs using Attunity, Hive and Spark to load data into HDFS to provide a transformed data in order to cater to downstream applications.

●Performed QC Testing on multiple sources to help launch before Iteration deadlines successfully.

●Operated in Big data using REST API technology.

●Performed Hive optimization techniques such as vectorization, bucketing, Partitioning etc.,

●Worked closely with Data Operations team to complete landing of additional tables and removal of unwanted tables to help improve the streamlining of data.

●Write UDFs in Python to remove inconsistencies in data stored while using data stored in Amazon Redshift tables.

●Utilized Jenkins scripts to perform routine development and QC Testing activities faster.

●Was a part of the team to successfully transition from Agile methodology to Scrumban to deliver faster.

●Worked in an Agile environment which delivered output every iteration of two weeks

●Developed DTSF and GDPR jars to help meet privacy guidelines and process related improvements.

●Performed Upgrades on Java Framework of our Team for Business requirement, Hadoop upgrades etc.,

●Resolved multiple production incidents and monitored the Hadoop jobs while dealing with multiple teams and maintaining daily deadlines.

●Performed transformations on data to provide a refined data for ETL processing by using spark dataframes.

●Triggered ETL processes in Autosys to send the data from Hadoop to AWS Cloud.

●Launched many critical sources of the project to help meet the deadlines

●Built Data pipelines in Python to transfer data from RDBMS.

●Completed a POC on trying to develop a CI/CD module using Jenkins to seamlessly integrate development with testing for improving delivery times.

●Utilized Data transfer tools such as sqoop and Flume to transfer data.

●Installed, configured and utilized AWS components such as EC2,S3,EMR,Redshift, Kinesis as part of data engineering work

●Worked with Business Analysts and developers to help obtain the solution for the complex requirement.

●Created CI/CD/TA pipelines using Jenkins to increase flow of data from start to end.

●Worked on both External and Managed HIVE tables for optimized performance.

●Performed transformations on Structured and unstructured data.

●Performed Sqoop operation in GDPR jobs to obtain data from GDPR Sql Server Tables.

●Using Oozie and Falcon to schedule jobs in production to manage the data flow.

●Utilized version control software like Accurev to promote and store code in the repository

EDUCATION

Clemson University, Clemson,SC - Masters in Mechanical Engineering

August 2014 - May 2016

Osmania University, Hyderabad,IN - Bachelors in Mechanical Engineering

Oct 2010 - June 2014

Contact this candidate