Manager Data

Location:

Alpharetta, GA

Posted:

July 24, 2020

Contact this candidate

Resume:

Sreenivas Kamireddy

Charlotte, NC, *****

E-mail: **********@*****.*** Mobile: 813-***-****

SUMMARY:

Having 13 + years of IT Experience in Big Data, Data Science and Database technologies.

Around 5 years of working experience on Hadoop Admin

Strong technical expertise in Hadoop Administration (BIG DATA Administration) with Hadoop 3.1 Multi-Node cluster Set up, Name node (HDFS) High Availability, HDFS Federation, MRv2 and YARN Framework, Apache Oozie workflow scheduler, Hadoop cluster Implementation, Cloudera Manager, Hortonworks Ambari.

Deploying a big data solution using IBM spectrum conductor, IBM DSX (Data Science Experience), Power AI, R Studio, Watson ML, Snap ML, Xgboost.

Configure and setup Spark, Kafka, Jupyter Notebook, Superset, Zeppelin, Python, Shell Script, NiFi, Rstudio on HDP/CDH.

Configure Active Directory MIT Kerberos and SSL configuration on Cluster to provision authorization and for multiple users.

Enable High availability server architecture process for Name Services: Hadoop hdfs metadata, Hive services and resource managers to run business as usual without cluster down.

Upgrade HDP with minimum down time and work with HWX to resolve bugs/fixes for the HDP 3.1.0 / HDP 3.1.5.

Upgrade Ambari with minimum down time and work with HWX to resolve bugs/fixes for the Ambari 2.7.3/2.7.5.

Applied Vormetric’s for data level encryption that protects against unauthorized access by users and process.

Configured Ranger/Sentry to granular and role-based authorization module for Hadoop data.

Involved in Hadoop cluster tasks like Decommissioning and Commissioning nodes without any effect to running jobs and data.

Diligently working with the infrastructure, network, database, application, and business intelligence

teams to guarantee high data quality and availability.

Contribute to planning and implementation of new/upgraded hardware and software releases.

Research and recommend cluster performance tuning, innovative, and where possible, automated approaches for administration tasks. Identify approaches to efficiencies in resource utilization, provide economies of scale, and simplify support issues.

Supported weekend OS patch support and take care of Hadoop clusters high availability

Experience in all phases of Software Development Life cycle (SDLC)

Strong experience in Upgradation/Migration/Conversion projects.

Roles played: Hadoop Administrator (Big Data Admin), DB2 DBA, IMS/DB DBA, Technical Lead and Application Developer.

Interested/Willing to learn new technologies based on the business requirement (rapid adopting ability for the new technology).

Worked in both Highly-Structured and Less-Structured IT Development/Maintenance environments.

Ability to prepare technical design, project plan, unit test case development, test result capturing, status report documentation.

Technical Skills:

Hadoop / Big Data

HDP3.1.5, Hortonworks

Ambari, CDH5.2,ClouderaManager,HDFS, MS Azure, MapReduce, Yarn, Hive, Pig, HBase, Sqoop, Flume, Oozie, MIT Kerberos, Spark, Scala, Zookeeper, Ansible, Splunk, YUM, Solar, Kafka, Apache Ranger, Big top, Big Insights, Cloudera Manager, Nagios, Ganglia, Graphite, Strom, Unix, Shell Scripting, Mongo DB, CI CD, Git, Jenkins, Kubernetes, Trifacta.

IBM spectrum conductor, IBM WatsonML, Power AI, SnapML, Rstudio, jupyter Notebook, Zeppelin, Python, autosys, IBM DSX (Data Science Experience), Shell Script.

Cloud

Azure and AWS

Data Streaming

Kafka, Confluent Kafka, HL7

Programming languages

Spark, Scala, R Studio, Python, Pl/1, Cobol, Jcl, Sql, Rexx Asm, Pls, Clist, Shell Scripting

ETL

Informatica, Datastage, Cognos, Tableau, Data lake

Databases

Oracle, Postgresql, DB2, IMS/DB

NoSQL

HBase, Redis, MongoDB

Operating Systems

Centos, Red Hat Linux, UNIX, Windows, Ubuntu, MVS

Data Science

IBM DSX, IBM Spectrum, Power Artificial Intelligence

Machine Learning / AI

Xgboost, GPU, H2O, Sparkling Water, Snap ML, R, Python.

Work Experience:

Client: WELLS FARGO, Charlotte, NC June ’18 to Till date Hadoop Administrator

Scope: Corporate Model Risk (CMoR): Corporate risk helps all Wells Fargo businesses identify and manage risk. We focus on three key risk areas: credit risk, operational risk and market risk. We help our management and Board of Directors identify and monitor risks that may affect multiple lines of business and take appropriate action when business activities exceed the risk tolerance of the company.

Design Hadoop Cluster environment from the scratch to support Multi-tenant architecture process.

Build Hortonworks distribution Hadoop Cluster environment based on architecture design (number of master nodes, data nodes, client nodes and network setting)

Configure Active Directory/LDAP, MIT Kerberos and SSL configuration on Cluster to provision authorization and for multiple users.

Enable High availability server architecture process for Name Services: Hadoop hdfs metadata, Hive Services and resource managers to run business as usual without cluster down.

Enable Multiple 3rd party Plugins (Notebooks, IDE etc) for Data Science Experience (IBM DSX) and Machine Learning team members and monitor Cluster alerts, Node and data disk failures and perform decommission and commission servers

Install and administrate R studio, Anaconda Python, Spectrum, XGBoost and IBM Spectrum for Machine Learning and Power Artificial Intelligence (AI).

Analyzing Hadoop cluster and different Big Data analytic tools including Hive, HDFS, HBase, Spark and Scala.

Ingest data into Hadoop environment from the different sources.

Importing and exporting data into HDFS and Hive using Sqoop from HDFS to Relational Database Systems (RDBMS) and vice-versa.

Execute Agile/Scrum methodology to develop the application by performing everyday standups to discuss status update of the stories and roadblocks. Use Version One as agile project management software for managing the backlogs, stories, goals and incidents.

Deploy Cloud (Azure) cluster and protecting Azure Active Directory against compromise.

Backing up cloud VMS AD account against compromise.

Experience with administering and installing Apache NiFi

Designing Enterprise Ingestion Framework through NiFi

Using shell script setup cron job to clear logs

Setup Kafka and execute consumer through pyspark job on edge nodes.

Setup sentry on Cloudera (CDH) environment.

Environment: Redhat Linux, Hortonworks distribution (HDFS, YARN, Cloudera Manager, Ambari, Hive/HS2, Sqoop, Spark, NiFi, Hbase, Rstuido, Ranger etc),Kafka, Cloud, Git, Jenkins, Kubernetes, Tableau, Trifacta, PostgreSQL, MS Azure, Oracle, IBM spectrum conductor, R studio, jupyter Notebook, Zeppelin, Python, autosys, IBM DSX (Data Science Experience), Shell Script, IBM spectrum conductor, IBM WatsonMLA, Power AI and SnapML

Client: AT&T, Atlanta, GA. July ’16 to May’ 18

Role: Hadoop Administrator

Scope: Global Fault Platform-SMLS (GFP-SMLS) The parent application, GFP, provides fault management capabilities for all AT&T network and services. It is connected to the AT&T OAM networks

and provides a platform for alarm management and correlation for network equipment. This includes alarm collection from the network via SNMP traps, syslogs, and vendor EMSes; alarm processing functions, including de-duplication, filtering, aging, detecting flapping conditions and storm controls. Topology is used for alarm correlation and impact analysis.

Experience using Hortonworks platform and their eco systems, installing, configuring and using ecosystem components like MapReduce, HDFS, Hive, Sqoop and Flume.

Upgrade HDP / Ambari with minimum down time and work with HWX to resolve bugs/fixes for the HDP 2.5.3 / HDP 2.6.1.

Involved in Hadoop cluster task like Adding and Removing Nodes without any effect to running jobs and data.

Installed and configured Hadoop ecosystem like HBase, Flume.

Load log data into HDFS using Flume. Worked extensively in creating MapReduce jobs to power data for search and aggregation

Kerberos/LDAP skills Falcon, Ranger, Knox, Ambary - Deep knowledge w/practical exp.

Experience on Apache Knox Gateway security for Hadoop Clusters.

Developed workflow in Oozie to automate the tasks of loading the data into HDFS and pre-processing with Hive.

Involved in installing Hadoop Ecosystem components.

Experienced in providing security to Hadoop cluster with Kerberos and integration with LDAP/AD at Enterprise level.

Responsible to manage data coming from different sources.

Involved in HDFS maintenance and loading of structured and unstructured data.

Building massively scalable multi-threaded applications for bulk data processing primarily with Apache Spark and PIG on Hadoop.

Worked along with the Hadoop Operations team in Hadoop cluster planning, installation, maintenance, monitoring and upgrades.

Take customer data from data lake and create customer interaction tables in hive.

Accessing data from data lake and analyzing that data to improve LTE

Environment: HDP, Ambari, Hbase, Hive, Hortonworks, Pig, Kerberos, Kafka, Strom, Sqoop, Knox, Unix, Shell Script, Linux, Apache Ranger, Yarn, Apache Oozie workflow, Flume, Zookeeper

Client: Lenovo, Morrisville, NC. Sep ’15 to June’ 16

Role: Hadoop Administrator

Scope: Lenovo supplies replacement parts to Customer Engineers and Customers all over the world. Customer Engineers are the group of people who fix computers under warranty period.

Responsibilities:

Deployed multi-node development, testing and production Hadoop clusters with different Hadoop components (IMPALA, PIG, SQOOP, OOZIE, FLUME, HCATALOG, ZOOKEEPER) using CDH (Cloudera Manager).

Configured Capacity Scheduler on the Resource Manager to provide a way to share large cluster resources.

Deployed Name Node high availability for major production cluster and designed automatic failover control using zookeeper and quorum journal nodes.

Configured Oozie for workflow automation and coordination.

Good experience in troubleshoot production level issues in the cluster and its functionality.

Backed up data on regular basis to a remote cluster using distcp.

Regular Ad-Hoc execution of Impala and Pig queries depending upon the use cases.

Regular Commissioning and Decommissioning of nodes depending upon the amount of data.

Experience in Disaster Recovery and High Availability of Hadoop clusters/components.

Diligently teaming with the infrastructure, network, database, application and business intelligence teams to guarantee high data quality and availability.

Diagnose and resolve performance issues and scheduling of jobs using

Configured Fair scheduler to share the resources of the cluster.

Experience designing data queries against data in the HDFS environment using tools such as Apache Hive.

Imported data from MySQL server to HDFS using Sqoop.

Manage the day-to-day operations of the cluster for backup and support.

Used the RegEx, JSON and Avro SerDe’s for serialization and de-serialization packaged with Hive to parse the contents of streamed log data.

Writing shell scripts to automate the administrative tasks.

Implemented Hive custom UDF’s to integrate the Weather and geographical data with business data to achieve comprehensive data analysis.

Used Oozie workflow engine to manage interdependent Hadoop jobs and to automate several types of Hadoop jobs such as Java map-reduce, Hive and Sqoop as well as system specific jobs.

Worked along with the Hadoop Operations team in Hadoop cluster planning, installation, maintenance, monitoring and upgrades.

Environment: CDH, Cloudera Manager, HDFS, Hbase, Impala, pig, Kerberos, Kafka, Bigtop, Strom, Sqoop, Knox, Unix, Shell Script, Linux, Apache Ranger, Mongo DB, Splunk, Yarn, Apache Oozie workflow, Flume, Zookeeper, RegEx, JSON.

Client: Alacara, NYC, NY. March ’15 to Sep’ 15

Role: Hadoop Administrator

RDc (Referrance Data Customer).

Scope: The RD Customer project is actually an umbrella for a series of projects which mainly consist of RD Customer Application that are currently in steady state and various Micro Projects, which are development projects, revolving around Customer Database. The major RDC applications are AWF (Address Work File), SD (Strategic Direct), CT (Coverage Transformation), Transition and RDc Webservice.

Responsibilities:

Setup a Multi Node Cluster. Plan and Deploy a Hadoop Cluster using with Cloudera Manager.

Start, Stop and restart clusters and performed all administrative functions by Cloudera Manager

Performed in developing purge/archive criteria and procedures for historical.

Performance tuning of Hadoop clusters and Hadoop MapReduce routines.

Screen Hadoop cluster job performances and capacity planning

Monitor Hadoop cluster connectivity and security. Manage and review Hadoop

log files.

Diligently teaming with the infrastructure, network, database, application and business intelligence teams to guarantee high data quality and availability.

Created reports for the BI team using Sqoop to export data into HDFS and Hive.

Used sqoop for bringing in the raw data, populate staging tables and store the refined data in partitioned tables.

Created Hive queries that helped market analysts spot emerging trends by comparing fresh data.

Developed Hive queries to process the data for analysis by imposing read only structure on the stream data.

Performed minor and major upgrades, commissioning and decommissioning of data nodes on Hadoop cluster.

The Hive tables created as per requirement were internal or external tables defined with proper static and dynamic partitions, intended for efficiency.

Used Oozie workflow engine to manage interdependent Hadoop jobs and to automate several types of Hadoop jobs such as Java map-reduce, Hive and Sqoop as well as system specific jobs.

Worked along with the Hadoop Operations team in Hadoop cluster planning, installation, maintenance, monitoring and upgrades.

Adding, removing, or updating user account information, resetting passwords, etc.

Installing and updating packages using HL7.

Environment: Cloudera Manager, Hbase, HAWQ, Hive, Pig, Kerberos, Kafka, Sqoop, Nagios, Ganglia, Graphite Apache Ranger, Splunk, Yarn, Apache Oozie workflow scheduler, Flume, Zookeeper, RegEx, JSON, Spark

DB2 Administrator

LMS (Lease Management System).

Ricoh, Australia. Sep ’11 to February 15.

Scope: The LMS system handles the administration of leases written by IBM Australia Credit (IBMAC). This includes Ricoh lease entry and maintenance, cash entry and maintenance, lease payouts and partial payouts and lease terminations as well as accounting income calculations and tax income calculations

Responsibilities:

Performed performance, maintenance and utilities associated with each structure (REORG, LOAD, UNLOAD).

Performed in DB Extent addition in production and test databases as well.

Performed U1 refresh / create a new test environment 1st Sunday of every month

Performed storage pool balancing on required weekends

Performed Stored procedures, Functions and Triggers.

Develop and execute appropriate Data Definition Language (DDL) to support project.

Performed in developing purge/archive criteria and procedures for historical.

Created JCL to take a BACKUP for the daily balancing dataset.

Worked on REORG utility to reorg the database, DBRM, Package and Plans.

Created JCL for IMAGE COPY, UNLOAD DATABASE Recovery and BACKUP.

Environment:DB2, VM, QMF, SPUFI, IMS/DB, REXX, RTC, RPM, MQ series and TOD.

DB2 Administrator

TXA (Transaction Accounting).

EY GlobaI Tax, Canada. Dec ’09 to Aug’ 11

Scope:TA extracts feeder information from the EY Settle Pool and creates accounting entries. General ledger account codes are assigned to the TA transactions in the GLIM TA account coding process. The accounting entries created by TA are fed into GLIM which creates the interface files for CLS and CARS. GLIM makes it financial data with good format and send to IBM General ledger books.

Responsibilities:

Performed performance, maintenance and utilities associated with each structure (REORG, LOAD, UNLOAD).

Performed Stored procedures, Functions and Triggers.

Develop and execute appropriate Data Definition Language (DDL) to support project.

Created JCL to take a BACKUP for the daily balancing dataset.

Making changes in the base code to incorporate additional functionalities.

Reviewing code changes and testing updates.

Analysis of defects encountered in the project on monthly basis & sharing it with the team.

Responsibility As a team member I was involved in analyzing the Business requirements, preparing the Technical specification document and coding.

Taking up tasks independently from onsite, analyze them, clarify queries with onsite and deliver the completed task.

Environment: PL/1, CICS, JCL, COBOL, DB2, IMS/DB, REXX, ASM, PLS, RMDS,

SCLM, FM, RTC, RPM, MQ series.

DB2/COBOL Developer

DSM (Distributed Security Manager)

IBM Global Services, Southbury, Connecticut.

March ’06 to Dec ‘2009

Scope: DSM provides the security administration in a distributed heterogeneous environment. DSM/MVS strategy is to introduce a management layer. Administrator give their instructions to DSM, DSM drawing on its Knowledge Base of Information passes appropriate information to the products and facilities, translating the instructions into forms required by each registration product. The communication between the server and target system is handled by - the communication layer “Request distribution Manager” which ensures the delivery of messages and reactivates communication after a problem.

Responsibilities:

Gather requirements via discussions/meetings with users, IT Management.

24x7 support to production, provide resolution to night-time production job abends, attend conference calls with business operations, system managers if any issues in batch stream.

Develop solutions to the business under the direction of IT Management.

Serve on on-call rotation, handling overnight problem calls.

Develop database changes when applicable under the direction of I/T Management.

Analyze problems and solve on a day to day basis with appropriate 360-degree user communication.

Created regressive test plans and conduct unit testing

Work with Client I/T Management & staff for co-ordination of System and end user testing.

Participate in Client/Team meetings send regular status updates and Task assignments to the Team members.

Implementation of application changes as required, in accordance with SDLC policy.

Assess user inquiries and work with I/T management to determine priorities.

Create weekly status report and participation in team meetings.

Providing class room training related to technology & application to the team and new joiners.

Environment: PL/1, CICS, JCL, COBOL, DB2, REXX, ASM, PLS, DEM, OSP, CADP, IBM FA, FM.

Educational Qualifications:

M.C.A from University of Madras, India.

B.Sc (Computer Science) from S.V University, India.

Certifications: 1.Scrum Fundamentals certified

2.Certifited Microsoft Azure Administrator

https://www.linkedin.com/in/sreenivas-kamireddy-a0352b119/

Contact this candidate