Big Data

Location:

Queens, NY

Salary:

70/hr (Negotiable)

Posted:

June 16, 2020

Contact this candidate

Resume:

SHM RAKIBUDDIN

Cell: 347-***-****

adduxm@r.postjobfree.com

GC – No Visa Sponsorship Required - Open to Relocation

Professional Summary:

6+ years of professional experience in IT sector including extensive exposure in hadoop technologies and Linux administration.

Experience in implementing Data Warehousing/ETL solutions for different domains like financial, retail and media verticals.

Hands on experience in setting up, configuring Hadoop ecosystem components like MapReduce, HDFS, HBase, IMPALA, OOZIE, HIVE, SQOOP, PIG, SPARK, Zookeeper, SENTRY services.

Experience in planning, implementing, testing and documenting the performance benchmarking of Hadoop platform.

Helped in planning, development and architecture of Hadoop ecosystem.

Experience in both On-Premises and Cloud setup i.e. AWS.

Experience with securing Hadoop clusters by implementing Kerberos KDC installation, data transport encryption with TLS, and data-at-rest encryption with Cloudera Navigator Encrypt.

Experience on Design, configure and manage the backup and disaster recovery using Data Replication, Snapshots, Cloudera BDR utilities.

Implemented Role based authorization for HDFS, HIVE, and IMPALA using Apache Sentry.

Good Knowledge on Implementing and using Cluster monitoring tools like Cloudera Manager.

Experienced in implementing and supporting auditing tools like Cloudera Navigator.

Experience for building real-time data pipeline using Apache Spark.

Experience in implementing High Availability features for services like Namenode, HUE, IMPALA.

Participated in the application on-boarding meetings along with Application owners, Architects and helps them to identify/review the technology stack, use case and estimation of resource requirements.

Experience in documenting standard practices and compliance policies.

Participated in OS level patching, version upgrades of CDH5 and CDH6.

Experience in analyzing Log files, job failures, identifying root causes and taking/recommending course of actions.

Performed performance tuning and troubleshooting of MapReduce jobs by analyzing and reviewing job counters and application logs files.

Experienced in tuning performance yarn, hive, impala and spark.

Additional responsibilities include interacting with offshore team on a daily basis, communicating the requirement and delegating the tasks to offshore/on-site team members and reviewing their delivery.

Good Experience in managing Linux platform servers.

Effective problem solving skills and ability to learn and use new technologies/tools quickly.

Experienced in Data Ingestion projects to inject data into Data lake using multiple sources systems using Talend Bigdata and Open Studio.

Good scripting knowledge in Bash shell scripting.

Experience in working on JIRA, SUCCEED and SERVICE-NOW tools for change management and support processes.

Experience in providing 24x7X365, on-call and weekend production support.

Experience:

Hadoop Administrator

OrthoNet LLC

June, 2019 – present

White Plains, N.Y

Experience on designing and implementing Hadoop cluster including adding and removing nodes using cluster monitoring tools Cloudera Manager, configuring the NameNode high availability and keeping a track of all the running hadoop jobs.

Identifying the best solutions / Proof of Concept leveraging Big Data & Advanced Analytics levers that meet and exceed the customer's business, functional and technical requirements.

Involved in upgrading the HDP components.

Installed and configured Hadoop Ecosystem components in Cloudera environments.

Setting up Name-node HA using QJM and Resource Manager HA for the Cloudera cluster.

Used Spark with Yarn and got performance results compared with MapReduce and use HDFS to store the analyzed and processed data scalability.

Involved in the Department of Spark Streaming for various data sources.

Responsible for Cluster maintenance, commissioning and decommissioning Data nodes, Cluster Monitoring, Troubleshooting, Manage and review data backups, Manage & review Hadoop log files.

Supported teams to troubleshoot Map Reduce jobs running on the cluster.

Day to day responsibilities includes solving developer issues, deployments, moving code from one environment to another environment, providing access to the new user, providing instant solutions for reducing the impact and documenting the same and preventing future issues.

Involved in loading data from Linux file system to HDFS.

Involved in creating Hive tables, loading data and writing hive queries depends on the requirements.

Followed standard Back up policies to activate disaster recovery mechanism of cluster.

Store unstructured data as key value pair in HDFS using HBase.

Design and Configure the Cluster with the services required (Sentry, Hive server2, Kerberos, HDFS, Hue, Hive, Zookeeper.

Responsible for troubleshooting issues in the execution of MapReduce jobs by inspecting and reviewing log files.

Performed change request and Incident management process following the company standards.

Implemented partitioning, bucketing, indexing and analyzing in HIVE.

Configured MySQL as external metadata store for different hadoop components and tools.

Continuous monitoring and managing the Hadoop cluster using Cloudera Manager and linux command line.

Demonstration of the Live Proof Of Concept Demo to Clients.

Supported different teams on call and on provisioned demand.

Environment: Cloudera CDH 5.x, HDFS, Hive, Spark, Impala, Zookeeper, Oozie, HBase, Sqoop, Python, MySQL, Talend.

Hadoop Engineer

Goldman Sachs

Jul, 2017 – May, 2019

New York City, N.Y

Working experience on designing and implementing complete end-to-end Hadoop Infrastructure including Pig, Hive, Sqoop, Oozie and Zookeeper.

Used Sqoop to migrate data to and fro from HDFS and My SQL or Oracle and deployed Hive and HBase integration to perform OLAP operations on HBase data.

Designed, planned and delivered a proof of concept and business function/division based implementation of a Big Data roadmap and strategy project.

Involved in loading and transforming large sets of structured, semi structured and unstructured data from relational databases into HDFS using Sqoop imports.

Involved in exporting the analyzed data to the databases such as Teradata, MySQL and Oracle use Sqoop for visualization and to generate reports for the BI team.

Worked on an Oozie scheduler to automate the pipeline workflow and orchestrate the Sqoop, hive and hbase jobs that extract the data in a timely manner.

The Hive tables created as per requirement were internal or external tables defined with appropriate static and dynamic partitions, intended for efficiency.

Transformed the data using Hive, Pig for BI team to perform visual analytics, according to the client requirement.

Developed scripts and automated data management from end to end and sync up between all the Clusters.

Implemented Fair schedulers on the Job Tracker to share the resources of the cluster of the Map Reduce jobs given by the users.

Environment: Cloudera CDH, HDFS, MapReduce, Hive, Oozie, Pig, Shell Scripting, MySQL.

Hadoop Consultant

Ameren Corporation

Dec, 2015-Jun, 2017

St. Louis, MO

Designing and implementing complete end-to-end Hadoop Infrastructure in Hortonworks HDP.

Designed, planned and delivered a proof of concept and business function/division based implementation of a Big Data roadmap and strategy project.

Involved in loading and transforming large sets of structured, semi structured and unstructured data from relational databases into HDFS using Sqoop imports.

Involved in MapReduce Converged Data Platform, built with the idea of data movement in mind.

Involved in exporting the analyzed data to the databases such as MySQL and Oracle use Sqoop for visualization and to generate reports for the BI team.

Worked on Oozie scheduler to automate the pipeline workflow and orchestrate the Sqoop, hive that extract the data in a timely manner.

Exported the generated results to Tableau for testing by connecting to the corresponding Hive tables using the Hive ODBC/JDBC connector.

The Hive tables created as per requirement were internal or external tables defined with appropriate static and dynamic partitions, intended for efficiency.

Transformed the data using sqoop,talend for BI team to perform visual analytics, according to the client requirement.

Implemented Fair schedulers on the Job Tracker to share the resources of the cluster of the Map Reduce jobs given by the users.

Environment: Hortonworks HDP 2.x and 3.x, HDFS, MapReduce, Hive, Oozie, Hbase, Shell Scripting, Oracle, Tableau, Talend.

Linux Administrator

Apr, 2014 – Nov, 2015

NBC Universal

New York City, NY

Responsible for configuring and managing Squid server in Linux and Windows.

Supporting VMware installations.

Installation, Configuration, Management and Maintenance over Linux System REDHAT 6.

Worked on Logical volume manager to create file systems as per user and database requirements.

Adding & deleting Users, Groups & Others. Changing groups & ownership, implementing ACL’s.

Installing and updating packages using YUM.

Package management using RPM & YUM. Disk Quota Management.

Building & configuring Red Hat Linux systems over the network, implementing automated tasks through crontab, resolving tickets according to the priority basis.

Environment: Linux, MYSQL, TCP/IP, Networking.

Technical Skills:

Operating Systems/Platforms: Linux,Unix, RHEL7, CentOS, Windows

Programming Languages: SQL, HQL, Python, Pig Latin

Cloud Computing Services: AWS services like EC2,S3,EMR,GLUE,RDS etc

Big Data Ecosystem: Hadoop, MapReduce,Sprak,Hue, HDFS, HBase, Zookeeper, Hive, SQOOP, Oozie, YARN, Impala, Sentry, Ranger

Management Tool: Cloudera Manager, Ambari

ETL Tool: Talend

Security: Kerberos, Sentry, Ranger,LDAP, AD, SSL/TLS, REST Encryption

Contact this candidate