Post Job Free

Resume

Sign in

Data Developer

Location:
AP, India
Posted:
October 24, 2014

Contact this candidate

Resume:

Rewis Yonan

781-***-****

Objective

Seeking a position in a highly successful company in order to utilize my

knowledge and skills in Hadoop Development to contribute to the further

success of the company

SUMMARY

. 5+ years of total IT with strong experience in Hadoop Big Data

development.

. Experience in Map Reduce programming model and Hadoop Distributed File

Systems

. Working knowledge in HDFS Admin Shell commands

. Reliable worker with the ability to learn independently new concepts

and skills.

. Solid technical experience with excellent reputation as a team player

and collaborator.

. Good ability to interact with customers from diverse cultures and

assist them.

. Strong ability to multitask and work overtime if necessary.

. Excellent Organizational and communication Skills

TECHNICAL SKILLS

Big Data Hadoop, MapReduce, High-performance computing, Picard, Data

mining, Pig, Hive, HDFS, Zookeeper, Kafka, MongoDB, AWS and

Tableau

Networking CCNA, Maintain TCP/IP Network, LAN, (WAN) including Wireless,

Active directory, Accounts Administration Backup/Recovery

and Data Restoration

Programming C#, Java, C++, HTML

Database SQL Server, MS Access (installation, Administration, and

Security)

Software Visual Studio, OPNET modeler, MS Excel, MS Word, and Visio

Operating System Windows NT, 2000, XP, 7 and 8

Hardware Computer hardware and software installation and maintenance

EDUCATION:

. Bachelor of Science, Computer Science, British University (BUE)

CERTIFICATIONS:

. MapR Certified Hadoop Administration (MCA)-77%

. Leadership skills course, Oxford University, UK

. Cisco Certified Network Associate (CCNA)

WORK EXPERIENCE

Client (Confidential)

Feb 2013 - Till Date

PROJECT 1

Hadoop Developer

POC: Twitter Data Mining Using Hadoop-Mapreduce Framework

This Project main aim was to track the trending HashTags, trending

discussion from the real-time streaming data. We used the most popular

streaming tool KAFKA to load the data on Hadoop File system and move the

same data to MongoDB -Nosql database. We use most productive algorithms to

analyze the data on HDFS using Map reduce, Hive and Pig. We got the Geo

tagging Location based popular tweets, treading hash tags data and counts

on daily basis to guide the customers to post ads on Twitter. We did the

real-time analytics using Hadoop and display the end results using Tableau.

Technologies Involved in this project: kafka_2.9.1-0.8.1.1, Cloudera CDH4.7

Hadoop Cluster, AWS, Java 1.7, Mysql, MongoDB-Nosql-2.6, Tableau 8.2,

hadoop -2.0.0, hive-0.10, hue-2.5, oozie-3.3.2, pig-0.11,sqoop-1.4.3,

sqoop2-1.99, zookeeper-3.4.5, Red hat Linux, Unix scripting, twitter4j,

log4j, Scala, junit testing & Maven.

. Hadoop cluster -nodes- 30 node cluster

. Mongo DB cluster (MongoDB with 1000 IOPS)

. M3xlarge nodes with 4 core CPU and 15 Gb RAM and 1.6TB node size

. Daily getting 1TB of data size in JSON format with popular hash tags

along with 8 billion events and trillions of real time tweets

. Using Kafka we stream the data with twitter4j from source to Hadoop.

From Hadoop to Mongodb move the data using Map reduce, hive and pig

scripts by connecting with mongo-hadoop connectors. Analyze data on

HDFS and send the results to MonogDB databases to update the

information on the existing table.

Role and Responsibilities:

. Monitor the AWS Hadoop cluster using Cloudera manager for adding nodes

and decommission dead nodes and to monitor heal checks.

. Configure and monitor MongoDB cluster in AWS and establish connections

from Hadoop to MongoDB data transfer.

. Connect tableau from client end with AWS ip addresses and view the end

results.

. Install KAFKA on Hadoop cluster and configure producer and consumer

coding part in java to establish connection from twitter source to

HDFS with popular hash tags.

. Copy the data from HDFS TO MONGODB using pig, hive and Map reduce

scripts and visualize the streaming data in dashboard tableau.

. Do analytics using map reduce, hive and pig in HDFS and sends back

those results to MongoDB databases and update information in

collections.

. Develop shell scripts and makes the process automotive to drive the

process from JSON to BSON.

Faragello, Egypt Dec

2011-Oct 2012

SQL Server Developer

Faragello foods is an Egyptian multinational corporation that operates in

the food industry. The company is considered one of the biggest and most

diversified food companies in Middle East. It is the largest processor and

marketer of chicken, beef, and pork.

In this project I worked as a developer and I was actively involved in

designing database and developing packages for extraction/loading.

Responsibilities:

. Involved in ER diagrams and mapping the data into database objects.

. Design of the Database and the Tables.

. Build table relationships and wrote Stored Procedure's to clean the

existing data.

. Developed SQL scripts to Insert/Update and Delete data in

MS SQL database tables

. Evaluated database performance and performed maintenance duties such

as tuning, backup, restoration and disaster recovery,

. Written complex SQL statements using joins, sub queries and correlated

sub queries.

. Involved in Job scheduling and alerts.

. Generated database SQL scripts and deployed databases including

installation and configuration.

. Designed and implemented user log in and security.

. Involved in requirement gathering, analysis, design, development,

change management, deployment and User Training.

Environment: SQL Server 2005 Enterprise Edition, .NET FrameworkT-

SQL, SQL Server 2005, SSIS 2005, SSRS, SSAS, Windows 2003 Advance Server,

MS Excel, MS Access, Visual Studio 2005

Project: 2

Client: Hill Physicians Medical Group

Hadoop Developer/Admin

Hill Physicians Medical Group was formed in 1984 and is an independent

practice association servicing some 400,000 health plan members in northern

California. Hill Physicians has over 3700 physicians and is affiliated with

36 hospitals and 15 urgent care centers. The company contracts with managed

CIGNA, and Health Net to provide care to health plan members through its

provider affiliates.

Responsibilities:

. Installed, configured, maintained Apache Hadoop clusters for

application development and Hadoop tools like Hive, Pig, Hbase,

Zookeeper, Sqoop and Flume.

. Responsible for writing Map Reduce programs in Java.

. Perform data analysis using Hive and Pig.

. Load log data into HDFS using Flume.

. Developed Hadoop monitoring processes (capacity, performance,

consistency) to assure processing issues which are identified and

resolved.

. Installed Oozie workflow engine to run multiple Hive and Pig jobs

. Gained very good business knowledge on health insurance, claim

processing, fraud suspect identification, appeals process, etc.

Environment: Hadoop 1x, Hive, Pig, Hbase, Sqoop, Flume, Oozie, R

programming language, Oracle11g, PL/SQL, SQL*PLUS, UNIX shell scripting,

Java

Savoy hotel, Sharm El Sheekh, South Sinai, Egypt

Mar 2010- Nov 2011

Network Engineer

Description: Establishes networking environment by designing system

configuration, directing system installation, defining, documenting, and

enforcing system standards. Maximizes network performance by monitoring

performance, troubleshooting network problems and outages, scheduling

upgrades, collaborating with network architects on network optimization.

Secures network system by establishing and enforcing policies, defining and

monitoring access.

Responsibilities:

. Install and support LANs, WANs, network segments, Internet, and intranet

systems.

. Install and maintain network hardware and software.

. Monitor networks to ensure security.

. Evaluate and modify system's performance.

. Determine network and system requirements.

. Ensure network connectivity throughout the hotel's LAN/WAN.

. Administer servers, routers, switches, and firewalls.

RESEARCH:

Power consumption in wireless sensor network

. Study the performance of a new protocol for wireless networks with

energy-driven requirements

. Projection for pervasive objects communications in short-range network

Key words: Big Data, HDFS (Hadoop Distributed File System), MapReduce("Map

Reduce", "Map-Reduce"),,Sqoop, Cassandra, Python, CloudBase ("Cloud Base",

"Cloud-Base"), Elastic MapReduce (EMR), Flume, HBase ( "H Base", "H-Base"

), HCatalog, Hive, Hue, Ambari, Apache, Avro, BigTop (bigtop, "big

top", "big-top"), Cascading, Chukwa, Cloudera, MongoDB (MongoDB, "Mongo

DB", "Mongo-DB"), Oozie, Pig, Sqoop, Structured data, ZooKeeper.



Contact this candidate