Data Manager

Location:

New Rochelle, NY

Posted:

May 25, 2017

Contact this candidate

Resume:

SAKIRALI SAIYAD

** ********* ****** **. *

New Rochelle NY.10805

(914) 902 - 8173

***********@*****.***

OBJECTIVE:

Motivated professional seeking a position where I can apply my education while gaining additional skills.

EDUCATION:

Monroe College, New Rochelle, NY

Masters of Computer Science, December 2016

G.P.A – 3.83

L.D. College of Engineering, Ahmedabad India

Bachelor of Engineering, (May 2011) G.P.A -3.6

PROFESSIONAL SUMMARY:

Expertise in Hadoop eco system components HDFS, Map Reduce, Yarn, H Base, Pig, Sqoop, Spark, Spark SQL, Spark Streaming, and Hive for scalability, distributed computing, and high performance computing.

Experienced in Installing, Maintaining and Configuring Hadoop Cluster.

Strong knowledge on creating and monitoring Hadoop clusters on Amazon EC2, VM, Hortonworks Data Platform 2.1 & 2.2, CDH3, CDH4 Cloudera Manager on Linux, Ubuntu OS etc.

Capable of processing large sets of structured, semi-structured and unstructured data and supporting systems application architecture.

Having Good knowledge on Single node and Multi node Cluster Configurations.

Expertise on Scala Programming language and Spark Core.

Worked with AWS based data ingestion and transformations.

Experienced in job workflow scheduling and monitoring tools like Oozie and Zookeeper.

Good knowledge on Amazon EMR, Amazon RDS S3 Buckets, Dynamo DB, RedShift.

Analyze data, interpret results, and convey findings in a concise and professional manner

Partner with Data Infrastructure team and business owners to implement new data sources and ensure consistent definitions are used in reporting and analytics

Promote full cycle approach including request analysis, creating/pulling dataset, report creation and implementation and providing final analysis to the requestor

Very Good understanding of SQL, ETL and Data Warehousing Technologies

Expert in TSQL, creating and using Stored Procedures, Views, User Defined Functions, implementing Business Intelligence solutions using SQL Server 2000/2005/2008.

Developed Web-Services module for integration using SOAP and REST.

Good experience on Kafka and Storm

Knowledge of java virtual machines (JVM) and multithreaded processing.

Good exposure to IDE tools like Eclipse Net Beans and I Report.

Excellent exposure to database designing and modeling using E/R diagrams.

Java Developer with extensive experience on various Java Libraries, API’s, and frameworks.

Hands on development experience with RDBMS, including writing complex SQL queries, Stored procedure, and triggers.

Have sound knowledge on designing data warehousing applications with using Tools like Teradata, Oracle, and SQL Server.

Experience on using Talend ETL tool.

Strong in databases like Sybase, DB2, Oracle, MS SQL, Clickstream.

Strong Working experience in snowflake.

TECHNICAL SKILLS:

Hadoop/Big Data Technologies

HDFS, Map Reduce, Sqoop, Flume, Pig, Hive, Oozie, impala, Spark, Splunk, Zookeeper and Kafka,

NO SQL Database

HBase, Cassandra

Monitoring and Reporting

Tableau, Custom shell scripts,

Hadoop Distribution

AWS, Horton Works, Cloudera, Map R

Build Tools

SQL Developer

Programming & Scripting

JAVA, SQL, Shell Scripting, Python, Scala

Java Technologies

Servlets, JavaBeans, JDBC, Spring, Hibernate, SOAP/Rest services

Databases

Oracle, MY SQL, MS SQL server, Teradata

Web Dev. Technologies

HTML, XML, JSON, CSS, JQUERY, JavaScript, angular JS

Version Control

SVN, CVS, GIT

Operating Systems

Linux, Unix, Mac OS-X, Cen OS, Windows10, Windows 8, Windows 7, Windows Server 2008/2003

PROFESSIONAL EXPERIENCES:

Global System LLC. Irving, Texas USA (April 2016 – Dec. 2016)

Project Trainee

Responsibilities:

Implemented nine nodes CDH3 Hadoop cluster on Red hat LINUX.

Involved in loading data from Oracle database into HDFS using Sqoop queries.

Implemented Map reduces programs to get Top K Results using Map Reduce programs by fallowing Map Reduce Design Patterns.

Installed/Configured/Maintained Apache Hadoop clusters for Analytics, application development and Hadoop tools like Hive, HSQL Pig, HBase, OLAP, Zookeeper, Avro, parquet and Sqoop on Linux ARCH.

Having experience in doing structured modelling on unstructured data models.

Developed data pipeline using Flume, Pig and Java MapReduce to ingest customer behavioral data and financial histories into HDFS for analysis.

Involved in loading the created H Files into HBase for faster access of large customer base without taking Performance hit.

Involved in collecting and aggregating large amounts of log data using Apache Flume and staging data in HDFS for further analysis.

Worked on installing cluster, commissioning & decommissioning of Data Nodes, Name Node recovery, capacity planning, and slots configuration.

Used Pig as ETL tool to do transformations, event joins and some pre-aggregations before storing the data onto HDFS.

Write test cases, analyze and reporting test results to product teams.

Worked with AWS data pipeline.

Hadoop workflow management using Oozie.

Responsible for developing data pipeline using flume, Sqoop and pig to extract the data from weblogs and store in HDFS.

Installed Oozie workflow engine to run multiple Hive and Pig Jobs, used Sqoop to import and export data from HDFS to RDBMS and vice-versa for visualization and to generate reports.

Involved in migration of ETL processes from Oracle to Hive to test the easy data manipulation.

Worked in functional, system, and regression testing activities with agile methodology.

Worked on Python plugin on MySQL workbench to upload CSV files.

Used Hive to analyze the partitioned and bucketed data and compute various metrics for reporting.

Relief Infraprojects Pvt. Ltd. Ahmedabad, Gujarat India (June 2012 – Dec. 2014)

Project Engineer

Helped this regional mid-level marketing construction business processes by developing, installing and configuring Hadoop ecosystem components that moved data from individual servers to HDFS.

Installed and configured MapReduce, HIVE and the HDFS; implemented CDH3 Hadoop cluster on CentOS.

Assisted with performance tuning and monitoring.

Created HBase tables to load large sets of structured, semi-structured and unstructured data coming from UNIX, NoSQL and a variety of portfolios.

Supported code/design analysis, strategy development and project planning.

Created reports for the BI team using Sqoop to export data into HDFS and Hive.

Developed multiple MapReduce jobs in Java for data cleaning and preprocessing.

Assisted with data capacity planning and node forecasting.

Collaborated with the infrastructure, network, database, application and BI teams to ensure data quality and availability.

Administrator for Pig, Hive and Hbase installing updates, patches and upgrades.

Padmavati Computers. Ahmedabad, Gujarat. India (June 2011 – May 2012)

Project Trainee

The project is for developing a web based application to eliminate all the paperwork in the hospital and laboratories, reading the data from different instruments and store the data in a relational database and generating business intelligence reports for the management.

Designed and implemented the training and reports modules of the application using Servlets, JSP and Ajax.

Developed custom JSP tags for the application.

Writing queries for fetching and manipulating data using ORM software iBatis.

Used Quartz schedulers to run the jobs sequentially at given time.

Implemented design patterns like Filter, Cache Manager and Singleton to improve the performance of the application.

Implemented the reports module of the application using Jasper Reports to display dynamically generated reports for business intelligence.

Deployed the application in client's location on Tomcat Server.

Environment: HTML, Java Script, Ajax, Servlets, JSP, iBatis, Tomcat Server, PostgreSQL, Jasper Reports.

AREAS OF EXPERTISE:

Big Data Management: Hadoop, MapReduce, HDFS, HBase, Zookeeper, Hive, Pig, Sqoop, Cassandra, Oozie, Flume, Yarn

Programming Languages: Java, C/C++, Python, Scala

Scripting Languages: JSP & Servlets, PHP, JavaScript, XML, HTML, Python and Bash

Databases: NoSQL, Oracle

UNIX Tools: Apache,

Tools: Eclipse, J Developer, MS Visual Studio

Platforms: Windows(2000/XP), Linux, Solaris,

Application Servers: Apache Tomcat 5.x 6.0,

Testing Tools: NetBeans, Eclipse

Methodologies: Agile, Design Patterns

LANGUAGE SKILLS:

Fluent in English, Hindi, Urdu and Gujarati

REFERENCES:

Available upon request.

Contact this candidate