Data Test Cases

Location:

Wantagh, NY

Posted:

September 29, 2016

Contact this candidate

Resume:

Sindhu Y

******.********@*****.***

516-***-****

OBJECTIVE

Looking for a position to utilize and contribute my skills in an organization that offers professional growth and in-depth experience.

EDUCATION

New York Institute of Technology, NY.

Master of Computer Science, May – 2015.

Major GPA (3.56)

SKILLS (Coursework)

Good knowledge of Hadoop Architecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node, and MapReduce/Yarn concepts

Experience in installing, configuring, and administrating Hadoop cluster for various distributions like Apache, Cloudera, Map R and Horton works.

Very good understanding on NOSQL databases like Hbase and Cassandra.

Experience in building large scale highly available Web Applications.Working knowledge of web services and other integration patterns.

Experience in managing and reviewing Hadoop log files.

Experience in using Pig, Hive and Scoop

Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems and vice-versa.

Experience in using JBoss, IBM Web sphere and Apache Tomcat

Experience in using configuration tools Harvest and Microsoft Visual SourceSafe

Experience in using different IDEs like Eclipse, Net Beans, WSAD

Experience in querying stream of data using RxJS

Worked on deployment, installation, configuration and issues with Application servers like Apache JBoss, JBoss Admin 4.0, Apache Tomcat and Web sphere

Developed projects and products using SDLC (Software development life cycle), from initiation, planning, designing, execution and implementation, development

Generated Java APIs for retrieval and analysis on No-SQL database such as HBase.

Knowledge in job work-flow scheduling and monitoring tools like Oozie and Zookeeper.

Knowledge of data warehousing and ETL tools like Informatica and Pentaho.

Knowledge on Spark Framework and Scala.

Knowledge of job workflow scheduling and monitoring tools like Oozie and Zookeeper, of NoSQL databases such as HBase, Cassandra, and of administrative tasks such as installing Hadoop, Commissioning and decommissioning, and its ecosystem components such as Flume, Oozie, Hive and Pig.

Knowledge in Software Development Life Cycle (Requirements Analysis, Design, Development, Testing, Deployment and Support).

Excellent Java development skills using J2EE, Spring, J2SE, Servlets, JUnit, MRUnit, JSP,JDBC.

TECHNICAL SKILL

Programming Languages : C, C++, Java, Shell Scripting, PL/SQL

J2EE Technologies : Spring, Servlets, JSP, JDBC, Hibernate.

BigData Ecosystem : HDFS, HBase, MapReduce, Hive, Pig, Python,

Sqoop, Impala, Kafka, Cassandra, Oozie, Zookeeper, Flume.

DBMS : Oracle 11g, SQL Server, MySQL.

Modeling Tools : UML on Rational Rose 4.0.

Web Technologies : HTML, JavaScript, XML, jQuery, Ajax, CSS.

Web Services : Web Logic, WebSphere,Apache,

Cassandra,Tomcat,

IDEs : Eclipse, Netbeans, WinSCP.

Operating systems : Windows, Unix, Linux (Ubuntu), Solaris,Centos.

Version and Source Control : CVS, SVN.

Servers : Apache Tomcat.

Frameworks : MVC, Struts, Log4J, Junit, Maven, ANT, WebServices.

EXPERIENCE

Client: Amerigroup – Virginia Beach, VA July 2015 to Till Date

Role: HadoopDeveloper

Description:

Amerigroup is a US health insurance and managed health care provider company headquartered in Virginia Beach, Virginia.

Responsibilities:

Design, deploy, Manage cluster nodes for our data platform operations (racking/stacking)

Install and configure cluster. Setting up puppet for centralized configuration management.

Monitoring Cluster using various tools to see how the nodes are performing.

Managing the received data using Pentaho ETL tool and upload the same to the database.

Expertise in cluster task like Adding Nodes, Removing Nodes without any effect to running jobs and data.

Write scripts to automate application deployments and configurations. Monitoring YARN applications. Troubleshoot and resolve cluster related system problems.

Wrote map reduce programs to clean and pre-process the data coming from different sources.

Implemented various output formats like Sequence file and parquet format in Map reduce programs. Also, implemented multiple output formats in the same program to match the use cases.

Developed Hadoop streaming Map/Reduce works using Python.

Performed benchmarking of the No-SQL databases, Cassandra and HBase.

Hands on experience with Lambda architectures.

Created data model for structuring and storing the data efficiently. Implemented partitioning and bucketing of tables in Cassandra.

Implemented test scripts to support test driven development and continuous integration.

Converted text files into Avro then to parquet format for the file to be used with other Hadoop eco system tools.

Experienced on loading and transforming of large sets of structured, semi structured and unstructured data.

Exported the analyzed data to HBase using Sqoop and to generate reports for the BI team.

Analyzed large amounts of data sets to determine optimal way to aggregate and report on it.

Participate in requirement gathering and analysis phase of the project in documenting the business requirements by conducting workshops/meetings with various business users.

POC work is going on using Spark and Kafka for real time processing.

Developed a data pipeline using Kafka and Storm to store data into HDFS.

Populated HDFS and Cassandra with huge amounts of data using Apache Kafka.

POC work is going on comparing the Cassandra and HBase NoSQL databases.

Hands-on experience in Pentaho.

Environment: MapReduce, HDFS, Hive, Pig, Hue, Oozie, Core Java, Eclipse, Hbase, Flume, Spark, Kafka, Cloudera Manager, Cassandra, Python, Greenplum DB, IDMS, VSAM, SQL*PLUS, Toad, Putty, Windows NT, UNIX Shell Scripting, Pentaho, Talend, Bigdata, YARN

Client: InfoTech Enterprises, Hyd India May 2011 – July 2013

Role: Java/J2EE developer

Responsibilities:

Implemented Services using Core Java.

Developed and deployed UI layer logics of sites using JSP.

Developed the XML data object to generate the PDF documents and other reports.

Used Hibernate, DAO, and JDBC for data retrieval and medications from database.

Messaging and interaction of Web Services is done using SOAP.

Developed JUnit Test cases for Unit Test cases and as well as System and User test scenarios.

Designed and developed Struts like MVC 2 Web framework using the front-controller design pattern, which is used successfully in a number of production systems.

Spearheaded the “Quick Wins” project by working very closely with the business and end users to improve the current website’s ranking from being 23rd to 6th in just 3 months.

Normalized Oracle database, conforming to design concepts and best practices.

Resolved product complications at customer sites and funneled the insights to the development and deployment teams to adopt long term product development strategy with minimal roadblocks.

Convinced business users and analysts with alternative solutions that are more robust and simpler to implement from technical perspective while satisfying the functional requirements from the business perspective.

Applied design patterns and OO design concepts to improve the existing Java/JEE based code base.

Identified and fixed transactional issues due to incorrect exception handling and concurrency issues due to unsynchronized block of code.

Environment:

Java 1.2/1.3, Swing, Applet, Servlets, JSP, custom tags, JNDI, JDBC, XML, XSL, DTD, HTML, CSS, Java Script, Oracle, DB2, PL/SQL, Weblogic, JUnit, Log4J and CVS.

ADDITIONAL:

Worked as Graduate Assistant for 1plus year at Admission Office in NYIT

Team Player.

Open to learning and feedback.

Good Interpersonal communication skills.

Contact this candidate