Dhruv Amin
*****.******@*****.***
Seeking challenging opportunities in world of Bigdata/Hadoop where I can apply my experience and skills to solve complex business problems.
Skills Summary
●3+ years’ experience as a Hadoop/Bigdata consultant doing QA and Development work
●Good practical exposure of leveraging big data technologies on Apache Hadoop
●Strong practical knowledge with HDFS, MapReduce, Pig, Sqoop, Hive, Spark, Scala etc.
●Comprehensive experience in UNIX/Linux bash shell programming
●Well verse with SQL
●Understand the complex processing needs of big data and have experience developing codes and modules to address those needs
●Efficient in building Hive and Pig scripts for data analysis
●Experience in importing and exporting data from different RDBMS like MySql into HDFS and Hive using Sqoop
●Hand on experience in Functional testing, Integration testing, System testing, Regression testing and End to End testing
●Experience in Data Analysis, Data Validation, Data Cleansing, Data Standardization, Data Verification and identifying data mismatch.
●Excellent analytical, problem solving, communication and interpersonal skills with ability to interact with individuals at all levels and can work as a part of a team as well as independently.
●Experience in testing of Data Warehouse/ETL Applications developed in Informatica, Ab -initio, Data stage using SQL Server, Oracle, DB2, and UNIX. Ability to evaluate ETL/BI specifications and processes
●Quick learner with good analytical and communication skills
Technologies
●Big Data Ecosystem: Hadoop, HBase, Hive, Pig, Sqoop, Oozie, Kafka, MapReduce, Spark
●Programming Languages: C, Objective-C, Java, Shell Scripting, Scala
●Databases: Oracle, MySQL, SQL Server
●ETL Tool: Informatica Power Center
●Testing Technologies: Functional testing, Regression testing, System testing, UAT testing, End-to-end testing, Manual testing, and Automation testing
Experience
Bigdata/Hadoop Consultant
Client - Wind Mobile June 2015 – current
●Processing the schema oriented and non-schema oriented data using Spark Scala
●Worked on 12 node cluster to process data to meet the business needs
●Designed and developed MapReduce Jobs in generating customer specific reports
●Created design for mapping the data to the business objectives
●Used Hadoop's ability to store large quantities of structured, semi-structured and unstructured data at scale, it was possible to improve accuracy of inventory stock. The data sources were in the form of csv files, excel, JSON etc.
●Comprehensive knowledge of Hadoop Architecture and tools available in Hadoop ecosystem. Experience in all phases of Hadoop projects including choosing the right tool, security considerations, performance & production implementation etc.
●Planned the QA activities, estimated and scheduled the work effort for the testing phase of the projects
●Combined small files into large files for efficient processing on HDFS using MapReduce
●Extensively used Sqoop to ingest customer data
●Used HBASE for real time analysis
●Wrote Hive Queries and UDFs to do analytics on datasets
●Environment: Hortonworks, Spark, Scala, HBase, Sqoop, Hive
Bigdata/Hadoop Consultant
Client - National Bank Oct 2013– May 2015
●Experienced in designing and developing Hadoop solutions using Pig, Hive, MapReduce, Spark, Sqoop & HDFS on Cloudera Hadoop
●Involved in loading data into HBase using HBase Shell, HBase Client API, Pig and Sqoop
●Created multiple Hive tables, implemented Partitioning, Dynamic Partitioning and Buckets in Hive for efficient data access
●Performed Hive queries for analyzing data in the Hive warehouse using Hive Query Language(HQL)
●Created Hive UDFs according to business requirements
●Worked on Oozie workflow engine for job scheduling
●Involved in testing the XML files and checked whether data is parsed and loaded to staging tables
●Developed test cases based on test matrix including test data preparation for Data Completeness, Data Transformations, Data quality, Performance and scalability
●Created detailed Test cases for each test phase to ensure complete coverage. Test Cases were incorporated both positive and negative test conditions
Database Developer
StratIS Inc. Feb 2013- Sept 2013
●Have worked on data manipulation exercise fetching employee-level information using SQL
●Hands-on experience on advanced data extraction queries such as group-by, order-by, joins (inner, left, right) and sub-queries on employee/member/payer level data (using SQL)
●Created and managed databases for clients as well as systems to provide import of client data, and reporting of finalized data
●Monitored database activity, integrity, security and troubleshoot locks for high availability.
●Performed data cleaning if needed to make sure data were clean and accurate
●Monitored Oracle database batch jobs daily to make sure they ran successfully
●Implemented ETL scripts to bring client data from multiple sources to SQL database and vice versa
●Analyzed execution plans, and tuned queries and databases for better optimal performance
●Modifying databases and products according to client needs
Education
Bachelors in Biomedical Engineering