Data Sql Server

Location:

Hyderabad, Telangana, India

Posted:

April 25, 2018

Contact this candidate

Resume:

Anand

Hadoop with SQL Developer

************@*****.***

770-***-****

Professional Summary:

* ***** ** ************ ********** in IT, including Big Data Hadoop Development, SQL Development along with experience in Application Development.

Experience in installation, configuration, management and deployment of Hadoop Cluster, HDFS, Map Reduce, Pig, Hive, Sqoop, Apache Storm, Flume, Oozie, HBase and Zookeeper.

Proficient in Extracting, Transforming and Loading (ETL) data from different type sources such as Excel, Oracle, and flat file using SQL Server Integration Services (SSIS) from OLTP to OLAP

Good experience in processing Unstructured, Semi-structured and Structured data.

Hands on Experience with objects creation such as tables, views, indexes, stored procedures, triggers and user defined functions, and data dictionaries using T-SQL scripts.

Proficient in SQL Server installation, configuration, query optimization, backup/ recovery and database consistency checks.

Procedural knowledge on cleansing and analyzing data using Hive, Presto on Hadoop Platform and also on Relational databases such as Oracle, SQL, Teradata and Mongo DB.

Extensively worked on creating Teradata Bteq Scripts and used Informatics to load data into Teradata. Involved in optimizing existing algorithms in Hadoop using Spark Context, Spark-SQL, Data Frames and Pair RDD's.

Good knowledge on installing, configuring, and using Hadoop components like Map Reduce, HDFS, Hive, Sqoop, Pig, Zookeeper and Flume.

In depth knowledge of database like SQL, MySQL and extensive experience in writing SQL queries, Stored Procedures, Triggers, Cursors, Functions and Packages.

Working experience in managing, monitoring and tuning SQL Server performance using SQL Server Profiler/Monitor and Windows Performance Monitor.

Experience in scripting, analyzing and debugging both new and existing Complex Stored procedures.

Thorough understanding of the HDFS, Map Reduce framework and extensive experience in developing Map Reduce Jobs

Experienced in building highly scalable Big-data solutions using Hadoop and multiple distributions i.e., Cloudera, Horton works and NoSQL platforms

Well -versed in scheduling Jobs, Alerts and SQL Mail Agent using SQL Server Agent Services.

Hands-on experience on major components in Hadoop Ecosystem including Hive, HBase, HBase-Hive Integration, PIG, Sqoop, Flume& knowledge of Mapper/Reduce/HDFS Framework.

Involved in migration from MS SQL Server2005 to 2008, DTS to SSIS, running and scheduling SSIS packages, Look-up for cleaning bad data, validating data using SSIS.

Outstanding in data migration from heterogeneous databases like Oracle, MS access and flat files to SQL Server using SSIS.

SAS Certified Programmer with experience in Pharmaceutical industry involving data management and statistical analysis for clinical trials.

Flexible, enthusiastic and project oriented team player with excellent written, verbal communication and leadership skills to develop creative solutions for challenging client needs.

Experience is providing 24X7 support and on call rotation schedule

Education:

Bachelors in Computer Science

Technical Skills:

Big Data Technologies

Hadoop 1.x/2.x(Yarn), HDFS, MapReduce, Pig, Hive, HBase, Cassandra, Zookeeper, Oozie, Sqoop, Flume, HCatalog, Apache Spark, Scala, Impala, Kafka, Storm, Tez, Ganglia, Nagios

Hadoop Distributions

CloudEra, Horton Works, AWS

Operating Systems

Windows, Macintosh, Linux, Ubuntu, Unix, CentOS.

Programming Languages

C, C++, T-SQL, PL/SQL, JAVA, J2EE, SQL, PigLatin, HiveQL, Scala, Python, Unix Shell Scripting

Database Tools

Enterprise Manager, Query Analyzer, SQL Profiler, Upgrade Wizard, Replication, Database Engine Tuning Advisor, Business Intelligence Development Studio (BIDS), Lite Speed.

Databases

MS-SQL, MS-Access, NoSQL, MS SQL Server […] 2012, MySQL, Oracle.

Reporting Tools/ETL Tools

Tableau, Informatica, Data stage, Talend, Pentaho, Power View,SQL Server Integration Services (SSIS), Data Transformation Services (DTS), BCP

SAS Tools

SAS/BASE, SAS/MACROS, SAS/STAT, SAS/GRAPH, SAS/SQL, SAS/ACCESS, SAS/ODS, SAS/REPORTS.

Methodologies

Agile/Scrum, Waterfall, DevOps

Protocols

HTTP, TCP/IP, FTP

Web Technologies

Web Services, XML, HTML

Development Tools

Eclipse, NetBeans, IntelliJ, Hue, Microsoft Office Suite (Word, Excel, PowerPoint, Access)

Professional Experience:

Client: IMS Health PA Jul 2017 – Till date

Role: Hadoop Developer

Description: IMS Health and Quintiles are now IQVIA. We are committed to providing solutions that enable healthcare companies to innovate with confidence, maximize opportunities and, ultimately, drive healthcare forward.

Responsibilities:

Responsible for installation and configuration of Hive, Pig, Sqoop, Flume and Oozie on the Hadoop Cluster.

Responsible for analyzing and cleansing raw data by performing Hive queries and running Pig Scripts on data.

Designed workflows and coordinators in Oozie to automate and parallelize Hive and Pig jobs in Cloudera Hadoop (CDH 5.8.0).

Gained familiarity with both HUE UI as well as HIVE CLI for accessing HDFS files and data.

Involved in developing Hive DDLs to create, alter and drop Hive tables and Storm &amp, Kafka.

Developed a data pipeline using Kafka and Storm to store data into HDFS.

Developed Hive UDF to parse the staged raw data to get the item details from a specific store.

Built re-usable Hive UDF libraries for business requirements which enabled users to use these UDF’s in Hive querying.

Designed workflow by scheduling Hive processes for Log file data which is streamed into HDFS using Flume.

Developed Hive (version 0.11.0.2) and Impala (2.1.0 & 1.3.1) for end user/ analyst requirements to perform hoc analysis.

Involved in building the runnable jars for the module framework through Maven clean & Maven dependencies.

Tested ApacheTez, an extensible framework for building high performance batch and interactive data processing applications, on Pig and Hive jobs.

Written multiple MapReduce program in Java for data extraction, transformation and aggregation from multiple file formats including XML, JSON, CSV and other compressed file formats.

Developed SQL scripts to compare all the records for every field and table at each phase of the data movement process from the original source system to the final target.

Involved in converting Hive/SQL queries into Spark transformations using Spark RDDs, Scala and gained experience in using Spark-Shell and Spark Streaming.

Responsible for continuous monitoring and managing the Hadoop Cluster using Cloudera Manager.

Loaded and transformed large sets of structured, semi-structured and unstructured data.

Indulged in regular stand-ups meetings, status calls, Business owner meetings with stake holders, Risk management teams in an Agile environment.

Supported code/design analysis, strategy development and project planning.

Followed Scrum implementation of scaled agile methodology for entire project.

Environment: Cloudera Hadoop Cluster, Unix Servers, Shell Scripting, Java Map Reduce, Hive, Storm, Sqoop, Flume, Oozie, Kafka, Git, Eclipse, Tableau.

Client: Speedway Enon, Ohio Apr 2016- Jun 2017

Role: Hadoop Developer

Description: Become a Member Today! At Speedway, with nearly every purchase–candy bars, drinks, you name it–you earn points toward free fuel and food, merchandise, & gift cards! Register today.

Responsibilities:

Worked on live 60 nodes Hadoop Cluster running CDH5.4.4, CHD5.2.0, CDH5.2.1

Worked on Hadoop cluster using different Bigdata analytic tools including Kafka, Pig, Hive and Map Reduce.

Developed simple to complex MapReduce streaming jobs using Python language that are implemented using Hive and Pig.

Implemented data access jobs through Pig, Hive, HBase (0.98.0), Storm (0.91)

Involved in loading data from LINUX file system to HDFS

Importing and exporting data into HDFS and Hive using Sqoop.

Altered existing Scala programs to enhance performance and obtain partitioned results Spark tool.

Worked on processing unstructured data using Pig and Hive.

Collected and aggregated large amounts of log data using Apache Flume and staging data in HDFS for further analysis.

Used Impala to read, write and query the Hadoop data in HDFS or HBase.

Involved in scheduling Oozie workflow engine to run multiple Hive and Pig jobs.

Developed Pig Latin Scripts to extract data from the web server output files to load into HDFS.

Responsible in taking backups and restoration of Tableau repository.

Converted ETL operations to Hadoop system using Pig Latin operations, transformations and functions.

Created and maintained technical documentation for launching Hadoop clusters and for executing Hive queries and Pig Scripts.

Exported the result set from Hive to MySQL using Shell Scripts.

Actively involved in code review and bug fixing for improving the performance.

Environment: Hadoop, HDFS, Pig, Hive, Map Reduce, Sqoop, Storm, Kafka, LINUX, Hortonworks distribution, Bigdata, Java APIs, Java collection, SQL, NoSQL, MongoDB.

Client: Vanguard, Malvern, PA Dec 2014- Mar 2016

Role: Hadoop Administrator/Developer

Description: The Vanguard Group is an American Investment management company which is the largest provider of mutual funds and exchange-traded funds in the world. Vanguard also provides brokerage services, asset management, educational account services and trust services.

Responsibilities:

Responsible for installation, configuration, maintenance, monitoring, performance tuning and troubleshooting Hadoop Clusters in different environments such as Development Cluster, Test Cluster and Production.

Used Job Tracker to assign MapReduce tasks to Task Tracker in cluster of nodes.

Handled importing of data from various data sources, performed transformations using Hive, MapReduce, loaded data into HDFS and extracted the data from MySQL into HDFS using Sqoop.

Processed HDFS data and created external tables using Hive and developed scripts to ingest and repair tables that can be reused across the project.

Implemented Kerberos security in all environments.

Defined file system layout and data set permissions.

Implemented Capacity Scheduler to share the resources of the cluster for the MapReduce jobs given by the users.

Responsible for troubleshooting issues in the execution of MapReduce jobs by inspecting and reviewing log files.

Involved in loading data from Linux and Unix file system to HDFS.

Involved in emitting processed data from Hadoop to relational databases or external file systems using Sqoop, HDFS GET or Copy to Local.

Involved in Cluster planning and setting up the multimode cluster.

Used Gangila to monitor and Nagios to send alerts about the cluster around the clock.

Commissioned and Decommissioned nodes from time to time.

Involved in HDFS maintenance and administering it through HDFS-Java API.

Worked with Hadoop developers and designers in troubleshooting MapReduce job failures and issues.

Environment: Hadoop 1.2.1, MapReduce, HDFS, Pig, Hive, Sqoop, Cloudera Hadoop Distribution, HBase, Windows NT, LINUX, UNIX Shell Scripting.

Client: GM/OnStar, Detroit, MI Apr 2013- Nov 2014

Role: Hadoop/ SQL Developer

Description: OnStar Corporation is a subsidiary of General Motors that provides subscription-based communications, in-vehicle security, hands-free calling, turn-by-turn navigation and remote diagnostic systems throughout the United States.

Responsibilities:

Interacted with team leaders, business users and various teams during issue handling and to gather both Functional and Technical requirements.

Good understanding and related experience with Hive, Pig and Map/Reduce.

Deep understanding of schedulers, workload management, availability, scalability and distributed data platforms.

Installed and configured Hadoop MapReduce, HDFS, developed multiple MapReduce jobs in Java for data cleaning and pre-processing.

Involved in loading data from UNIX file system to HDFS using HDFS Api and Flume.

Wrote MapReduce jobs to discover trends in data usage by users.

Involved in managing and reviewing Hadoop log files.

Involved in running Hadoop streaming jobs to process terabytes of text data.

Load and transform large sets of structured, semi structured and unstructured data.

Wrote pig UDF's.

Automated all the jobs starting from pulling the Data from different Data Sources and pushing the result dataset to Hadoop Distributed File System and running MR jobs and PIG.

Developed queries in T-SQL to extract data from a variety of SQL environments and to add data elements to the data warehouse.

Identified, tested and resolved database performance issues to ensure data optimization.

Created documentation for Unit Testing, Integration Testing and Test Case Preparation. Wrote unit test cases using MR Unit

Environment: Hadoop 1.2.1, MapReduce, HDFS, Pig, Hive, Flume, Sqoop, Cloudera Hadoop Distribution, HBase, SQL Server, T-SQL.

Client: ZenQ, Hyderabad, India Nov 2011- Mar 2013

Role: SQL Developer

Description: ZenQ is the leading provider of pure-play software testing services to clients across the globe. The company offers highest quality and efficient solutions to help the clients build quality products and solutions.

Responsibilities:

Responsible for managing scope, planning, tracking and change control aspects of the project.

Installation and configuration of SQL Server 2008 in production and testing environments.

Involved in database design in client server by analyzing business requirements.

Communicate activities/progress to client team, business analysts and QA team through weekly feature meetings.

Responsible for leading, guiding, mentoring ORMB offshore and onshore implementation team.

Supported Implementation Team in understanding the Business Requirement. Work on change requests and thoroughly testing the changed functionality with impact analysis.

Implemented customized incremental release packages.

Documented Use Cases, Functional and Technical Design Documents.

Generated T-SQL scripts to create and maintain database objects.

Created and maintained database objects, complex Stored Procedures, Triggers, and Tables, Views and SQL Joins and other statements for various applications.

Wrote new stored procedures and modified existing ones and tuned them such that they performed well.

Effectively used temporary tables for stored procedures considering the performance issues with the front-end application.

Client: Sum Total Systems, Hyderabad, India Apr 2009- Oct 2011

Role: Application Developer

Description: Sum Total Systems, Inc. is a software company that provides human resource management software and services to private and public-sector organizations. The company delivers solutions through multiple-cloud based channels, including Software as a Service (SaaS), Hosted Subscription and premises-based licensure.

Responsibilities:

Streamlined records management within enrollment management program provided by 9 campuses. Automate student enrollment process. Decrease time for required creation of 4,000 records on spreadsheet.

Designed and built SQL Server database that supported 1,000,000+ active and alumni student records. Combined database with excel to automate spreadsheet development.

Reduced hours for creating spreadsheet to 2 hours, with zero errors, decrease in man-hours, improvement in system scalability, and production of additional reports.

Architected Access database, connected system to server, and integrated network with system's GUI that facilitated teamwork among agents.

Created manuals on database functionality and provided training and support to staff and student workers.

Created online surveys for students.

Worked on university websites, also trained staff on updating the website/ personal profiles on the website.

Contact this candidate