Data Sql

Location:

India

Posted:

May 04, 2017

Contact this candidate

Resume:

Srikar Nelakurthi

+1-778-***-****

Email: acz4rp@r.postjobfree.com

Over 10 years and 6 months of experience on CISCO and BT Projects in analysis, design and development of software applications in client/server environment with Relational Database concepts using Oracle and Data Warehousing Concepts using Agile Methodology.

Strong experience in creating big data solutions using Apache Spark Core, Spark SQL, Data Frames, Spark Streaming, Kafka.

Expertise in Hadoop Framework,Architecture,Eninge,Management,Storage,supporting ecosystem components like Hive, Sqoop and Oozie.

Strong Experience in RDBMS database design and logical/physical Data Modelling.

Proficient at HiveQL and have good experience of partitioning (time based), dynamic partitioning and bucketing to optimize Hive queries.

Experience in implementation of pyspark RDD Transformations and Actions.

Used Hive to create tables in both delimited text storage format and binary storage format.

Experience developing Hive UDF to apply custom logic using Python.

Good working experience using Sqoop to import data into HDFS from RDBMS and vice-versa. Also have good experience in using the Sqoop direct mode with external tables to perform very fast data loads.

Knowledge in Mongo DB, No SQL database

Expertise in Data Modelling including Star Schema and Snowflake Schema.

Has in depth experience/understanding of database architectural design especially with large scale OLAP/DW Implementation.

Data warehousing experience using Informatica Power Center, Informatica Designer, ETL OLAP.

Dimensional Data Modeling experience on Data modeling, Dimensional Modeling, Star Modeling, Data marts, OLAP, FACT & Dimensions tables, Physical & Logical data modeling.

Strong Knowledge in Relational Database Concepts, Entity Relation Diagrams, Normalization and De normalization Concepts. Expertise in using SQL * Loader and External Tables for loading data from flat files into Oracle tables.

Experience in Data Migration and Data Conversion using Informatica interacting multiple sources like Flat-file, xml, Oracle and Teradata

Extensively used various Exception handling cases for proper flow of data into database tables

Experience in performance tuning and optimization of PL/SQL scripts and SQL queries using Explain Plan, TKPROF, and Auto-trace.

Extensively used REF Cursors, Bulk binds, Bulk Collects, Dynamic SQL’s, FORALL statements and co-related sub-queries

Heavily worked with SQL HINTS, INDEXES (B*Tree, Bitmap, Unique, function based, Clustered indexes) for performance improvements.

Extensively used Oracle built-in Packages such as UTL_FILE, DBMS_JOB, DBMS_SQL, DBMS_SCHEDULER and DBMS_UTILITY.

Developed and supported Extraction, Transformation and Load process (ETL) using PL/SQL and to populate the tables in OLTP and OLAP Data warehouse Environment.

Good Knowledge of complete Software Development Life Cycle, Data Acquisition, Data Integration and Data Loading.

Basic Knowledge on Spark ML, creating Labelled Point on Vectors and creating the Data Frame, splitting the data to Training and Testing then creating a Model and preforming the predictions on the Model.

Familiar with Source control management tools like GIT, SVN, VSS and PVCS.

Worked with Agile Solutions and QA tools such as RALLY,HP Quality Center,

Excellent analytical,communications skills and ability to work independently

EDUCATION

Masters Computer Applications, India.

TECHNICAL SKILLS

Big Data

Hadoop, Hive, MapReduce, Sqoop, Pig, PySpark Core, Spark SQL,Spark Streaming,Oozie,Kafka

Databases

Oracle 12c,11g,10g/9.x, MySQL, Teradata

Tools

SQL*Loader, TOAD, Teradata SQL Assistant, Informatica Power Center, $ Universe, Kintana,APPDB,PVCS,RALLY,Jupyter Notebook,Spyder IDE

Scripting languages

UNIX Shell Scripting

Operating Systems

Windows, UNIX

Programming skills

Python, PL/SQL

WORK EXPERIENCE

Cisco, CANADA & USA

Sr. Data Engineer Oct 2013 - Till Date

Responsibilities:

Worked Closely with Functional team to gather and understand requirements and translated to Technical Design Documents.

Loading data from different sources into HDFS and process the data according to the business requirements

Used Spark Sql DataFrames,Sparl Sql Functions to perform analysis and aggregations before loading the final data to Hive Tables

Understand different business parameters/predictors that could influence performance (complexity,data sizes)

Loaded data into Spark Core RDD, Transformed and performed Actions on RDD, later converting them to Data Frames to do further Spark Sql processing.

Performed transformations on RDD to support Data Standardization, Element level computations.

Used Broadcast, Accumulator for look up transformations in spark, persist, cache for avoiding re-computing.

Used complex data types like Map, Array, Struct in Hive, StructType in Spark based on biz Needs.

Experience in implementing all kinds of Joins in Hive, performance tuning in Hive for a scenario.

Designed and Created data cleansing, data conversion, validation using hive QL.

Extensively used Partitioning (File, Database) to handle large datasets.

Transferred the analyzed data across relational database from HDFS using Sqoop.

Designed appropriate Hive databases and partitioning/bucketing tables to allow faster data retrieval during analysis.

Involved in processing the data in the Hive tables using Spark-SQL for high-performance, low-latency queries.

Used Hive to do data transformations, joins and some pre-aggregations before storing the data onto HDFS.

Written Hive queries for data analysis to meet the business requirements

Developed python scripts to parse the raw data, populate staging tables and store the refined data in partitioned tables

Used diff kinds of Python external modules such as Pandas, Numpy, Matplotlib Bokeh, BS4 etc.

Coordinating with analyst and Data Science teams

Created Workflow and Coordinator xml’s to schedule Oozie for hive data loads.

Testing and debugging of all hiveql and objects in order to evaluate the performance and to validate the use case for business requirement

Environment: Hadoop, HDFS, PySpark core, Spark Steaming, Spark Sql, SQOOP, Hive, Oozie, Kafka, Oracle,Teradata, GIT,Jupyter Notebook,Spyder IDE

Cisco, USA

ETL /Data Engineer & PL SQL Developer Feb 2010– Sep 2013

Responsibilities:

Involved in business analysis and technical design sessions with business and IT teams to develop, understand business requirements document (BRD) and ETL specifications.

Involved in creating DataMart using Data modeling concepts like Star-Schema Modeling, Snowflake Schema Modeling, Fact and Dimension tables.

Developed various mappings for extracting the data from different source systems using Informatica.

Used Transformation Developer to create the filters, lookups, and stored procedure, Joiner, update strategy, expressions and aggregations transformations.

Developed reusable mappings using Mapplets, parameters and Variables.

Created sessions, batches and scheduled them. Scheduling the sessions to extract, transform and load data in to warehouse database as per Business requirements.

Worked on performance tuning of mappings, sessions, Target and Source.

Worked closely with functional team to understand their Development related requirements and created TDD (Technical Design Document) and Functional specification documents (FSD) for the design provided in SDS (System Design Specification).

Used partitioning and indexing on tables to improve the performance.

Involved in Creation of Tables, Join conditions, Correlated sub queries, Nested queries and Views for various application.

Fine Tuned procedures for the maximum efficiency in various schemas across databases using Oracle Hints, Explain Plan and Trace sessions

Used Exception Handling extensively for the ease of debugging and displaying the error messages in the application.

Used TKPROF and SQL Trace to analyze and fix time consuming SQL statements that were in API’s

Extensively worked on creating and executing formal test cases, test plans for functional, system and regression testing.

Used SQL Analytics functions for various reporting needs.

Implemented PL/SQL scripts using Ref cursors in order to load data dynamically and implemented incremental loading of the data.

Used Complex SQL queries and Inline views, Materialized views, Global temporary tables and REF cursors for reporting purposes.

Used UNIX SHELL scripts for automation of daily and weekly batch jobs.

Used scheduler tools like $U and DBMS_Scheduler for scheduling the batch jobs

Developed Complex SQL queries and Inline views, global temporary tables for reporting purposes.

Used SQL queries and analytic functions to develop reports for managers for business analysis.

Used $U for automating and scheduling the jobs.

Developed SQL scripts and Procedures to load data (ETL) into data marts and database tables.

Created SQL scripts and written Procedures using PL/SQL to load data from flat files into new tables using both UTL and SQL Loader.

Created and managed the execution of ETLs with the help of Job scheduling tool Dollar Universe.

Participating in defining and implementing technical direction and architectural solutions.

Analyzing plans, collected data and statistics to identify performance and data issues

Environment: Oracle, Pl/Sql, Informatica, Windows, TOAD, APPDB, UNIX, SQL plus, Kintana, PVCS and SVN.

British Telecom, UK. Oct 06 – Jan 10

PL/ SQL Developer / Data Analyst

Responsibilities:

Gathered Business Requirements form the client and translated the business detail into technical design and specification.

Contributed in all phases of SDLC.

Contributed in discussions relating to Design/Development of new requirements and expressing different views.

Contributed to project and quality plans.

Contributed in Design, coding, testing, debugging, and documentation.

Development of Complex Packages and procedures in PL/SQL.

Supported front end Java applications by writing the new stored Procedures.

Created new tables and packages in the schema to support the web application.

Identifying, writing and executing Test Cases.

Tuning the queries and debugging of stored procedures.

Extensively used SQL Loader for loading data into staging tables from various data sources

Fine Tuned existing procedures using SQL Trace.

Used Bulk Collections for better performance and easy retrieval of data, by reducing context switching between SQL and PL/SQL engines.

Created PL/SQL scripts to extract the data from the operational database into simple flat text files using UTL_FILE package.

Creation of database objects like tables, views, materialized views, procedures and packages using oracle tools like Toad, PL/SQL Developer and SQL* plus.

Created Views and materialized views.

Used bulk collect in PL/SQL objects to load multiple records at once.

Extensively used Pragma Autonomous Transaction in logging commits and transaction rollbacks.

Largely used advanced features like Dynamic SQL, PLSQL tables, Records, Object types.

Used various Exception handling cases for proper flow of data into database tables.

Used Dynamic SQL to clean the staging tables for various reporting Purposes

Used UNIX scripts for FTP and Automation of batch jobs.

Involved in performance tuning of SQL queries using TK PROF and Explain Plan.

Environment: Oracle 9i, TOAD, SQL Developer, UML, VSS

Contact this candidate