Data Engineer

Location:

Vancouver, BC, Canada

Posted:

September 01, 2019

Contact this candidate

Resume:

Siva Rajesh Annam

+1-778-***-****

ac9796@r.postjobfree.com

Vancouver, CA

Passionate and Focused aspirant with a proven academic background in Computer Science. Skilled in deploying Hadoop and Apache Spark to analyze complex data sets, identify patterns & gain valuable insights. Seeking a challenging work environment where I can grow my inherent talents and utilize them for personal and organizational growth to the best of my ability.

KEY SKILLS

• Data Processing • Big Data Analytics • Apache Spark Framework • Hadoop Framework

• Leadership & Team Management • Quick Learner

TECHNICAL SKILLS

Languages: Java, Scala, UNIX Shell Scripting, SQL, PL/SQL, Python

Operating Systems: Windows, Linux

Database: ORACLE, MySQL

Big Data Ecosystem: Hadoop, Map Reduce, Sqoop, Flume, Apache Spark, Pig, Hive, HBase(NoSQL), Oozie, Zookeeper

Spark Framework: Spark RDD's, SparkSQL, Spark Streaming, Spark MLlib, Apache Kafka & Architecture

AWS Services: EC2, S3, EMR, Athena

TRAINING & CERTIFICATIONS

Big Data Hadoop Certification Training Edureka

Apache Spark and Scala Certification Training Edureka

PROFESSIONAL EXPERIENCE

TATA CONSULTANCY SERVICES

ASSISTANT SYSTEM ENGINEER

Hyderabad, IN Dec '16 - May '17

TCS, an Indian multinational information technology service, consulting company, has operations in 46 countries & provides services to number of customers worldwide.

Key Skills & Responsibilities:

Involved in Vodafone-Germany Telecommunication project

Application monitoring through IBM Netcool Tool

Working knowledge of Oracle Databases & SQL queries

Deployed UNIX shell commands to develop Shell scripts in accordance with client requirements

Deployed many SQL queries and Java programs based on client requirement

Ability to perform regular Health Checks on servers and hosted applications

EDUCATION

NEW YORK INSTITUTE OF TECHNOLOGY

Masters - Information Networking and Computer Security

Vancouver, CA May '17 – Dec '18

GPA: 3.79/ 4

VELAGAPUDI RAMAKRISHNA SIDDHARTHA ENGINEERING COLLEGE

B. Tech - Computer Science

Vijayawada, IN Aug '13 - Jul '16

CGPA: 8.99/ 10

A.A.N.M &V.V.R.S.R POLYTECHNIC COLLEGE

Diploma - Computer Science

Vijayawada, IN Jul '10-Apr '13

PERCENTAGE: 91.05 / 100

PROJECTS

PROJECT 1: Analysis of Airlines Data sets (Part of Certification Training)

Brief: Analyze Huge Volume of Airlines Data sets to find insights

Environment: HDFS (for storage), Pig (for Cleaning), Hive (for Analysis), Sqoop (for exporting to RDBMS)

Overcame challenges of storing & processing Huge Volume of data via Hadoop Framework.

Transferred data into HDFS & deployed Pig to clean the dataset (Unstructured data into Structured Data).

Analyzed the Filtered Data set with the help of Hive and Map Reduce to render insights like nation-wide airport listing, airlines sharing airports, airlines with zero stops, most active airline, etc.

Delivered the output into RDBMS via Sqoop.

PROJECT 2: Drop-page of signal during roaming (Part of Certification Training)

Brief: Identifying Top 10 Customers facing frequent roaming call drops based on call records to improve connectivity

Environment: HDFS (for storage), Spark SQL (for transformation)

Deployed Spark framework to perform data analytics on available datasets & render insights to optimize processes.

Transferred data into HDFS & processed with Spark RDD's for initial processing.

Deployed Spark SQL to find insights in the available data set.

PROJECT 3: Analysis of US Election Data set (Part of Certification Training)

Brief: Analyze factors that led to the eventual outcome based on demographic features to plan subsequent campaigns

Environment: HDFS (for storage), Spark SQL (for transformation), Spark MLlib (for ML), Zeppelin (for visualization)

Overcame challenges of storing & processing structured/semi-structured data via Hadoop Framework & Apache Spark.

Transferred data into HDFS & performed initial processing with Spark RDD's.

Transformed the processed data with the help of Spark SQL.

Clustered the data using Spark MLlib(K-Means).

Identified the insights with the help of Zeppelin and Tableau Visualization tools.

Contact this candidate