Data Software Engineer

Location:

Bentonville, AR

Posted:

February 19, 2020

Contact this candidate

Resume:

POOJA BADGUJAR

Phone: 551-***-**** Email: ***************@*****.***

https://www.linkedin.com/in/pooja-badgujar-1b58a1166/

Professional Summary:

Over all 6+ years of professional experience as Hadoop Developer using Apache Spark Framework and also Oracle Database Administrator

Worked on python application to create data frames to save as table and submit in spark jobs.

Worked on Sqoop import and export from SQL Server, Oracle, DB2 and Teradata and configure to run in parallel in shell scripting.

Loaded large sets of structured, semi structured data from Swift Object Store to HDFS in Edge node with DISTCP command and created staging table.

Worked on POWER BI to create reporting graphs and cards for analysis for different years and months based on different KPI’s requirement.

Worked on TDCH connector to export table to Teradata and configuring the job in shell scripting.

Experience to create HQL scripts and python scripts to submit in pyspark jobs with optimize resources.

Created shell scripts to perform file transfer between Windows server and file server.

Extensively worked on error handling in shell scripting as well as in python depending on downstream jobs requirement.

Extensively worked on Shell Scripting to configure the spark jobs and Sqoop jobs from databases.

Experience of handling huge datasets in Spark Jobs and submit parallelly to run as small subsets of jobs with the help of partitioning.

Hands-on fundamental building blocks of Spark - RDDs and related manipulations for implementing business logics Like Transformations, Actions and Functions performed on RDD.

Depth understanding of Data-frames and Data-Sets in Spark SQL

Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems and vice-versa.

Designed good understanding of Partitions, bucketing concepts in Hive and designed both Managed and External tables in Hive to optimize performance

Expert in performing business analytical scripts using Hive SQL.

Worked on IDEs such as Eclipse and IntelliJ for developing, deploying and debugging the applications.

Good knowledge on Data Warehousing, ETL development, Distributed Computing, and largescale data processing.

Experienced in work with different file formats like Text, Sequence, Xml and JSON.

Expertise in working with relational databases such as Oracle 10g, SQL Server 2012.

Collaborated with the infrastructure, network, database, application, to ensure data quality and availability.

Strong knowledge of Software Development Life Cycle and expertise in detailed design documentation

Excellent Communication Skills, Ability to perform at a high level and meet deadlines.

Technical Skills:

Big Data: HDFS, Apache Spark, Spark SQL, Spark streaming, Zookeeper, Hive, Sqoop, HBase, Kafka, Flume, Yarn, Power BI.

Languages: Java, Scala, SQL/PLSQL, Shell Scripting, Python.

Database: MySQL, Mongo DB, Cassandra, Oracle 10g/11g, Microsoft SQL Server 2014

IDE / Testing Tools: Eclipse, IntelliJ IDEA,

Operating System: Windows, UNIX, Linux, MacOS

Tools: SQL Developer, Maven. Hue, TOAD, Snowflake

Professional Experience:

Project 1

Client: Walmart Home Office (Mar 2019 – Till Date)

Project: International Pricing

Position: Software Engineer

Responsibilities:

Involved in gathering business requirement for JDA application.