Thirupathaiah Pujari
adjjo7@r.postjobfree.com
PROFESSIONAL SUMMARY:
Vibrant and self-motivated professional with 8 Plus years of IT application developing for Big data and Web applications.
Six Plus Years of experience in dealing with Apache Hadoop Ecosystem like HDFS, YARN, MapReduce, TEZ, Hive,AWS S3,Dynamo DB, Kinesis, Sqoop,Presto, Zookeeper, kafka,Nifi, Gobblin, Spark, Scala and Big Data Hands on experience on kubernetes and Docker image.
Good working experience on MinIo(Open Source Storage Location)
Excellent understanding of Hadoop Architecture and underlying Hadoop framework including Storage Management and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node, Job Scheduler and Map Reduce programming.
Managed data coming from different sources (Aws S3, Dynamo DB, and kinesis,Nifi, Kafka, Sql and Teradata), involved in HDFS maintenance, and loading of structured and unstructured data using Gobblin and Difa framework.
Capable of developing applications on AWS machine learning and artificial Intelligence.
Proficient Knowledge on DevOps model in current project.
Co-ordinate with offshore team and cross-functional teams to ensure that applications are properly tested configured and deployed.
Extensive knowledge in Telecom Domain Provision, Activation, Billing and order Management.
Exposed in working with Spark Data Frames and optimized the SLA’s .
Experience in importing and exporting data using Sqoop from Relational Database Systems to HDFS
Hands on experience in Linux Shell Scripting. Worked with Big Data distributions Cloudera.
Exposure to Developing, scheduling and maintaining UC4 jobs and MCC integration of all workflows.
Extensive experience in Data Analysis using Hive, Hive's analytical functions and Presto.
Outstanding knowledge on Building, Deploying and Maintaining the Applications.
Creating reports using various data sources like Excel, Oracle, SQL Server, Hive (Hadoop) etc.
Knowledge of database architecture for OLTP and OLAP applications, Data Analysis and ETL processes.
Good in system analysis, ER Dimensional Modeling, Database design and implementing RDBMS specific features.
Strong experience in Data Analysis, Data Migration, Data Cleansing, Transformation, Integration, Data Import, and Data Export through the use of multiple ETL tools.
Experience in using various IDEs Eclipse, IntelliJ and repositories SVN and GitHub by using build tools SBT and Maven.
In-depth Understanding in a highly dynamic team using agile methodologies like Scrum, Kanban and waterfall with the Knowledge of tools like Rally, Agility.
Managed data coming from MongoDB, involved in HDFS maintenance, and loading of structured and unstructured data using Spark into HDFS.
EDUCATION:
Master of Computer Applications, National Institute of Technology, Durgapur 2012-West Bengal.
B.Sc. Computer Science, SriKrishnaDevaraya University, Anantapur 2008-Andhrapradesh.
TECHNICAL SKILL:
Big Data Technologies
HDFS, MapReduce, TEZ, Hive, Sqoop, Spark Streaming,Nifi Kafka, Gobblin,AWS S3,Zookeeper,Kinesis,Presto,Minio,Kubernetes and Docker
Relational Databases
Oracle, SQL Server, MS Access, Teradata,Netezza and Snowflake
Languages
Scala, Java
NoSQL Databases
Dynamo DB and MongoDB
Mark-up/Script languages
HTML, XML, JavaScript, VBA Script, Shell Scripting
Service
Web Service,SOAP
Version Tools
GitHub (Git Bash),SVN
Tools
MS Office, Visual Studio, UC4, Eclipse, Intellij,Dbeaver,MobaXterm
PROFESSIONAL EXPERIENCE:
Tata Consultancy Services, Englewood, CO May'16 until date
Client: Comcast Cable Corporation
Senior Big Data Developer
Project: Meld Migration
Description: Analysis of huge and critical transactional data from diversified data sources, migrate the data into Hadoop, and build robust analytics solution for business reporting.
Responsibilities:
Migration of ETL process into Hadoop using spark and Scala.
Optimizing of existing algorithms in Hadoop using Spark Context, Spark-SQL, Data Frames and RDD's.
Migrating from cloudera
Good understanding on DAG cycle for entire spark application flow on Spark application WebUI.
Used HiveQL for migrating the smaller legacy process into meld.
Worked on Delta data and developed ingestions using Databricks
Configuration of Jobs using UC4 and developing the workflows for daily run to in the cluster
Creating Kafka topics and consuming them based on the needs from various sources
Bringing streaming data using spark from Nifi and Kafka
DevOps mode of working, involved in developing, QA and Deployment.
Building the required data lake in Hadoop using Sqoop from various data source
Experienced in developing custom input formats and data types to parse and process unstructured and semi structured input data and mapped them into key value pairs to implement business logic in Map-Reduce.
Analysed the SQL scripts and designed the solution to implement using Spark
Creating data model for bringing in the new data source for supporting various process.
Maintain the code in GitHub, well versed knowledge in using the GitHub, Gitbash
Bundling and compiling the code for creating jars using SBT
Experienced in analysing data with Hive and Pig
Experienced in querying data using SparkSQL on top of Spark engine
Unit testing on data and improvement of performance, turned over to production
Downloading data from AWS S3 to HDF using spark and Scala.
Implemented Vault process to store all secret information like passwords.
AWS Athena used to query the tables in S3
Good knowledge on headwaters3(kinesis)
Environment: Hadoop, MapReduce, HDFS, Hive, Sqoop, kubernetes,Docker,Minio, Kerberos, Scala, Spark SQL, Spark Streaming, Kafka, Shell Scripting, MySQL Oracle 11g, SQL*PLUS, Intellij, Databricks, GitHub, SBT, Rally, Python, Netezza,snowflake.
Tata Consultancy Services, Chennai, India
Client: Comcast Cable Corporation
Role: Big Data Developer Jan'14 - Apr’16
Project: Data Insights
Description: Worked on various Data issues and business problems as follows Consent Degree, Theft of Service, Digital First (Order Management), Billing and back office systems
Responsibilities:
Creating Spark and Scala code for bring the transformation data from various data sources.
Involved in gathering requirements from client and estimating time line for developing complex queries using Hive for logistics application.
Creating Schematic layer in Hadoop using hive and spark, for proving the solutions for the business needed.
Explored with the SPARK, improving the performance and optimization of the existing algorithms in Hadoop using SPARK CONTEXT, SPARK-SQL, DATA FRAME, PAIR RDD, and SPARK YARN.
Created the transformation using PYSpark using the Python.
Excellent understanding and knowledge of NOSQL database HBASE and CASSANDRA.
Fetching bulk data from Cassandra using presto to Hive.
Developing Type 2 and Type 1 tables using Hive
Building data lakes in Hadoop using Sqoop and hive.
Using spark streaming capturing the real time data and processing in Hadoop.
Dynamic refresh of data to tableau reports to see the trends for every scenarios built as part of the business requirement.
Exposure on Spark Architecture and how RDD’s work internally by involving and processing the data from Local files, HDFS and RDBMS sources by creating RDD and optimizing for performance.
Experience in writing Sqoop scripts to import and export data from RDBMS into HDFS and
HDFS to Microsoft SQL server and handled incremental loading on the customer and transaction information data dynamically.
Involved in converting Map Reduce programs into Spark transformations using Spark RDD’s on Scala.
Experience creating reports using BI Reporting Tools like Tableau and Power Bi.
Evaluated usage of Oozie for Workflow Orchestration.
Embedding the required Tableau reports to websites using JavaScript API.
Acting as a billing data SME for consent degree program to support the reconciliation process.
Worked on consent degree’s reconciliation process to reduce the fallout percentage of failure consents
Good data understanding of order data from billing and order management.
Creating Tableau reports for the different Modules and scenarios to support business needs and bringing the burn down charts for the reconciliation process
Automating the daily comparison reports using VBA to generate the reports in Tableau
Supporting Business Team, Development, and QA in all the phases of project life cycle
Involved in developing the analytics using R for identifying the patterns to get solutions for the failure orders
Environment: Hadoop, MapReduce, HDFS, Hive, Sqoop, Kerberos, Scala, Spark SQL, Spark Streaming, Kafka, Nifi,ETL Oracle, SQL Server, Teradata, Word, Visual Studio, Excel, Rally, Intellij, GitHub, SBT, Rally,
Tata Consultancy Services, Hyderabad, India Jul ‘12 –Dec ‘13
Client: Century link
Role: Web Developer
Project: CPLUS and IOE
Description: Consulting Plus (CPLUS) is a web based Service Order entry system that used by QWEST customer Service representatives to place orders with Qwest for services like POTS, Internet, DirecTV and VOIP.
It caters to small business group and consumers currently. Integrated Order Entry is a web based Estimation and Service Order entry system that facilitates the process of ordering done by the customers for different types of services that a customer orders from CenturyLink.
IOE application provides the functionalities of BPT and CPLUS where customer can also know the Quoting before ordering the services.
Responsibilities:
Involved in requirement gathering from the Clients for implementing and delivered it by following the waterfall process. All the business logic and front-end designs handled based on the blueprint.
Actively involved in the requirement gather and requirement planning for each quarter.
Involved in UI development based on the given blueprints from the client.
Application completely based on the Content Management System, involved in configuring the components in Site core and business layer coding.
Regression testing made after every development phase before delivering to Quality Assurance team.
Appreciation from the client for on time delivery with the quality of test cases automation by using selenium.
Performed Database testing using SQL Developer.
Experience in Web service testing using SOAP UI.
Working knowledge of waterfall development models .
Developed Test Automation Scripts using Selenium WebDrivers.
Integrated tests cases with Continuous Integration Tool (JENKINS).
Developed and maintained Regression/Sanity test suite in HP ALM/CA Agile Central test management tool.
Experience building automation framework using Selenium,
Environment: Visual Studio, Excel, Rally, Shell Scripting, Selenium Cucumber Framework, JavaScript, CSS, HTML, Java, Apache Maven, Junit and Eclipse.
Award & Recognition:
On the Spot, award in Data insights – Victory, Data Quality for back office discrepancy corrections.
Awarder for the Best Innovation idea and successful delivery of Financial Management Tool.
I awarded KUDOS for the best performance in the complete Trouble Ticket Management Tool.
I awarded KUDOS for the best performance while migrating from cloudera to Minio.