Sign in

Data Manager

Atlanta, Georgia, United States
October 04, 2016

Contact this candidate


Prasad Ravinutala



Over 8 years of experience with emphasis on Big Data technologies & Hadoop Technoligies, for development and design of Java & Scala based enterprise applications.

Experience working with AWS,Cloudera, Hortonworks and Pivotal Distributions of Hadoop

Expertise in HDFS, MapReduce, Spark, Scala,Hive, Chef,Jenkins,Impala, Pig, Sqoop, Hbase, Oozie, Flume, Kafka, Storm and various other ecosystem components

Expertise in Spark framework for batch and real time data processing

Experience in working with BI team and transform big data requirements into Hadoop centric technologies.

Worked on Decissioning & Marketing Events of Credit Cards in the Different Projects

Prior experience of analysis and resolution of data quality and integration issues

Experience in providing and maintaining technical documentation, specifically Data Mapping and Low Level / ETL (or ELT) Design Documentation

Knowledge of Type 2 Dimension Data model and data warehouse ETL techniques for historical data

Working experience on designing and implementing complete end-to-end Hadoop Infrastructure including PIG, HIVE, Sqoop, Oozie, Flume and zookeeper.

Converted Map Reduce applications to Spark and hand lined messaging services using Apache Kafka.

Worked on flume to load the log data from multiple sources directly into HDFS

Worked on Data migration from existing data stores and mainframe NDM (Network Data mover) to Hadoop

Good Knowledge with NoSql Databases -Cassandra, Mongo DB and HBase.

Experience in handling multiple relational databases: MySQL, SQL Server, PostgeSQL and Oracle.

Supported data analysis projects using Elastic MapReduce on the Amazon Web Services

Worked on Map reduce jobs for better scalability and performance.


Hadoop Ecosystem Development HDFS, MapReduce, Spark, Hive, Pig, Flume, Oozie, Zookeeper, HBASE, Cassandra, Kafka, HCatalog, Storm, Sqoop.

Operating System Linux, Windows XP, Server 2003, Server 2008.

Databases MySQL, Oracle, MS SQL Server, PostgreSQL, MS Access

Languages C, JAVA, PYTHON, SQL, Pig, UNIX shell scripting


Bachelors of Engineering, Electronics and Instrumentations, BITS Pilani, India, 2008.


Capital One, Mclean,Virginia Feb 2015 – Sep 2016

Big Data Developer


Imported data from TeraData systems to AWS S3 using DataTransfer and Spark With Scala Distributed Systems

Experience on Hadoop data ingestion using ETL tools Talend, Datastage and Hadoop transformation (including MapReduce, Scala)

Worked on Inbuilt Quantum Application where we used to run our workflow with spark application.

Experienced with Linux operating system and shell scripting

Supported data analysis projects using Elastic MapReduce on the Amazon Web Services (AWS) cloud. Exporting and importing data into S3.

Supported in setting up QA & Production environment and updating configurations for implementing scripts with Spark Scala.

Worked on RunCukes Cucumber Test Reports & also has good understating on Gerkins Langusges for Running ATTD.

Used Apache Spark for real time and batch processing.

Developed Spark code using scala and Spark-SQL/Streaming for faster testing and processing of data

Integrated Apache Storm with Kafka to perform web analytics and uploaded click stream data from Kafka to HDFS, Hbase and Hive by integrating with Storm.

Used Kibana Elastic Search for handling log messages that are handled by multiple systems

Implemented Chef Severs for Sceduling the Cronjobs for the Spark Applications.

Worked on Digital Jenkins Sever to build the scala Projects for the spark Applications.Nexus build repos where all build storage is available

Involved in converting Hive/SQL queries into Spark transformations using Spark RDDs, Scala and have a good experience in using Spark-Shell and Spark Streaming. .

Implemented Mongo Db and set up Mongo Components to Write Data to Mongo and S3 Simaltaneously & read the data from Mongo

We setup Mongo Db in Different Envirments Like Dev,QA,Prod

Worked on GITHUB for Version Control Tool.

Developed scalable modular software packages for various APIs and applications.

Implemented procedures for measurement and optimization of performance of new and current systems.

Developed Spark code and Spark-SQL/Streaming for faster testing and processing of data.

Experience in deploying data from various sources into HDFS and building reports using Tableau.

Developed a data pipeline using Kafka and Strom to store data into HDFS.

Performed real time analysis on the incoming data.

Configured deployed and maintained multi-node Dev and Test Kafka Clusters.

Performed transformations, cleaning and filtering on imported data using Hive, Map Reduce, and loaded final data into HDFS.

Environment: AWS S3,EC2,Spark,Scala, Pig, Hive, Kafka, Mongo, Sqoop, Apache Camel, Oozie, HCatalog, Chef, Jenkins, Artifactory, Avro, IBM Data Studio

Accenture, New Jersy, NJ Oct 2013 – Feb 2015

Big Data/Hadoop Developer


Worked on a live 65 nodes Hadoop cluster running CDH4.4

Performed Flume & Sqoop imports of data from Data warehouse platform to HDFS and built hive tables on top of the datasets.

Participated in building CDH4 test cluster for implementing Kerberos authentication. Installing Cloudera manager and Hue.

Continuous monitoring and managing the Hadoop cluster through Cloudera Manager.

Implemented Data classification algorithms using Map reduce design patterns.

Created tasks for incremental load into staging tables and schedule them to run.

Expertise in designing and deployment of Hadoop cluster and different Big Data analytic tools including Pig, Hive, HBase, Oozie, Sqoop, Flume, Spark, Impala with Cloudera distribution.

Used Pig to do data transformations, event joins, filter and some pre-aggregations before storing the data into HDFS. Supported MapReduce Programs those are running on the cluster.

Worked on debugging, performance tuning of Hive & Pig Jobs.

Involved in creating Hive tables, loading with data and writing hive queries which will run internally in map reduce way.

Worked on performing minor upgrade of cluster from CDH3u5 to CDH4.3.0.

Installed and configured Cloudera Manager on an already existing cluster

Worked extensively in MapReduce using Java Well versed with features like multiple output in MapReduce.

Participated with developers in events such as meetups, conference and technology meetings.

Prepared sample code and applications for displaying various outcomes of API applications.

Supported technical teams in community discussions for educating members on API applications functionalities.

Mentored analyst and test team for writing Hive Queries.

Assisted in developer engagement for Phone API applications and cloud services through social networking portal.

Worked on features like reading a hive table from MapReduce and making it available for all data nodes by keeping in distributed cache. Used both Hue and xml for Oozie.

Got good experience with NOSQL database and Python.

Extracted the data from Teradata into HDFS using the Sqoop

Environment: Hadoop, CDH4, Hue, MapReduce, Hive, Pig, Sqoop, Oozie, Impala, NOSQL, core java/ J2EEJSON, Netezza, Maven, SVN, and Eclipse.

Mclane, Temple, TX May 2012 – Oct 2013

Hadoop Developer


Solved performance issues in Hive and Pig scripts with understanding of Joins, Group and Aggregation and how does it translate to MapReduce jobs.

Worked in tuning Hive and Pig scripts to improve performance.

Developed UDFs using JAVA, PIG and HIVE queries

Extracted the data from SQL into HDFS using Sqoop.

Created Sqoop job with incremental load to populate Hive External tables.

Developed Oozie workflow for scheduling and orchestrating the ETL process. Responsible for building scalable distributed data solutions using Hadoop.

Implemented nine nodes CDH3 Hadoop cluster on Red hat LINUX.

Involved in loading data from LINUX file system to HDFS.

Developed high quality content for blog posts and screen casts to support enhancements in applications.

Implemented a script to transmit sysprin information from Oracle to Hbase using Sqoop.

Provided technical assistance during community discussions for educating developers about API applications.

Cluster coordination services through Zookeeper.

Environment: Hadoop, HDFS, Pig, Sqoop, HBase, Shell Scripting, Ubuntu, Python, Linux Red Hat.

Google, India Jan 2010 – Mar 2012

Java/J2EE Developer


Developed web components using JSP, Servlets and JDBC

Designed tables and indexes

Designed, Implemented, Tested and Deployed Enterprise Java Beans both Session and Entity using WebLogic as Application Server

Environment: Windows NT 2000/2003, XP, and Windows 7/ 8 C, Java, UNIX, and SQL using TOAD, FinacleCore banking, CRM 10209, Microsoft Office Suit, Microsoft project

Mufindi Limited, India Aug 2008 – Jan 2010

Java Developer


Involved in Requirements gathering, Requirement analysis, Design, Development, Integration and Deployment.

Involved in Tax module and Order Placement / Order Processing module.

Responsible for the design and development of the application framework

Environment: Core Java, J2EE 1.3, JSP 1.2, Servlets 2.3, EJB 2.0, Struts 1.1, JNDI 1.2, JDBC 2.1, Oracle 8i,UML, DAO, JMS, XML, Web Logic 7.0, MVC Design Pattern, Eclipse 2.1, Log4j and JUnit.

Contact this candidate