Sign in

Data Aws

Lone Tree, CO
April 06, 2021

Contact this candidate


Srinivasa Rao Velaga


Philadelphia, PA


To pursue a career in Big Data development that will enable me to use my focused educational background, strong organizational skills, and interpersonal communication skills. SUMMARY:

• 4 years experience in Hadoop, developing end to end applications using most of the Hadoop ecosystem tools, migrating apps from RDBMS to Hadoop and Spark.

• Experience in designing, architecting scalable systems and providing best practices to bigdata applications for various business needs.

• Knowledge and hands on experience with the Hadoop ecosystem concepts and tools like Map Reduce, Hive, Impala, Sqoop, Pyspark, HBase, HDFS, Kafka, ZooKeeper, and Oozie.

• Experience and knowledge in AWS S3, Redshift, DynamoDB, EMR, EC2, Glue, Kinesis, Lambda

• Experience in developing functional programming framework to deploy data pipelines using Hive, Impala, Sqoop, Python, GoLang and shell scripting.

• Importing and exporting data using Sqoop from HDFS to Relational Database Systems, vice- versa, made them generic for most use cases.

• Deployed streaming solutions using kafka, spark streaming and HBase.

• Developed, supported and maintained pyspark application along with hands on experience with python3.

• Experience in healthcare data claims(Mx,Hx,Rx), handling PHI data, its encryption and anonymization.

• Knowledge with DW platforms and databases like Oracle and MySQL.

• Good understanding/knowledge of different architectures and various components such as Map Reduce programming paradigm and Microservices architecture.

• Strong Linux shell scripting skills, developing generic scripts and functional programming using shell scripting.

• Good working experience in latest version control systems like GitHub and Gitlab

• Extensive knowledge and experience, supporting and managing Cloudera Distributions of CDH 3.x/4.x/5.x/6.x

• Good experience in Tableau Desktop, Tableau Server in various versions of Tableau 8.x and Tableau 9.x

• Deep understanding of schedulers like Control M, Autosys and Cronjobs, workload management, availability, scalability and distributed data platforms.

• Resourceful, self-starter, self-motivated with aptitude to self-train and adapt to new market trends, requirements and ideas.


Masters in Software Engineering, GPA: 3.65/4

San Jose State University, CA, May 2018

Relevant Coursework: Intro to Hadoop and MapReduce, Big Data and Business Intelligence, Cloud Computing, Virtualization, Enterprise Software Platforms, Mobile Application Development, Databases and Data Structures.


Technical Skills: Python, Golang, Shell Scripting, Hadoop, Spark, HiveQL, Impala, Sqoop, Java, MySQL, Oracle, MongoDB, DynamoDB, AWS, IBM Bluemix, Google Cloud Platform, Informatica PowerCenter, Syncsort DMXh, Putty, Tableau, R, JIRA, Selenium, Test Rail, JSON, XML, C, C++, GitHub and GitLab Certification: AWS Certified Developer- Associate level, authorized by Amazon Web Services (April 2017- 2020)

EXPERIENCE(3+ years)

Big Data Developer, Symphony Health Solutions, Blue Bell, PA Jun 2017- Present Developing "SHeaP(Symphony Health Pipeline)", an in-house python application to run the jobs that previously ran on Informatica.


• Develop new healthcare data pipeline software application.

• Create a variety of data products according to the client's requirements

• Onboard the data to Hadoop from several vendors, process the data and create Hive tables in Hadoop.

• Analyse the jobs that run on Informatica and convert them to Python scripts which Increase the throughput and decrease the costs.

• Process tables in hive and impala for improved performance.

• Create shell scripts to process the raw data and load data to AWS S3 and Redshift databases.

• Create test suite in Impala to test the processed files at different stages of Stage-ODS-EDW.

• Sqoop data from relational databases like Oracle to hdfs/hive and vice versa.

• Productionize scripts so that they can be used across new environments.

• Create test suites for testing daily ingested data in target tables using impala.

• Act as developer advocate


• The data delivered is ultimately used to create better health outcomes of the patients

• With the new application framework, the development latency was reduced considerably. Technologies: Python, Hadoop, Hive, Impala, CDH 5.5/5.7, Ksh, DMX, Informatica, Microsoft Visio and Excel


Earthquake Prediction Engine, Final Masters Project Detecting an earthquake and predicting aftershocks of an earthquake using California data. Alerts are delivered via app notifications and SMS for registered users. The webpage contains an interactive map with live events and can be toggled for past events.

• Built an environment for earthquake analysis and predicting aftershocks which is 60% accurate

• Wrote a bash script to filter and merge the three-dimensional raw data and output as SAC format file which can be used for plotting two-dimensional graphs

• Cleared noises in graphs using ObsPy library in Python

• Formulated a mathematical model to calculate epicentre of an earthquake

• Tested different algorithms and features for training purposes, resulted in increased accuracy of prediction

• Front-end and alerting module are developed in AngularJS. Technologies: ObsPy library for Python, SAC, Scikit-Learn, AWS EC2, Tableau, Git and AngularJS Philadelphia Crime Analysis, Business Intelligence Project Providing interactive dashboards and insights of the Philadelphia 911 calls dataset for years 2007- 2017.

• Created Dashboards to visualize the crimes from 911 calls to identify the patterns in the data

• Integrated a predictive analytics system using IBM SPSS and R to achieve forecasts of the data

• Angular2.0 was used to build front-end

Technologies: Tableau 10.3, IBM SPSS integrated with RStudio packages like Sparklyr, dplyr and ggplot2, dataset from Kaggle, NodeJS, HTML and CSS RailTrac, Cloud Computing Project

Tracking the number of available seats and location of a transit in real-time. Users can access data and add sensors by subscription model. Users can view the real-time analytics in a dashboard. The target users for this service are transit authorities, where they can add and drop sensors according to requirements with simple pay as you go billing module.

• Designed and developed an Infrastructure as a Service to track empty seats and location of trains which secured one of top 5 projects out of 30 in the respective semester

• Connected virtual sensors to AWS IoT through MQTT protocol, a publish and subscribe messaging transport for IoT communications

• Streamlined the AWS services using AWS Lambda, writing to DynamoDB whenever a new message is published to the topic in AWS IoT

• DynamoDB connection to NodeJS application, frontend, user and billing modules were delivered by teammates

Technologies: AWS services- EC2, IoT, Lambda and DynamoDB, Raspberry Pi, NodeJS, HTML & CSS and CronTab

Lessen (Shop & Donate), Enterprise Software Project Platform for bidding donated items. The user has both donating and bidding facilities. The Web application has upload feature and Android application has capture and upload features. Admin account is available with features like managing and blocking the items that are unauthorized or inaccurate, dashboard with log analytics to track clicks and views etc. The net amount after shipping charges is for charity.

• Developed an Android app with features like: capture and upload pictures of items for bidding, list View of bids and track won bids.

• Constructed RESTful API to make communication between MongoDB and applications. Technologies: Android Studio, MongoDB, Google Firebase, Spring MVC, Tableau, REST API and AWS Elastic Beanstalk.

Groupon Application Testing, Quality assurance & Testing Project Groupon is an e-commerce site available worldwide. Testing in different platforms to find bugs is goal of this project

• Tested on iOS, Android and web platforms using manual and automation testing and identified five UI issues and one security issue.

• Managed the test cases using Test Rail

• Logged the defects using Jira

Technologies: Appium, Selenium, Jira, Test Rail

Contact this candidate