Data Engineer Web Applications

Location:

Phoenix, AZ

Posted:

November 08, 2023

Contact this candidate

Resume:

Shiva Kumar Eravelly

********.*****@*****.***

901-***-****

Summary:

• OCJP certified Sr. Data Engineer with an accomplished 11 years of experience specialized in Hadoop components like, Spark with Scala & Java, Kafka, Hive, Data Pipeline, S3, EC2, RDS, EMR, Lambda, HDFS, Hive, Sqoop and Oozie along with Java/J2EE Technologies.

• Been part of Architecting Big data Analytics project on AWS platform and configured CI/CD tool Jenkins and configured endpoints.

• Developed web applications using J2EE technology like JSP, Servlet, EJB, JDBC Sound knowledge in developing multi-tier web applications in Spring, Hibernate and RESTful that ensures clear separation of layers and rapid application development.

Skills:

• Languages: Core Java, Scala, SQL, Hive QL.

• Hadoop Ecosystem: AWS Cloud, Spark/Streaming, Kafka, Hive, Oozie, HDFS, Airflow.

• Application Servers: BEA WebLogic, IBM WebSphere.

• DB: Oracle, IBM DB2, MYSQL, Redshift, RDS, Snowflake.

• Systems: Windows, Linux, Unix.

• Framework: MVC, Spring, Hibernate, JUnit.

• Scripting Language: Shell Scripting, HTML, JavaScript, jQuery, CSS, XML.

• Java/J2EE Skills: JSP, Servlets, EJB, JDBC, JMS, REST web services.

Work Experience:

Jul 2021 – To date InMarket

Sr. Data Engineer

Responsibilities:

• Took ownership of the NinthDecimal projects which was acquired by InMarket.

• Achieved monthly cost savings of over $200,000 through project remodeling and redesign.

• Analyzed, Redesigned, Developed and Migrated the NinthDecimal legacy systems developed on Luigi/PySpark/Snowflake to Airflow/Spark/BigQuery/Dataproc.

• Proficiently designed ER models for multiple projects to optimize data flow from source to destination.

• Orchestrated ETL pipelines in Airflow using services from AWS, GCP and other API’s.

• Worked upon EMR cluster optimization to improve Spark performance and developed jobs using DataFrame API.

• Created IAM Roles/groups on AWS and GCP.

• Collaborated with cross-functional teams to retrieve essential business functionalities.

• Created terraform scripts to deploy Scheduled Queries and Lambda in Cloud.

• Lead a team in achieving the goals by guiding and providing technical solutions and creating well defined stories and Acceptance Criteria gathered from stake holders.

Environment: AWS, GCP, GCS, Dataproc, Spark, Scala, Python, S3, EMR, Lambda, ECS, Airflow, Snowflake, BQ, Redshift, GIT.

Sep 2020 – Jul 2021 William Sonoma

Sr. Data Engineer

Responsibilities:

• Developed Adobe Clickstream data feed pipeline using S3, Lambda, EMR, Spark and Scala for the end users to provide information on clickstream analytics.

• Developed Batch jobs using Spark and Scala to transform the data and ingest into Snowflake and worked on improving the performance of these batch jobs.

• Worked upon building Streaming Applications with Apache Kafka.

• Orchestrated pipelines using Azure Data Factory.

• Used HVR to import and export data from various data sources to destinations.

Environment: AWS, Azure, Spark, Scala, Python, Kafka, AWS Lambda, S3, Snowflake, EMR, Airflow, MySQL, GIT.

Aug 2019 – Jul 2020 Choice Hotels

Sr. Data Engineer

Responsibilities:

• Designed and Developed a module to handle billions of records ingesting into Redshift by improving efficiency and reducing the cost. Migrating from Cloudera to EMR and triggered jobs using Lambda.

• Extensively worked upon Spark UDF’s to transform complex JSON data and load it into Redshift.

• Used AWS S3 for storing the data in various formats using compression techniques.

• Developed Spark Streaming application to stream data from various sources using Kafka.

• Created Airflow DAG’s to schedule and trigger batch jobs. Worked upon writing JUnit test cases and used Sonar to refactor and reviewed code.

Environment: CDH, Spark, Java, Scala, Python, RESTful, Kafka, Hive, Impala, AWS Lambda, S3, Redshift, EMR, Airflow, MySQL, GIT.

Aug 2018 – Mar 2019 Caterpillar

Software Data Engineer

Responsibilities:

• Developed JSON extract data for the API team to consume normalized data from CDDW warehouse using Spark and Scala.

• Worked upon improving the performance of Spark jobs by tuning the resources allocation and partitioning data and writing data to interim files in HDFS.

• Worked upon migrating CDDW project from Cloudera On-premises environment to AWS cloud.

• Developed AWS Lambda which captures events from API Gateway to fetch data from Snowflake.

• Implemented Snowflake connections with AWS Lambda. Created views in Hive to improve the performance of batch jobs.

Environment: CDH, Spark, Java, Python, Hive, AWS Lambda, S3, EC2, EMR, Cloud Watch, Impala, Snowflake, Oozie, Sqoop.

Dec 2015 – Aug 2018 Department of Labor

Big Data Developer

Responsibilities:

• Designed configured and developed on Amazon Web Services (AWS) for a multitude of applications utilizing the AWS stack (including EC2, S3, Cloud Watch, SNS, EMR) focusing on high availability, fault tolerance and auto-scaling.

• Developed POC’s using latest Big Data technologies available in the market and presented to the fed’s on how the system will be modernized and the impact it creates in data analytics.

• Configured EMR clusters and automated the creation and termination of clusters.

• Developed Spark jobs using Scala and Spark-SQL and submitted to Spark Engine to perform Validations, Aggregations on CSV and parquet files and loaded to S3.

• Involved in launching Redshift DB cluster, schema and tables and migrating data from S3 to Redshift.

• Used RedShift as a Data Lake and connected to Tableau to use RedShift as Data Source.

• Developed Lambda’s to trigger functionalities based upon incoming events from S3 to SNS.

• Configuring, automation and maintaining build and deployment using CI/CD tools (Jenkins) in AWS cloud and orchestrated the end points.

Environment: AWS, EMR, amazon S3, amazon EC2, amazon SNS, amazon SWF, Cloud-Watch, Java, Scala, Spark, Spark-Streaming, Scala, Redshift, Tableau, Jenkins.

Feb 2015 – Oct 2015 TCC Solutions

Programmer Analyst

Responsibilities:

• Implemented persistent layer using Hibernate API and integrated with Spring component.

• Worked upon Batch processing and Designed ER-Model for the database.

• Implemented J2EE Design Patterns like DAO, Value Object, Factory, for the integration of application modules.

Environment: Java, Spring framework 3.0, Hibernate, JSP, HTML-5, JavaScript, jQuery, IBM DB2, WebLogic Server.

Mar 2011 – Jan 2014 Cognizant Technology Solutions

Programmer Analyst

Responsibilities:

• Used spring framework for Dependency Injection (IOC), Model View Controller (MVC) to implement business layer components and application navigation layer.

• Handled events and runtime errors using JSF event listeners and validators.

• Developed application layout and composing tiles definitions, managed beans to use with JSF, transfer objects to pass data over layers, Business delegates for invoking business methods.

• Built Hibernate models and Java patterns to implement DAO layer using Hibernate interfaces.

Environment: Java 1.6, Spring framework 3.0, Hibernate, WebLogic, XML, JSP, JSF HTML5, CSS3, JavaScript, JDBC, Oracle 10g.

Education:

• Jawaharlal Nehru University, India 2010

Bachelor of Technology

• University of Central Missouri, USA 2014

Master’s in Computer Science

Certifications:

• Certificate of Appreciation from U.S. Department of Labor- 2016

• OCJP 1.6 – Oracle Certified Java Professional- 2011

Contact this candidate