Software Engineer Big Data

Location:

Little Elm, TX

Posted:

June 23, 2025

Contact this candidate

Resume:

Anuhya Siliveru

Mobile: 737-***-****

Mail:***********@*****.***

http://linkedin.com/in/anuhyaanu

TECHNICAL SUMMARY:

Languages

Python,Scala

Big Data

HDFS, Spark with pyspark and YARN

Cloud

AWS S3, Athena, Aurora, SNS, EMR, Kinesis, EC2,Glue,Redshift,Lambda,Stepfunctions,SNS,EC2,EMR

IDE

Spyder,Eclipse,intellje

Databases:

Hive, Snowflake,SQL Server

Environment:

Linux, Windows

Version Control:

GIT, Apache MVN, Tortoise SVN

PROFESSIONAL SUMMARY:

• 12+ Years of IT experience across various industries, including 6+ years specializing in Cloud Data Engineering using Python, AWS, and Big Data technologies.

• Extensive expertise in Python scripting for ETL, data transformation, and automation tasks.

• Proficient in developing scalable data pipelines with Python, leveraging AWS services like Redshift, Lambda, Step Functions, Glue, and EMR.

• Strong experience in data processing with PySpark and managing large datasets within AWS S3 and Redshift environments.

• Skilled in designing Python-based data ingestion frameworks for Redshift and AWS S3.

• Expertise in using Python for error handling, logging, and monitoring to ensure data pipeline robustness.

• Knowledgeable in best practices for Python code optimization, debugging, and modularization.

PROFESSIONAL EXPERIENCE

Organization: Technocorp Solutions INC (USA)

Role: Software Engineer

Duration: 02/2022 - present

Client: Elevance Health Technologies.

Project Description:

IRB is a centralized reporting platform integrating Membership, Claims, and Rebates data. It consolidates multiple data sources into a data warehouse, providing data access to developers and business analysts.

Responsibilities:

Developed PySpark scripts for business transformations, data validation, and loading, focusing on optimizing performance and accuracy.

Built automated data pipelines with Python, integrating different data sources and improving data ingestion processes.

Migrated On premises DB structure to Redshift warehouse.

Build various normalization jobs for data ingestion into Redshift.

Implement Work Load Management (WML) in Redshift to prioritize basic dashboad queries over more complex or long running adhoc queries which allowed more reliable and faster reporting interface, giving sub-second query response for basic queries.

Created Glue jobs to load the data into Redshift tables.

Optimized Redshift load performance using distribution styles and sort keys.

Tuned Redshift SQL queries for complex joins and aggregations, improving dashboard responsiveness.

Created and managed project-specific error logging and control log tables in Redshift for tracking task success and failure.

Worked closely with the Tableau team to troubleshoot Python-related data preparation issues, enhancing report generation.

Attended daily Agile SCRUM meetings to collaborate with cross-functional teams and refine Python workflows.

Environment: Python, Spark (with PySpark), AWS (Redshift,Lambda, Step Functions, S3, Glue, EMR), Snowflake, Git, JIRA, Confluence.

Organization: Legato Health Technologies

Role: Senior Software Engineer

Duration: May 2020 – Jan 2022

Client: Anthem, Inc.

Project: IngenioRx Operational Reporting Repository (IORR)

Project Description:

IORR is a centralized platform for operational reporting within IngenioRx, primarily for member experience reporting and data tracking.

Responsibilities

Built Python ETL scripts for data extraction, transformation, and loading, integrating AWS Glue jobs and Lambda functions to automate data pipelines.

Created custom Python modules for parsing, transforming, and loading membership data into S3 and subsequently to Redshift.

Implemented step functions in AWS to manage the sequencing and orchestration of Python-based ETL tasks.

Created Glue jobs to load the data into Redshift tables.

Optimized Redshift load performance using distribution styles and sort keys.

Ingest data from Hive tables to Redshift warehouse.

Validated data integrity with Python unit tests, ensuring accuracy across data pipelines.

Leveraged AWS CloudWatch and debugged Python scripts, optimizing pipeline performance and reliability.

Environment: Python, PySpark, AWS (Lambda, Step Functions, S3, Glue, Redshift), Snowflake, Git, JIRA

Project 2 : IngenioRx- Eligibility Platform (IEP BCI)

IngenioRx Eligibility Platform is based on Event driven architecture cloud based Data Lake for claims data of member and group

Responsibilities:

Developed E2E flow for Standalone files.

Developed lambda Functions and Step functions in AWS

Write the python code for transformation and reading the csv files from s3 buckets.

Worked on Pandas data frames to construct data frames from dictionary.

Involved in Extractions transformation and loading from PLZ zone to outbound

Following Agile Methodology, attending standup calls and retrospective meetings to track/Discuss the open issues.

Involved in sanity check of data by validating unit test cases

Involved and implemented design discussions

Debugging ETL loads in Cloud watch and Altus Cluster logs

Implemented CI/CD process with GIT to move code to production

Involved in Development support during the deployment process and post production support as well.

Environment:- Python,Spark,AWS –Cloud formation, Step Function,S3,EMR, Glue, Lambda, Redshift, SNS, Cloud watch event, Git, JIRA, Confluence

Project 1 : IFSSI-claims finance data.

IFSSI is solution to ingest fixed width file comes from IFSI which is claims finance data.

Responsibilities:

Developed lambda Functions and Stepfunctions in AWS

Involved in Extractions transformation and loading from RAWZ to APPZ.

Involved in sanity check of data by validating unit test cases

Involved and implemented design discussions

Debugging ETL loads in Cloud watch and EMR Cluster logs Implemented CI/CD process with GIT to move code to production

Environment:- Spark,AWS –Cloud formation, Step Function,S3,EMR, Glue, Lambda, Redshift, SNS, Cloud watch event, Git, JIRA, Confluence

Organization : Tetra soft India Pvt. Ltd Role : Senior Software Engineer Duration : 02/2020 – 05/2020

Client : Anthem, Inc.

Project 2 : IngenioRx- Eligibility Platform (IEP)

IngenioRx Eligibility Platform is based on Event driven architecture cloud based Data Lake for claims data of member and group

Responsibilities:

Developed E2E flow for Anthem Eligibility files from 12 source systems.

Created SNS notifications to notify the alerts of DQ failures.

Developed lambda Functions and Step functions in AWS

Developed python scripts to implement the business logics and transformations.

Involved in sanity check of data by validating unit test cases

Involved and implemented design discussions

Debugging ETL loads in Cloud watch and Altus Cluster logs

Implemented CI/CD process with GIT to move code to production

Involved in Development support during the deployment process and post production support as well.

Environment: AWS lambda, Altus Cluster, S3,pyspark, Aurora DB,SNS.

Duration : 08/2019 –01/2020

Client : Anthem, Inc.

Project 1 : IQVIA Data Deliverables

The IQVIA (Rx) data is an industry standard source for measuring volume and cost of dispensed prescriptions

Responsibilities:

Work as a consultant and closely working with product owners and BAs to develop the application phases of the software development cycle.

Involved in creating Hive Scripts and shell scripts according the mapping document

Building & Loading data into APPZone Hive tables in Hadoop.

Extensively Developed the SQL queries based on the mapping documents/Approach documents.

Following Agile Methodology, attending standup calls and retrospective meetings to track/Discuss the open issues.

Broadly worked on Git Bitbucket, Jira and CTM to promote code from lower environments to higher environments, to track sprint stories respectively.

Conduct code review call with Tech Lead and PO to get approval for code migration to higher environments.

Develop the TDD document and implementing the code.

Working on JIRA to track the sprint releases.

Working with PO to approve the TDD document and Code.

Environment: HDFS, Yarn, Hive Spark, Python, Linux

Organization : Wave Infosoft pvt.ltd, Hyderabad

Role : Software Engineer Duration : 12/2017 – 07/2019

Client : Maersk

Project #1 : IOT Data Migartion to AWS

This project aims at porting IOT sensor data on to cloud to address analytic needs, the native storage services of AWS infrastructure is leveraged to store this data(Athena,Aurora,s3) to help the client in getting the insights to make business decision specially addressing the logistic optimization of their carrier services.

Environment: Scala, Spark streaming, Aws, Snowflake, Aurora, Python, S3, MySQL,SNS

Responsibilities:

Developing software modules use to migrate data from onprem to cloud services

Involved in creating the EMR Cluster and monitoring them by AWS UI.

Developing the code to send an email notification to client using SNS (AWS Service)

Developing the code to connect Snowflake(Data Warehouse) do read and write operations from source db.

Automate the process of loading, extracting and loading data between different servers.

Organization : Wave Infosoft pvt.ltd, Hyderabad

Role : Software Engineer Duration : 06/2017 – 01/2017 Client : Confidential

Project : Global Channel Partner Program (Data Integration)

Responsibilities:

Design and Development of technical specifications using Hadoop technology.

Handled importing and exporting data to HDFS and Hive using Sqoop.

Involved in creating Hive tables, partitioning, bucketing of table, loading with data and writing hive queries, which will run internally in map, reduce way.

Designed scalable data layout in Hive by choosing the right file formats.

Analyzed and transformed stored data by writing Spark in Scala based on business requirements.

Created Oozie workflows for Hadoop based jobs including Sqoop, Hive and Pig.

Analyzed the dependencies between the jobs and scheduling them accordingly using Control-M.

Environment: Hadoop, Sqoop, Hive, Spark, Java, UNIX

Organization: Arete IT ServicesPvt Ltd, Vijayawada, India Role : Software Engineer

Duration : 02/2013 – 05/2017

Client : Better Castings

Project : ERP

It includes the complete computerization of the customer services & records. The basic module includes setup of master data, production information, billing, inventory and employee’s information. Master module has various master forms for allowing entering of data pertaining to department and their codes, investigation codes and their rates, various kinds of procedures and their charges, packing and their charges etc.

Responsibilities:

Understanding the flow specifications and responsible for the development of the Application which is totally based on MVC architecture.

Interaction with the client, gathering requirements and identify The Use Cases.

Involved in Project Designing and Development.

Involved in writing report generation Environment: Java, Jsp,Struts, JavaScript,Ajax

EDUCATION:

Master of Science in Computer Science

P.B. Siddhartha College of Arts & Science, Nagarjuna University, Guntur, Andhra Pradesh

Graduated: June 2011.

Developed automated data load processes from S3 to Redshift using Glue and Lambda, ensuring data integrity and scheduling with Step Functions.

• Developed automated data load processes from S3 to Redshift using Glue and Lambda, ensuring data integrity and scheduling with Step Functions.

Contact this candidate