Data Engineer

Location:

Katy, TX

Posted:

April 11, 2025

Contact this candidate

Resume:

Ramulu Gantela

Senior Data Engineer

Mobile: 346-***-**** Email: ****.********@*****.*** LinkedIn

Objective:

Data Engineer with over 19 years of IT experience in designing, developing, and deploying data pipelines and data models. Proven expertise in the full lifecycle of Data Warehousing applications, from analysis to post-production support. Proficient in cloud data migration (AWS, Snowflake) and Data Integration platforms (AWS Glue). Extensive knowledge of Autoloader, Spark Streaming, Spark SQL, and other Spark components. Experienced in Big Data ecosystem solutions, including Hadoop MapReduce and NoSQL databases. Strong background in AWS services, including AWS Spark, Snowflake, Databricks, AWS, Kafka, Airflow, and SQL. Known for delivering innovative solutions and driving business value through cutting-edge data technologies. Ready to contribute to challenging projects and exceed expectations in dynamic environments.

Professional Summary:

Extensive experience with data warehousing, data marts, dimensions, and facts.

Proficient in Spark Architecture with Databricks, Structured Streaming, Delta Lake, and Lakehouse Architectures.

Skilled in creating data pipelines using Python, PySpark, and Spark SQL.

Experienced with multiple data formats: JSON, PARQUET, XML, AVRO, and CSV.

Expertise in developing data engineering applications on AWS and Snowflake.

Proficient in streaming and transforming data using Kafka, Qlik, Databricks, and Snowflake.

Skilled in using AWS services: SNS, EC2, SQS, Glue, Step Functions, CloudWatch, and Lambda.

Imported data from AWS S3 into Spark DataFrames, performed transformations and actions.

Expertise in developing and optimizing data pipelines and models.

Developed and maintained ETL processes for migrating data from S3 to Snowflake.

In-depth knowledge of Snowflake Database, Schema, and Table structures.

Built SnowPipes for continuous data integration from S3 to Snowflake.

Proficient in using SnowSQL for querying, loading, and exporting data.

Experienced in loading data from internal and external stages to Snowflake tables.

Skilled in bulk loading and unloading data into Snowflake tables.

Expertise in developing PySpark programs on Databricks for data transformations.

Designed and developed ETL data pipelines for various on-prem and cloud sources.

Experienced in analyzing, building, monitoring, and debugging data pipelines.

Proficient in writing SQL Queries for Source to Target Data Transformations.

Implemented DevOps practices using Git, Ansible, and Jenkins for CI/CD.

Utilized Delta Lake configurations like ZSORT, OPTIMIZE to enhance performance.

Experienced with Conceptual, Logical, and Physical Data Models for OLAP and OLTP systems.

Worked on Agile/Sprint projects with exposure to Azure DevOps Skillset

Technical Skills:

Big Data

Hadoop, HDFS, MapReduce, PySpark, Spark-Scala, Spark-Java, Spark Streaming, Kafka, Hive, Impala, Flume, Spark SQL, YARN, HBase and Sqoop

AWS

Amazon S3, EC2, Redshift, EMR, DynamoDB, Kinesis, Glue, RDS, IAM, Lambda, Firehose, Aurora

Azure

Azure Data Lake Gen2, Azure Delta Lake, Azure Storage, Azure Synapse, Azure DataBricks, Azure SQL, Azure Cosmos DB

Snowflake

SnowSQL, SnowPipe, Streams, Tasks, Shares, Data Sharing, zero copy Clone, RBAC, data sharing, Materialized views, Time Travel, Data Retention, Advanced SnowSQL, snowflake stored procedures.

Datawarehouse

Snowflake, Redshift

Data Modelling

Hackolade, Erwin

Datalakehouse

Databricks

RDBMS

Oracle, MariaDB, SQL Server

Query Language

SQL, HQL, Spark SQL, PL/SQL

Programming Languages

Python, Scala, Java

Scripting

Unix shell scripting, Python, Perl

Visualization

ThoughtSpot, Tableau

NoSQL

HBase, Cassandra, MongoDB

Methodologies

Agile, Waterfall

Version Control

GitHub, SVN

Scheduler

Airflow, Control-M

DevOps Tools

Ansible, Jenkins

Data Modelling Tools

ErWin DM

Domain

Banking - Risk, Compliance, Treasures & Market; Telecom - IP, NMS; Supply Chain Management

IDEs

Eclipse, Pycharm, Jira

Education:

Master of Technology from IIT (Indian Institute of Technology) Guwahati, Assam-2005.

Bachelor of Engineering from Osmania University, India-2002.

Professional Experience:

Client: The Cigna Group, USA Jun’ 2023–Current

Role: Sr Data Engineer

Responsibilities:

Developed data pipelines for data lake cloud migration of complex health insurance claim systems to process Health Insurance EDI documents like HIPAA 276, 277, 278, 270, 271, 835, 837 etc.

Design and optimize data pipelines for ingesting, processing, and transforming large volumes of data from a wide variety of sources.

Collaborated with cross-functional and IT teams to design conceptual/logical data models and data lineage capabilities to support business goals and objectives.

Optimize Databricks jobs, queries and notebooks for performance, scalability, and cost-efficiency, leveraging best practices and optimization techniques like Partitions and Z-Order.

Designed data pipelines using SQL, PySpark, and Kafka to ingest, clean and persist claim processing data.

Designed and implemented a solution to move on-prem data to AWS cloud using Kafka, Kafka connect and Qlik Replicate.

Working on pipelines to transform data in parquet format from AWS S3 to Databricks delta lake tables.

Working on a complex set of jobs to gather data from multiple high transaction tables process/prepare the data and create views for analytics team consumption thereby improving the performance.

Develop and maintain streaming data pipelines for different HL7 claim types with different sources like JSON, XML, CSV, AVRO, PARQUET etc. and analyze data for insights.

Implemented structured streaming using delta lake tables and window functions to replace batch jobs.

Involved in the design of ThoughtSpot dashboards to provide real time insights and improve efficiency.

Good understanding of continuous integration and deployment using Jenkins, Airflow and Terraform files.

Working on implementing Delta Live Tables (DLT) to better manage the existing pipelines.

Fraudulent Insurance Claims Detection with Scalable Machine Learning: Built a fraud detection model for insurance claims using Pandas and Scikit-learn.

Orchestrated end-to-end development, featuring advanced preprocessing, feature engineering, and scalable model selection.

Achieved a balance between minimizing false positives and accurately detecting fraud in real-time, showcasing expertise in data manipulation and machine learning.

Upgrade a schema or multiple tables from the Hive metastore (HMS) to Unity Catalog (UC) external tables using the upgrade using DEEP CLONE approach.

Used CI/CD pipelines for deployment of code into SIT, UAT, PROD environments

Troubleshoot and resolved data and technical production issues.

Client: UCT Global, USA Jul’ 2022–May’ 2023

Role: Data Engineer

Responsibilities:

Created a centralized Common Data Model project in UCT Global IT.

Worked with stakeholders to gather requirements and define data solutions. The data solutions were planned on time and within budget.

Managed all stages of data migration, including data analysis, pre-go-live validations, user acceptance testing, user sign-off, go-live cut-over plans, and post go-live validations.

Developed data migration approaches and strategies, created data migration functional specifications, data flow, data mapping specs, data cleansing, and data transformation with SMEs.

Conducted multiple test iterations and dry runs for smooth cutover from legacy systems to new environments.

Implemented complex SQL queries and stored procedures for data extraction, transformation, and loading.

Designed and optimized data models to support analytical reporting and machine learning applications.

Design and implement data models, schemas, and storage structures within the data platform for efficient data organization and analytics following best practices.

Involved in standardization and modernization of data for analytical and operational needs to promote data re-use and provide right data for consumption.

Ensured users adapted to new processes and systems, facilitating their transition.

Provided expertise and leadership in integration technologies and enforced company’s process standards.

Collaborated with cross-functional teams to understand business requirements and deliver data solutions that meet stakeholders' needs.

Led business requirements gathering, fit-gap analysis, solution design, and project estimations.

Trained UAT teams on change requests and ensured successful data loads into new systems.

Built and maintained a data model that ensured the accuracy and consistency of data. The data model was used to power a variety of Tableau reports and dashboards.

Troubleshoot and resolve data and technical issues. The issues were resolved quickly and efficiently, minimizing the impact on the business.

Client: HSBC Plc, UK Sep’ 2017–Jun’ 2022

Role: ETL Senior Developer (Tech. Lead)

Responsibilities:

Involved in business analysis and technical design sessions with business and technical staff to develop data models, requirements document, and ETL specifications.

Involved in designing the physical database system.

Involved in data quality and cleansing of data source.

Involved in creation of database schema and capacity planning of schema for the warehouse.

Unit testing of the mappings

Responsible for migration of applications from dev environment to Test and finally to Production.

Developing the mapping documents including System of Record field names, Mapping Rules and BI Target field names.

Preparation of test cases to validate the output data generated using the mapping documents.

Validation of the test cases using JIRA tool.

Defect tracking through JIRA and SharePoint portal.

Develop, design, and maintain Tableau dashboards and analytics.

Manage and utilize the Tableau platform to extract meaningful insights from data.

Created dashboards and reports by using Tableau.

Demonstrated the work done on each sprint to clients.

Client: HSBC Plc, UK Jan’ 2015– Sep’ 2017

Role: ETL Senior Developer

Responsibilities:

Used SAS Data Integration Studio to develop various job processes for ETL (Extract, transform and load).

Integrate data from various system sources (flat files, CSV, text form, and other formats) using SAS.

Used various transformations like Look up, Extract, Data Validation, Splitter etc., To perform Data Validations.

Involved in the preparation of Technical Design documents for the developed code.

Provide support on implementation day and there after depending upon business requirements produce new releases.

Extract data from the operational systems transform them by improving their quality and load them in a dedicated fraud data mart.

Profiling, cleansing, integrating and standardizing data to create consistent, reliable information with SAS DQ

Using FCM information run all custom scenarios generating alerts when appropriate.

Responsible for deploying and scheduling the SAS DI JOBS via LSF.

Preparation of test cases to validate the output data generated using the mapping documents.

Validation of the test cases using JIRA tool.

Defect tracking through JIRA and SharePoint portal.

Working on JIRA tickets and tracking requests, which include preparing internal performance reports and ticket analysis.

Generating the disk utilization reports and intimate to business.

Setting up and monitoring Backup policies by working with Enterprise Backup teams.

Participating in daily meetings with business Customer Data Platform (CDP).

Client: HSBC Plc, UK Dec’ 2013– Jan’ 2015

Role: Senior Software Developer

Responsibilities:

Designing, Implement and delivering data warehouse ETL and reporting applications.

Worked with project managers, design lead and solution architect to achieve business and functional requirements.

Analyze the requirements and accordingly plan, develop, design, test and implement the functionality and procedure.

Review existing systems to determine compatibility with projected and select appropriate systems, including ensuring forward compatibility of existing systems.

Database development and ETL process in Oracle l0g and Greenplum using PSQL, SQL, PERL and UNIX scripting.

Worked on the Performance tuning of the programs, ETL Procedures and processes.

Worked in debugging using Log messages, Server Messages.

Preparation of the various technical and functional documents consisting of Detailed Design Documentation, function test specification with use cases.

Analysis of source systems and work with business analysts to identify study and understand requirements and translate them into ETL code

Perform analysis on quality and source of data to determine accuracy of information being reported

Cooperating with stakeholders at handover, system verification, and operations on a need basis.

Client: HSBC Plc, UK May’ 2012 – Dec’ 2013

Role: Software Developer

Responsibilities:

Work distributed amongst onshore & offshore is distributed within the team based on the understanding level for respective PTS via MQ until Depository Trust & Clearing Corporation (DTCC).

Validate the deliverables within the team on daily based after the peer reviews on the data/reports.

Asset Class scope includes Rates, Credit, Equities, GFX and Commodities trades and transactions are in scope.

Product scope includes swaps, security-based swaps, mixed swaps, security-based swap agreements, FX options, non-deliverable forwards, currency swaps, and cross currency swaps are in scope.

Extract the data within the input files received via MQ as XML messages/xslts and mapping sheets from the team space and converting them into necessary format for data to process until reporting.

Identify key Automation aspects and involve in the inputs to the team to come up with utilities where and when necessary.

Generating the output FpML files for all the PTS through DFA-Transformation Factory and work with the team.

Consolidating the test results on daily basis and report to onshore test management.

Creating the Test Cases for all respective sprints on QC and moving to Test Lab for further execution cycles.

Client: HSBC Plc, UK Nov’ 2010 – May’ 2012

Role: Software Developer

Responsibilities:

TLM 2.5 is a third party (Smart Stream) strategic reconciliation platform used for reconciling data for different eg: IRD, EQDs, and FXO etc coming from various source systems.

The end-to-end reconciliation for cash recs involves analyzing requirements and for a data translation layer which sources data to TLM.

Developed SQL and PL/SQL queries and stored procedures.

Written code for enhancements in Java and Oracle DB.

Develop Oracle PL/SQL code using Packages, stored Procedures, Functions, Triggers, Cursors, and Views, Materialized views as per the business requirement for new enhancement or to resolve issues.

Participated in data quality assurance and testing.

Client: Fiserv, USA Dec’ 2006 – Oct’ 2010

Role: Senior Software Engineer

Responsibilities:

SME for the modules Financial Platform, CNOM and Lending. Mentoring a team as POC for Maintenance Team.

Involved in the all the discussions in analyzing and understanding the requirements for a specific rec and planning project calls for discussions with Business on any issues or gaps.

Monitoring the defects status in Quality Center and following with various teams in their resolution and updating the project plans and test docs respectively.

Prepared test plans and test cases uploaded into Quality Centre and have taken care to adhere to the project delivery timelines.

Actively involved in regression testing for various existing and new recs and BAU enhancements and came up with different performance reports.

Involved in the EL3 support.

Defect Tracking & Revalidating.

Client: Bank of Utica, USA Jun’ 2005 – Dec’ 2006

Role: Software Engineer

Responsibilities:

Developed SQL and PL/SQL queries and stored procedures.

Written code for enhancements in Java and Oracle DB.

Develop Oracle PL/SQL code using Packages, stored Procedures, Functions, Triggers, Cursors, and Views, Materialized views as per the business requirement for new enhancement or to resolve issues.

Participated in data quality assurance and testing.

Documented SQL and PL/SQL code, which made it easier for other developers to understand and maintain the code

Contact this candidate