Post Job Free
Sign in

Data Engineer Lead

Location:
Charlotte, NC, 28269
Posted:
June 23, 2025

Contact this candidate

Resume:

Rajashekar A Lead Data Engineer

**************@*****.*** 254-***-****

PROFESSIONAL SUMMARY

Lead Data Engineer with 10+ years of software engineering experience including around 8+ years of experience in Big Data technologies. Expertise in Hadoop/Spark development experience, Cloud engineering like AWS and Azure, automation tools and E2E life cycle of software design process. Outstanding communication skills, dedicated to maintain up-to-date IT skills and industry knowledge.

Experience Summary:

Experience in developing applications that perform large scale distributed data processing using big data ecosystem tools like HDFS, YARN, Sqoop, Flume, Kafka, MapReduce, Pig, Hive, Spark, Pyspark,Spark SQL, Spark Streaming, HBase, Cassandra, MongoDB, Mahout, Oozie, and AWS.

Good functional experience in using various Hadoop distributions like Hortonworks, Cloudera, and EMR.

Developed aggregation pipelines and optimized complex MongoDB queries for business-critical analytics and data retrieval operations.

Experience in using data ingestion tools- such as Kafka, Sqoop and Flume.

Experienced in performing in-memory real time data processing using Apache Spark.

Good experience in developing multiple Kafka Producers and Consumers as per business requirements.

Extensively worked on Spark components like Spark SQL, MLlib, and Spark Streaming.

Designed and developed interactive Tableau dashboards for monitoring data quality KPIs across Azure Data Factory pipelines.

Configured Spark Streaming to receive real time data from Kafka and store the stream data to HDFS and process it using Spark and Scala.

Developed analytical components using KAFKA, SCALA, SPARK, HBASE and SPARK STREAM.

Worked on developing Azure Data Factory pipelines with various integration runtimes & linked services; and multiple activities like Copy, Dataflow, Spark, Lookup, Stored Procedure, For each & While loop.

Extract Transform and Load data from Sources Systems to Azure Data Storage services using a combination of Azure Data Factory, T-SQL, Azure Data Lake Analytics, Data Ingestion to Azure Services like Azure Data Lake, Azure Storage, Azure SQL, Azure DW and processing the data in In Azure Data bricks.

Developed real time data processing applications by using Scala and Python and implemented Apache Spark Streaming from streaming Source Azure EventHub .

Involved in loading data from Linux file systems, servers, java web services using Azure EventHub producers and partitions.

Experienced in data migration between MongoDB and Snowflake using custom ETL pipelines and third-party connectors (e.g., Kafka Connect, Spark-Connector).

Experienced in data migration between MongoDB and Snowflake using custom ETL pipelines and third-party connectors (e.g., Kafka Connect, Spark-Connector).

Developed code to read data stream from Azure EventHub and send it to respective bolts through respective. stream.

Applied Azure EventHub custom encoders for custom input format to load data into Azure EventHub Partitions.

Proficient in deploying and supporting Azure Synapse (formerly Azure SQL Data Warehouse), including designing, implementing, and managing data warehouses to meet diverse business needs.

Worked on AWS services like Lambda, Glue and EMR for ingesting data from different source systems like relational and non-relational to meet business functional requirements.

Used AWS EMR to transform and move large amounts of data into and out of other AWS data stores and databases, such as Amazon Simple Storage Service (Amazon S3) and Amazon DynamoDB

Used AWS cloudwatch monitor the services within the application and to analyze the real-time logs.

Storing the data for further analysis in Amazon redshift before loading the data into end database.

Experienced in data migration between MongoDB and Snowflake using custom ETL pipelines and third-party connectors (e.g., Kafka Connect, Spark-Connector).

Extensively used Amazon S3 for storing the ingested data and also the queried data after the EMR process.

Event logging through the AWS event bridge on the stored data in S3.

Developed AWS Data Pipeline Using AWS Glue, Airflow, Pyspark and snowflake.

Worked on Amazon Web service (AWS) to integrate EMR with Spark 2 and S3 storage and Snowflake.

Extensively worked on AWS services such as EC2 instance, S3, EMR, Cloud Formation, Cloud Watch, and Lambda.

Experience with an in-depth level of understanding in the strategy and practical implementation of AWS Cloud-Specific technologies including IAM, EC2, EMR, SNS, RDS, Redshift, Athena, Dynamo DB, Lambda, Cloud Watch, Auto-Scaling, S3, and Route 53.

Built optimized QVD pipelines to extract, transform, and load data from Azure SQL and Data Lake into Qlik models

Extensive work experience in creating UDFs, UDAFs in Pig and Hive.

Collaborated with cross-functional teams to embed MongoDB within microservices and big data architecture using Python, Scala, and REST APIs.

Involved in deploying applications on Azure. Involved in setting big data cluster using Azure Data bricks.

Good experience in using Impala for data analysis.

Good experience in Snowflake data warehouse, developed data extraction queries, automatic ETL for data loading from Data Lake.

Experience on NoSQL databases such as HBase, Cassandra, MongoDB, and DynamoDB.

Extending HIVE and PIG core functionality by using custom User Defined Function's (UDF), User Defined Table-Generating Functions (UDTF) and User Defined Aggregating Functions (UDAF) for Hive and Pig.

Adaptability to ETL tools like AWS Glue, Spark, and PySpark, indicative of ability to work with platforms like Matillion.

Proficient in ETL processes, showcasing readiness to learn and utilize similar ETL platforms such as Matillion.

Proficient in SnapLogic integration platform for seamless data integration and workflow automation.

Implemented section access to enforce row-level security across Qlik Sense apps based on user roles and groups.

Designed and executed SnapLogic pipelines for data extraction, transformation, and loading.

Configured SnapLogic connectors for interaction with various data repositories and cloud services.

Used SnapLogic's intuitive interface for crafting data integration workflows.

Collaborated cross-functionally to integrate SnapLogic into existing data architecture.

Implemented error handling mechanisms and data quality checks within pipelines.

Optimized SnapLogic pipelines for enhanced data processing efficiency.

Provided training and support on SnapLogic best practices.

Expertise in relational databases like MySQL, SQL Server, DB2, and Oracle.

Good knowledge in understanding the security requirements for Hadoop and integrate with Kerberos authentication and authorization infrastructure.

Developed and maintained Qlik Sense dashboards to visualize key business metrics and operational KPIs.

Experience with ETL concepts and tools, including Informatica and Abinitio, demonstrating proficiency in similar platforms like DataStage.

Experience in ETL processes, emphasizing ability to work with various ETL tools in enterprise environments.

Involved in identifying job dependencies to design workflow for Oozie & YARN resource management.

Experience working with Core Java, J2EE, JDBC, ODBC, JSP, Java Eclipse, EJB and Servlets.

Strong experience on Data Warehousing ETL concepts using Informatica and Abinitio.

Skills

Big Data : Hadoop, HDFS, MapReduce, Pig, Hive, Spark, Kafka, Flume, Sqoop, Impala, Oozie, Zookeeper, YARN, Hue.

Hadoop Distributions : Cloudera (CDH4, CDH5), Hortonworks, EMR.

Programming Languages : Python, Java, & Scala, SQL, Shell Scripting

Database : NoSQL(HBase, Cassandra, MongoDB), MySQL, Oracle, DB2,Microsoft SQL Server.

Cloud Services : AWS (Lambda, EMR, S3, Athena,Redshift, Cloudwatch), AZURE (ADF,Data bricks,Data lake)

Frameworks : Spring, Hibernate, Struts.

Scripting Languages : JSP, Servlets, JavaScript, XML, HTML.

Java Technologies : Servlets, JavaBeans, JSP, JDBC, EJB.

Application Servers : Apache Tomcat, Web Sphere, WebLogic, JBoss.

ETL Tools : Abintito, SAS, Informatica.

Reporting Tools : Power BI, Tableau, Qlik sense

Work Experience:

SLK Software

Client: FIDELITY Info services

Role: Lead Data Engineer April 2023 – Present

Led the end-to-end design and delivery of scalable PySpark and AWS Glue-based ETL pipelines for migrating legacy eligibility processes to AWS cloud.

Designed Dockerfiles and automated scripts to containerize Spark applications, enabling consistent deployment across dev, QA, and production environments.

Directed the implementation of MongoDB Atlas for document-based storage, including schema modeling, index tuning, and cloud-native provisioning.

Architected and deployed real-time data pipelines using AWS Kinesis and Spark Structured Streaming, optimizing throughput and latency for critical workloads.

Oversaw the development of dbt models to transform raw ingested data into curated Snowflake datasets used for downstream analytics and reporting.

Spearheaded the integration of Tableau and Qlik Sense dashboards with Snowflake and Redshift, enabling real-time visibility into business KPIs.

Played a key role in building CI/CD workflows using Jenkins and GitHub Actions for automating Docker image builds, versioning, and pipeline deployments.

Implemented security best practices across AWS resources, including IAM role enforcement, KMS-based encryption, and VPC subnet isolation.

Collaborated with data engineers, analysts, and DevOps teams to align infrastructure provisioning using Terraform, Docker, and Helm on Amazon EKS.

Directed efforts to build SnapLogic pipelines for orchestrating complex ETL workflows with integrated error handling and automated retries using Linux.

Designed MongoDB collections and indexing strategies to support high-volume, semi-structured JSON data ingested from Kafka and Spark.

Contributed to code reviews and architecture planning sessions, ensuring best practices in Spark transformations, MongoDB schema design, and Snowflake ELT logic.

Designed and implemented complex ETL pipelines using AWS Glue and PySpark to ingest, cleanse, and transform structured and semi-structured data from diverse source systems.

Defined and enforced standardized ETL framework patterns using modular Glue jobs, ensuring reusability, maintainability, and ease of debugging across teams.

Automated end-to-end ETL workflows with event-driven triggers using AWS Lambda, Step Functions, and S3 events for real-time and batch data processing.

Built fault-tolerant and retry-enabled ETL orchestration pipelines using Airflow DAGs, integrated with AWS EMR and Redshift Spectrum for parallel processing.

Contributed to multi-cloud strategy by maintaining backup pipelines and storage replicas across Azure and AWS to ensure fault tolerance.

Architected data lake storage strategies on Amazon S3 with lifecycle policies, intelligent tiering, and Glue Catalog integration for optimized storage and discoverability.

Led data migration efforts from on-prem databases to AWS-managed services (S3, RDS, Redshift, DynamoDB) using Glue, DMS, and custom Spark ingestion jobs.

Implemented partitioning, bucketing, and columnar formats (Parquet, ORC) within ETL jobs to boost performance for downstream analytics in Athena and Redshift.

Configured Azure Monitor alongside CloudWatch for unified observability and alerting across distributed data pipelines.

Developed automation scripts in Shell and Python for provisioning Kafka topics, consumer groups, ACLs, and monitoring configurations

Developed and maintained SSIS packages to perform complex ETL workflows, including incremental loads, CDC, and dynamic SQL execution.

Built custom data pipelines in Azure Data Factory, leveraging lookup, for-each, and stored procedure activities to process large volumes of transactional data.

Architected end-to-end AWS infrastructure using services such as EC2, Lambda, Glue, Redshift, EMR, S3, and Kinesis to support data lakes and streaming platforms on Linux Systems.

Familiar with Pentaho Data Integration (PDI) transformations and jobs; adaptable in using alternate ETL tools for integrating disparate data sources. Collaborated with security teams to ensure ETL compliance with enterprise standards by integrating KMS encryption, IAM policies, and audit logging in Glue and S3.

Supported cross-cloud data ingestion by orchestrating data flows between Azure Data Factory and AWS Glue for hybrid architecture use cases.

Established standards for full-load and incremental data processing using Glue Catalog, dbt macros, and Snowflake features like stream & task.

Integrated Azure Blob Storage with AWS-based ETL workflows to archive processed data and enable cost-effective long-term storage.

Guided the team in optimizing Spark performance through broadcast joins, repartitioning, and caching, resulting in reduced job execution times.

Supported the migration of legacy SQL workflows into modern, cloud-native solutions on Snowflake, Redshift, and MongoDB.

Automated deployment processes and release pipelines with GitOps practices, enabling faster rollouts and rollback strategies.

Built Python-based web scraping scripts using BeautifulSoup and Requests libraries to extract structured and unstructured data from public APIs and websites.

Developed scheduling logic and retry handlers for resilient scraping pipelines using Airflow and Python-based logging.

Leveraged Terraform Workspaces to manage isolated environments (dev, QA, staging, prod) with consistent configurations and separate state files

Participated in stakeholder discussions to translate reporting requirements into dynamic dashboards and reusable semantic models.

Mentored junior engineers on PySpark development, Snowflake architecture, and dbt model version control using Git.

Ensured compliance with security policies, role-based access control, and encryption standards across all data pipelines and platforms.

Administered and maintained Confluent Kafka clusters in production and non-prod environments, including brokers, ZooKeeper, Schema Registry, and Kafka Connect

Used Python libraries such as Pandas, Requests, and PyYAML to handle API integration, data parsing, and configuration-driven pipeline execution

Deployed MongoDB on Azure Virtual Machines in a highly available architecture, leveraging zone-redundancy and disk encryption for security and resilience.

Delivered technical documentation covering pipeline architecture, data lineage, and performance optimization strategies.

Drove data quality assurance by implementing validation frameworks within ETL workflows using PySpark, dbt tests, and Snowflake constraints, ensuring accuracy and compliance.

Integrated observability tools including CloudWatch, Prometheus, and custom logging to monitor ETL jobs, track performance metrics, and

Championed a culture of quality and agility through regular performance reviews, backlog grooming, and cross-functional coordination.

ACCION Labs

Client: EMERSON

Role: Sr. Data Engineer June 2022 – March 2023

Responsibilities:

Worked on AWS services like Lambda, Glue and EMR for ingesting data from different source systems like relational and non-relational to meet business functional requirements.

Used AWS EMR to transform and move large amounts of data into and out of other AWS data stores and databases, such as Amazon Simple Storage Service (Amazon S3) and Amazon DynamoDB

Used AWS cloudwatch monitor the services within the application and to analyze the real-time logs.

Used Terraform modules to provision and manage Azure resources (VMs, Storage Accounts, Networking) alongside AWS infrastructure.

Storing the data for further analysis in Amazon redshift before loading the data into end database.

Extensively used Amazon S3 for storing the ingested data and also the queried data after the EMR process.

Implemented row-level security (RLS) to restrict data visibility by department and user roles in Tableau dashboards.

Event logging through the AWS event bridge on the stored data in S3.

Developed AWS Data Pipeline Using Aws Glue, Airflow, Pyspark and snowflake.

Worked on Amazon Web service (AWS) to integrate EMR with Spark 2 and S3 storage and Snowflake.

Worked alongside DevOps teams to integrate MongoDB schema changes into CI/CD pipelines using GitHub Actions and Terraform.

Tuned Kafka producer/consumer configurations such as batch size, linger.ms, acks, retries, and buffer memory to match SLAs and system throughput expectations.

Configured Spark streaming to receive real time data from the Kafka and store the stream data into AWS S3 using Scala.

Implemented automatic CI/CD pipelines with Jenkins to deploy Micro services in AWS ECS for streaming data, Python jobs in AWS Lambda, Containerized deployments of Java & Python.

Used AWS glue catalog with crawler to get the data from S3 and perform SQL query operations.

Written Terraform scripts to automate Aws services with CloudFront distribution

Converted legacy Excel-based reports into automated Tableau dashboards, reducing reporting effort by 60%.

Proficient in handling large-scale analytics and data warehousing tasks using Netezza.

Experience in optimizing queries and database performance within the Netezza environment.

Familiarity with Netezza's parallel processing architecture for efficient data processing.

Conducted unit and integration tests on MongoDB queries and data access layers using PyTest and JUnit frameworks to ensure data consistency and integrity.

Managed user access and Kafka security via Confluent RBAC, Kerberos integration, TLS encryption, and role-based ACL enforcement.

Created reusable Terraform templates to deploy Azure Data Factory pipelines and monitor integration runtimes using linked services and triggers.

Integrated Azure Resource Manager (ARM) templates with Jenkins pipelines to provision infrastructure during CI/CD cycles.

Skilled in data modeling and schema design in Netezza to ensure optimal storage and retrieval of data.

Proficient in writing SQL queries and stored procedures in Netezza for data manipulation and analysis.

Supported real-time and batch data processing needs using SnapLogic in AWS.

Configured Azure Monitor and Log Analytics to track resource utilization and ETL job health across Azure-hosted components.

Developed Tableau visualizations to analyze job execution trends across Airflow DAGs, helping reduce failures by 25%.

Performed monitoring and optimization of SnapLogic pipelines for AWS-based projects.

Created reusable Terraform templates to deploy Azure Data Factory pipelines and monitor integration runtimes using linked services and triggers.

Collaborated cross-functionally to design scalable data solutions with SnapLogic on AWS.

Ensured data integrity and reliability through robust error handling mechanisms in SnapLogic.

Built executive dashboards in Tableau to visualize data movement from AWS S3 to Snowflake, enabling transparency in ETL job success/failure rates.

Experience in data migration and integration with Netezza from various source systems.

Adaptability to ETL tools like AWS Glue, Spark, and PySpark, indicative of ability to work with platforms like Matillion.

Configured Azure Monitor alongside CloudWatch for unified observability and alerting across distributed data pipelines.

Proficient in ETL processes, showcasing readiness to learn and utilize similar ETL platforms such as Matillion

Knowledge on Pyspark and used Snowflake to analyze sensor data and cluster users based on their behaviour in the events.

Executed seamless middleware integrations and platform migrations to Kafka from legacy message queues with zero-downtime rollout strategies.

Configured Azure Monitor alongside CloudWatch for unified observability and alerting across distributed data pipelines.

Developed UDF, UDAF, UDTF functions and implemented it in Snowflake Queries.

Implemented Snow pipe, Stage and file upload to Snowflake database using copy command.

Developing Scripts and Batch Job to schedule a bundle (group of coordinators) which consists of

various Spark Programs using Airflow.

Proficient in managing and querying Oracle databases, ensuring efficient data storage and retrieval for business operations.

Extensive experience with Microsoft SQL Server 2012+ for handling large datasets and supporting diverse business functions.

Experience in building and orchestrating ETL workflows using Matillion in cloud environments.

Proficient in designing and implementing data transformations and manipulations in Matillion.

Skilled in integrating Matillion with various data sources and target systems for seamless data movement.

Profiled slow-running queries and optimized execution using MongoDB Compass and performance profiler tools.

Familiarity with Matillion's drag-and-drop interface for rapid development and deployment of ETL processes.

Implemented observability solutions using Prometheus, Grafana, AppDynamics, JMX, and Splunk to monitor Kafka performance, lag, throughput, and consumer health.

Experience in optimizing and tuning Matillion jobs for improved performance and efficiency.

Knowledgeable in monitoring and managing Matillion instances for scalability and reliability.

Skilled in working with DB2 database platform, ensuring data integrity and efficient data management.

Familiarity with Netezza database technology, proficient in optimizing data storage and retrieval processes for enhanced performance.

Experienced in optimizing Snowflake queries, joins to handle different data sets.

Exploring with the Spark improving the performance and optimization of the existing algorithms in Hadoop using Spark

Context, Spark-SQL, Data Frame, Pair RDD's, Spark YARN

Worked with Terraform Templates to automate the Azure Iaas Virtual machines using terraform modules.

Created change data capture (CDC) workflows with MongoDB Change Streams and Kafka to stream delta updates to Snowflake and Redshift

Used Spark and Spark-SQL to read the parquet data and create the tables in hive using the pyspark API.

Created pipeline for processing structured and unstructured streaming data using spark streaming and stored the filtered data into S3 as parquet files.

Proficient in designing and optimizing data models, schemas, and queries for performance and scalability in Snowflake's cloud data warehouse environment.

Experience in managing Snowflake resources, including warehouses, databases, schemas, and roles, to meet performance, security, and governance requirements.

Skilled in writing and tuning SQL queries, stored procedures, and user-defined functions (UDFs) in Snowflake for data analysis, reporting, and dashboarding purposes.

Familiarity with Snowflake's features for data sharing, data replication, and data masking to enable secure collaboration and compliance with data privacy regulations.

Integrated Kafka with downstream systems (MongoDB, Spark, Snowflake) using Kafka Connect and custom producers/consumers built in Python and Scala.

Ability to implement Snowflake's security controls, including role-based access control (RBAC), multi-factor authentication (MFA), and data encryption, to protect sensitive data assets.

Optimized MongoDB aggregation pipelines and implemented $lookup, $facet, and $group operations to support advanced analytics.

Knowledge of Snowflake's integration capabilities with other cloud services, data sources, and analytics tools, enabling seamless data integration and interoperability across the cloud ecosystem.

Experience in monitoring and optimizing Snowflake performance, including query execution, resource utilization, and storage management, to ensure cost-effective and efficient data processing operations.

Proficiency in using dbt to manage and transform data in cloud data warehouses such as Snowflake, BigQuery, or Redshift, using SQL-based modeling and templating.

Experience in building modular, reusable dbt models to represent business logic and data transformations, promoting code reusability and maintainability.

Skilled in writing SQL queries, macros, and Jinja templating in dbt to define complex transformation logic and data aggregation operations.

Familiarity with dbt's testing framework for validating data quality, consistency, and accuracy across transformation pipelines, ensuring reliable insights for decision-making.

Ability to integrate dbt with version control systems like Git and continuous integration/continuous deployment (CI/CD) pipelines for automated testing and deployment of data models.

Knowledge of dbt's documentation features for documenting data lineage, business logic, and transformation rules, enhancing data governance and collaboration among data teams.

Led dbt implementation for managing data transformation workflows.

Developed and maintained dbt models for defining business logic.

Utilized dbt's version control for managing changes and ensuring reproducibility.

Integrated dbt with Git for collaborative development and version management.

Administered and maintained Confluent Kafka clusters in production and non-prod environments, including brokers, ZooKeeper, Schema Registry, and Kafka Connect

Created reusable SQL code snippets and macros for efficient transformations.

Orchestrated dbt runs using automation tools for timely execution.

Conducted testing and validation of dbt models for data accuracy.

Documented dbt models and transformations for stakeholder understanding.

End to end data platform with Snowflake, Matillion, Power BI, Qubole, Data bricks, Tableau, Looker, Python.

Used AWS Glue for the data transformation, validate and data cleansing.

Write and tune complex Python, Scala, Spark, Airflow jobs.

Led the conceptualization and execution of ETL processes to fortify the Enterprise Data Warehouse and Operational Data Store infrastructures, pivotal for driving Business Intelligence initiatives forward.

Engaged in the continuous development, debugging, and maintenance of software applications, tailored to support various business functions, leveraging IBM InfoSphere DataStage and cloud-based ETL tools, and adopting a versatile mix of development platforms and technologies.

Proficiently operated within the Snowflake development environment, executing comprehensive SQL operations and embracing ELT methodologies to maximize operational efficiency.

Actively contributed to the intricate design, development, and deployment phases of complex applications, predominantly relying on IBM InfoSphere Information Server products such as DataStage and QualityStage, alongside proficient utilization of Control-M for streamlined scheduling and orchestration.

Developed multiple Spark jobs in Scala & Python for data cleaning and preprocessing.

Prepared the Technical Specification document for the ETL job development.

Client: EUROFINS IT SOLUTIONS INDIAN PVT.LTD

Role: Sr. Data Engineer Feb 2018 – Jun 2022

Responsibilities:

Worked on developing Azure Data Factory pipelines with various integration runtimes & linked services; and multiple activities like Copy, Dataflow, Spark, Lookup, Stored Procedure, For each & While loop.

Built optimized QVD pipelines to extract, transform, and load data from Azure SQL and Data Lake into Qlik models

Extract Transform and Load data from Sources Systems to Azure Data Storage services using a combination of Azure Data Factory, T-SQL, Azure Data Lake Analytics, Data Ingestion to Azure Services like Azure Data Lake, Azure Storage, Azure SQL, Azure DW and processing the data in In Azure Data bricks.

Integrated Terraform with GitHub Actions for infrastructure-as-code deployments, ensuring environment consistency and auditability across dev, test, and prod

Developed real time data processing applications by using Scala and Python and implemented Apache Spark Streaming from streaming Source Azure EventHub.

Integrated MongoDB with Apache Spark and Kafka for real-time ingestion and transformation of semi-structured and JSON-based event data.

Used Tableau’s calculated fields and level-of-detail expressions to define KPIs for manufacturing, quality control, and test turnaround times.

Collaborated with full-stack developers, data engineers, and product teams to define MongoDB-backed APIs and optimize data flows.

Designed and implemented SnapLogic data pipelines on AWS Cloud.

Configured and optimized SnapLogic instances for efficient ETL.

Integrated disparate data systems using SnapLogic connectors.

Used Terraform to provision Azure Virtual Networks, Blob Storage, and Linux VMs, supporting data ingestion and compute workloads for analytics platforms.

Developed and maintained Qlik Sense dashboards to visualize operational KPIs for logistics and supply chain data.

Managed real-time and batch data processing workflows in SnapLogic.

Developed modular and reusable Terraform scripts to provision AWS infrastructure components including VPC, EC2, S3, IAM, and EKS for data engineering platforms.

Monitored and troubleshooted SnapLogic pipelines in AWS Cloud.

Involved in loading data from Linux file systems, servers, java web services using Azure EventHub producers and partitions.

Developed code to read data stream from Azure EventHub and send it to respective bolts through respective. stream.

platforms Designed and implemented SSIS packages to extract, transform, and load (ETL) data from various sources into Azure SQL Database, enabling efficient and reliable data integration for the Ecommerce platform.

Developed and maintained SSIS packages for extracting, transforming, and loading data into Azure SQL Database from various operational sources.

Implemented parameterized SSIS workflows with error handling and logging for batch data movement in a secure and auditable manner.

Applied Azure EventHub custom encoders for custom input format to load data into Azure EventHub



Contact this candidate