Data Engineering Aws Cloud

Location:

Round Rock, TX

Posted:

March 03, 2025

Contact this candidate

Resume:

Satya Srilatha N

Email:***.********@*****.***

Cell: +1-954-***-****

PROFESSIONAL SUMMARY:

● 7.5 years of experience in AWS Cloud, Data Engineering, Apache Spark, Pyspark, Python,SQL and Devops tools for pipeline design, development, and maintenance.

● Databricks certified Associate & AWS Certified Solutions Architect – Associate.

● 6+ years of relevant experience in AWS Glue, Glue Catalogue, Crawlers, EMR, Athena, Lambda, Step functions, Cloud Watch, SNS, EC2, RDS, CloudFormation, S3, Aurora DB, KMS, Spark Core, Spark SQL, Pyspark, Python, Terraform, Bamboo.

● Skilled in optimizing Big Data workflows with Databricks, Delta Live Tables, and Apache Spark for reliable real-time data processing.

● Have a knowledge of implementing Unity Catalog on the Databricks platform for improved metadata management and organization.

● Extensively worked in the Agile model and a key participant in all phases of SDLC with Analysis, Design, Development, Integration, Implementation, Debugging, and Testing.

● Strong working experience in cloud data migration using AWS and Snowflake.

● Good Experience with unifying data platforms using Apache Kafka producers/consumers.

● Good Knowledge on kafka for streaming real time feeds from external applications to kafka topics.

● Build real time data pipelines by developing Kafka procedures and spark streaming applications for consuming.

● Expert in Git, Jenkins, Bamboo for version control and collaboration, enhancing team productivity and project management in agile environments.

● Proficient with JIRA for managing data engineering projects and ensuring effective team communication.

● Deploying and managing systems on cloud computing platforms primarily focused on Amazon.Web services and Azure Cloud Services.

● Strong knowledge in configuring and building infrastructure on (AWS) Amazon cloud platform for high availability and fault-tolerance using services like EC2, S3, EBS, VPC, SNS, SES, RDS, IAM, Route53, ELB, Autoscaling, Code Deploy, Lambda.

● Strong Knowledge in monitoring across heterogeneous operating systems as well as the integration of specific processes using tools like Nagios, AWS Cloud Watch.

● Good Knowledge in other aws services like AWS CloudFront, AWS Systems manager, EBS Lifecycle Manager.

● Managing diverse Linux systems and Windows systems handling Installation, troubleshooting and required configurations.

● Enjoy new challenges and am willing to take on extra responsibilities to get work done.

● Highly adaptive to any type of work environment and technology.

● Good problem resolution and communication skills with ability to work in a highly visible role. EDUCATION: Bachelor’s in Information Technology from JNTUK 2014. TECHNICAL SKILLS:

Data Processing and

Analysis

Apache Spark, Databricks, Apache Kafka, Snowflake, Hadoop, AWS Glue, AWS EMR, AWS Lambda

Cloud Technologies Amazon Web Services (AWS), Microsoft Azure Data Storage and

Management

Azure Data Lake Storage (ADLS), Azure Blob Storage, Azure Synapse, Amazon S3, Unity Catalog, Auto Loader

Data Warehousing Azure Synapse Analytics, AWS Redshift Data Orchestration AWS Glue, Azure Data Factory, Snow Pipe, Delta Live Table Pipeline, Databricks Workflows

Programming

Languages

Python, PySpark, Spark SQL, SQL

NoSQL Database DynamoDB

Database MySQL, SQL Server, Oracle, Snowflake

Version Control and

Methodology

Git, Agile

WORK EXPERIENCE:

Project Name : Advisor Insights

Client : Ameriprise Financial.

Environment : AWS, Azure Databricks, Delta Live Tables, Auto Loader, Unity Catalog, Snowflake, DynamoDB, Glue, Lambda, Cloudwatch, RDS, SNS, S3, Python, PySpark, Spark SQL, Git. Project: Advisor Insights

Generate and publish Advisor Insights to provide financial recommendations based on current market,client goals and priorities.

Roles & Responsibilities:

● Ingested data from AWS S3 and using the Autoloader feature for efficient data ingestion.

● Structure it and Enrich it with business rules to generate meaningful insights.

● Developed ETL pipelines to migrate and organize data related to clients portfolio,investments and advisors into Databricks.

● Building Data Pipeline using Apache Spark on Databricks, Python, PySpark, AWS cloud services

(i.e. S3, Athena, Glue Catalog).

● Implemented Snowflake Streams to monitor changes in customers portfolio, triggering updates as needed to map source system data elements to target system to Implement Data Validations to ensure Data Integrity and Data Quality.

● Used Spark Streaming on Databricks for continuous updates, ensuring that insights on advisor and investments remained timely and relevant.

● Managed datasets in Snowflake with micro-partitioning and clustering, enhancing query performance for advisor insights data, with deduplication strategies to minimize storage costs.

● Configured Snow pipe to enable continuous ingestion from AWS S3 into Snowflake, providing near real-time access to updated insights data, essential for analyzing advisor effectiveness as data transitioned to Databricks.

● Established CI/CD pipelines with Azure DevOps and Git to streamline deployments of data transformations across Databricks, Snowflake, and AWS, supporting regular updates for insights data workflows.

● Enforced robust security and governance protocols using Unity Catalog in Databricks, implementing role-based access control and regular audits to safeguard sensitive data on clients portfolios.

● Pioneered a culture of data excellence and governance using Unity Catalog, ensuring unparalleled data quality and integrity across all layers of the Delta live tables architecture.

● Generated and loaded business insights data in Dynamo db for downstream teams to utilize it for the dashboards.

Project Name : Clinical Rules Engine

Client : Anthem, USA.

Environment : Glue, Lambda, Step Functions, Cloudwatch, Athena, RDS, SNS, S3, IAM, Secret Manager, Terraform, Bamboo, Bit Bucket, Python, PySpark. Hue, Presto. Role : AWS Data Engineer.

Description:

As Data Engineer responsible to develop and deploy bug free ETL code on AWS cloud platform. Designing ETL Pipeline and deploying to all higher environments with 100% native AWS components. Using AWS Lambda, AWS step function to schedule and design workflow, S3 for data lake and Athena as data for current project.

Roles and Responsibilities:

● Creation of Glue ETL jobs using PySpark to perform ETL on the Healthcare Data.

● Creation of lambda for s3 data copy from one account to other account, data deletion, audit, file operation purpose and DR.

● Involved in the development of real time streaming applications using Pyspark, ApacheFlink, Kafka, Hadoop clusters.

● Developed ETL pipelines in and out of data warehouse using a combination of python and Snowflake Snow SQL writing SQL queries again Snowflake.

● Creation of source, target tables on Athena data warehouse using Glue crawler.

● Creation of step function to execute job flow.

● Creation of S3 bucket policies & KMS keys for encryption.

● Using AWS secrete manager to store password and username for DB.

● Creation of cloud watch rules to schedule the jobs.

● Working on RDS Aurora and Oracle.

● Implemented kafka customs encoders for custom input format to load data into kafka portions.

● Creation of terraform templates to deploy code through CICD.

● Automate required configurations using Terraform.

● Set-up process, services, and tools around cloud; leverage appropriate AWS services.

● Validating the environment to meet all security and compliance controls.

● Deployed, automated, maintained, and automated aws cloud-based production system, to ensure the availability, performance, scalability, and security of production systems. Roles & Responsibilities:

The project is based on provisioning multiple applications hosted in AWS & Azure which Includes :

● Deploying Infrastructure for all applications in AWS & Azure as per the Cloud Reference Architecture.

● Provisioning Infrastructure through Cloud formation and Terraform.

● Configured Application and Classic load balancers.

● VPC Peering between different accounts.

● Configured IGW,NAT Gateway,RouteTables,NACLs.

● Configured Custom Cloud Watch metrics for monitoring Memory and Disk Utilization using SSM Agent.

● Configuring AWS Config rules and used to access,audit and evaluate the configuration changes on the AWS Account.

● Migrating On-premises VM’s to AWS cloud.

● Support for the cluster, topics on the kafka manager. Cloud formation scripting, security and resource automation.

● Automating Stop and Start of AWS EC2 Instances and Azure VM’S using lamda functions and Azure run books

● Cloning of instances using AWS Console.

● Setting up Auto scaling for web server Instances using launch configuration and Auto scaling groups.

● Scheduling Daily EC2 and EBS backups and cleanup using snapshot lifecycle manager.

● Enabling Guard Duty service and monitoring the AWS account from malicious attack and unauthorized behavior and enabling cloud watch alerts to trigger the alerts for the same.

● Cleaning of the old AMI’s and EBS snapshots using the bash scripting.

● Writing PowerShell scripts to upload server log files to S3 as backup.

● Retrieving the backup files and folders using the EBS snapshots.

● Used AWS policy generator to write policies for IAM and S3.

● Preparing the Cost optimization report using the AWS Right Sizing feature.

● Monitoring Trusted Advisory reports in AWS and making sure in reducing cost of unwanted resources, Increase performance, and improve security by optimizing your AWS environment.

● Monitoring the Infrastructure via Nagios and Cloud watch tools.

● Creating and mounting EFS on multiple servers to ensure synchronization of data across the servers.

● Deploying VMs in the Azure accounts based on the requests.

● Configured Application gateway for the application servers in Azure Environment.

● Setting up a centralized Trend Micro Antivirus service to secure the Instances hosted in AWS.

● Creating RDS Instances, taking snapshots, restoring the backup during major deployments.

● Data copy back activity on RDS instances.

● Configured SNS and SES services and used for cloud watch notifications and email services.

● Setting up the Guacamole bastion server which provides access to hosts in vpc for both windows desktops(RDP) and Linux terminals (SSH).

● Implemented OS patching automation using the AWS Systems Manager.

● Mounted Fsx Filesystems for Windows based systems and used as network drives. ACHIEVEMENTS:

● Wall of fame award for Successful delivery and Support of MIaaS in Cognizant.

● Impact awards for Successful implementation and development of end-to-end pipeline for Medical Claims in Legato (2021).

● Impact awards for implementing 25 parallel job runs and developing reusable code for Rule Results in Legato (2020).

Contact this candidate