Karthik Reddy Yerreddu
New Hampshire, USA ********************@*****.*** +1-303-***-****
PROFESSIONAL SUMMARY
Cloud Data Engineer with 4 years of experience in architecting cloud infrastructure, developing scalable ETL pipelines, and automating operations using Python, Shell, and Terraform. Proficient in deploying high-availability systems on AWS, Azure, GCP and Kubernetes. Skilled in implementing CI/CD pipelines to accelerate development and ensure reliable infrastructure delivery. Proven ability to optimize performance, streamline workflows, and support real-time, data-driven decision-making.
Skills
Cloud: AWS (EC2, S3, RDS, Redshift, Lambda, IAM, CloudFormation, Terraform, EKS, DynamoDB, Quicksight), Azure
(VMs, AKS, Blob Storage, policy, Cognitive services.), GCP (Compute service, storage service, networking service, Security, Integration and analytics service, monitoring), Oracle. Programming/Scripting: Python, Golang, Java, PySpark, Shell Scripting, Spring framework, Apache spark, PowerShell. Application Monitoring: Splunk, Nagios, Datadog, AWS CloudWatch. ETL: Informatica Power Centre 10.1.0/9.6.1/9.5.1/9.1/8.6/8.5/8.1/7.1, Snowflake, Databricks. Databases: MySQL, DynamoDB, SQL.
Testing tools: Quality Centre, Mercury Test Director, UAT, Junit, SonarQube. Incident Management Tools: ServiceNow, JIRA.
Operating Systems: Linux, Windows.
Version Control: Git, GitHub, GitLab.
CI/CD & Automation: Chef, Docker, Ansible, Jenkins, Maven, Kubernetes, HELM, Terraform. PROFESSIONAL EXPERIENCE
Cloud Data Engineer, Logan Data Inc, USA May 2023-Present
Architected and optimized real-time big data pipelines using PySpark and Databricks, enabling millisecond-level processing of high-volume transactional data and improving business intelligence efficiency by 35%.
Constructed and implemented serverless data infrastructure using AWS Lambda, Google Cloud Functions, and Terraform, automating data transformations and reducing operational costs by 25%, while enhancing query performance for ad-hoc reporting.
Engineered scalable data warehouse models and schemas, standardizing metadata, data lineage, and cataloguing strategies to improve data accessibility for 200+ analysts and business users across global teams.
Optimized SQL and NoSQL database operations across PostgreSQL, MongoDB, and Cassandra by implementing indexing, sharding, and query tuning strategies, improving data retrieval speed by 60% and supporting analytics for 200+ stakeholders.
Implemented CI/CD pipelines in Azure DevOps to automate build, test, and deployment processes for microservices and front-end applications.
Developed Ant scripts for the purpose of automating ear deployments and the configuration of Web Sphere server for all J2EE applications on the market.
Automated deployments using Jenkins pipelines integrated with Ansible and Docker.
Integrated SonarQube into CI pipelines for continuous code quality analysis, enabling early detection of bugs, vulnerabilities, and technical debt.
Developed and maintained Ansible playbooks for consistent configuration management and automated deployments across Linux-based EC2 instances and container workloads.
Led the migration of legacy databases to Azure Synapse and Big Query, reducing latency by 45% and accelerating reporting.
Implemented proactive monitoring with Prometheus and Grafana, reducing pipeline failures by 60% and ensuring 99.8% data availability.
Developed a Golden Image pipeline using GitLab CI/CD and Packer on AWS for secure AMI creation.
Implemented Terraform & AWS Auto Scaling for streamlined AMI distribution across multiple accounts.
Integrated OIDC authentication via Azure AD for secure AWS access, improving security by 95%.
Integrated AI-driven anomaly detection models into financial and operational data pipelines, leveraging TensorFlow and Scikit-learn to identify fraudulent transactions with 30% higher accuracy, mitigating financial risks and improving data security.
Collaborated with development, QA, and data science teams to deliver container-ready ML workloads and microservices into AWS-managed infrastructure using GitOps principles.
Participated in 24/7 on-call rotation for critical production environments, improving incident response time by 40%. Cloud Support Engineer, Logan Data Inc, USA Mar 2021 – Apr 2022
Supported AWS cloud infrastructure, guaranteeing system dependability and proactively resolving issues to achieve a 95% SLA compliance rate.
Gained hands-on exposure to CI/CD practices and tools like Jenkins, Git, Maven, Docker, and basic scripting (Bash, Python).
Assisted in setting up and maintaining automated build and release pipelines using tools like Jenkins, Git, and Maven to streamline the software development lifecycle.
Supported source code versioning, branch management, and merge conflict resolution under the guidance of senior engineers using Git and GitHub/GitLab.
Through proactive monitoring and alerting with AWS CloudWatch and Splunk, downtime was reduced by 20%, increasing system availability.
Terraform and CloudFormation were used to deploy and manage cloud-based apps, guaranteeing scalable and reliable infrastructure deployments.
Root cause analysis (RCA) was used to fix persistent problems, resulting in a 30% decrease in occurrences and an increase in system stability.
Using Python and Shell scripting, common infrastructure activities were automated, increasing operational efficiency by 25%.
To guarantee data performance and dependability, I oversaw MySQL and DynamoDB database operations, including querying, backups, and troubleshooting.
Changes to the cloud infrastructure and troubleshooting techniques were documented, which enhanced team knowledge sharing and shortened resolution times.
Helped move on-premises apps to AWS, improving performance and scalability while cutting down on costs. ACADEMIC PROJECTS
1.Intelligent Traffic Control System
Created a cloud-based solution with AWS IoT Core, Lambda, and DynamoDB to improve traffic flow.
Reduced human labour by 35% by configuring AWS CloudWatch for automated warnings and real-time monitoring.
Lowered traffic wait time by 25% and processed data in real-time with a latency of less than 1.5 seconds. 2. AI Chatbot Development using AWS
Built an intelligent chatbot using Amazon Lex for NLP, AWS Lambda for backend logic, and DynamoDB for state management.
Integrated S3 and CloudFront for static asset hosting, ensuring low-latency global delivery.
Implemented CloudWatch monitoring, error handling, and security best practices for high availability and scalability.
3. Data Analysis & Visualization with AWS RedShift and QuickSight for On-Premises Integration: Project summary:
Led a data analysis and visualization initiative for a client with extensive data residing on-premises. The objective was to migrate, secure, transform, and visualize key business data to support strategic decision-making. Responsibilities:
On-Premises to Cloud Integration:
Established a secure connection between the client’s on-prem SQL server and AWS Redshift using AWS Database Migration Service (DMS) and AWS VPN/VPC Peering, ensuring encrypted data transfer.
ETL Pipeline Development:
Designed and implemented robust ETL workflows using AWS Glue and Python scripts to cleanse, normalize, and load data into Redshift for analytical processing.
Data Storage & Security:
Utilized Amazon S3 as a staging layer for intermediate data storage. Implemented IAM roles, bucket policies, and KMS encryption to ensure data confidentiality and compliance.
Data Modeling & Optimization:
Created optimized schema designs in Redshift (e.g., star schema) to improve query performance. Used Redshift Spectrum for querying semi-structured data directly from S3.
Dashboard & Visualization:
Built interactive and dynamic dashboards in Amazon Quicksight, integrating various KPIs and business metrics across finance, operations, and sales departments.
Business Impact:
Delivered actionable insights through dashboards that enabled the client to make data-driven decisions, leading to a
~25% increase in operational efficiency and better inventory forecasting.
Collaboration & Delivery:
Coordinated with cross-functional teams (Data Analysts, DBAs, and Business Stakeholders) to finalize reporting requirements and ensure accurate visualization delivery.
Environment: AWS Redshift, AWS Quicksight, AWS DMS, AWS Glue, S3, IAM, VPC, AWS Data Pipeline, Python.
CERTIFICATIONS
AWS CERTIFIED CLOUD PRACTITIONER
Credential ID: AWS03706201
AWS CERTIFIED DATA ENGINEER
Credential ID: 078eefdbeac8405c841a11ae87c9d4cf
AWS Certified Solutions Architect – Associate (In Progress)
SnowPro Core Certified
Snowflake Badge 1 Data Warehousing Workshop
EDUCATION
Master of Science in Information Systems 2022-2024 University Of Colorado, Denver, CO.
Bachelor of Technology, Computer Science and Engineering 2018-2022 Gokaraju Rangaraju Institute of Engineering and Technology, India.