SUMMARY
• Software Engineer with *+ years of experience in building and supporting scalable cloud platforms, Linux based systems, and containerized environments with strong expertise in observability, monitoring, and system reliability engineering.
• Strong experience in Kubernetes, Docker, and Elasticsearch cluster management, including configuration, scaling, and perfor- mance tuning for high availability distributed systems.
• Skilled in implementing observability solutions usingmetrics,logging,monitoring, anddistributedtracingtoolssuchasPrometheus, Grafana, Kibana, and OpenTelemetry.
• Proven ability to troubleshoot complex infrastructure issues, optimize system performance, and improve operational efficiency through automation and collaboration with SRE and platform engineering teams. TECHNICAL SKILLS
• Platforms: Linux, Kubernetes, Docker
• Observability Tools: Prometheus, Grafana, Kibana, Splunk, Dynatrace, Jaeger
• Logging and Monitoring: Elasticsearch, distributed tracing, metrics, logging
• Networking: TCP/IP, DNS, load balancing
• Messaging: Kafka
• Cloud Platforms: AWS
• DevOps: CI/CD, automation, Git
• Concepts: system reliability, performance tuning, high availability, incident management
• Methodologies: Agile, ITSM, incident management, change management EXPERIENCE
Citizens Financial Group Dallas TX Jun 2024 - Present Software Engineer
• Managed and supported Linux based infrastructure and Kubernetes container platforms ensuring high availability, scalability, and reliability of enterprise applications across distributed environments.
• Administered and optimized large scale Elasticsearch clusters including configuration tuning, indexing strategies, and query per- formance improvements to support high volume logging and monitoring systems.
• Implemented observability solutions using Prometheus, Grafana, and Kibana enabling real timemonitoring,metrics visualization, and proactive system health analysis across platform services.
• Designed and integrated distributed tracing solutions using OpenTelemetry and Jaeger to improve visibility into microservices communication and reduce mean time to resolution for production issues.
• Performed deep dive troubleshooting across infrastructure layers including containers, Kubernetes clusters, networking compo- nents, and application services to identify root causes and resolve performance bottlenecks.
• Applied networking concepts including TCP/IP, DNS, and load balancing to diagnose connectivity issues and optimize service communication across distributed systems.
• Collaborated with SRE and platform engineering teams to enhance observability practices, improve monitoring coverage, and strengthen system reliability.
• Supported incident, change, and problemmanagementprocessesbyrespondingtoproductionissues,documenting root causes, and implementing preventive improvements.
• Automated operational workflows and monitoring processes improving system efficiency, reducing manual intervention, and increasing platform stability.
• Assisted in deployment, upgrade, and scaling activities for Kubernetes based environments ensuring seamless rollout of platform updates and infrastructure changes.
• Monitored system performance and capacity metrics using observability tools to proactively identify risks and implement per- formance optimization strategies.
• Maintained operational governance and documentation standards ensuring compliance with enterprise reliability and system management practices.
United Health Group Dallas TX Jul 2023 - Jun 2024 Software Engineer
• Supported Linux based systems and containerized environments using Docker and Kubernetes to maintain reliable and scalable infrastructure for enterprise applications.
• Configured and maintained Elasticsearch clusters for centralized logging and monitoring ensuring efficient data indexing, search performance, and system observability.
• Implemented monitoring and alerting solutions using Prometheus and Grafana enabling real time visibility into system perfor- mance and application health.
• Assisted in troubleshooting infrastructure and application issues by analyzing logs, metrics, and traces to identify root causes and implement fixes.
• Applied networking fundamentals including DNS resolution, TCP/IP communication, and load balancing to diagnose and resolve system level issues.
• Collaborated with engineering teams to improve observability coverage, enhance monitoring dashboards, and optimize system performance.
• Participated in incident management processes by responding to alerts, investigating issues, and supporting resolution efforts to maintain system uptime.
• Supported deployment and configuration of containerized applications ensuring smooth integration with Kubernetes environ- ments.
• Contributed to automation efforts by developing scripts and workflows to streamline monitoring, deployment, and operational tasks.
• Worked within Agile teams to deliver infrastructure improvements and ensure alignment with operational and business require- ments.
TCS Chennai TN Jun 2021 - Aug 2022
Java Full Stack Developer
• Designed and developed enterprise scale Java applications using Core Java, Java 8, Spring Boot, SpringMVC,and Hibernate within multi module systems.
• Architected and implemented responsive single page applications using Angular, HTML5, CSS3, and JavaScript for enterprise user groups.
• Developed and integrated RESTful microservices using Spring Boot and JPA to support distributed system communication and business workflows.
• Optimized database performance by implementing efficient data access strategies using Hibernate, JPA, Oracle, and MySQL for high volume transactions.
• Deployed and supported production workloads on AWS using EC2, S3, RDS, IAM, and Load Balancers following security and availability best practices.
• Applied Agile ScrumpracticesandimplementedunittestingusingJUnitandMockito,improving code quality and release stability. Apollo Hospitals Chennai TN Jan 2020 - Jun 2021 Java Developer
• Developed Java based web modules for hospital administration systems using Core Java, Spring MVC, and Hibernate, supporting day to day clinical and operational workflows.
• Implemented user interface components using HTML5, CSS3, JavaScript, and basic React to digitize manual processes and im- prove usability for internal users.
• Developed backend REST APIs using Spring Boot to enable secure communication between front end applications and core hos- pital systems.
• Implemented database queries and persistence logic using JDBC, Hibernate, Oracle, and MySQL to manage patient, billing, and reporting data efficiently.
• Supported application deployment and environment management on AWS using EC2, S3, RDS, and IAM under senior team guid- ance.
• Performed unit testing and production issue resolution using JUnit and Agile practices, reducing post release defects by approx- imately 30 percent.
EDUCATION
Master Of Science - Computer Science
Texas Tech University Aug 2022 - May 2024
CERTIFICATIONS
AWS Certified Solutions Architect - Associate