SHILPI KUMARI
Application & Infrastructure Support Engineer
Dallas–Fort Worth, TX 940-***-**** ******.******.*****@*****.*** linkedin.com/in/shilpikumari007
7+
99.9%
40%
300+
Years Enterprise
Application Support
System Uptime
Maintained
Reduction in
MTTR Achieved
Incidents Managed
Monthly
PROFESSIONAL SUMMARY
Results-driven Application & Infrastructure Support Engineer with 7+ years of experience ensuring high availability and reliability of enterprise-grade applications across telecom, cloud, and hybrid environments. Demonstrated expertise in L2/L3 production support, incident management, monitoring & observability, infrastructure automation, and CI/CD operations. Adept at root cause analysis, SLA-driven service delivery, and cross-functional collaboration. Recognized for reducing downtime by 30%, cutting MTTR by 40%, and delivering 99.9% system uptime supporting mission-critical platforms serving millions of users. Holds active U.S. work authorization. Do not need sponsorship.
TECHNICAL SKILLS
Application Support: L2/L3 Production Triage, Log Analysis, Performance Tuning, Troubleshooting
Monitoring & Observability: Splunk, AppDynamics, Prometheus, Dynatrace, Grafana
Cloud Platforms: Microsoft Azure, AWS (EC2, S3, CloudWatch, Lambda)
Infrastructure & OS: Linux/Unix, Windows Server, SSL/TLS, DNS, Load Balancing
Automation & IaC: Ansible, Bash/Shell Scripting, PowerShell, Terraform (basic)
CI/CD & DevOps: Jenkins, Git/GitHub Actions, Salesforce Copado CI, Docker, Release Management
Databases: SQL Server, MySQL, MongoDB, Query Optimization
ITSM & Frameworks: ServiceNow, ITIL v3/v4, Agile/Scrum, JIRA, Confluence
Networking: TCP/IP, HTTP/HTTPS, REST APIs, Firewall Rules, VPN
Soft Skills: Stakeholder Communication, RCA Leadership, Escalation Management, Mentoring
PROFESSIONAL EXPERIENCE
Application & Infrastructure Support Engineer May 2022 – Present
Infosys Limited – Client: T-Mobile Fort Worth, TX
PRODUCTION SUPPORT & INCIDENT MANAGEMENT
Provide 24/7 L2/L3 production support for 15+ mission-critical Java/Spring Boot telecom applications across Azure, AWS, and on-premises hybrid infrastructure, maintaining 99.9% system availability.
Manage and resolve 300+ incidents monthly through ServiceNow with structured triage workflows, achieving 98% SLA compliance and reducing escalation rates by 20%.
Lead root cause analysis for Severity-1/Severity-2 outages using Splunk log correlation, thread dumps, heap analysis, and database query profiling; implemented permanent fixes that reduced recurring incidents by 25%.
MONITORING, OBSERVABILITY & RELIABILITY
Designed and tuned alerting rules in Splunk, AppDynamics, Prometheus, and Dynatrace, cutting critical incident detection time by 35% and enabling proactive issue identification before customer impact.
Built operational dashboards and health-check monitors to provide real-time visibility into application performance, API response times, error rates, and infrastructure utilization.
Reduced Mean Time to Recovery (MTTR) by 40% through improved runbooks, faster diagnostic workflows, and automated incident correlation.
AUTOMATION, DEPLOYMENTS & INFRASTRUCTURE
Automated routine operational tasks (log rotation, health checks, environment configurations) using Ansible playbooks and Bash scripts, saving 10+ hours/week of manual effort.
Supported application deployments, smoke testing, and release validation in Jenkins CI/CD pipelines, contributing to a 99% successful release rate across 50+ releases per quarter.
Managed SSL certificate lifecycle, server patching, OS upgrades, and application maintenance across Azure VMs and on-premises servers with zero unplanned downtime.
DOCUMENTATION & COLLABORATION
Authored 30+ SOPs, runbooks, and knowledge base articles in Confluence, improving new team member onboarding time by 40% and enabling faster L1 self-resolution.
Collaborate with development, QA, and infrastructure teams in Agile sprints to prioritize technical debt, drive operational improvements, and ensure seamless handoffs during change windows.
Application Support Analyst (L2/L3) Jan 2016 – Feb 2019
Go Business India New Delhi, India
Delivered L2/L3 application support for enterprise telecommunications platforms serving 1M+ subscribers, resolving 95% of incidents within SLA targets.
Diagnosed and resolved production issues through Linux log analysis (PuTTY, WinSCP, Splunk), SQL query debugging, and application stack trace investigation, improving first-call resolution by 30%.
Supported pipeline-based and manual application deployments across dev, staging, and production environments, reducing deployment-related failures by 20%.
Performed regression, functional, and smoke testing for new releases and hotfixes, cutting post-release production defects by 25%.
Managed defect lifecycle, change requests, and operational documentation in ServiceNow and Confluence, ensuring full audit trail compliance.
Collaborated with cross-functional teams in Agile ceremonies (daily standups, sprint reviews, retrospectives) to improve delivery timelines and reduce inter-team dependencies.
CERTIFICATIONS & PROFESSIONAL DEVELOPMENT
ITIL v4 Foundation (In Progress)
Microsoft Azure Fundamentals – AZ-900
AWS Certified Cloud Practitioner
Splunk Core Certified User
EDUCATION
Bachelor of Technology (B.Tech) – Computer Science & Engineering
Gateway Institute of Engineering & Technology, Sonipat, India 2011 – 2015