Sr. Observability Engineer, Sr. Splunk Engineer, Monitoring Engineer

Location:

United States

Salary:

$70

Posted:

April 13, 2026

Contact this candidate

Resume:

Roja Chekuri

*************@*****.***

972-***-****

Summary

Around 12+ years of experience in IT Operations, Observability, and Monitoring Engineering with expertise in Splunk, Dynatrace.

Hands-on expertise with multiple monitoring and automation frameworks such as Splunk, Dynatrace, ITSI, Appdynamics, Apica, Protean j, Ansible, BigPanda, Grafana, Cribl.

Extensive knowledge of a tier Splunk installation, Indexer, intermediate, heavy forwarder, search heads, UFs and apps.

Responsible for designing, developing, testing, troubleshooting, deploying, and maintaining Splunk solutions, reporting, alerting and dashboards.

Developed and maintained Ansible playbooks and roles to automate management tasks.

Managed Ansible inventories & dynamic inventory sources to automate cribl pipelines.

Experience on Splunk search construction with ability to create well-structured search queries that minimize performance impact.

Worked on Splunk with ServiceNow integration for automating incident creation in ServiceNow.

Experience with Splunk Searching and Reporting modules, Knowledge Objects, Administration, Add-On’s, Dashboards, Clustering and Forwarder Management.

Created data retention policies, perform index administration, maintenance, and optimization for Splunk.

Aggregated Reports such as Report Acceleration, Datamodel and Summary Indexing.

Configure and customized existing and new systems and applications on Splunk cloud.

As a primary Splunk engineer, supported forwarders across different OS including Linux and windows.

Manage indexes and cluster indexes, Splunk web framework, data model and pivot tables.

Managed and supported Cribl Stream deployments with change records ensuring smooth operation and optimization of data pipelines.

Enabled logs in AWS CloudWatch and created dashboards, alarms in AWS CloudWatch.

Hands on experience in customizing Splunk Apps and dashboards, built advanced visualizations, configurations, Reports and search capabilities.

Knowledge on Splunk ITSI glasstables, deep dives, KPI’s, ITSI modules.

Experience with administration, architecture, configuration and upgrades for distributed Splunk environments on Linux / Windows systems.

Responsible for translating business case analysis into functional requirements.

Worked on several styling and text customizations for different components of the Splunk platform using JavaScript and CSS.

Worked on Splunk DB Connect to import and index the data from database.

Experience developing Splunk queries and dashboards targeted at understanding application performance, capacity analysis, and troubleshoot performance issues with Splunk searches.

Standardized Splunk agent deployment, configuration and maintenance across a variety of UNIX and Windows platforms.

Provided 24/7 on-call support for production, strong team player with great analytical skills.

Expert in Business/Technical document preparation with MS-Office, Visio, MS-Project.

Technical Skills:

Splunk Modules

Splunk 9.x/8.x/7.x/6.x/5.x, Splunk Enterprise, Splunk IT Service Intelligence, Machine Learning Toolkit, Splunk DB Connect, Splunk Cloud, Splunk on Splunk, Splunk Web Framework.

ITIL ITSM 5.5/6.0/7.x, ITIL, ServiceNow and LDAP

Scripting Languages JavaScript, Shell Script, Python

Web Technologies HTML, CSS, XML, Javascript, AngularJS, Bootstrap

Tools/IDE Eclipse3.0, Sublime, Textpad, Maven

App Monitoring Tools Splunk, CA Wily, Netcool, Dynatrace, Apica, Grafana, Jenkins, SCOM, APPD

SDLC Methodologies Waterfall, Agile, Scrum

Certifications:

Splunk Admin, LICENSE #: Cert-191365

Splunk Power User, LICENSE #: Cert-148206

Splunk User LICENSE #: Cert-171927

Splunk Power User, LICENSE #: Cert-108029

Cribl Observability Stream Admin Cert-56626301

Cribl Observability Stream User

Education:

Master’s degree in engineering at San Francisco Bay University, California, USA

Work Experience:

Ally Financials, Detroit, MI Aug 17 – Till date

Role: Observability Engineer/Monitoring Tech Lead /Splunk SME

Responsibilities:

Monitoring Support team and provide support in implementing, troubleshooting, and resolving incidents and Jira stories.

Created Monitoring Dashboards in Dynatrace to check on Application Failures, to track API call traffic.

Created and implemented Change requests in the psp and prod environment

Implemented and supported Gitlab CI/CD pipelines for deployment processes.

Monitored and troubleshoot AWS Glue job failures by leveraging Dynatrace for root cause analysis.

Investigated AWS Glue job execution failures by correlating job logs with Dynatrace.

Creating and maintaining the runbook documentation for Application health check, pipeline failures, etc.

Managed the migration of applications from Splunk and onboarded them into Dynatrace.

Collaborate with application stakeholders to assess Splunk knowledge objects and support migration to modern Observability tools like Dynatrace.

Facilitate technical trainings, create documentation on Splunk, Dynatrace, Cribl, terraform and new monitoring frameworks.

Developed multiple proof-of-concept dashboards for IT operations and service owners to monitor application and server performance.

Prepared best practices and create runbooks to standardize operational processes.

Expertise with Splunk Searching, Analyzing and Knowledge Objects, Administration, Add-On’s, etc.

Support onboarding of multiple enterprise applications into Splunk, ensuring accurate data ingestion and visualization

Perform Applications health check using multiple tools like AppD, Splunk, ScienceLogic, Apica, Jenkins/ProteanJ, Bitbucket, Grafana, cloudwatch, ElasticSearch, Dynatrace, Terraform.

Developed and maintained Ansible playbooks and roles to automate configuration management tasks.

Managed and supported Cribl Stream deployments with change records ensuring smooth operation and optimization of data pipelines.

Managed Ansible inventories & dynamic inventory sources to automate Cribl pipelines.

Utilized Ansible and developed playbooks for task automation, streamlining operational processes and ensuring consistency across deployments.

Utilized Terraform to build application in AWS environment and managed multiple environments.

Automated infrastructure deployment and configuration tasks using Terraform modules and templates.

Worked on Splunk with ServiceNow Integrations for automating incident creation.

Worked on integrating Splunk with Cloudwatch to monitor Splunk instance health in cloud.

Enabled logs in AWS CloudWatch and created dashboards, alarms in AWS CloudWatch.

Design and maintain web environments on AWS including services like EC2, ELB.

Experience using monitoring solutions like Cloudwatch, ELK, Grafana and Dynatrace.

Experience on Splunk search construction with ability to create well-structured search queries that minimize performance impact.

Manage Splunk users, created new roles and authentications.

Install/Onboard Splunk Apps/Add-ons and documenting on the processes in confluence.

Monitor MC and respond to System Health Alerts that monitor and investigate splunk performance, resource usage.

Troubleshot Splunk platform and forwarder issues, root cause analysis, and ensured platform health, availability, and capacity planning.

Guided internal teams in dashboard development, creating production-quality dashboards, documenting best practices, and maintaining the Splunk runbook documentation.

Used Splunk REST API for automation and configuration, created scripts and macros for saved searches, and contributed to the development of automated maintenance routines.

Responsible for providing analysis of problems and resolutions or fixes for the production issues related to Splunk platform within the Service Level Agreement.

Providing 24/7 on-call support for production as a part of monitoring team.

Defined KPI’s, Glass tables, KPI Alerts and KPI base searches for Splunk ITSI.

Created and Managed Splunk DB connect Identities, Database Connections, Database Inputs, Outputs, lookups, access controls.

Environment: Splunk 9.x/8.x/7.x, Splunk ITSI, AWS, ELK, Cribl, Python, Git, Jenkins, Openshift, CSS, JavaScript, ServiceNow, Nexus, LDAP, Splunk DB Connect, AppDynamics, Apica, Jenkins, ScienceLogic, Shell, GitHub, Maven, SharePoint, Confluence, Bitbucket, AppDynamics, Terraform, Grafana, Cribl, Prometheus, Dynatrace.

JPMorgan Chase, Houston, TX Aug 16 – July 17

Role: Splunk SME/Engineer

Responsibilities:

Onboarded and supported various enterprise applications into Splunk Command Center for centralized monitoring, coordinated with application and system owners to ensure complete and accurate logging.

Designed and developed advanced Splunk dashboards, alerts, reports, and visualizations, including ITSI glass tables, deep dives, and service health views for proactive operations monitoring.

Expertise in Installation, Configuration, Migration, Troubleshooting and Maintenance of Splunk.

Experience on Splunk search construction with ability to create well-structured search queries that minimize performance impact.

Configured and managed Splunk DB Connect, setting up database connections, inputs, lookups, and monitoring database health via custom dashboards.

Troubleshoot Splunk platform and forwarder issues, supported root cause analysis, and ensured platform health, availability, and capacity planning.

Assist internal users of Splunk in designing and maintaining production-quality dashboards.

Coordinated with application and system owners to onboard applications in Splunk, Dynatrace and ensure logging capabilities are functional.

Used Splunk REST API for automation and configuration, created scripts and macros for saved searches, and contributed to the development of automated maintenance routines.

Created many of the proof-of-concept dashboards for IT operations, and service owners which are used to monitor application and server health.

Responsible for providing analysis of problems and resolutions or fixes for the production issues related to Splunk platform within the Service Level Agreement.

Integrated Splunk with third-party tools such as AppDynamics, Wily Introscope, Dynatrace, and Netcool, and linked alerts to incident systems like PageOut, React, and HP Service Center.

Environment: Splunk 6.5.2/6.3, Machine Learning Tool Kit, Splunk ITSI, CSS, JavaScript, Python scripting, Netcool, CA Wily, DynaTrace, ServiceNow, LDAP, Splunk DB Connect, Shell, SharePoint Site, MyAppProfile.

Dell Inc, Roundrock, TX Aug 15 – Aug 16

Role: Splunk Consultant/Engineer

Responsibilities:

Designed, developed, and implemented data visualization functionality for Splunk to be used in conjunction with machine data.

Created and Managed Splunk DB connect Identities, Database Connections, Database Inputs, Outputs, lookups, access controls.

Design and maintain Splunk Datamodels and worked on report acceleration, Datamodel acceleration to speed up generation of pivot tables and and Summary Indexing charts for long running queries.

Set up Dashboards for senior management and production support which required to use Splunk.

Used techniques to optimize searches for better performance, Search time Vs Index time field extraction.

Experience with Splunk Searching and Reporting modules, Knowledge Objects, Administration, Add-On’s, Dashboards, Clustering and Forwarder Management.

Worked on several styling and text customizations for different components of the Splunk platform using JavaScript and CSS.

Worked on SplunkWeb for specific actions on user action, such as click, change, or mouse-over event to override the default behavior, replacing with our custom handler using JavaScript.

Created and Managed Splunk DB connect Identities, Database Connections, Database Inputs, Outputs, lookups, access controls.

Configured Splunk multisite indexer cluster such as Golden Gate for data replication.

Used KV Store to perform Create-Read-Update-Delete (CRUD) operations on individual records using Splunk REST API access and lookups to the data collection using the Splunk search language.

Environment: Splunk 6.3, Splunk DB Connect, Tomcat 7.x, CSS, JavaScript, F5 BIG-IP Load Balancers, Apache HTTP server 2.4, LDAP, JDBC, JDK 1.7, J2EE, JMS, XML, MySQL, Oracle 11g, RedHat Linux 6.x, Solaris 10, GitHub.

Acute Soft Solutions Ind Pvt Ltd May 13 – April 14

Roles: Software Engineer

Responsibilities:

Worked for multiple clients to support healthcare and banking projects.

Experience with Searching and Reporting modules, Knowledge Objects, Administration, Add-On’s, Dashboards, Clustering and Forwarder Management.

Designed, developed, and implemented data visualization functionality for Splunk to be used in conjunction with machine data.

Configuration, maintenance, deployments, supporting Fed wire and swift of banking applications.

Upgrading the App as per Application team requirements. Performance analysis, tuning and management.

Worked on frontend development using html, javascript, css, Angular JS.

Created a web pages from scratch using AngularJS.

Created Dashboards for the Performance Analytics users for presenting any visualization, such as charts, Lists, dials and scorecards.

Experience in Data Ingestion and Data enhancement in Splunk.

Contact this candidate