Dur-e-Shahwar Muneeb
DevOps / Infrastructure Automation Engineer/ Linux System Administrator
****.********@*****.*** 469-***-****
PROFILE:
Linux System Administrator with 7+ years of experience supporting enterprise Linux and Windows environments with strong focus on infrastructure automation, server provisioning, and configuration management. Experienced in building and maintaining automation using Ansible and Ansible Tower to standardize server builds, patching, and deployments across on-prem and AWS cloud environments. Skilled in Bash scripting for automation, PowerShell exposure for Windows support, and Infrastructure-as-Code concepts. Strong experience collaborating in Agile environments, working on JIRA stories, participating in daily stand-ups, and delivering automation solutions in two-week sprint cycles. Proven ability to reduce manual provisioning time, improve operational efficiency, and support large-scale enterprise platforms. Strong communicator, self-motivated team player, and experienced in cross-team collaboration to drive infrastructure modernization initiatives. Experienced in capacity planning, system lifecycle management, and maintaining stability in enterprise production environments. Experience supporting AWS cloud security and Identity & Access Management (IAM), including users, roles, policies, least-privilege access, and audit logging using CloudTrail.
PROFESSIONAL SKILLS:
Linux Systems (Redhat rhel 6,7,8,9, CentOS, Ubuntu), Vulnerability management, VMware (ESXi, Vsphere, Vcenter, vMotion),familiarity with HyperVisors, Server Migrations, Automation (Ansible, Ansible Tower, RAAP), Disk Management (SAN, LVM, RAID, gdisk, fdisk OLVM familiarity), Filesystems (EXT3, EXT4, XFS), Tickets and Incidents (Jira, BMC Remedy, SNOW), Performance Monitoring (Nagios, Datadog, TOP, Free, IOSTAT, VMstat, ifconfig, netstat, Traceroute, tcpdump, Wireshark, familiarity with Splunk, Grafana, Prometheus exposure), Job Scheduling & Batch Operations (Cron-based scheduling in Linux environments, exposure to enterprise batch workflows, familiarity with AutoSys concepts including job dependencies, calendars, failure handling, reruns, SLA-driven execution, and batch job failure analysis through log review and coordination with application teams), Bash Scripting, MySQL / SQL (Exposure), basic SQL (queries and troubleshooting DB connections)Powershell exposure, FTP, TFTP, NFS, SAMBA, SSH, SCP, Rsync, Cron, Kernel Tuning, Identity Access management (Cyber Ark), Network Troubleshooting (LAN, WAN, TCP/IP, UDP, DNS, DHCP, VLAN) in enterprise multi-tier environments, Version Control (GIT, GITLAB, Bitbucket), High-Performance Computing (HPC), GPU Server Administration, Cluster Troubleshooting, Documentation & Knowledge Management, On-Call Support, DOCKER (Container, Docker image, DockerFile, Docker Compose), Kubernetes(working knowledge), Cloud Computing-AWS (IAM, AMI, EC2, S3, EBS, EFS, ELB, Auto-scaling, CloudWatch, CloudTrail, Route53, Security Group, AWS IAM (users, groups, roles, policies, least privilege access, IAM audits), Identity & Access Management (IAM), access governance, entitlement review (operational support),Exposure to Jenkins and Terraform for CI/CD and infrastructure automation, Customer & Stakeholder Communication (incident bridges, calls, email, ticketing systems)
EXPERIENCE:
DevOps Automation Engineer
IBM Global Services, NY Mar 2023 - Present
Upgrading, eradicating vulnerabilities, and hardening of server Operating System by applying patches in a multi-platform (RedHat linux 7,8, 9 / CentOS) environment, reducing patching time by 40% by using Ansible automation and red hat satellite.
Expertise in working with Ansible for automating tasks, provisioning, patching, installing applications, and configuration.
Developed and maintained Ansible playbooks to standardize configurations, reduce manual effort, and improve operational consistency.
Designed and enhanced automation workflows using Ansible and Ansible Tower to provision and configure Linux servers across on-prem and AWS environments.
Automated server build, patching, and configuration processes reducing manual provisioning efforts and improving deployment speed.
Collaborated with cross-functional teams during sprint cycles to deliver automation stories aligned with infrastructure modernization goals.
Built reusable Ansible playbooks to standardize server configurations across development, testing, and production environments.
Supported Windows server environments with PowerShell scripting for automation tasks (exposure level).
Participated in daily Agile stand-ups, sprint planning, backlog grooming, and JIRA story execution.
Ensured systems remained compliant with enterprise security standards through vulnerability remediation, OS hardening, and access control enforcement.
Led enterprise-wide RHEL migrations (7 8 9) across 500+ servers, performing dependency analysis, pre/post health checks, cutover validation, and rollback testing, ensuring smooth transitions with minimal downtime.
Supported batch-driven Linux applications by monitoring scheduled jobs (Cron-based), analyzing job failures, validating downstream dependencies, and restoring failed executions within SLA timelines.
Performed root-cause analysis for failed batch jobs by reviewing system logs, application logs, and database connectivity, coordinating fixes with application and database teams.
Managing Ansible Tower gives users permission to run their tasks and update inventory when needed, creating projects and job templates using Git repositories. Running job templates to configure remote nodes.
Provisioning, deploying and maintaining Linux-based servers in virtual environment using templates in Vcenter accessing through VSphere, in a large enterprise environment.
Experience supporting SAN/NAS storage environments, including volume provisioning, snapshot handling, and performance monitoring.
Supported application teams with exposure to MySQL, assisting in running SQL queries and validating results during Linux application troubleshooting.
Assisted with basic SQL query analysis and database connectivity checks while performing root-cause analysis in production Linux environments.
Utilized SQL queries (MySQL exposure) for data validation, incident troubleshooting, and application support; familiar with adapting queries across different RDBMS platforms.
Managed snapshots, created clones, and templates. Worked on application High Availability and clustering, vMotion, solutions for managing large VMware infrastructure.
Experience collaborating with networking teams on circuit provisioning updates and maintaining network connectivity documentation.”
Orchestrated continuous system upgrades and migrations with minimal downtime, ensuring business continuity and the smooth transition of critical applications and services.
Spinning EC2 instances in AWS cloud, creating S3 buckets for storage, configuring VPC as well as creating policies managing users and groups leveraging AWS IAM service.
Monitoring infrastructure performance and security using AWS CloudWatch, CloudTrail, and Datadog; exposed to creating alerts and dashboards in Splunk and Grafana for application monitoring. Modifying kernel parameters for efficient performance, loading, and unloading kernel modules.
Supported AWS IAM configuration for Linux workloads, managing users, roles, and policies aligned to least-privilege principles.
Assisted with identifying and remediating over-privileged access by reviewing IAM policies and access usage.
Collaborated with application teams to support production environments, assisting in resolving incidents impacting end-to-end application flows in addition to infrastructure issues.
Collaborated with application, database, and networking teams during migrations to validate end-to-end application flows, acting as a bridge between technical teams and stakeholders.
Coordinated with vendors (Red Hat, hardware suppliers) and engaged data center engineers during migration cutovers and hardware replacements, ensuring minimal business disruption.
Supported Dell and HPE hardware through vendor coordination and data center engagements.
Participated in incidents, problems, and change management activities aligned with ITIL best practices.
Automated routine operational tasks with Ansible and Bash and leveraged basic Python scripts for simple automation needs.
Coordinated with DevOps teams using GitLab pipelines and exposure to Jenkins/Terraform for deployments and infrastructure automation.
Optimized memory and CPU usage by fine-tuning Linux performance parameters, improving system responsiveness and stability for resource-intensive applications.
Configured and extended Logical Volume Management (LVM) systems, dynamically resizing file systems and ensuring optimal storage management to accommodate increasing workloads familiarity with OLVM concepts.
Creating and managing file system (EXT3, EXT4 & XFS) as required and extending Volume Groups and ultimately Logical Volumes.
Using channel Bonding or teaming for its advantage to increase network performance, increase bandwidth, and redundancies.
Configuring, managing, and troubleshooting NFS servers & client to share data over the network.
Engaging field engineers in remote datacenters to source and replace parts in stakeholders prescribed window.
Coordinating with DevOps team on Docker Images & file to create the containerized environment.
Working knowledge of Kubernetes for creating, managing, and scaling pods, and supporting containerized application deployments.
Collaborated with application, database, and network teams to validate end-to-end application functionality during migrations and production incidents.
Strong knowledge in configuring network services like NFS, NIS, DHCP, DNS, HTTP, LDAP, VPN and Firewall.
Optimized system performance through kernel tuning, adjusting kernel parameters to meet client-specific needs and enhance system functionality.
Hands-on experience on multi-platform servers using Nagios tool, utilized this tool to configure and monitor.
Proactively identified areas of improvement and contributed to enhancing operational processes and documentation.
Tracked incidents, tasks, and change activities in Jira, preparing status reports, RCA documentation, and operational summaries using Excel and PowerPoint for stakeholders.
Limiting root access by implementing SUDO utility throughout the production environment.
Participated and supported and assisted 3rd party application deployments to ensure smooth installation and configuration.
Opened cases with the vendors (Red Hat) to gain expert analysis and solutions to apply within the environment for improved system performance and compliance.
Participated in a scheduled on-call rotation, providing support outside business hours including evenings and weekends to ensure continuous availability production systems.
Linux System Administrator
General Electric International (GE) Mar 2019– Feb 2023
Provide technical support and troubleshooting user, filesystem, OS, and network issues.
Installed packages by using YUM and RPM.
Partitioning of hard drives using fdisk, gdisk or parted utilities and managing Logical Volume Manager to manage storage dynamically.
Maintaining and managing users by creating, modifying, and deleting users.
Hardening security setting up GETGID, STICKY BIT, and ACL.
Automated routine administrative tasks by writing Bash scripts
Improved performance of servers by leveraging monitoring tool, reviewed logs to identify root cause.
Collected TCP dumps for the testing team to aid in network analysis and troubleshooting.
Proficient in enabled and disabled port access and conducting performance tuning for RHEL servers to optimize latency-sensitive workloads.
Offered support and resolved escalations through the ticketing system, emails, and phone calls promptly using BMC Remedy.
Collaborate with other teams to resolve escalated technical issues and implement new technologies.
Connected the users with remote machines through key-based SSH on all Linux production systems.
Created, configured, and managed user and group accounts, ensuring proper access control and user management across multiple systems.
Provide on-site or remote support for employees or clients to minimize downtime.
EDUCATION:
associates of applied sciences in cybersecurity
Collin College, Frisco Texas