HAJRA ASLAM WARRAICH
Linux Systems Engineer
***********@*****.*** 908-***-****
SUMMARY:
Linux Systems Engineer with around 7 years of experience designing, operating, and securing enterprise computing environments across on premises data centers and public cloud platforms. Proven track record building high performance, highly available Linux infrastructures (RHEL, CentOS, Ubuntu) and hardening them through disciplined patching, vulnerability management, and configuration best practices. Hands on with infrastructure as code and automation using Ansible, containerization and orchestration with Docker, Docker Swarm, Kubernetes. Experienced supporting business critical workloads, partnering with cross functional teams to containerize complex applications, implement robust monitoring and incident response, and deliver scalable, resilient platforms that align with security and compliance requirements.
SKILLS:
• Operating Systems & Platforms
• Linux Flavors: RHEL,OEL,CentOS,Ubuntu
• Managed User access, permissions, and group
management
• Automation & Scripting: Bash, Python
• Ansible and Ansible Tower (Playbooks, roles, ad hoc commands)
• AWS cloud computing & services to manage cloud
platforms (EC2, EBS, S3, EFS, ELB, IAM,
Autoscaling, VPC, Route53 & SG).
• Containerization: Docker & Podman
• Kubernetes: Self-healing, Auto-scaling and
Orchestration
• Image versioning and container lifecycle analysis
• Terraform (exposed to infrastructure assessment)
• Version Control: Git, Gitlab, Bitbucket
• Linux hardening (SELinux, SSH, firewall rules)
• Patching and upgradation of servers
• Configured Chrony
• TCP / IP & Internet Application Protocols
• Networking: DNS, DHCP, NIS, LDAP, SSH, VPN
troubleshooting
• Packet tracing with tcpdump and log-based RCA
• NFS and Autofs Configuration
• RAID Configurations
• Filesystem tuning and error review: Fdisk, Gdisk, LVM
• Monitoring Tools: Nagios,
Prometheus,Grafana,Cloudwatch
• Ticketing / Incident Management/ITSM Tools: Jira, ServiceNow
• Log and event analysis (Splunk, ELK, Datadog)
• System performance tools: top, htop, vmstat, iostat
• Documentation, RCA reporting, analytics dashboards
• Vulnerability Management using Qualys
• VMware ESXi,vSphere,vCenter
(Snapshots,Clones,HA,FT,DRS)
• On-Call Support and ITSM / Incident Management
• Remote Server Access Setup (ILO, IDRAC)
PROFESSIONAL EXPERIENCE:
Microsoft - Redmond, NY Sep 2023 - Present
Linux Systems Engineer
• Managed and supported enterprise RHEL environments, including user/group administration, patching, performance tuning and troubleshooting system issues.
• Performed advanced performance tuning for CPU, memory, I/O, and kernel parameters to optimize Linux systems for latency -sensitive applications.
• Configured and optimized Linux storage using LVM, RAID and multipathing to ensure reliableand high- performance disk operations.
• Automated routine Linux operations using bash, python and ansible to improve reliability and reduce manual effort.
• Implemented high-availability configurations for Linux workloads using load balancers, redundant nodes and automated failover mechanisms.
• Built declarative infrastructure for Linux hosts and Ansible, so complete environments (network, compute, storage, security) could be recreated from code.
• Used Ansible roles and playbooks to implement repeatable configuration patterns for Linux servers and Kubernetes nodes (packages, services, sysctl, users, monitoring agents).
• Contributed to Git-based workflows (Git/Bitbucket) for infra code, including branching strategies, pull requests, code reviews, and promotion of changes across dev / QA / prod.
• Managed AWS-hosted Linux platforms (EC2, EBS, S3, IAM, Route 53, CloudWatch), focusing on secure network layouts, instance hardening, and automated backups/snapshots for stateful services.
• Setup development environment with Linux and VMware and production environment in AWS EC2.
• Engineered and operated container-based platforms on top of Linux using Docker and Kubernetes, defining deployments, services, and ingress rules for highly available microservices.
• Created and maintained Docker images using multi-stage builds, private registries, and image-scanning tools to keep base images lightweight and secure.
• Automated deployment, scaling, and management of containerized applications using Kubernetes.
• Designed high-availability layouts using Red Hat clustering concepts, load balancers, and redundant Linux nodes to remove single points of failure for critical services.
• Participated in incident response for Linux production environments, using logs, metrics, and tracing to identify root causes and implementing automation to prevent recurrence.
• Wired Linux nodes and Kubernetes workloads into centralized observability stacks (Nagios, Splunk, ELK, CloudWatch), surfacing metrics/logs and tuning alerts to reduce noise while catching real issues.
• Worked within Agile teams using Kanban methodology.
• Tuned Linux kernel parameters and Tuned profiles for latency-sensitive or high-throughput workloads, following best practices for performance on RHEL/CentOS.
• Executed scheduled maintenance and after-hours release deployments. JP Morgan Chase- Jersey City, NJ Aug 2021- July 2023 Linux Admin
• Actively monitored on memory, CPU, swap, and disk usage across Linux hosts, using monitoring tools and trend graphs to schedule clean-ups, tuning, or upgrades before users were impacted.
• Handled a steady stream of platform-related tickets in tools like Jira/ServiceNow, from simple access requests to multi-team performance investigations, and always closed them with clear technical notes.
• On boarded teams onto the Linux platform by setting up accounts, groups, sudo roles, and directory structures, aligning access with security and ownership standards.
• Tracked down login and access problems by inspecting file permissions, group membership, PAM settings, and network paths, then implemented durable fixes instead of quick workarounds.
• Looked after core infrastructure services on Linux such as Apache HTTP, DNS, and SSH, adjusting configurations, reloading services, and validating that dependent apps behaved correctly.
• Analyze and resolve user account related issues/access while responding to user requests in a timely manner.
• Install and configure Apache web server for both UNIX and Linux platforms.
• Experienced in RAID configuration for performance improvement and redundancy.
• Server Installations, configurations, upgrades on Linux.
• Setup development environment with Linux and VMware and production environment in AWS EC2.
• Responsible for backing up all systems using NetBackup 7.5 and restoring files in the event of data loss and restoring data upon user request.
• File System Administration. Detailed knowledge with installation, configuration, management and troubleshooting of Redhat Linux.
• Organized and maintained platform storage using LVM and RAID, creating and resizing logical volumes for applications and checking for early warning signs of disk or file system issues.
• Designed and supported shared storage for clusters using NFS, technologies such as GlusterFS/Ceph, ensuring pods and applications had reliable, scalable storage backends.
• Used Ansible to push OS patches, package installs, and configuration updates to large groups of servers in one go, replacing risky manual changes with controlled automation.
• Managed and supported workloads on VMware, including building new VMs from templates, managing clones and snapshots, and escalating physical host issues to data center support teams.
• Configured and maintained iDRAC/iLO for remote management of Linux servers, allowing safe power cycles and troubleshooting even when the OS or network path was down.
• Coordinated with hardware vendors when components failed, collecting Linux-side diagnostics (logs, SMART details, error counters) and arranging repair windows with internal stakeholders.
• Handled service tickets through ServiceNow, including creation, review, and resolution.
• Wrote and updated SOPs, best-practice guides, and post-issue summaries for the platform so recurrent problems could be solved quickly and consistently by any engineer on the team.
• Worked closely with network, storage, database, and application teams during incidents and planned maintenance, representing the Linux platform and making sure tasks were executed in the right order. IEEE – New York, NY July 2019 – June 2021
Linux Analyst
• Performed detailed analysis of system performance trends on Linux hosts, identifying early indicators of resource saturation or misconfiguration.
• Investigated recurring platform issues by correlating kernel logs, authentication events, and network traces, producing actionable recommendations to reduce incident frequency.
• Correlated system logs, audit trails, and monitoring alerts to identify behavioral patterns on Linux hosts, helping teams isolate recurring bottlenecks and misconfigurations.
• Produced in-depth technical reports helping leadership make informed infrastructure decisions.
• Managing users, groups, and roles custom portals.
• Provided technical documentation, revise, operational procedures.
• Run, maintained, and setup schedule works (CRONTAB).
• Perform daily system monitoring, verify the integrity and availability of all hardware and server resources, and review system and application logs.
• Used different Linux commands to run, maintain, setup schedule work, protect and rescue file systems.
• Monitor and control system access, change file permissions, ownerships and monitor system processes to increase system efficiency.
EDUCATION:
Bachelor’s in science 2016
University of The Punjab, Lahore Pakistan
CERIFICATION:
Red Hat System Certified Administrator (RHCSA)