Professional Summary
Omer Makhdoom
Bank of China NYC, NY
Site Reliability Engineer
•Installed, configured, and managed Red Hat Enterprise Linux (RHEL) 6, 7, and 8 servers using templates in vcenter and further streamlined post-installation configurations using Ansible for consistent and efficient deployments.
•Deployed and maintained Linux-based environments for various applications, ensuring optimal performance, stability, and security across Red Hat and CentOS distributions.
•Administered system-wide OS updates and patch management for Red Hat and CentOS, ensuring security vulnerabilities were addressed and system stability was maintained across platforms.
•Monitored and optimized file systems across multiple operating systems, ensuring that disk usage (including inodes) remained within limits and all mount points were properly configured and maintained.
•Configured and managed disk partitions using Logical Volume Manager (LVM) in RHEL 6, 7, and 8, ensuring scalability and efficient disk usage.
•Automated routine tasks using Ansible, creating modular playbooks to handle system configurations, updates, and software deployments in a scalable and efficient manner.
•Deployed RAID configurations (RAID 0, 1, 5, 6, 10) based on performance and redundancy needs, enhancing data reliability and system resilience.
•Provisioned and integrated new servers into networks, troubleshooting and resolving network connectivity issues to maintain reliable and secure services.
•Conducted system performance monitoring using tools like Nagios, analyzing CPU, memory, disk, and network usage, and proactively addressing performance bottlenecks.
•Managed containerized applications with Docker, deploying, maintaining, and troubleshooting containers to support development and production environments.
•Managed Docker container clusters orchestrated with Kubernetes, supporting CI/CD pipelines for application deployment, testing, and scaling within highly available environments.
Work Experience
A highly skilled Site Reliability Engineer with 6 years of experience managing server infrastructures across Red Hat Enterprise Linux (RHEL), CentOS, and Linux platforms. Excellent at planning, deploying, configuring, and optimizing IT environments to ensure exceptional system availability and performance. Expertise in automation using tools like Ansible, GIT, Docker, and Jenkins, with extensive hands-on experience in cloud infrastructure management on AWS. Proficient in Bash scripting for automating tasks, system administration, and performance tuning. A proven problem solver and trouble-shooter, comfortable leading systems, projects, and cross-functional teams in high-pressure environments. Seeking an opportunity to contribute my skills and experience to a forward-thinking organization where I can drive operational efficiency and enhance infrastructure reliability.
*************@*****.***
Site Reliability Engineer
Barnes & Nobel, Monroe, NJ
Linux Administrator
•Utilized GIT and GitHub for version control, managing continuous integration (CI) and continuous delivery (CD) pipelines with Jenkins, and resolving build and release job failures in collaboration with development teams.
•Automated deployment processes using Ansible, writing playbooks, executing ad-hoc commands, and maintaining host inventories to ensure consistency in large-scale environments.
•Deployed and managed AWS infrastructure using CloudFormation, including EC2, RDS, S3, Auto-Scaling, Elastic Load Balancers (ELBs), and implementing security controls within Virtual Private Clouds (VPCs).
•Managed AWS Identity and Access Management (IAM), creating roles and policies for users and groups, and controlling access to AWS resources to ensure a secure and compliant cloud environment.
03/2020 - 05/2023
•Handled full lifecycle management of Linux servers, from initial OS installation and user management to security enforcement and regular maintenance routines.
•Installed, configured, and maintained essential network services, including DHCP, SSH, and SCP, optimizing network functionality and secure remote access across servers.
•Took server migration initiatives, transitioning environments between physical servers and virtual machines, along with comprehensive migration of user data, applications, and system configurations.
•Administered and maintained user management, handling user privileges, troubleshooting access issues, and ensuring security policies were strictly adhered to.
•Utilized Nagios and other monitoring tools to oversee system performance, proactively identifying issues and implementing solutions to minimize downtime.
•Automated Linux installation processes through PXE boot and Kickstart configurations, streamlining server provisioning and reducing manual intervention.
•Optimized system resource utilization, conducting in-depth performance analysis and tuning, resulting in improved server efficiency and reduced bottlenecks.
•Provided end-to-end system administration for Linux environments, managing system builds, updates, security patches, and troubleshooting issues to ensure smooth operations.
•Developed and implemented automated scripts using Bash to manage backups, user tasks, and routine administrative functions, improving system efficiency.
•Set up SAMBA servers to provide secure cross-platform file sharing, ensuring seamless access between Linux and Windows systems.
•Managed NFS and SAMBA services, ensuring high availability and performance for shared storage solutions across heterogeneous environments.
•Established comprehensive backup and disaster recovery procedures, safeguarding data and ensuring rapid recovery from failures or outages.
•Managed SSH key-based authentication for remote servers, enhancing system security by enabling password-less authentication and ensuring compliance with security standards.
•Performed system performance monitoring and tuning, using tools like vmstat, iostat, and top to ensure servers operated at optimal capacity, while addressing any potential issues before they escalated.
•Configured and managed user permissions through sudoers, limiting access and controlling administrative privileges, ensuring secure and compliant system management.
•Executed regular backup and restoration tasks with tools like tar and gzip, ensuring data was reliably stored and quickly recoverable in case of incidents.
•Conducted system capacity planning, analyzing performance data and trends to anticipate future resource needs and prevent potential bottlenecks.
•Maintained 24/7 system availability, collaborating closely with cross-functional teams to ensure the reliability of development tools and infrastructure, supporting continuous deployment pipelines.
Server Management: Installation, configuration, and administration of servers, RAID setup, LVM management
Virtualization: VMware, vshpere, vcenter, vmotion,
Cloud (AWS Services): EC2, S3, CloudFormation, IAM, ELB, Autoscaling, Cloudwatch, Security Groups, VPC, Route53
Containerization: Docker, Kubernetes, container management and orchestration
Version Control Tool: GIT, Gitlab, BitBucket
Operating Systems: Red Hat Enterprise Linux (RHEL) 6/7/8, OEL, CentOS, Debian
Automation & Scripting: Ansible, Ansible Tower, Ansible playbooks, roles, Adhoc Commands,
Scripting: Shell scripting
Monitoring & Performance: Nagios, Splunk, kernel tuning, system performance optimization
Security: SSH key management, user management, sudoers configuration, network security protocols
Backup & Recovery: Disaster recovery, backup management (tar, gzip, rsync)
Network Management: NFS, Samba, TCp/IP, DHCP, DNS, TFTP, FTP,
•Managed user accounts and system security, took care of the creation and maintenance of user profiles, monitoring disk space, and managing system processes to ensure optimal performance.
•Responded promptly to user support tickets, troubleshooting and resolving issues efficiently to minimize downtime and ensure seamless operations.
•Maintained critical development tools and infrastructure, ensuring continuous availability and reliability for 24/7 development environments, supporting cross-functional teams.
•Demonstrated a strong understanding of Linux operating systems and open-source technologies, with hands-on experience in popular open-source applications and tools.
•Executed hardware and software installations, upgrades, and system maintenance, including patch management, system performance tuning, security analysis, and network configuration.
•Migrated local user accounts and groups between systems using rsync, ensuring data integrity and minimizing disruption during migrations.
•Created and maintained up-to-date documentation for internal teams and helpdesk staff, detailing procedures for performing specific tasks and troubleshooting common issues.
•Enforced network and system security protocols, ensuring the privacy and security of data, networks, and computer systems in line with best practices.
09/2018 - 02/2020
Linux Technician
DTCC, New York, NY
Skills
Education
Bachelor in Science - University of Georgia
2018