Job Description
ComResource is looking for a Senior DevOps Engineer.
We are seeking a talented Senior DevOps Engineer for our engineering team with experience in designing, implementing, and scaling our technologies, approach, and platform to support our systems. They will be responsible for deploying & maintaining deliverables in all phases of the Software Development Life Cycle (SDLC) by collaborating with business, development & systems engineering teams. Responsibilities include scope definition for infrastructure, infrastructure build-out, application environments maintenance, application deployment, application robustness testing through performance testing, and continued application support through on-call cadence. Qualified candidates will bring the latest software development, automation, and deployment practices. You will develop systems to accelerate our deployments for a variety of software projects and use your talents and passion to build out complete enterprise systems. This person will work collaboratively with the developer to automate, deploy, operate, maintain, and monitor our systems in multiple data centers.
Responsibilities:
Hands-on leadership of Architecture design and work closely with developer along with supporting operations functions within an Agile/Scrum environment
Responsibility for related application, technology, and data scalability and security pertaining to application growth
Gather requirements, create a plan, create an estimated timeline, and execute the project
Lead the high-quality execution of software products against project plans and delivery commitments
Establish milestones for infrastructure buildouts, engineering, development, testing, and implementation of application environments
Responsibility for multi-channel software development lifecycle, enhancements/modifications, system configuration, migrations/upgrades, and production support
Provide support and troubleshooting for all related systems and technologies
Responsible for being a technical subject matter expert on Cloud related technologies
Production Operations and Support
Participate in troubleshooting efforts to find the root cause and provide valuable suggestions to prevent this from happening again
Provide guidance and standards for application optimization techniques such as CDN, Cloud, and caching techniques
Influence the business strategy across Engineering by articulating key architecture, design, or technology challenges and building understanding among executive decision-makers
Resolve difficult technical issues, remove obstacles for teams, and help all projects to move forward on schedule, budget, and meeting
Build and operate a high performance stable and resilient cloud platform
Champion Site Reliability Engineering
Work closely with our cross-functional IT and technology partners to ensure system interactions are top-notch and to your standard
Align our technology with business strategy, including scope definition, cost estimations, resource allocation, business requirements, process design, technical specifications, data management, compliance, and testing
Essentials:
Education: Bachelor's Degree in Information Technology, Computer Science, Engineering, Business Administration, or related field
Years of Experience: 7 - 10
5+ years of hands-on experience implementing, operating, and maintaining infrastructure for high-volume enterprise applications
5 years of experience in distributed system development (design and support of systems with scalability and disaster recovery robustness) to support compute use cases for business requirements
3+ years of implementation and operations experience with production systems in public cloud environments (GCP preferred)
Hands-on experience automating infrastructure operations and with modern best practices such as infrastructure-as-code, cross-region & multi-provider redundancy, and event correlation solutions
Proficient with containerization and cluster management technologies including Kubernetes and Docker
Deep understanding and hands-on experience with Cloud Native deployment and monitoring tools/technology with expertise in areas like Kubernetes, Helm charts, container-based deployment, Service Mesh, Prometheus, Grafana, etc.
Motivated by a DevOps culture and Site Reliability Engineering concepts
5 years of experience in operating systems (Windows, RedHat, CentOS, Amazon Linux), networking (Akamai, Nginx, Apache, AWS/GCP VPC), and/or software (Terraform, Bash, Sh) packages
5 years of experience integrating monitoring, alerting, and reporting tools (NewRelic, Akamai, Grafana, Elasticsearch, Prometheus) with existing and newly developed systems
4 years of cloud engineering and development experience. Must have experience extending and supporting cloud-based systems using Terraform and GCP or AWS
3 years of experience implementing and supporting microservices architecture using containers, with tools such as Docker, AWS ECS or GCP Compute/GKE
3 years of experience working with database systems such as Cloud SQL, BigQuery, AWS RDS, Oracle SQL, MongoDB & Elasticsearch
Design and Build CI-CD pipeline for code deployment using Argocd, Argo Workflow, and GitHub Actions, also supporting legacy processes in Travis, Codebuild, Jenkins, and Bamboo
Experience in supporting open-source Web and Application Services (Java, Ruby, PHP, Python, Perl)
Experience with bash, Perl, or other shell scripting required
Experience with git fundamentals required
Experience with Level 1 & 2 support, monitoring of customer and business systems and participating in 24/7 on-call support activities
Desired:
Deep working knowledge in multi-channel applications that effect eCommerce including Order Management Systems, mobile, social, and SAAS mashups
Experience in SOX / PCI Compliance
Retail experience