Post Job Free
Sign in

Data Center Operations Engineer

Location:
Washington, DC
Posted:
February 21, 2026

Contact this candidate

Resume:

Khoa Doan

202-***-**** • **********@*****.*** • Lorton, VA

Profile Summary

Program Manager for the In-Warranty Break-Fix team, leading US-East regional warranty repair operations end to end: intake/triage, dispatch, Field Engineer escorts, verification, and closure against SLAs. Authored and enforced official SOPs/MOPs and runbooks. Drove communications and escalations with overseas OEMs and local service vendors, partnering closely with Supply Chain and Sourcing on parts logistics and RMA policy, led a 12+ DCT team dedicated to In-Warranty repair.

In parallel, serve as a Data Center Operations Engineer on an internal team, owning daily operations, co-led 35+ DCTS team on a 24/7 schedule and maintained KPI/SLA governance, executing high-priority projects, and coordinating incident response for Large-Scale Events

(LSEs).

Core Competencies

• Program/Operations Management • SLA/KPI Governance • OKR Alignment & Feedback Cadence

• Break-Fix Program Leadership • Vendor Management • OEM Escalations • QBR/ Performance Reviews

• SOP/MOP Authoring • Change Management • Risk Controls • Audit-Ready Documentation

• Data Center Daily Operations • Incident Response (LSE) • Business Continuity Support

• Power & Cooling Operations (Foundational) • Escalation coordination with building provider (UPS/HVAC/CRAC impacts), event logging, and operational follow-through.

• Structured Cabling • Rack Layout/Space Planning • Hardware Lifecycle • Parts/RMA Coordination

• DCIM Program Support

• Ticketing/Workflows • Cross-Functional Partnering (Facilities, Network, Delivery, Operations Security

Experience

ByteDance (parent company of TikTok) 05/2022 - 11/2025

• US East PM — In-Warranty Break-Fix Program

• Led regional In-Warranty break-fix operations for OEM server/storage repairs executed by Field Engineers from third-party service vendors with internal Data Center Technician support.

• Owned the end-to-end workflow: intake/triage, dispatch, access/escort coordination, parts logistics, repair execution, verification, and ticket closure.

• Authored and maintained official SOPs/MOPs/runbooks to standardize vendor engagements, onsite controls, verification steps, and documentation quality.

• Drove communications and escalations with overseas OEM manufacturers and local service vendors to remove blockers and improve time-to-repair and repeat-repair rates.

• Led a 13+ Data Center Technician team dedicated to In-Warranty repair.

• Partnered with Supply Chain, Quality, and Sourcing to manage RMA processes, parts availability, logistics coordination, and sourcing/vendor compliance.

• Defined and monitored KPIs/SLAs (response time, first-time-fix rate, repeat repairs, aging) and ran recurring vendor performance reviews.

• Partnered with Security Operations to formalize vendor security workflows, including badge access procedures for recurring onsite vendors, escort requirements and logging, loaner laptop serial tracking, and controlled USB access for troubleshooting— maintaining compliance while sustaining SLA performance.

• Conducted interviews for Data Center Technicians, assessed hands-on troubleshooting and operational readiness for shift-based coverage, and provided hiring feedback to strengthen 24/7 staffing.

• Data Center Operations Engineer (concurrent)

• Own day-to-day data center operations across power, cooling, and data hall readiness to support stable production environments.

• Execute high-priority operational projects (capacity/space readiness, remediation initiatives, upgrades) with structured planning and stakeholder alignment.

• Serve as first responder/escalation point and coordinate LSE incident response, including triage, communications, recovery execution, and post-incident follow-ups.

• Co-lead a team of 35+ data center technicians; design and maintain a 24/7 shift schedule to ensure continuous site coverage and balanced workload distribution.

• Contribute to DCIM kickoff efforts on the ground: site readiness, scoping, and obtaining installation pricing/quotes from vendors.

• Conducted interviews for program roles (engineering/lead positions), evaluated candidates against vendor management, escalation judgment, and process/SOP ownership requirements, and provided structured hiring recommendations. Dropbox 04/2019–05/2022

• Data Center Operations

• Supported installs, rack/stack, diagnostics, and break/fix across multiple data center sites while minimizing production impact.

• Managed repair workflows and ticket execution in Jira, coordinated cross-functional dependencies, and improved operational consistency through documentation. ZT Systems 2017–2019

• Data Center Technician

• Performed server/network incident response and component replacements at scale (HDD/ SSD, DIMM, NIC, RAID/HBA, mainboard).

• Supported on-call workstreams and large deployment/maintenance activities across multiple facilities.

NCR 2008–2017

• Customer Engineer

• Field support for enterprise network/server environments (Dell/HP, Cisco, Symantec/ Veritas, Juniper, Tintri, AT&T); cabling, base configs, and access enablement. Education

A.A.S., Computer Electronic Engineering Technology — ITT Technical Institute (2009) High School Diploma — Hayfield Secondary School (2003) Certificates

CCNA Routing & Switching • Cisco Certified Technician (CCT) — Data Center • Cisco Data Center Unified Computing Support Specialist

OSHA Electrical Safety Training (29 CFR 1910.331–.335) — Certificate of Completion (2025)



Contact this candidate