Hassaan Moin Khan
San Francisco, CA ******@*****.***
Summary
Senior engineering leader (DevOps/SRE/Platform) with 15+ years building and operating large-scale distributed systems and cloud infrastructure. Deep experience in Kubernetes, Docker, Terraform, GCP/AWS/Azure, Hadoop/Spark/Kafka, high-throughput services, disaster recovery, performance optimization, and automation. Strong Linux systems background, networking/security, and on-call operations. AI-literate with hands-on GenAI/LLM integrations.
Core Skills
● Cloud/Infra: GCP (preferred), AWS, Azure; Kubernetes, Docker; Terraform; CI/CD
● Big Data/Streaming: Hadoop, Spark, Kafka, Hive; Airflow; Presto; (Druid/Opensearch familiarity)
● SRE/DevOps: Reliability engineering, DR planning, fault tolerance, scalability, incident response, on-call
● Systems: Linux/Unix internals, performance profiling, low-latency services, C/C++/Go/Java/Python, Bash
● Automation/Config: Shell scripting, Python, Ansible/Puppet/Chef (experience across environments)
● Observability: Grafana, custom metrics, alerting/on-call practices (PagerDuty-style workflows)
● Security/Networking: Network architecture, HAProxy, TLS, authN/Z, data security
● AI/ML: GenAI/LLM integration, AI-driven ops and recommendations Experience CTO, Quartech (Stealth) May 2024 – Present
● Designed and operated a cloud-native, horizontally scalable services architecture on GCP/AWS with Kubernetes and Docker; defined IaC with Terraform for multi-environment provisioning.
● Established SRE best practices: service SLIs/SLOs, runbooks, incident response, disaster recovery testing, and capacity planning; implemented health checks, canaries, and autoscaling.
● Built low-latency media and chat features using FFmpeg and custom messaging; performed performance benchmarking and kernel/network tuning on Linux.
● Implemented monitoring/alerting (Grafana/Prometheus stack) and on-call rotations for a globally distributed team.
● Integrated LLMs (Bayesian clustering, LlamaIndex) and Microsoft Copilot to deliver AI-driven recommendations and support automation.
Founder/CEO, SwiftTaxi Sep 2021 – May 2024
● Architected reliable backend for autonomous systems with Cassandra, Kafka-style messaging, and resilient data pipelines; implemented DR/backup strategies and HA clusters.
● Led Linux-based systems engineering for edge devices; created automation scripts (Python/Bash) for telemetry ingestion, monitoring, and recovery.
● Built maps and SLAM pipelines; enforced security and networking hardening for edge-to-cloud comms; conducted performance profiling and low-level debugging. Software Engineer, Facebook (Payments) Jan 2019 – Sep 2021
● Developed low-latency backend services in C++ with sharded MySQL for high-throughput payments; designed PSP onboarding and sandboxes.
● Worked on reliability and scale: performance measurement, resource utilization tuning, and rollout safety; integrated observability and alerting with on-call responsibilities.
● Built and maintained test environments and automation for infra and integrations. TL, Uber Sep 2016 – Jan 2019
● Established production readiness review (PRR) across org: DR planning, failure mode analysis, fault tolerance, graceful degradation.
● Built core distributed services (distributed locking, unique ID generation, timers) in Java/Go/Python; operated on large-scale clusters with HA and low-latency constraints.
● Owned Cassandra as-a-service tooling/maintenance; implemented automation for cluster operations, backups, and repair.
● Co-led architecture for Uber Elevate simulations on Hadoop; built regression/validation frameworks on Hadoop/Spark.
Engineering Manager, Zenefits Nov 2015 – Jul 2016
● Led infrastructure services team; created internal service framework on Jetty/Java with protobufs for consistent, reliable service deployments.
● Built Pub/Sub and async processing; created pipelines and data sync frameworks to support microservices migration; introduced DR and monitoring practices. Lead Engineer, Apple Mar 2012 – Nov 2015
● Built a high-throughput, replicated, fault-tolerant queue on Cassandra (250k+ ops/sec on 16-node cluster); implemented multi-partition caching and HA designs.
● Upgraded Apple Maps data pipeline (CDH5 + Cassandra) and improved publishing performance; conducted extensive performance profiling and low-level debugging.
● Implemented security infrastructure for internal/external communication; contributed to common infra components and reliability engineering.
Earlier Roles (selected)
● XA.net: Built RTB-scale user store on MongoDB with <15ms lookups; Hadoop analytics; automation and performance tuning.
● Cooliris: Designed backend ads infra (Java, protos, embedded Tomcat), integrated third-party networks, built real-time tracking and Hadoop analytics; wrote monitoring/alerting.
● Cloudmark: Multithreaded C/C++ systems on Solaris; 4x performance improvements; cross-platform C libraries; spam attack detection (real-time).
● Neato Robotics: Linux/C++ robotics stack; SLAM; embedded C/Assembly; systems performance and reliability at the edge.
Education
● MS, Electrical & Computer Engineering, Carnegie Mellon University
● BS, Electrical & Computer Engineering; Minor in Computer Science, Carnegie Mellon University Certifications & Patents
● Patent: LiDAR for Robotics
Highlights Aligned to Role
● Cloud & IaC: Extensive GCP/AWS; Kubernetes, Docker; Terraform for repeatable provisioning and DR.
● Big Data: Hadoop/Spark at Uber/Apple; Kafka pipelines; Airflow-based workflows; Hive/Presto familiarity.
● SRE Practices: PRRs, SLIs/SLOs, chaos and DR tests, failure mode analysis, on-call rotations, runbooks, Grafana/Prometheus; PagerDuty-style escalations.
● Systems/Perf: Low-level Linux tuning, C/C++ performance optimization, high-throughput queues, low-latency services.
● Automation: Python/Bash scripting for cluster ops, CI/CD, self-healing actions, and infra reactions.
● Security/Networking: HAProxy, TLS, network segmentation, data security controls for payments and distributed systems.
● AI Literacy: Direct experience integrating LLMs (LlamaIndex, Copilot) and AI-driven features; curiosity and hands-on experimentation.