Post Job Free
Sign in

EU - Lead DBA

Company:
Insight Global
Location:
Novato, CA, 94949
Posted:
April 23, 2025
Apply

Description:

1. Database Design, Tuning & Scaling

Architect and optimize complex PostgreSQL and MySQL database clusters to handle

high-velocity OLTP workloads.

Implement horizontal scaling strategies (e.g., sharding, logical partitioning) to support

increased load.

Tune vertical scaling parameters such as connection pools, caching layers, buffer sizes,

and memory allocation.

Continuously improve schema design and normalization strategies for performance and

flexibility.

2. Query Optimization & Workload Profiling

Analyze query patterns, slow query logs, and application metrics.

Identify long-running or inefficient queries and collaborate with developers to refactor

them.

Implement advanced indexing strategies (partial indexes, BRIN, GIN, etc.) to reduce

query latency.

Conduct proactive workload profiling to prevent performance degradation at scale. This

includes regularly benchmarking representative workloads, identifying emerging hotspots

or contention areas, analyzing I/O and CPU usage under various load patterns, and

forecasting capacity requirements. The consultant will develop profiling automation

scripts, simulate growth scenarios, and implement preemptive optimizations to safeguard

against latency spikes and throughput bottlenecks as the player base or feature

complexity grows.

3. Infrastructure & Platform Engineering

Manage and optimize databases on AWS (RDS, Aurora) Google Cloud Platform

(CloudSQL) and MySQL/Postgres on Data center platforms

Design multi-region, fault-tolerant, and geo-replicated systems to support global game

availability.

Support both cloud-native and hybrid infrastructure environments, including

Kubernetes-based deployments.

Contribute to IaC tooling (Terraform, Ansible) to automate DB provisioning, scaling, and

configuration.

4. High Availability, Replication, and DR

Design and implement high availability (HA) setups using streaming replication, GTID,

failover nodes, etc.

Ensure low-latency and durable replication strategies across regions and cloud

providers.

Define and test comprehensive backup/recovery and disaster recovery (DR)

procedures.

Minimize RTO/RPO for game-critical workloads.

5. Monitoring, Observability & Incident Response

Build and maintain observability dashboards using Datadog, Grafana, CloudWatch,

and native DB tools.

Set up and fine-tune alerts for key metrics (e.g., replication lag, query timeouts, cache hit

ratios).

Participate in 24/7 on-call rotations, triaging incidents and driving resolution.

Perform root cause analysis (RCA) and contribute to post-mortems for high-impact

issues.

6. Team Collaboration & CI/CD Integration

Work closely with DevOps, backend engineering, and SRE teams to ensure seamless

integration into CI/CD pipelines.

Provide tooling and standards for DB migrations, version control, and rollback safety.

Mentor engineers on DB best practices, data modeling, and query tuning

We are a company committed to creating diverse and inclusive environments where people can bring their full, authentic selves to work every day. We are an equal opportunity/affirmative action employer that believes everyone matters. Qualified candidates will receive consideration for employment regardless of their race, color, ethnicity, religion, sex (including pregnancy), sexual orientation, gender identity and expression, marital status, national origin, ancestry, genetic factors, age, disability, protected veteran status, military or uniformed service member status, or any other status or characteristic protected by applicable laws, regulations, and ordinances. If you need assistance and/or a reasonable accommodation due to a disability during the application or recruiting process, please send a request to .

To learn more about how we collect, keep, and process your private information, please review Insight Global's Workforce Privacy Policy: .

Required Skills & Experience

Required Skills & Experience

7+ years managing PostgreSQL and MySQL in production-grade environments.

Hands-on experience supporting OLTP systems with millions of concurrent requests.

Mastery in schema design, performance tuning, and deep query analysis.

Proficiency with AWS (RDS, Aurora), GCP (CloudSQL), and hybrid deployments.

Solid scripting knowledge (Bash, Python) for DB automation and incident tooling.

Experience with observability platforms and instrumentation.

Ability to handle incidents calmly under pressure and provide rapid resolution.

Knowledge of multi-region replication, HA, and DR strategies.

Nice to Have Skills & Experience

Prior experience in gaming infrastructure or other latency-sensitive, high-throughput

domains. This includes working on systems such as online multiplayer games, financial

trading engines, streaming media platforms, or real-time logistics and telemetry services.

Such environments require exceptional response times, resilience under concurrent

load, strict consistency models, and advanced troubleshooting practices to meet

demanding SLAs.

Familiarity with Kubernetes, containers, and DB orchestration in microservices

environments.

Exposure to infrastructure-as-code tools (Terraform, Ansible, Pulumi, Puppet).

Understanding of security, RBAC, AD, and compliance in cloud-hosted DBs

Benefit packages for this role will start on the 31st day of employment and include medical, dental, and vision insurance, as well as HSA, FSA, and DCFSA account options, and 401k retirement account access with employer matching. Employees in this role are also entitled to paid sick leave and/or other paid time off as provided by applicable law.

Apply