Post Job Free
Sign in

Lead QA & Release Engineer (AI Platform Quality & Automation)

Company:
XY.AI Labs
Location:
Palo Alto, CA, 94306
Posted:
May 23, 2025
Apply

Description:

Join Our Mission-Driven AI HealthTech Team!

We are an early-stage AI/HealthTech startup on a mission to transform healthcare operations with intelligent automation. We value high ownership, rapid iteration, and a passion for improving patient outcomes.

We are seeking a QA Lead to own the quality and reliability of our AI-driven platform from day one. This is a founding QA role where you will design and implement our testing and release infrastructure from scratch – ensuring that our healthcare AI systems are safe, robust, and ready for mission-critical use. You will combine traditional test automation with novel LLM-based evaluation techniques to validate the performance of our AI agents. In addition to hands-on testing and automation, you’ll champion best practices in CI/CD and eventually help grow a QA team. This role is perfect for a seasoned QA engineer who is excited by the challenge of testing cutting-edge AI products and wants to have broad ownership over release quality in a high-growth startup environment.

Responsibilities

Own End-to-End Quality Strategy: Define and implement a comprehensive QA strategy for our platform, covering everything from frontend and backend testing to rigorous evaluation of AI/ML components. Establish QA processes, standards, and success metrics to ensure each release meets a high quality bar for our customers.

Test Automation Infrastructure: Build and maintain automated test frameworks for both software functionality and AI behavior. Develop suites of unit, integration, and end-to-end tests that run in our CI pipeline, catching regressions early. Utilize tools and frameworks (e.g. PyTest, Selenium or similar for any UI, API testing tools) to systematically validate product features on every commit.

LLM-Based Evaluation: Innovate on testing our AI agents and LLM-driven features. Design novel test cases and prompting techniques to evaluate Large Language Model responses and decision-making. For example, create automated routines that challenge our AI with diverse scenarios and use AI or rule-based checkers to grade the outputs for correctness, coherence, and safety. Continuously refine these evaluations to improve coverage of edge cases, factual accuracy, and bias detection.

Continuous Integration & Deployment (CI/CD): Take ownership of the release pipeline. Work with DevOps to integrate automated tests and evaluation scripts into a CI system (e.g. GitHub Actions, Jenkins) so that every code change and model update is evaluated thoroughly before deployment. Implement Continuous Evaluation practices for AI (LLMOps), enabling real-time performance monitoring of our models in staging and production.

Manual Testing & QA of AI Outputs: Where automation falls short, perform hands-on testing of new features and AI outputs. Conduct thorough user-level testing of scenarios, review AI decisions or conversations for quality, and verify that fixes truly resolve issues. Develop clear bug reports and work closely with engineers to troubleshoot and resolve defects.

Release Management: Own the go/no-go quality gate for releases. Document release notes, known issues, and test results for each version. Coordinate cross-functionally to ensure that product, engineering, and management are aligned on release readiness. Over time, help implement blue/green or canary releases and other best practices to minimize risk in deploying updates.

Quality Leadership & Team Building: As the quality expert, educate and mentor developers on writing testable code and adopting QA best practices. Champion a culture of quality where testing and consideration of edge cases are part of the development process. In the future, help hire and lead additional QA engineers or SDETs, scaling the QA function while maintaining a hands-on role.

Qualifications

QA Expertise: 5+ years of experience in Quality Assurance or Software Test Engineering, including leadership of test planning and automation efforts. You have designed test strategies and frameworks for complex software products from scratch or in early stages. Strong knowledge of QA methodologies, release processes, and software development life cycle.

AI/ML Product Experience: Experience testing AI-powered or machine learning-based products is required. You understand the unique challenges of validating AI systems (nondeterministic outputs, model “drift”, etc.) and have applied creative techniques to test ML models or data-driven features. Familiarity with LLM evaluation metrics and frameworks is a big plus, as is knowledge of AI safety and bias considerations in testing.

Strong Automation Skills: Proficiency in programming (Python preferred) to write test scripts, automation tools, and perhaps simple utilities for data manipulation. Experience with test automation frameworks (e.g. Jest/Mocha for Node, or PyTest/unittest for Python) and CI/CD pipelines. You can set up test runners in CI, configure Docker/containerized test environments, and integrate reporting tools.

CI/CD and DevOps Knowledge: Hands-on experience with continuous integration and deployment systems. You can create and maintain pipelines that build, test, and deploy applications. Knowledge of infrastructure-as-code, containerization (Docker/Kubernetes), and managing test environments is beneficial.

Attention to Detail & Analytical Skills: Keen eye for identifying issues, patterns, and edge cases. Ability to analyze complex systems (including logs, data outputs, performance metrics) to pinpoint problems. You approach testing methodically and are obsessed with reliability and user experience.

Communication & Collaboration: Excellent written and verbal communication skills for documenting test plans, writing clear bug reports, and conveying quality status. You work closely with developers, product managers, and even beta users – able to explain technical issues and quality risks in simple terms. Prior experience coordinating releases across teams is a plus.

Education: Bachelor’s degree in Computer Science, Engineering, or equivalent practical experience. Advanced coursework or certifications in software testing or ML engineering are nice to have.

Ideal Candidate Attributes

Innovative Tester: You love breaking things in order to make them better. When faced with a novel AI feature, you relish figuring out creative ways to test it – whether by inventing new automation or enlisting another AI to help evaluate the outputs. You stay up-to-date on the latest in AI evaluation techniques and QA tools to continuously improve our practices.

Leadership & Ownership: Ready to be the single point of accountability for product quality. You take that responsibility seriously and establish credibility through action. Even as an individual contributor, you think like a team lead – prioritizing effectively, influencing others to care about quality, and planning for the long term (scalable processes).

Startup Mentality: Thrives in a fast-paced, evolving environment. You are proactive and resourceful, capable of building QA solutions with limited resources. Ambiguity doesn’t scare you; you take initiative to define a process where there is none. You’re excited by the prospect of building the QA/release function from the ground up.

Passion for Mission: Inspired by the impact of technology in healthcare. You understand that quality can be life-critical in this domain, and you are motivated to ensure our AI performs safely and effectively to improve patient care.

Collaborative & Curious: You work well with engineers and data scientists, respecting their expertise and learning from them. At the same time, you’re not afraid to ask questions and challenge assumptions that could affect quality. You foster positive collaboration, helping the entire team raise the bar on testing practices.

Benefits

Competitive Compensation: Top-tier salary commensurate with experience, plus significant equity stock options. You’ll share in the company’s success as we grow.

Comprehensive Healthcare: Full medical, dental, and vision coverage for you (and options for your family) – we’re a health company and believe in taking care of your health first.

Remote-First Culture: Work from anywhere in the U.S. (with a preference for Pacific or Eastern time zones for team sync). We provide a home office stipend for equipment, and we have quarterly in-person meetups to build team camaraderie.

Learning & Development: Annual budget for professional development – attend that QA conference, take an AI course, or get a certification. We want you to stay at the cutting edge of QA and AI tools.

Startup Perks: Early team influence on company culture and engineering practices. Minimal bureaucracy, open communication, and a chance to shape not just the product quality but the overall direction of an AI-first healthcare startup. You’ll be working with a small, brilliant team that loves to have fun while tackling serious challenges.

Join us in our mission to revolutionize healthcare with AI. If you’re excited about either of these roles and meet a majority of the requirements, we encourage you to apply!

We care about passion, potential, and perseverance. Help us build something amazing that makes a real difference in people’s lives.

Apply