Post Job Free
Sign in

AI/ML Data Scientist, RAG Engineer

Location:
Boulder, CO
Posted:
June 03, 2026

Contact this candidate

Resume:

Alexander Booth

Senior AI/ML Data Scientist ML Engineer

Boulder, CO 80302 Remote U.S. 720-***-**** *********.**.****@*****.*** linkedin.com/in/alexandercbooth

PROFESSIONAL SUMMARY

Senior Machine Learning, Data Science, and AI/ML engineer with 10+ years of experience building production ML systems, LLM/RAG workflows, model-ready datasets, and large-scale data pipelines. Strong in Python, SQL, Spark, AWS/GCP, MLOps, statistical modeling, NLP, deep learning, forecasting, experimentation, and model evaluation. Experienced turning messy source data into trusted training, evaluation, and inference workflows while partnering with product, engineering, research, and domain teams. Brings full-stack AI product experience across Flask, Django, React, and Next.js to move models from notebooks into usable, production-ready applications.

CORE AI/ML, DATA SCIENCE & ENGINEERING SKILLS

Machine Learning & AI: Supervised/unsupervised learning, deep learning, NLP, LLMs, RAG, embeddings, vector search, recommendation systems, model evaluation

Data Science & Analytics: Forecasting, statistical modeling, experimentation, A/B testing, customer analytics, risk analytics, segmentation, optimization, KPI dashboards

RAG & LLM Systems: Retrieval pipelines, semantic search, context ranking, prompt engineering, evaluation datasets, fine-tuning support, hallucination reduction

Data Engineering: Python, SQL, Spark, ETL/ELT, data modeling, schema design, feature engineering, data quality, validation, monitoring, reproducible datasets

MLOps & Cloud: AWS, GCP, Docker, Kubernetes, CI/CD, model deployment, model monitoring, scalable ML infrastructure, production reliability

Full-Stack AI Applications: Flask, Django, FastAPI, React, Next.js, REST APIs, AI dashboards, internal ML tools, data review interfaces, model workflow automation

PROFESSIONAL EXPERIENCE

GitHub - Senior Machine Learning Engineer August 2021 - Present United States

• Own and deliver production AI/ML systems for large-scale developer intelligence products, with emphasis on dependable data flows, high-quality model inputs, model evaluation, and infrastructure that supports millions of daily users.

• Designed and improved LLM and RAG-style workflows for code generation, semantic retrieval, context assembly, ranking, and evaluation, improving code-assistance accuracy by more than 30%.

• Built and optimized end-to-end ML and data pipelines using Python, SQL, Spark, cloud services, containers, and CI/CD, reducing model latency by 40% while strengthening production reliability.

• Designed reusable feature tables, training datasets, evaluation corpora, and schema documentation so research, data science, and ML engineering teams could work from consistent sources.

• Developed internal AI workflow tools and prototypes with Flask/Django services and React/Next.js interfaces for dataset review, prompt testing, model evaluation, and release readiness.

• Implemented monitoring and validation around model and data pipelines to catch missing fields, drift, quality regressions, latency issues, and reliability risks earlier in the launch process.

• Mentor engineers and collaborate with cross-functional teams on production ML, RAG evaluation, data quality tradeoffs, experiment design, and maintainable AI system architecture.

Specialized Bicycle Components - Senior Data Scientist June 2020 - July 2021 United States

• Built demand forecasting, customer analytics, and product performance models from fragmented sales, inventory, product, and customer behavior data.

• Improved forecast accuracy by 25%, helping supply chain, operations, and business teams make stronger planning decisions during changing demand cycles.

• Created model-ready feature pipelines, reusable analytical datasets, and validation routines so forecasting and segmentation models could be refreshed reliably without manual spreadsheet work.

• Applied machine learning, statistical modeling, and optimization to manufacturing, inventory, customer segmentation, and product-performance questions.

• Delivered analytics APIs, dashboards, and decision-support tools using Python, SQL, Flask/Django patterns, and React-style reporting workflows to make model outputs usable by business stakeholders.

Corvus Insurance - Machine Learning Engineer September 2019 - May 2020 United States

• Designed ML models for underwriting automation, risk assessment, and decision support using structured policy data, external signals, and unstructured risk information.

• Built scalable real-time analytics and feature-generation pipelines, moving raw source data through cleaning, enrichment, modeling, and operational consumption steps.

• Reduced underwriting time by 20% by improving the data, modeling, scoring, and automation workflow behind risk decisions.

• Created model-serving and experimentation workflows with Python, SQL, Flask/Django-style services, explainability checks, and monitoring for model quality and operational use.

• Worked with sensitive customer and risk data using strong validation, traceability, access-awareness, and audit-ready data practices relevant to regulated AI environments.

University of New Hampshire - Software Engineer / Applied AI Research September 2011 - May 2019 United States

• Developed research-grade machine learning, physics simulation, and numerical analysis software for complex experiments where reproducibility and data quality were critical.

• Built Python/C++ simulation and analysis pipelines for micromagnetic systems and 3D skyrmion structures, converting raw simulation outputs into structured research datasets.

• Integrated neural-network methods with physics-based simulation frameworks to explore AI-driven modeling of complex scientific systems.

• Created experiment tracking, validation steps, repeatable workflows, and lightweight research tools that helped collaborators review, rerun, and extend results.

• Published peer-reviewed research in computational physics and AI applications while collaborating across technical and scientific teams.

SELECTED AI/ML, DATA SCIENCE & RAG PROJECTS

RAG Knowledge Assistant: Designed retrieval-augmented generation workflows using embeddings, vector search, semantic ranking, context construction, prompt templates, and evaluation datasets for more grounded LLM outputs.

Full-Stack ML Applications: Built AI/ML application patterns with Django, Flask, React, and Next.js to expose models through APIs, dashboards, dataset review tools, and human-in-the-loop workflows.

Model-Ready Dataset Pipelines: Built pipelines that cleaned, structured, validated, and monitored large datasets for ML training, experimentation, evaluation, and inference across production AI systems.

Predictive Risk Engine: Developed risk-scoring models and feature pipelines for underwriting automation, combining structured records, external signals, and explainable model outputs.

Scientific AI Simulation Platform: Integrated neural networks with physics-based simulations, creating reusable data-processing and analysis workflows for complex scientific datasets.

EDUCATION

Master's Degree, Analytics - University of New Hampshire

Bachelor's Degree, Mechanical Engineering - University of Maine

ATS KEYWORDS

Machine Learning, Data Scientist, AI Engineer, ML Engineer, RAG Engineer, LLM Applications, NLP, Deep Learning, Data Science, MLOps, Python, SQL, Spark, AWS, GCP, Docker, Kubernetes, Django, Flask, React, Next.js, Forecasting, Model Evaluation, Feature Engineering, Vector Search, Embeddings



Contact this candidate