PhD Machine Learning Intern

Company:

Protagonist

Location:

Washington, DC

Posted:

August 03, 2025

Apply

Description:

Job Description

Join Our Talented Team at Protagonist

We fuse rigorous, methodologically sound analysis with our cutting-edge technology platform, Narrative Analytics®. This powerful combination enables us to quantitatively analyze open-source media, deliver strategic recommendations, and craft executive-level communication strategies for clients with missions that matter.Why Us?

Our team is a vibrant mix of communication specialists, data scientists, and subject matter experts with extensive experience across U.S. Government agencies, non-profit organizations, and Fortune 500 companies. By joining Protagonist, you'll immerse yourself in a collaborative environment where innovation thrives, and your contributions truly matter.What We Do

Innovative Solutions: We co-develop cutting-edge solutions with our clients to address tough communication problems and capitalize on opportunities to make a tangible impact.

Data-Driven Insights: Our tools and methodologies provide actionable insights that help clients meet their communication objectives and stay ahead of global challenges.

Applied Expertise: We integrate our solutions within client organizations, leveraging our profound expertise to address critical issues and ensure sustainable success.Be Part of Something Bigger

At Protagonist, you'll work on compelling projects that make a real difference. We seek talented individuals eager to contribute to our mission and grow alongside us. If you're passionate about communication, data analysis, and making an impact, we invite you to explore a career with Protagonist.Explore Your Future with Us!

Ready to take the next step in your career? Join us at Protagonist and be part of a team that's making a difference.

About You

The PhD Machine Learning Intern has a passion for cutting-edge AI research and its practical applications in narrative intelligence. You will play a key role in advancing our GEN5 System through the development and optimization of state-of-the-art Retrieval Augmented Generation (RAG) architectures. You are deeply familiar with the latest developments in large language models, vector databases, and information retrieval systems. You thrive on solving complex technical challenges at the intersection of NLP research and production systems, and you're excited about translating academic insights into real-world impact.

Primary Responsibilities

During this internship, you will focus on research, development, and implementation of advanced RAG systems for our GEN5 platform. You will work closely with our Senior Machine Learning Engineers, Data Scientists, and VP of Technology to push the boundaries of what's possible in narrative analytics through intelligent information retrieval and generation.

Specific Responsibilities

Research and implement novel RAG architectures optimized for multi-modal narrative data processing

Design and develop advanced retrieval mechanisms using dense and sparse vector representations

Experiment with hybrid search approaches combining semantic similarity and keyword-based retrieval

Optimize embedding models and vector databases for large-scale narrative content indexing

Develop and evaluate chunking strategies for complex, multi-document narrative datasets

Implement and fine-tune reranking models to improve retrieval precision

Design evaluation frameworks for RAG system performance, including relevance, faithfulness, and narrative coherence metrics

Collaborate with the Data Science team to integrate RAG capabilities into existing narrative detection pipelines

Conduct experiments on prompt engineering and context optimization for improved generation quality

Research and implement techniques for handling multi-language and cross-cultural narrative content

Contribute to research publications and technical documentation of methodologies and findings

Present research progress and findings to cross-functional teams and stakeholders

Requirements

Currently pursuing a PhD in Computer Science, Machine Learning, Natural Language Processing, or related field

Authorized to work in the US

Must be able to work on US Government contracts that may be restricted to US persons

Strong theoretical foundation in machine learning, deep learning, and natural language processing

Hands-on experience with transformer architectures, large language models, and embedding models

Proficiency in Python and deep learning frameworks (PyTorch, TensorFlow, Hugging Face)

Experience with vector databases and similarity search systems (Pinecone, Milvus, FAISS, OpenSearch, PGVector)

Knowledge of information retrieval concepts and evaluation metrics

Experience with distributed computing and large-scale data processing

Strong research and analytical skills with ability to work independently

Excellent communication skills and ability to present complex technical concepts clearly

Preferred Qualifications

Published research in RAG systems, information retrieval, or related NLP areas

Experience with multi-modal learning and cross-lingual NLP

Knowledge of knowledge graph construction and reasoning

Familiarity with narrative analysis or social media data processing

Experience with A/B testing and experimental design for ML systems

Background in computational social science or digital humanities

What You'll Gain

Hands-on experience applying cutting-edge AI research to real-world problems with societal impact

Opportunity to work with large-scale, diverse datasets spanning global narratives

Mentorship from experienced ML engineers and data scientists

Exposure to production ML systems serving enterprise clients

Potential for research publication and conference presentations

Experience in a fast-paced, mission-driven startup environment

Pay rate for this position is $32.00 per hour and expected duration is 4 months.

Protagonist is an Equal Opportunity Employer.

Pursuant to the San Francisco Fair Chance Ordinance, we will consider for employment qualified applicants with arrest and conviction records.

Apply

PhD Machine Learning Intern

Description:

Report this job