Description
Your tasks will involve writing adversarial prompts to identify weaknesses in various cutting-edge AI models, including Large Language Models (LLMs), Text-to-Image, Text-to-Video, Multi-Modal models, AI Agents and beyond. You'll also manage and analyze datasets to ensure the generation of high-quality outputs and actionable insights that contribute to AI safety research.
Key Responsibilities
Design adversarial prompts to test AI systems across multiple modalities.
Identify, categorize, and document model weaknesses or unsafe outputs.
Support data annotation, curation, and quality control processes.
Summarize findings into structured reports or data templates.
Requirements
Proven experience with Generative AI models is essential, though direct technical experience is not a prerequisite.
Understanding of risk taxonomies (e.g., harm categories, policy tiers).
Command of English at a near-native level.
Attention to detail, organizational capabilities
Ability to manage multiple tasks simultaneously and meet deadlines.
Additional Wants:
Familiarity with various model types (Text-to-Text, Text-to-Image) is desirable.
Experience with prompt injection techniques, jailbreaks and red-teaming techniques.
Prior work in model evaluation,prompt engineering, or safety analysis.
Regional expertise or cultural fluency in specific geopolitical areas.