Job Summary: We are seeking for a Dataset Curator who is responsible for designing, maintaining, and optimizing high-quality datasets for AI, Machine Learning (ML), and Large Language Model (LLM) projects. The Dataset Curator works closely with data scientists, AI trainers, and engineers to gather, clean, validate, and annotate datasets across multiple domains, ensuring the data supports robust AI model training and evaluation. The dataset curator must understand dataset diversity, bias detection, quality assessment, and metadata management. This role is critical to improving AI performance, dataset reliability, and data-driven decision-making.
Key Responsibilities
• Curate, collect, and structure datasets for AI and ML training purposes.
• Validate dataset accuracy, completeness, and consistency.
• Annotate and label datasets according to project-specific guidelines.
• Identify and correct data inconsistencies, duplicates, and anomalies.
• Maintain metadata and documentation for datasets.
• Collaborate with AI trainers and data engineers to define dataset requirements.
• Ensure datasets are ethically sourced and free from biases.
• Continuously monitor dataset quality and propose improvements.
Job Requirements
• Bachelor’s or Master’s degree in Data Science, Computer Science, Statistics, Information Systems or related field.
• Proficiency in Excel, Google Sheets, SQL, and/or Python for dataset handling.
• Knowledge of data cleaning, normalization, and transformation techniques.
• Familiarity with data annotation tools and platforms.
• Understanding of structured, semi-structured, and unstructured datasets.
• Experience with database management and version control systems.
• Awareness of AI dataset ethics and bias mitigation.
• Required Certifications such as Google Data Analytics Certificate (advantage), Data Management or Curation Certification, AI/Data Annotation Training (optional but preferred).
• 5-8 years proven experience in dataset curation, data analysis, or data management roles.
• Experience handling large-scale datasets for AI, ML, or analytics projects.
Salary: $1000
How to Apply: Qualified candidates should send their cv’s to using the Job title as the subject of the mail.