Post Job Free
Sign in

Data Curator

Company:
University of Pittsburgh
Location:
Pittsburgh, PA, 15289
Posted:
September 30, 2025
Apply

Description:

The University of Pittsburgh Center for Research Computing and Data (CRCD) seeks a highly motivated and detail-oriented Data Curator to support our growing research data management and curation services and related training opportunities. This position, funded for a minimum of 3 years, will play a critical role in ensuring that our teams, particularly from social science and related disciplines, are empowered to organize and deploy research data and data-based applications that are findable, accessible, interoperable, and reusable (FAIR).

The Data Curator will be responsible for executing and continuously improving data curation workflows, working closely with researchers, training teams, and repository staff to facilitate responsible and effective use of research data.

The ideal candidate will have demonstrable experience in managing unstructured and semi-structured data, a keen understanding of the needs of social science researchers, and excellent communication skills for collaborative work. Experience with machine learning pipelines and transformer-based text processing tools is preferred.

Key Responsibilities:

Develop, Implement, and Improve Data Curation Workflows:

Execute and refine replicable and efficient data curation workflows for data file packaging and repository ingestion, with a special focus on unstructured and semi-structured information from social science and related research projects.

Perform file format normalization, prepare documentation, and generate comprehensive metadata for datasets.

Conduct quality reviews of dataset packages to ensure completeness, accuracy, and adherence to standards.

Manage the transfer of curated dataset packages to specified institutional or external repositories.

Organize intermediate data products and documentation to enhance the efficiency and accuracy of interactive applications and tools built upon repository data.

Document the provenance and transformations of data across model-based processing within a project where applicable.

Apply