Post Job Free
Sign in

Project Development

Location:
Charlottesville, VA
Posted:
November 11, 2012

Contact this candidate

Resume:

Born-Digital Archives in

Collecting Repositories:

Turning Challenges into Byte-Size

Opportunities

Gretchen Gueguen, Mark A. Matienzo,

Simon Wilson, and Peter Chan

Session 502, 27 August 2011

Society of American Archivists Annual Meeting

AIMS

AIMS Project

"Born-Digital Collections: An Inter-

Institutional Model for Stewardship

Two year project to create a framework for

stewardship of born-digital archival

records in collecting repositories

Funded by the Andrew W. Mellon

Foundation

AIMS

Partners

AIMS

Grant Goals

Processing of Hybrid Collections

Software Development

Community Development

Unconference (May 2011, Charlottesville, VA)

UK Symposium (June 2011, London, England)

Workshop (August 2011, Chicago, IL)

White Paper and Project Report

AIMS

Framework Development

A framework for collecting and delivering

the born-digital materials that are quickly

beginning to constitute the collections of

contemporary scholarly, literary, and political

figures and organizations.

University of Virginia

AIMS

What is Collection Development?

Actions and policies of institutions to bring in

material for end users (both current and future);

includes prioritizing, developing relationships with

creators, assessments, negotiating agreements and

preparing for accessioning.

Within the AIMS framework

Viable, practical method to capture/process born-

digital material from hybrid collections requires

sound work at the beginning (i.e. policies, practices,

agreements with donors, etc.) to set up later work

AIMS

Elements of Collection

Development

1 . Prerequisites

2 . Establish relationship with donor

3 . Analyze Feasibility

4 . Negotiate Agreements

5 . Prepare for Accessioning

AIMS

Prerequisites

Neil Beagrie, "Plenty of Room at the Bo5om? Personal Digital Libraries and

Collec>ons," D Lib Magazine (June 2005)

Blagofaire. h5p://xkcd.com/239/

AIMS

Donor Relationship

AIMS

Enhanced Curation

AIMS

Analyzing Feasibility

AIMS

Negotiate Agreements

All rights reserved by Chevrolet UK

AIMS

Prepare for Accessioning...

Scope and extent determined?

Coordination with

acquisition of

Method and time

analog material?

determined?

Pre-acquisition

Enhanced curation

appraisal performed?

carried out?

Test capture if needed?

Development of new methodologies undertaken as needed/possible?

AIMS

Accessioning

Mark A. Matienzo, Yale University

AIMS

What is Accessioning?

Archival institution takes physical and legal custody

of a group of records from a donor and documents

the transfer in a register or other representation of

the institution s holdings

Within AIMS Framework

Processes which establish physical, administrative

and intellectual control over transferred records;

assessment and documentation of future needs;

documentation of actions taken; beginning of safe

storage and maintenance

AIMS

Elements of Accessioning

1 . Prerequisites

2 . Transfer records and gain administrative control

3 . Physical control and stabilization

4 . Intellectual control and documentation to

support further processes

5. Maintain accessioned records

AIMS

Case Study:

Re-Accessioning at Yale

Collaborative capacity building across two

repositories

Manuscripts and Archives

Beinecke Rare Book and Manuscript Library

Addressing previously received accessions of

containing electronic records on media

Still in testing phase, but working towards

implementing in production

AIMS

Types of Records and Media

Wide variety of records creators

Literary authors

University faculty

University offices

Architectural firms

Common types of media

Floppy disks: 5.25 and 3.5

Optical media: CDROM, CD-R, DVD-R, etc.

Zip disks

USB flash drives

AIMS

Goals of Re-Accessioning

Identify, document, and register media

Mitigate risk of media deterioration and

obsolescence

Extract basic metadata from filesystems on media

and files contained on filesystems

AIMS

Re-Accessioning Workflow

Start

accessioning Write-protect media Verify image

process Media

Record identifying Extract filesystem- Disk Meta-

images data

Retrieve media characteristics of and file-level

media in media log metadata Transfer package

Assign identifiers to Package images and Ingest transfer

Create image

media metadata for ingest package

Media FS/File

Disk Document

MD MD

image accessioning process

End

accessioning

process

AIMS

Disk Imaging

Using forensic (bit-level) imaging process

Ensure data on media is not manipulated using

write-protection

Uses software to acquire images

Includes hash-based verification process

AIMS

AIMS

Media Log

Using SharePoint list

Contains unique identifier of media

Records physical/logical characteristics of media

Documents success, failure, or status of various

processes and additional notes

AIMS

Media Log

AIMS

Media Log

AIMS

Metadata Extraction

Can be repurposed for descriptive, administrative,

and technical metadata

Uses command-line tools (Sleuthkit, fiwalk)

Outputs XML document

AIMS

Packaging and Transfer

Using BagIt packages/Bagger application

Packages contain disk images, extracted metadata,

imaging logs, and high-level accession information

Transfer to storage is verified by comparison

against manifest

AIMS

AIMS

Arrangement & Description

Simon Wilson

Hull University Archives

AIMS

Purpose of Arrangement & Description

The general objectives for Arrangement & Description are:

- to preserve context

- to establish intellectual control of the material

- to provide a means of discovery

SAA definition, emphasis on minimizing the amount of handling

Within the AIMS framework

Processes which establish intellectual control of the material including

implementation of policies and agreements with donors etc. to enable

subsequent discovery and access

AIMS

Elements of Arrangement

and Description

1. Prerequisites

2. Plan for processing

- gather supporting information; files captured from media

(accessioning); convert files (for viewing); appraisal strategy;

assess arrangement options; consider preservation issues

3. Processing

- implement arrangement strategy; add descriptive metadata and

wider context (eg Collection Level Description); copyright &

other legal considerations

4. Prepare for Discovery & Access

- remove restricted access to b-d material during processing

AIMS

Case Study - Stephen Gallagher

Background:

2005: 42 boxes paper archives

2010: born-digital material:

14,320 files (13.6GB) transferred

to us via external hard drive and

a box of Amstrad disks

Create integrated catalogue

to accommodate paper, born-

digital and future accruals

AIMS

Case Study - Stephen Gallagher

Approach:

- current work higher priority in filing system

- considered each work a distinct project

- structure reflect his way of working & the

archival principles of control that creator,

archivist & user can all understand

Series level was most logical solution

- all related files placed in the series

- reasonable return for our effort

AIMS

Case Study - Stephen Gallagher

300 files created using FinalDraft

screenwriter software

- view file (as created) to identify

appropriate format for long term

preservation

Other issues:

- copyright/third-party content

- commercial implications: access

via repository = publication?

- re-purposing of work from one

(unsuccessful) project to another

AIMS

Challenges faced

Each collection is unique, approach will vary:

- integrate born-digital material with existing material/arrangement?

- one-off collection (eg project) or likely to be subsequent accruals?

- collection type; differs for personal papers & organisational records

- same personnel work on paper and born-digital components?

- can we appraise without knowing the contents?

similar to paper material that is in a different language?

AIMS

Challenges faced

Volume of material :

- depositor perception that 'storage is cheap - does this mean

we shouldn t appraise the material we receive?

- wide range of file types encountered

- not practical to describe each and every file

- risk management - if you don t check every

file for sensitive information

- we need to automate as much of the processing as possible

AIMS

Hypatia

Digital archivists' identified a gap in current tools used experiences to

define the requirements for a new tool

Key features identified:

- need an intuitive (for archivists) graphical interface

- drag'n'drop to create the intellectual arrangement

- ability to return to original order of the material

- view some file types, add descriptive metadata etc

- high level of granularity when applying rights & permissions

Technical (acquired at accessioning) and descriptive metadata -

Discovery & Access process

AIMS

Discovery and Access

Peter Chan

Stanford University

AIMS

What is Discovery & Access

Discovery and Access refers to the systems

and workflows that make processed or

unprocessed material and the metadata that

support it available to users.

Discovery and

Access

Arrangement and

Descrip&on

Accessioning

Collec&on

Development

AIMS

Goals of D&A

To make material available to user communities by

ensuring that they can:

find out about material

understand whether it is available for consultation and if so,

how

access material.

To apply appropriate access restrictions in order to

protect private and sensitive information as well as

intellectual property.

To provide access to material in a format and/or

environment that presents the original s significant

properties.

AIMS

Case Study - Stephen Jay Gould

Papers

Analog component: 550 linear feet of papers (789 boxes, 119 cartons,

30 flat boxes, and 14 map folders.

File size and number: 59.7 MB and 2,567 files.

Media formats: 98 3 floppy diskettes; 61 5.25 floppy diskettes; 4 sets

of punch cards*; 3 computer tapes

File Types: Computer Programs; Data sets; Documents; Spreadsheets

File Formats: ASCII Text; WordPerfect 4.2, 5.0, 5.1, 6.0, 6.1; Microsoft

Word 2.0, 6.0, 97, 2000; Microsoft RTF; Microsoft Excel 4.0; Lotus 1-2-3

2.0, etc.

* During processing of the analog papers in 2011, another 21 sets of

punch cards and more floppy diskettes were found.

AIMS

D&A EAD

AIMS

D&A Facet Browsing

AIMS

D&A Full text search

AIMS

D&A See Contents on Web

AIMS

D&A Tag & Annotation by

Invited Persons / Public

Annotation:

AIMS

Impacts from

Collection Development

File formats: no restriction

Computer medium: no restriction (punch card,

open reel tape, 5.25 inch floppy, 3.5 inch floppy),

File type: no restriction (computer program, data

set, document, spreadsheet),

Agreement: permission to post contents online.

AIMS

Impacts from

Accessioning

Built 5.25 inch floppy capture station

Ask Computer History Museum to read

punch cards

Open reel tapes still outstanding

AIMS

Impacts from

Processing

AccessData FTK was used to search files with restricted information,

annotate files with appropriate descriptive metadata (book title,

articles, etc.), and rights metadata (access restriction), generate

technical metadata for the delivery platform to act upon.

Transit Solution was used to transform files to html format for display

in web.

A XSLT program was written to transform the XSL-FO output from

FTK to XML content document. A Ruby program was written to

ingest the XML content document, original files, and the display

derivatives to Fedora.

AIMS

FTK Bookmark and Label

AIMS

FTK Full Text, Pattern Search &

Fuzzy Hash

AIMS

Emulation Design Files

AIMS

Network Diagram for 50,000

Creeley Emails

AIMS

MUSE: Sentiment Analysis for

Emails

AIMS

MUSE: See Individual Email

AIMS

Want to know more?

http://born-digital-archives.blogspot.com

Gretchen Gueguen Mark Matienzo

*****@********.*** ****.********@****.***

Simon Wilson Peter Chan

*.******@****.**.** ******@********.***

AIMS

ta and

wider context (eg Collection Level Description); copyright &



Contact this candidate