Post Job Free
Sign in

Project Development

Location:
United Kingdom
Posted:
November 11, 2012

Contact this candidate

Resume:

A IMS Born-Digital Collections:

A n Inter-Institutional

M odel for Stewardship

January 2012

University of Hull

Stanford University

University of Virginia

Yale University

A cknowledgement

The AIMS Project is a partnership between the University of Virginia Libraries,

Stanford University Libraries and Academic Resources, the University of Hull Library,

and Yale University Library with support from the Andrew W. Mellon Foundation.

Suggested citation:

AIMS Work Group. 2012. AIMS Born-Digital Collections: An Inter-Institutional Model for Stewardship.

http://www2.lib.virginia.edu/aims/whitepaper/AIMS_ nal.pdf.

Table of Contents

Foreword

i

The AIMS Framework: The Functions of Stewardship

1

1. Collection Development

4

Donors and Trust: University of Hull

8

Enhanced Curation at the British Library

13

2. Accessioning

17

Evolution of Accessioning at University of Hull

21

Project Xanadu: Loss and Recovery

25

3. Arrangement and Description

31

Technical Development: Functional Requirements for Arrangement and Description

33

Arrangement and Description Case Study: The Papers of Stephen Gallagher

36

4. Discovery and Access

44

Visualizing Email Access: MUSE

48

Access models (Table 1)

51

Publication Pathway and Discovery and Access at the Bodleian Library

56

Discovery models (Table 2)

59

Conclusions

63

Appendix A: Glossary

64

Appendix B: Bibliography

72

Appendix C: Contributor Biographies

76

Appendix D: Institutional Summaries and Collection Descriptions

83

1. The University of Hull, University Archives at Hull History Centre

83

2. Stanford University, Stanford University Libraries & Academic Information Resources

86

3. The University of Virginia, Albert and Shirley Small Special Collections Library

91

4. Yale University

93

Appendix E: Sample Processing Plans

96

1. University of Hull: Stephen Gallagher Processing Plan

96

2. Stanford University: Gould Processing Plan

99

3. University of Virginia: Cheuse Papers Processing Plan

104

4. Yale University: Tobin Collection Processing Plan

106

Appendix F: Policies, Templates, Documents, etc.

108

1. AIMS Donor Survey

108

2. University of Hull Accessioning Work ows

113

3. University of Hull Digital Media Photography Form

115

4. University of Hull Insertion Sheet

116

5. Guidelines for Creating Agreements at Stanford University

117

6. Stanford University Processing Work ow

120

7. Beinecke Library Born Digital Archival Acquisition Collection and Accession Guidelines

124

Appendix G: Technical Evaluation and Use

125

1. AccessData FTK3.3

125

2. AccessData FTK Imager 3.0

127

3. Comparison of 5.25 Floppy Disk Drive Solutions

129

4. Karen s Directory Printer (v.5.3.2)

132

5. Curator s Workbench

134

Appendix H: Technical Development

136

1. Functional Requirements for Arrangement and Description

136

2. Rubymatica

167

3. Hypatia

169

Appendix I: Digital Archivist Community

171

1. Born Digital Archives Blog

171

2. Digital Archivist Community Events

174

3. Day of Digital Archives

179

4. Presentations, Conferences, and Publications

181

F oreword

The AIMS project evolved around a common need among the project partners and most libraries and archives

to identify a methodology or continuous framework for stewarding born-digital archival materials. These

materials have been slowly accumulating in archival backlogs for years but are rapidly growing as more

contemporary collections are accessioned.

Alongside the many and complex technological requirements, the challenges of stewarding born-digital material

demand new strategies as well as a rede nition of archival work ows. Accordingly, this emerging challenge will affect

the skill-set needed for archivists and the working relationships among archival colleagues as well as those outside

our communities and organizations. If the archival profession aims to preserve and manage born-digital material to

standards matching those of paper-based collections, a broader and deeper understanding of these issues must be

developed, and this understanding must be incorporated into training of new archival professionals, professional

development programs, and continuing education.

In both the United Kingdom and the United States the home countries of the AIMS partners there is a

perception of a high bar for entry in the world of digital archives, both in terms of expertise and resources.

Therefore, many institutions are reluctant to take even initial steps.

In the US in particular, organizational cultures have made sharing best practices dif cult. While the electronic

records, or e-records, community in the US has focused more on organizational records from information and

knowledge management perspectives, those working in manuscript collecting repositories have been somewhat

reluctant to enter an unfamiliar arena. Common issues in these collecting repositories for example, legacy

material and unde ned accessioning practices made it dif cult to build expertise and capacity. Moreover,

institutional practice has been focused on immediate local needs rather than developing a shared framework.

Now there is a small but emerging group of archivists working on issues related to born-digital content in personal

papers and committed to sharing best practices. In addition, there is a growing recognition between archivists and

those in the digital community that collaboration is absolutely crucial to success in this new paradigm.

The size of the archival community in the UK makes for a smaller arena within which to share ideas and solutions.

In the UK there is a more developed, even thriving, community of practitioners working on born-digital archives of

external donors/depositors as well as from their own organizations. However, there is a wide and growing gap

between institutions with established staff, equipment, and processes (mostly national institutions and some

universities) and those with no expertise or capacity whatsoever. Many smaller repositories cannot afford to

Foreword i

collaborate with other institutions and thus cannot share some of the developments of their better-funded

colleagues.

Despite these challenges, individual institutions and collaborative partnerships in both the UK and US are doing a

great deal of work in research, development, and practical implementation. Some of the many projects and

initiatives that in uenced and informed the work of the AIMS partners are discussed in the next section. These

projects approach the issues from different archival and technical perspectives. Recently, new tools have been

developed that focused on capture, identi cation, or preservation. Some have discovered and are incorporating

tools used in other elds, particularly technology developed for forensic investigations.

Although a great deal of work has been and continues to be done in this area, there is not yet a uni ed approach

to address the lifecycle of stewardship in an accessible way and most importantly, in a way that is grounded in

archival practice. There is no single model to evaluate these many different approaches to born-digital stewardship

and to unite them in a framework of objectives and options.

THE AIMS PROJECT

Into this climate, the AIMS partners proposed an inter-institutional framework for stewarding born-digital content.

The AIMS partners realized that they could not solve all problems associated with born-digital materials but

decided to focus their attention on professional practice de ned by archival principles and by the current state of

collections at the partner institutions.

In developing the AIMS Framework, the project would apply a practitioner-based research approach by developing

a model based on real case studies of collections at each institution. Applying our theories would con rm or

challenge the initial framework which could then be used as a model around which to build individual work ows

and processes within each partner s organization. This test of concept for the AIMS Framework would prove

whether it could be used within a wide range of organizations with different staf ng models, archival processes,

tools and infrastructure. This practical approach imposed a discipline and a framework for investigations and

discussions; provided a variety of case studies, with different record formats, legacy issues, scale and complexity, and

donor relationships; and de ned an archival context for identifying ethical issues and other challenges, clearly

demonstrating the need for workable and scaleable solutions.

The AIMS project was originally tasked to make recommendations for best practice including tools and work ows

which could be applied within a variety of institutional scenarios. At a relatively early stage, however, it became clear

that the development of best practice within born-digital stewardship was not yet possible. Tools do not yet exist

for many elements of archival practice and many work ows are in uenced by constantly changing institutional

factors such as staff and technological infrastructure. The AIMS Framework, therefore, was developed to de ne

good practice in terms of archival tasks and objectives necessary for success.

Foreword ii

APPROACH

The AIMS project had a broad scope but a clear approach. From the outset, the partners realized that the

framework would need to acknowledge established practices and infrastructure within archive institutions for

managing paper-based collections, the existence of hybrid collections (those consisting of both digital and paper-

based materials), and the existence of legacy material transferred in the past and still stored on donors physical

storage media.

As a multi-institutional and multi-functional partnership, the group included archivists, digital archivists, technical

developers and repository managers, and other stakeholders within each partner institution. Each of the four

institutions have different strengths, different collection specializations, and face different challenges. They vary in

size, resources, and capacity, both in terms of parent organizations and archive/manuscript departments within the

larger library function. This diversity forced the teams to be exible, to explore a variety of options, and to compare

and evaluate options. The result was a series of collective decisions and a framework that is by its nature not

institution-speci c.

The project also combined two different organizational models for archive/manuscript departments. The rst model

is found in larger organizations, where functions of collection development, cataloguing and provision of access are

undertaken by different members or groups of staff. This enables (and indeed requires) policies and practices within

each function to be well developed and documented. However, there is little continuity of stewardship for a single

collection across its lifecycle within the institution. The separate functions may have different priorities, objectives, or

ways of working. In the UK, this model is relatively rare outside the larger national institutions, while in the US larger

institutions (including the academic institutions among the AIMS partners) are more common and therefore the

organizational models of these larger institutions tend to dominate professional discourse.

More prevalent among smaller institutions in both countries are smaller professional staffs who undertake (to a

greater or lesser extent) all the functions of collection development, cataloguing, and access, perhaps specializing in

a particular subject area. This does give greater continuity of stewardship; however, in some cases, there are fewer

resources (and perhaps less pressure) to develop detailed processes and policies for stewardship.

The transatlantic nature of the collaboration allowed the project to work within the established and evolving digital

archive communities of both nations, broadened its perspectives as well as its potential audience, and also shaped

its methodology. One constant was the presence of legacy collections and anomalies. The collection-focused nature

of the project solidi ed these areas of overlap, resulting in an approachable and accessible framework. All four

partners are university libraries or archives, all linked to professional colleagues and networks in other sectors

within their national or regional context.

To further ensure its broad applicability, the partners agreed that the stewardship framework should be developed

in compliance with established standards, models, and terminology whether based on archival, technical, legal, or

ethical standards. Two standards of note are the Encoded Archival Description (EAD) and the Open Archival

Information System (OAIS).

Foreword iii

The partners also sought to incorporate existing tools and services, such as Pronom and DROID, and, when

possible, to rely on software agnostic or open-source solutions. The University of Virginia, Stanford University, and

the University of Hull s collaboration on the Hydra Project1 prompted a natural choice to use the Fedora-based

repository environment. Nonetheless, the Framework does not rely on any particular system; in fact, project

partners developed functional requirements for new tools to ful ll the archival functions of arrangement and

description.

So that born-digital stewardship could be completely integrated with systems and processes for its paper-based

predecessors, the partners sought to recognize established archiving tools, such as Archivists Toolkit (AT) in the US

and Axiell CALM in the UK. These tools are not explicitly referred to in the Framework, but information detailing

the use of these tools by individual AIMS partners can be found throughout the text.

In addition to the development of tools, the project sought to draw upon the signi cant body of developing

initiatives focused on the stewardship of born-digital archives including the following:

Paradigm

http://www.paradigm.ac.ukl

The Personal Archives Accessible in Digital Media or PARADIGM project (2005-2007) was a collaboration

between the research libraries of the Universities of Oxford and Manchester to explore the issues involved in

preserving digital private papers through gaining practical experience in accessioning and ingesting digital private

papers into digital repositories, and processing these in line with archival and digital preservation requirements.

PARADIGM created a workbook documenting their recommended best practices. The PARADIGM project s

in uence is substantial and further discussion of the parallels and differences between AIMS and PARADIGM are

explored in the Introduction to the AIMS Framework.

futureArch

http://www.bodleian.ox.ac.uk/beam/projects/futurearch

Also funded by the Andrew W. Mellon Foundation, futureArch at the Bodleian Library seeks to transform our

capacity for working with born-digital & hybrid archives. In particular, Bodleian Electronic Archives and Manuscripts

(BEAM) has been working on digital preservation infrastructure, researcher interfaces for hybrid archives and

curatorial practices.

Archivematica

http://archivematica.org/wiki

Archivematica is a comprehensive digital preservation system offered as an open-source software solution. Based

on the OAIS functional model, Archivematica uses a micro-services approach to create an integrated suite of tools

for processing digital objects from ingest to access.

http://projecthydra.org

1

Foreword iv

Digital Lives Research Project

http://www.bl.uk/digital-lives/

Through the Digital Lives Research Project, the British Library explored personal digital collections in the 21st

century. The project inspired a Digital Lives Research Conference and the Digital Lives blog. To date, an initial

synthesis of the research has also been published.

Approaches to Managing and Collecting Born-Digital Literary Materials for Scholarly Use

http://www.neh.gov/ODH/Default.aspx?tabid=111&id=37

This National Endowment for the Humanities (NEH) Start-Up grant-funded a project examined the management

of the born-digital components of three signi cant collections of literary material. The project whitepaper is

available online and explores issues surrounding preservation and access.

Digital Forensics and Born-Digital Content in Cultural Heritage Collections

http://www.clir.org/pubs/abstract/pub149abst.html

This report, commissioned by the Council on Library and Information Resources (CLIR), was published in

December of 2010 and explores how digital forensic techniques typically used by the law enforcement and

computer security elds can be applied in the stewardship of born-digital collections within cultural heritage

institutions.

Salman Rushdie s Digital Life

http://marbl.library.emory.edu/innovations/salman-rushdie

This hybrid digital collection at Emory University s Manuscript, Archives and Rare Books Library (MARBL) provides

a model for arrangement, description, and access to born-digital materials. While the work done on this collection

may not be practical for all institutions, the exploration of various issues has been very in uential.

Practical E-Records

http://e-records.chrisprom.com/

The blog Practical E-Records was created as a result of Fulbright Scholar Chris Prom s work at the Center for

Archive and Information Studies (CAIS) at the University of Dundee. The blog aims to evaluate software and

conceptual models that archivists and records managers might use to identify, preserve, and provide access to

electronic records. Posts on speci c tools and models were helpful to the digital archivists in developing processing

work ows.

PROJECT NARRATIVE

The AIMS project was initiated as an extension of the Hydra project partnership between the University of

Virginia, Stanford University, and the University of Hull. The addition of Yale University broadened the project by

adding a non-Hydra partner. The project s purpose, objectives, and methodology were re ned during discussions

with the Andrew W. Mellon Foundation. The project began in October 2009 when funding was con rmed.

Foreword v

The rst project milestone was the recruitment and hire of a Digital Archivist at each of the four institutions. All

four digital archivists were initially appointed to xed-term contracts. However, two of the four posts have

subsequently become permanent (at Stanford and Virginia) and the other two (at Hull and Yale) were lled via a

secondment. All four institutions will retain these experienced staff members assembled for this project.

Once the digital archivists were oriented to the technical, organizational, and archival environment of their

institution, the project proceeded via two work ows.

First, the Digital Archivists and their colleagues processed the digital collections identi ed for the AIMS project,

many of which were hybrid collections of digital and paper-based materials. The Digital Archivists shared information

on all elements of their work: capture and handling procedures; processing methodologies and tools; ethical and

archival issues; and issues of discovery and access. Secondly, the entire project team collaboratively developed the

AIMS toolset or framework.

Both efforts were informed and in uenced by each other and by the digital archive community in the US and the

UK. The collaborative work took place via face-to-face and on-line meetings and environments:

Face-to-face meetings every six months, involving the Digital Archivists, lead archivists, repository

managers, and developers within the AIMS team, and other colleagues from the host institution. These

meetings were occasionally timed to coincide with Hydra development meetings (once with the full

AIMS team and once with the Digital Archivists), to derive maximum bene t from travel expenditure

and to enable archivists and technicians to meet face to face

Conference calls every two weeks for the full AIMS group (as above)

Conference calls on alternating weeks for Digital Archivists and the AIMS developer

Regular in-house meetings at each institution

During the later stages of the project, brief conference calls every week for Lead Archivists

Collaborative discussions and drafting of working documents

The project team collaborated with others working in this area and with the digital archivist community through

the following means:

A blog, with postings from the AIMS Digital Archivists and from guest bloggers (for more information,

see Appendix I.1):

http://born-digital-archives.blogspot.com/

Several in-person meetings and collaborative events including:

- An Unconference in Charlottesville in May 2011

- A symposium in London in June 2011

- A half-day workshop prior to and a presentation during the 2011 Society of American Archivists

(SAA) Annual Meeting entitled CREW: Collecting Repositories and E-records Workshop.

Detailed accounts of these events are in Appendix I.2.

The creation of the Day of Digital Archives project and blog (see Appendix I.3):

http://dayofdigitalarchives.blogspot.com/

Foreword vi

The AIMS Framework was developed progressively through each of these meetings, events, and calls, with

objectives being agreed to at each stage before building the next level of granularity. The rst task: reaching

consensus on the scope, purpose, and de nitions of the main archival activities as referred to in the Framework.

For each stage or activity, key objectives (described in archival terms with speci c reference to born-digital material)

were identi ed and parsed into decision points and tasks.

With these functions more fully characterized, it was possible to investigate resources and tools. Commencing with

a review of existing options (either tools developed speci cally for archival use, or those with another primary

purpose for example forensic investigation), tools and software were then tested through real-life

implementation with a sample corpus of material from within the AIMS collections. The testing and evaluation

focused on the extent to which the tool ful lled the de ned archival requirements such as ensuring authenticity

and integrity, and/or documenting an audit trail. While several tools ful lled some required needs, no single, open-

source solution was identi ed for arrangement and description. In addition, some of the commercial tools tested

were not designed for the archival market and required adaptation for archival work ows.

As a result of this unful lled quest, the team authored functional requirements for a tool to ll this gap in born-

digital archive stewardship. These functional requirements are described more in Chapter 2: Accessioning and more

fully in Appendix H.1. In line with our general research methodology, this work translates traditional archival

principles and practices into a born-digital context.

LESSONS LEARNED

The most basic assumptions were constantly tested during the project. Three formidable challenges were the

iterative nature of the project, varying institutional perspectives, and differences in terminology for similar concepts

among project partners.

Iterative Processes

Once processing of the project collections commenced, it became apparent that the work ows would have to be

iterative both within one archival function and between functions. A closer and more granular de nition of archival

activities revealed the extent to which they are carried out at different stages in the work ow, depending on

individual collections and circumstances. Some tasks must be carried out at a speci c place or order in the

work ow, while others are relevant to all or can be done at different points. In some cases the deciding factor was

archival, sometimes practical or technical, sometimes ethical.

The iterative nature of archival work ows has relatively few implications for the successful preservation of paper-

based archives. A suitable physical storage environment is the single most important factor and is relatively easy to

de ne and monitor. With born-digital material there is a greater need to understand, analyze, and assess the

implications of decisions made at a particular stage of the work ow to avoid problems or con icts later. The

work ow then must be seen as a whole even when embarking on rst steps.

Foreword vii

The iterative nature of processing collections at each institution also demonstrated the need for scalability. In

particular, accessioning and processing work ows need to allow for and enable digital materials to be transferred to

managed storage as soon as possible to ensure preservation of bitstreams. This requires a work ow as free as

possible of bottlenecks and labor-intensive processes that prevent this early and successful transfer.

Institutional Perspectives

The second challenge to the AIMS project was the diversity of institutional perspectives. Although this diversity was

eventually perceived by the partners as a bene t in building the Framework, it also meant that no single approach,

set of assumptions, or work ow steps could be adopted by default. Each had to be de ned, shared, and mapped

onto those of the other partner institutions so that generic tasks and objectives could be de ned for the

Framework.

Terminology

The third challenge was language and terminology. The differences both in use and understanding of terminology

between the US and the UK as well as between the archival profession and the digital library world of both

countries prompted questions and, in many instances, prevented the acceptance of assumed de nitions and

understandings. Adding to this challenge was the rede ning of traditional archival terms to a born-digital context.

The partners recognized that, despite differences in terminology, the fundamental archival objectives and outcomes

required rede nition of the nature of the activities and tasks required to achieve them. To aid in disambiguating

these terms, the project partners created a glossary, included in Appendix A.

CONCLUSION

The AIMS project did not promise to solve all problems associated with born-digital stewardship. In fact, we realized

that recommendations could only be for good practice rather than best practice. This is a practical approach but

also a recognition that there is no single solution for many of the issues that institutions face when dealing with

born-digital collections. Instead, the AIMS project partners developed this framework as a further step towards best

practice for the profession.

Foreword viii

AIMS: An Inter-Institutional Model for Stewardship

T he AIMS Framewor k:

T he Functions of Stewardship

INTRODUCTION

One of the primary research outputs of the AIMS project is the AIMS Framework: The Functions of Stewardship.

The Framework attempts to map an emerging world combining traditional archival practices with new

technologies. While traditional practices evolve (one need only witness the impact of Greene & Meissner s article

on more product, less process (MPLP) methodology 2 as evidence of this), the increasing sense of urgency at

institutions for a scalable methodology of acquiring and processing born-digital materials will change the traditional

paradigm even more.

As the four partner institutions worked collaboratively to design procedures for accessioning and processing born-

digital materials, we discovered these cannot be isolated activities carried out in one department, or by one staff

specialty, or apart from the rest of the archival management work ow. The Functions of Stewardship document the

entire lifecycle of born-digital material from the moment the institution becomes interested in acquiring to the

instant that a researcher accesses the material.

The Framework is divided into four main functions that should be thought of as sequential steps in a very high-level

work ow. However, it is also important to view the process as a whole. Decisions made at the beginning of the

process will have a direct impact on later outcomes. Furthermore, with growing legacy collections of data on disks

and servers already sitting in our stacks, the process at an individual institution may begin somewhere in the middle

or may require moving through the functions in an order different than what is presented here.

The AIMS partners reached consensus that the activities described in the framework are necessary for ensuring

the successful management of born-digital and hybrid collections. As described in the Foreword, one of the strengths

of the project is the diversity of archival environments and practices at each of the AIMS partners institutions. This

diversity, while high level, prompts the AIMS Framework to provide a sound basis for developing more robust and

sophisticated local practice.

Greene, M. A., & Meissner, D. (2005). More product, less process: Revamping traditional archival processing. American Archivist, 68(2), 208-

2

263.

1

AIMS: An Inter-Institutional Model for Stewardship

The four Functions of Stewardship outlined in the rest of this document are:

Collection Development: the actions and policies of an institution to acquire material for end-

users as they de ne them both current and future. Collection development activities form the basis

for subsequent actions and decisions undertaken by the institution as they accept stewardship for and

legal ownership of materials from a donor, creator, or seller. This is particularly important as institutions

develop their strategies for dealing with born-digital materials.

Accessioning: a core function of archives, wherein an archival institution takes physical and legal

custody of a group of records from a donor and documents the transfer in a register or other

representation of the institution s holdings.

Arrangement and Description: the processes undertaken by an institution to establish

intellectual control of the material following the physical control secured during accessioning. It also

prepares the material for discovery by preserving the context of the materials, and prepares for access

by applying appropriate restrictions.

Discovery and Access: the systems and work ows that make material, and the metadata that

support it, available to users while ensuring compliance with any access restrictions with. The process

of discovery and access requires some action on the part of individual users for example carrying

out a search or requesting an item.

Each functional area is further described in this document with necessary objectives identi ed for each. These

objectives are further detailed through expected outcomes, decision points, and tasks. In addition, keys to success,

or areas that should be addressed and conditions that should be put into place before beginning work in an area,

are de ned for each objective.

One area intentionally not addressed in this project is digital preservation the speci c practices developed to

ensure the long-term viability and security of data. The reason was twofold. First, while an emerging discipline, digital

preservation has many well-documented best practice models and methodologies. Reiterating what others have

already determined would not be useful. Instead, the framework assumes that efforts outside of the archival

functions ensure the viability of data. This leads to the second reason digital preservation was not discussed: it is

larger than the scope of this project. Digital preservation is a major infrastructure issue for libraries, archives, and

other institutions. The only way to achieve reasonable success in digital preservation is in economies of scale

wherein the nuts and bolts of preservation (storage space, repository infrastructure, refreshing of media, etc.) are

carried out in the same way for all digital content. In this way, the speci c archival activities that are explored here

do not overlap with preservation activities.

Appraisal is also not de ned as a speci c, separate function. Rather, appraisal activities are included in any or all of

the rst three functions within this framework collection development, accessioning, and arrangement and

description. Principles, strategies, and tasks related to appraisal process will appear within each function.

As a nal note, the AIMS Framework bears some resemblance to the PARADIGM Workbook in scope and

content. However, PARADIGM contains much more detail about acquiring collections and collection development.

2

AIMS: An Inter-Institutional Model for Stewardship

While more detail is useful, the PARADIGM Workbook sometimes lacks the broad and holistic viewpoint that the

AIMS Framework can and will provide. The project partners hope that both PARADIGM and the AIMS Framework

can be used together by institutions working towards the establishment of practices for the stewardship of born-

digital materials.

3

AIMS: An Inter-Institutional Model for Stewardship

1 . Collection Development

DEFINITION AND SCOPE

Collection Development: the actions and policies of an institution to acquire material for end-users as

they de ne them both current and future. Collection development activities form the basis for

subsequent actions and decisions undertaken by the institution as they accept stewardship for and legal

ownership of materials from a donor, creator, or seller. This is particularly important as institutions develop

their strategies for dealing with born-digital materials.

PREFACE

In the initial stages of the AIMS project, the activities described in this section of the AIMS Framework were

referred to as pre-accessioning intervention. From an archivist s perspective, this is a more accurate description

than collection development because it highlights attempts to determine what actions should be undertaken or

what information should be gathered in order to lay a solid foundation for the long-term stewardship of born-

digital archives in order to inform and assist curators when working with donors during this early stage. However,

since this work is undertaken within a larger framework across disciplines and with a variety of other archive or

library staff, the term collection development is more universally understood.

Until recently, born-digital materials were often viewed as an adjunct to the paper or analog materials in a

collection. They were seen as less important, in many cases thought to be duplicative or uninteresting, and perhaps

as items that could be discarded. Speci c collecting activities related to born-digital materials within manuscript

collections were sporadic and unde ned. Ensuring preservation and access usually included printing out the digital

les. While feasible when there are only a few items, this activity is neither sustainable in the long term nor

preferable. Despite their complications, born-digital materials are more exible, enabling full-text search or other

interactivity. The loss of this exibility downstream in a discovery environment has led to a growing effort to keep

digital les (rather than printing or discarding them), whether or not they are duplicates of analog material.

Traditionally the selection and cultivation of a collection has been the sole purview of a subject curator or archivist,

possibly working in conjunction with an acquisitions committee. While the processing team or archivists may have

been called on to assist with parts of the process, communication during this initial phase might be limited. When

dealing with born-digital materials, however, it is best to employ a more collaborative approach from the outset:

technical expertise and experience with newly designed work ows (or those just being tested) from archival or

digital staff will aid the curator in appraising materials, performing test captures, and identifying any issues related to

accessioning, processing, preservation or delivery. There are numerous examples of scenarios where technical and

1. Collection Development 4

AIMS: An Inter-Institutional Model for Stewardship

legal expertise would be necessary, including: undertaking research on new capture methodologies from a media

type not previously encountered; negotiating permission to capture or extract data from a proprietary web

service; assessing the feasibility of taking material dependent on software or other programs that require signi cant

commitment to deliver or render; and understanding the licensing and intellectual property rights implications of

capturing or copying software as well as data. These activities are substantially different from those undertaken

when dealing with analog materials, and it is best to discuss these issues with a team from the outset.

The team approach in the collection development phase will allow all parties to:

1. be aware of broad issues as they arise in order to develop strategies for incorporating them into

current and future work ows,

2. work closely together to better understand the institution s ability or capacity to receive the digital

materials in question and to undertake long-term stewardship

3. have a full understanding of the implications of donation, acquisition, processing and delivery.

KEYS TO SUCCESS

Collection development is the rst step in the AIMS Framework for born-digital stewardship. Decisions made

regarding born-digital materials by the curator and donor will affect each additional step of the archival process and

the long-term plan for accessibility. While the objectives below differ slightly from the traditional collection

development experience, they serve many of the same functions, including establishing trust with donors and

depositors and creating comprehensive documentation for future activities.

Before embarking on these objectives, institutions should also foster discussions about the policies and procedures

highlighted below. These discussions will require consideration of future scenarios, an activity made all the more

dif cult by rapidly changing technology and professional practice. The methodology or practice at each institution

will differ in the approach to developing policies and procedures. An understanding of your institution s strategic

environment and risk tolerance will be essential in successfully navigating these decisions. Some institutions will

require deliberation and the creation of policies as a rst step; others are more likely to favor experimental action.

The goal is for all departments and staff involved to agree on expectations, abilities and capacity. There is no right

or wrong way to do it, but waiting for a perfect work ow or tool to evolve may mean losing those materials

through data loss or to competing organizations. Spending time at this stage to think through future activities will

result in less confusion and dif culty later.

Born-Digital Collecting Policies

The collection development policies of an institution are designed to guide the acquisition of materials according to

their mission and collecting goals in order to meet the needs of their end users. The implementation of the

collection development policy in relation to born-digital materials might involve:

1. Collection Development 5

AIMS: An Inter-Institutional Model for Stewardship

prioritizing collections based on the needs or strategies of the institution and its user communities (stated

or implicit)

developing relationships with donors

assessing born-digital and analog materials and the relationship between them

evaluating how the former might t into current technological strategies or push further development at

the institution

Therefore an institution s born-digital collecting policy needs to establish the institution s position its principles

and general standpoint on a wide range of issues which have implications for stewardship. This will ensure that it

is effective in guiding discussions and decisions relating to speci c donations and individual accessions during

collection development activities. These discussions will determine how an institution s policy is applied in a speci c

instance, any exceptions to the policy, as well as how options within it will be recorded in a legal agreement.

A born-digital collection development policy should supplement an institution s general collecting policy, and include

information about:

Method(s) used for transfer and/or capture of materials

Methods for identifying and dealing with les that contain viruses or other threats to preservation

Options for dealing with les that are duplicates, redundant, or out-of-scope

Criteria for capture (or acquisition by other means) of proprietary or open-source software, or of

hardware

Strategies and methods for preserving materials (what is preserved and how)

Strategies and methods for providing access to materials (what is delivered to the user and how)

Policies and strategies for dealing with con dential or other sensitive content

Conditions governing access (restrictions, limits on access, users) and how they are applied and

enforced

Policies relating to intellectual property rights, including Creative Commons Licensing and copyright

(the role of the institution and how it is undertaken)

Retention (or not) of original storage media

Categories of digital material (AV, databases, text, etc.) which the institution is able to preserve,

manage, and deliver, with indicative listing of le types and formats, and limitations where applicable

Methods for ensuring and demonstrating integrity and authenticity, with associated criteria. (As

discussed in Chapter 3: Arrangement and Description, more development is needed in this area.)

These issues have technical, archival, ethical, and legal elements. Many of them relate to the technical processes

required for accessioning and delivery of materials and are discussed more fully elsewhere in this document. These

processes have ethical and legal implications that the donor needs to be aware of and to understand in order to

1. Collection Development 6

AIMS: An Inter-Institutional Model for Stewardship

give informed consent. For example, if the institution s default policy is for a bit-for-bit capture,3 will the donor be

asked for their preference? How would the institution handle les previously deleted by the donor/creator but

which are included in the data capture? Will there be a difference between what is captured and what is accessible

to the public (for example, when is the bit-for-bit copy retained for preservation and processing purposes only and

not for access)? Will material available online differ from what's available on-site (for example via a standalone

computer in the reading room)? Will legacy/transfer storage media be retained and what software or hardware will

be captured or acquired? Software may be needed in order to render the les with their signi cant properties4

but capture of proprietary software from a donor may contravene licensing agreements.

There are several useful papers discussing the issues and ethics of working with born-digital materials.5 However,

your institution should have a written statement that the curator may use as a primary point of reference.

Technical limitations at an institution might initiate a list of preferred formats based on capacity and ability. 6 This

may be driven by preservation strategies and, to a lesser degree, the current capability for delivery. However, the

acquisition of born-digital material should be based rstly on a curatorial appraisal of its t within the collection

development policies of the institution. In addition, a feasibility study or technical appraisal should be performed by

archivists and/or technical specialists before nal decisions are made. While an institution might take in a format that

is not on its preferred list, it would need to understand that it cannot guarantee the same level of stewardship

i.e., preservation might be only at the bit-level.

Recognizing your institution s ability and willingness to collect born-digital material and de ning the parameters of

this effort is key. Many institutions are rede ning their collecting policies and overall strategies for the 21st century,

and born-digital is recognizably a huge issue for many repositories as was demonstrated in the 2011 OCLC survey

report.7

3As an example, see Appendix F.7: Beinecke Rare Book and Manuscript Library s Born Digital Archival Acquisition Collection & Accession

Guidelines, speci cally In acquiring born digital materials the capture by snapshot of all working les on a speci c computer, will be the

preferred method of acquisition; in most cases BRBL will wish to capture entire digital environments without any advanced collection editing

by creator or curator.

4For example, at Stanford, when the Peter Koch computer les were acquired by a logical capture, the fonts associated with his InDesign and

Quark design les were not captured. This created an inability to render the printer s designs accurately in the virtual machine especially as

many of the fonts were no longer available.

5 For example: Matthew Kirschenbaum, Richard Ovenden, and Gabriela Redwine, Digital Forensics and Born-Digital Content in Cultural Heritage

Collections (Washington: Council on Library and Information Resources, December 2010); and Digital Lives Research Project

(http://www.bl.uk/digital-lives/)

6 For example: Deep Blue Preservation and Format Support Policy at the University of Michigan

(http://deepblue.lib.umich.edu/about/deepbluepreservation.jsp) and Wellcome Library Digital Curation toolbox

(http://library.wellcome.ac.uk/node289.html)

The British Library s website discusses their collecting policies for the 21st century

7

(http://pressandpolicy.bl.uk/content/default.aspx?NewsAreaId=312) and the anxiety on the part of archivists for dealing with born-digital ma-

terials is documented in Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives, Jackie M. Dooley and Katherine

Luce, OCLC Research, Oct. 2010.

1. Collection Development 7

AIMS: An Inter-Institutional Model for Stewardship

Donors and Trust: University of Hull

Digital

preservation

Simon Wilson

strategy

Digital Archivist, University of Hull

An institution

An organization recently contacted the Hull History Centre regarding the transfer of over 100 lin-

must have a good

ear metres of their historical archives, dating back over 170 years. The organisation also expressed

understanding of a willingness in principle to participate in the AIMS project.

its technological

Despite several meetings to discuss the potential type and range of born-digital material to transfer,

capabilities and the organization was hesitant and eventually withdrew from the project. Having recently trans-

must have some ferred their paper archives to the Hull History Centre, there was an on-going relationship with the

donor and a level of trust and understanding about the value and importance of archives. The or-

sort of

ganization recognised that the paper archives had historical value, even though they were clearly no

preservation longer using the paper-based records on a daily basis and could not justify the space those materi-

strategy in place als occupied in their of ce. Although Hull History Centre staff were thinking about the continuity

of records from paper to born-digital, the organization regarded the records differently and had not

or in

thought about how they would continue preserving the legacy of their work in the digital age. The

development in organization s digital material was still actively being created and used, and less likely to be perceived

order to as being archival or having historical value in the same way that the paper archives clearly did. The

born-digital les accumulating on the servers were less visible than their paper predecessors, and

undertake

shortage of space was less of a concern. Data security and the possibility that sensitive material

stewardship of would be transferred was also a worry although voiced somewhat vaguely, perhaps because the

risk would only become relevant and speci c once the principle of transferring the born-digital

born-digital

material was accepted.

archives

responsibly. This The lesson learned from this experience: place greater emphasis in the initial discussions with po-

tential donors on the continuation of established practices relating to material of archival value,

strategy must

whatever its format, rather than on the format of the material. The reluctance of this organization

include the

to discuss with Hull History Centre staff the born-digital material that they were creating hampered

management of the Centre s ability to identify or recommend possible material for the archives, or to offer reassur-

ance about the protection of sensitive material. This uncertainty also led to the organization s con-

material from the

cern that the identi cation of material for transfer would take considerable time and effort on their

moment of part.

transfer, through

Also evident was the fact that, in the future, the Centre must be clearer in explaining that the na-

processing in a

ture of born-digital archives necessitates the capture of electronic records soon after creation

virtual much sooner than is traditionally the case for a paper-based materials. In developing a user interface

for born-digital archives, the Centre will actively look to demonstrate to donors the ability to safely

workspace, and

store and control user access for materials stored in the Centre s digital repository and allay any

nally ingestion

fears that may arise from phenomenon such as wiki-leaks.

into preservation

Interestingly, another organization which has been regularly transferring material to us since the

and/or delivery

1960s and which was equally reluctant to include born digital records in these transfers has re-

repositories. cently itself raised the issue of its born-digital archives. Early reluctance was the result of concerns

about access to and misuse of the material, from the viewpoint of intellectual property and reputa-

tion rather than personal con dentiality. A change of staff within the organisation, together with a

greater emphasis on preserving a record of its more recent activities, has helped to overcome

those initial reservations. The risks of transfer and more open access are still present, but the or-

ganization is now able to see the advantages of creating a born-digital archives, as well as the obsta-

cles to be overcome.

1. Collection Development 8

AIMS: An Inter-Institutional Model for Stewardship

A complete understanding and description of the infrastructure will include:

storage environment

equipment for transfer, capture, and quarantine

maintenance activities; personnel and skills required

planning and communication strategies

Not all institutions need build their own digital repository. An acceptable strategy might include joining a local or

regional repository. 8 Wherever the location of the speci c repository, institutions need to ensure and demonstrate

that they can and will undertake responsible stewardship, or question whether they should in fact be collecting

born-digital materials.

Legal agreements

Many institutions will already have in place a template for agreements with donors (or depositors, sellers, or

vendors) which covers analog materials. Before active collection begins, the agreements should be amended to

cover born-digital materials. As with the collection development and preservation policies above, the agreement

should acknowledge and make explicit reference to the salient characteristics of born-digital materials and the

additional issues which arise with born-digital or hybrid archives. This will facilitate common understanding between

donor and institution, ensure informed consent, and record key decisions and information for future reference. At a

minimum, the agreement should include the following elements:

Scope and description of materials being transferred, both analog and digital, in

copyright infringement.48



Contact this candidate