A IMS Born-Digital Collections:
A n Inter-Institutional
M odel for Stewardship
January 2012
University of Hull
Stanford University
University of Virginia
Yale University
A cknowledgement
The AIMS Project is a partnership between the University of Virginia Libraries,
Stanford University Libraries and Academic Resources, the University of Hull Library,
and Yale University Library with support from the Andrew W. Mellon Foundation.
Suggested citation:
AIMS Work Group. 2012. AIMS Born-Digital Collections: An Inter-Institutional Model for Stewardship.
http://www2.lib.virginia.edu/aims/whitepaper/AIMS_ nal.pdf.
Table of Contents
Foreword
i
The AIMS Framework: The Functions of Stewardship
1
1. Collection Development
4
Donors and Trust: University of Hull
8
Enhanced Curation at the British Library
13
2. Accessioning
17
Evolution of Accessioning at University of Hull
21
Project Xanadu: Loss and Recovery
25
3. Arrangement and Description
31
Technical Development: Functional Requirements for Arrangement and Description
33
Arrangement and Description Case Study: The Papers of Stephen Gallagher
36
4. Discovery and Access
44
Visualizing Email Access: MUSE
48
Access models (Table 1)
51
Publication Pathway and Discovery and Access at the Bodleian Library
56
Discovery models (Table 2)
59
Conclusions
63
Appendix A: Glossary
64
Appendix B: Bibliography
72
Appendix C: Contributor Biographies
76
Appendix D: Institutional Summaries and Collection Descriptions
83
1. The University of Hull, University Archives at Hull History Centre
83
2. Stanford University, Stanford University Libraries & Academic Information Resources
86
3. The University of Virginia, Albert and Shirley Small Special Collections Library
91
4. Yale University
93
Appendix E: Sample Processing Plans
96
1. University of Hull: Stephen Gallagher Processing Plan
96
2. Stanford University: Gould Processing Plan
99
3. University of Virginia: Cheuse Papers Processing Plan
104
4. Yale University: Tobin Collection Processing Plan
106
Appendix F: Policies, Templates, Documents, etc.
108
1. AIMS Donor Survey
108
2. University of Hull Accessioning Work ows
113
3. University of Hull Digital Media Photography Form
115
4. University of Hull Insertion Sheet
116
5. Guidelines for Creating Agreements at Stanford University
117
6. Stanford University Processing Work ow
120
7. Beinecke Library Born Digital Archival Acquisition Collection and Accession Guidelines
124
Appendix G: Technical Evaluation and Use
125
1. AccessData FTK3.3
125
2. AccessData FTK Imager 3.0
127
3. Comparison of 5.25 Floppy Disk Drive Solutions
129
4. Karen s Directory Printer (v.5.3.2)
132
5. Curator s Workbench
134
Appendix H: Technical Development
136
1. Functional Requirements for Arrangement and Description
136
2. Rubymatica
167
3. Hypatia
169
Appendix I: Digital Archivist Community
171
1. Born Digital Archives Blog
171
2. Digital Archivist Community Events
174
3. Day of Digital Archives
179
4. Presentations, Conferences, and Publications
181
F oreword
The AIMS project evolved around a common need among the project partners and most libraries and archives
to identify a methodology or continuous framework for stewarding born-digital archival materials. These
materials have been slowly accumulating in archival backlogs for years but are rapidly growing as more
contemporary collections are accessioned.
Alongside the many and complex technological requirements, the challenges of stewarding born-digital material
demand new strategies as well as a rede nition of archival work ows. Accordingly, this emerging challenge will affect
the skill-set needed for archivists and the working relationships among archival colleagues as well as those outside
our communities and organizations. If the archival profession aims to preserve and manage born-digital material to
standards matching those of paper-based collections, a broader and deeper understanding of these issues must be
developed, and this understanding must be incorporated into training of new archival professionals, professional
development programs, and continuing education.
In both the United Kingdom and the United States the home countries of the AIMS partners there is a
perception of a high bar for entry in the world of digital archives, both in terms of expertise and resources.
Therefore, many institutions are reluctant to take even initial steps.
In the US in particular, organizational cultures have made sharing best practices dif cult. While the electronic
records, or e-records, community in the US has focused more on organizational records from information and
knowledge management perspectives, those working in manuscript collecting repositories have been somewhat
reluctant to enter an unfamiliar arena. Common issues in these collecting repositories for example, legacy
material and unde ned accessioning practices made it dif cult to build expertise and capacity. Moreover,
institutional practice has been focused on immediate local needs rather than developing a shared framework.
Now there is a small but emerging group of archivists working on issues related to born-digital content in personal
papers and committed to sharing best practices. In addition, there is a growing recognition between archivists and
those in the digital community that collaboration is absolutely crucial to success in this new paradigm.
The size of the archival community in the UK makes for a smaller arena within which to share ideas and solutions.
In the UK there is a more developed, even thriving, community of practitioners working on born-digital archives of
external donors/depositors as well as from their own organizations. However, there is a wide and growing gap
between institutions with established staff, equipment, and processes (mostly national institutions and some
universities) and those with no expertise or capacity whatsoever. Many smaller repositories cannot afford to
Foreword i
collaborate with other institutions and thus cannot share some of the developments of their better-funded
colleagues.
Despite these challenges, individual institutions and collaborative partnerships in both the UK and US are doing a
great deal of work in research, development, and practical implementation. Some of the many projects and
initiatives that in uenced and informed the work of the AIMS partners are discussed in the next section. These
projects approach the issues from different archival and technical perspectives. Recently, new tools have been
developed that focused on capture, identi cation, or preservation. Some have discovered and are incorporating
tools used in other elds, particularly technology developed for forensic investigations.
Although a great deal of work has been and continues to be done in this area, there is not yet a uni ed approach
to address the lifecycle of stewardship in an accessible way and most importantly, in a way that is grounded in
archival practice. There is no single model to evaluate these many different approaches to born-digital stewardship
and to unite them in a framework of objectives and options.
THE AIMS PROJECT
Into this climate, the AIMS partners proposed an inter-institutional framework for stewarding born-digital content.
The AIMS partners realized that they could not solve all problems associated with born-digital materials but
decided to focus their attention on professional practice de ned by archival principles and by the current state of
collections at the partner institutions.
In developing the AIMS Framework, the project would apply a practitioner-based research approach by developing
a model based on real case studies of collections at each institution. Applying our theories would con rm or
challenge the initial framework which could then be used as a model around which to build individual work ows
and processes within each partner s organization. This test of concept for the AIMS Framework would prove
whether it could be used within a wide range of organizations with different staf ng models, archival processes,
tools and infrastructure. This practical approach imposed a discipline and a framework for investigations and
discussions; provided a variety of case studies, with different record formats, legacy issues, scale and complexity, and
donor relationships; and de ned an archival context for identifying ethical issues and other challenges, clearly
demonstrating the need for workable and scaleable solutions.
The AIMS project was originally tasked to make recommendations for best practice including tools and work ows
which could be applied within a variety of institutional scenarios. At a relatively early stage, however, it became clear
that the development of best practice within born-digital stewardship was not yet possible. Tools do not yet exist
for many elements of archival practice and many work ows are in uenced by constantly changing institutional
factors such as staff and technological infrastructure. The AIMS Framework, therefore, was developed to de ne
good practice in terms of archival tasks and objectives necessary for success.
Foreword ii
APPROACH
The AIMS project had a broad scope but a clear approach. From the outset, the partners realized that the
framework would need to acknowledge established practices and infrastructure within archive institutions for
managing paper-based collections, the existence of hybrid collections (those consisting of both digital and paper-
based materials), and the existence of legacy material transferred in the past and still stored on donors physical
storage media.
As a multi-institutional and multi-functional partnership, the group included archivists, digital archivists, technical
developers and repository managers, and other stakeholders within each partner institution. Each of the four
institutions have different strengths, different collection specializations, and face different challenges. They vary in
size, resources, and capacity, both in terms of parent organizations and archive/manuscript departments within the
larger library function. This diversity forced the teams to be exible, to explore a variety of options, and to compare
and evaluate options. The result was a series of collective decisions and a framework that is by its nature not
institution-speci c.
The project also combined two different organizational models for archive/manuscript departments. The rst model
is found in larger organizations, where functions of collection development, cataloguing and provision of access are
undertaken by different members or groups of staff. This enables (and indeed requires) policies and practices within
each function to be well developed and documented. However, there is little continuity of stewardship for a single
collection across its lifecycle within the institution. The separate functions may have different priorities, objectives, or
ways of working. In the UK, this model is relatively rare outside the larger national institutions, while in the US larger
institutions (including the academic institutions among the AIMS partners) are more common and therefore the
organizational models of these larger institutions tend to dominate professional discourse.
More prevalent among smaller institutions in both countries are smaller professional staffs who undertake (to a
greater or lesser extent) all the functions of collection development, cataloguing, and access, perhaps specializing in
a particular subject area. This does give greater continuity of stewardship; however, in some cases, there are fewer
resources (and perhaps less pressure) to develop detailed processes and policies for stewardship.
The transatlantic nature of the collaboration allowed the project to work within the established and evolving digital
archive communities of both nations, broadened its perspectives as well as its potential audience, and also shaped
its methodology. One constant was the presence of legacy collections and anomalies. The collection-focused nature
of the project solidi ed these areas of overlap, resulting in an approachable and accessible framework. All four
partners are university libraries or archives, all linked to professional colleagues and networks in other sectors
within their national or regional context.
To further ensure its broad applicability, the partners agreed that the stewardship framework should be developed
in compliance with established standards, models, and terminology whether based on archival, technical, legal, or
ethical standards. Two standards of note are the Encoded Archival Description (EAD) and the Open Archival
Information System (OAIS).
Foreword iii
The partners also sought to incorporate existing tools and services, such as Pronom and DROID, and, when
possible, to rely on software agnostic or open-source solutions. The University of Virginia, Stanford University, and
the University of Hull s collaboration on the Hydra Project1 prompted a natural choice to use the Fedora-based
repository environment. Nonetheless, the Framework does not rely on any particular system; in fact, project
partners developed functional requirements for new tools to ful ll the archival functions of arrangement and
description.
So that born-digital stewardship could be completely integrated with systems and processes for its paper-based
predecessors, the partners sought to recognize established archiving tools, such as Archivists Toolkit (AT) in the US
and Axiell CALM in the UK. These tools are not explicitly referred to in the Framework, but information detailing
the use of these tools by individual AIMS partners can be found throughout the text.
In addition to the development of tools, the project sought to draw upon the signi cant body of developing
initiatives focused on the stewardship of born-digital archives including the following:
Paradigm
http://www.paradigm.ac.ukl
The Personal Archives Accessible in Digital Media or PARADIGM project (2005-2007) was a collaboration
between the research libraries of the Universities of Oxford and Manchester to explore the issues involved in
preserving digital private papers through gaining practical experience in accessioning and ingesting digital private
papers into digital repositories, and processing these in line with archival and digital preservation requirements.
PARADIGM created a workbook documenting their recommended best practices. The PARADIGM project s
in uence is substantial and further discussion of the parallels and differences between AIMS and PARADIGM are
explored in the Introduction to the AIMS Framework.
futureArch
http://www.bodleian.ox.ac.uk/beam/projects/futurearch
Also funded by the Andrew W. Mellon Foundation, futureArch at the Bodleian Library seeks to transform our
capacity for working with born-digital & hybrid archives. In particular, Bodleian Electronic Archives and Manuscripts
(BEAM) has been working on digital preservation infrastructure, researcher interfaces for hybrid archives and
curatorial practices.
Archivematica
http://archivematica.org/wiki
Archivematica is a comprehensive digital preservation system offered as an open-source software solution. Based
on the OAIS functional model, Archivematica uses a micro-services approach to create an integrated suite of tools
for processing digital objects from ingest to access.
http://projecthydra.org
1
Foreword iv
Digital Lives Research Project
http://www.bl.uk/digital-lives/
Through the Digital Lives Research Project, the British Library explored personal digital collections in the 21st
century. The project inspired a Digital Lives Research Conference and the Digital Lives blog. To date, an initial
synthesis of the research has also been published.
Approaches to Managing and Collecting Born-Digital Literary Materials for Scholarly Use
http://www.neh.gov/ODH/Default.aspx?tabid=111&id=37
This National Endowment for the Humanities (NEH) Start-Up grant-funded a project examined the management
of the born-digital components of three signi cant collections of literary material. The project whitepaper is
available online and explores issues surrounding preservation and access.
Digital Forensics and Born-Digital Content in Cultural Heritage Collections
http://www.clir.org/pubs/abstract/pub149abst.html
This report, commissioned by the Council on Library and Information Resources (CLIR), was published in
December of 2010 and explores how digital forensic techniques typically used by the law enforcement and
computer security elds can be applied in the stewardship of born-digital collections within cultural heritage
institutions.
Salman Rushdie s Digital Life
http://marbl.library.emory.edu/innovations/salman-rushdie
This hybrid digital collection at Emory University s Manuscript, Archives and Rare Books Library (MARBL) provides
a model for arrangement, description, and access to born-digital materials. While the work done on this collection
may not be practical for all institutions, the exploration of various issues has been very in uential.
Practical E-Records
http://e-records.chrisprom.com/
The blog Practical E-Records was created as a result of Fulbright Scholar Chris Prom s work at the Center for
Archive and Information Studies (CAIS) at the University of Dundee. The blog aims to evaluate software and
conceptual models that archivists and records managers might use to identify, preserve, and provide access to
electronic records. Posts on speci c tools and models were helpful to the digital archivists in developing processing
work ows.
PROJECT NARRATIVE
The AIMS project was initiated as an extension of the Hydra project partnership between the University of
Virginia, Stanford University, and the University of Hull. The addition of Yale University broadened the project by
adding a non-Hydra partner. The project s purpose, objectives, and methodology were re ned during discussions
with the Andrew W. Mellon Foundation. The project began in October 2009 when funding was con rmed.
Foreword v
The rst project milestone was the recruitment and hire of a Digital Archivist at each of the four institutions. All
four digital archivists were initially appointed to xed-term contracts. However, two of the four posts have
subsequently become permanent (at Stanford and Virginia) and the other two (at Hull and Yale) were lled via a
secondment. All four institutions will retain these experienced staff members assembled for this project.
Once the digital archivists were oriented to the technical, organizational, and archival environment of their
institution, the project proceeded via two work ows.
First, the Digital Archivists and their colleagues processed the digital collections identi ed for the AIMS project,
many of which were hybrid collections of digital and paper-based materials. The Digital Archivists shared information
on all elements of their work: capture and handling procedures; processing methodologies and tools; ethical and
archival issues; and issues of discovery and access. Secondly, the entire project team collaboratively developed the
AIMS toolset or framework.
Both efforts were informed and in uenced by each other and by the digital archive community in the US and the
UK. The collaborative work took place via face-to-face and on-line meetings and environments:
Face-to-face meetings every six months, involving the Digital Archivists, lead archivists, repository
managers, and developers within the AIMS team, and other colleagues from the host institution. These
meetings were occasionally timed to coincide with Hydra development meetings (once with the full
AIMS team and once with the Digital Archivists), to derive maximum bene t from travel expenditure
and to enable archivists and technicians to meet face to face
Conference calls every two weeks for the full AIMS group (as above)
Conference calls on alternating weeks for Digital Archivists and the AIMS developer
Regular in-house meetings at each institution
During the later stages of the project, brief conference calls every week for Lead Archivists
Collaborative discussions and drafting of working documents
The project team collaborated with others working in this area and with the digital archivist community through
the following means:
A blog, with postings from the AIMS Digital Archivists and from guest bloggers (for more information,
see Appendix I.1):
http://born-digital-archives.blogspot.com/
Several in-person meetings and collaborative events including:
- An Unconference in Charlottesville in May 2011
- A symposium in London in June 2011
- A half-day workshop prior to and a presentation during the 2011 Society of American Archivists
(SAA) Annual Meeting entitled CREW: Collecting Repositories and E-records Workshop.
Detailed accounts of these events are in Appendix I.2.
The creation of the Day of Digital Archives project and blog (see Appendix I.3):
http://dayofdigitalarchives.blogspot.com/
Foreword vi
The AIMS Framework was developed progressively through each of these meetings, events, and calls, with
objectives being agreed to at each stage before building the next level of granularity. The rst task: reaching
consensus on the scope, purpose, and de nitions of the main archival activities as referred to in the Framework.
For each stage or activity, key objectives (described in archival terms with speci c reference to born-digital material)
were identi ed and parsed into decision points and tasks.
With these functions more fully characterized, it was possible to investigate resources and tools. Commencing with
a review of existing options (either tools developed speci cally for archival use, or those with another primary
purpose for example forensic investigation), tools and software were then tested through real-life
implementation with a sample corpus of material from within the AIMS collections. The testing and evaluation
focused on the extent to which the tool ful lled the de ned archival requirements such as ensuring authenticity
and integrity, and/or documenting an audit trail. While several tools ful lled some required needs, no single, open-
source solution was identi ed for arrangement and description. In addition, some of the commercial tools tested
were not designed for the archival market and required adaptation for archival work ows.
As a result of this unful lled quest, the team authored functional requirements for a tool to ll this gap in born-
digital archive stewardship. These functional requirements are described more in Chapter 2: Accessioning and more
fully in Appendix H.1. In line with our general research methodology, this work translates traditional archival
principles and practices into a born-digital context.
LESSONS LEARNED
The most basic assumptions were constantly tested during the project. Three formidable challenges were the
iterative nature of the project, varying institutional perspectives, and differences in terminology for similar concepts
among project partners.
Iterative Processes
Once processing of the project collections commenced, it became apparent that the work ows would have to be
iterative both within one archival function and between functions. A closer and more granular de nition of archival
activities revealed the extent to which they are carried out at different stages in the work ow, depending on
individual collections and circumstances. Some tasks must be carried out at a speci c place or order in the
work ow, while others are relevant to all or can be done at different points. In some cases the deciding factor was
archival, sometimes practical or technical, sometimes ethical.
The iterative nature of archival work ows has relatively few implications for the successful preservation of paper-
based archives. A suitable physical storage environment is the single most important factor and is relatively easy to
de ne and monitor. With born-digital material there is a greater need to understand, analyze, and assess the
implications of decisions made at a particular stage of the work ow to avoid problems or con icts later. The
work ow then must be seen as a whole even when embarking on rst steps.
Foreword vii
The iterative nature of processing collections at each institution also demonstrated the need for scalability. In
particular, accessioning and processing work ows need to allow for and enable digital materials to be transferred to
managed storage as soon as possible to ensure preservation of bitstreams. This requires a work ow as free as
possible of bottlenecks and labor-intensive processes that prevent this early and successful transfer.
Institutional Perspectives
The second challenge to the AIMS project was the diversity of institutional perspectives. Although this diversity was
eventually perceived by the partners as a bene t in building the Framework, it also meant that no single approach,
set of assumptions, or work ow steps could be adopted by default. Each had to be de ned, shared, and mapped
onto those of the other partner institutions so that generic tasks and objectives could be de ned for the
Framework.
Terminology
The third challenge was language and terminology. The differences both in use and understanding of terminology
between the US and the UK as well as between the archival profession and the digital library world of both
countries prompted questions and, in many instances, prevented the acceptance of assumed de nitions and
understandings. Adding to this challenge was the rede ning of traditional archival terms to a born-digital context.
The partners recognized that, despite differences in terminology, the fundamental archival objectives and outcomes
required rede nition of the nature of the activities and tasks required to achieve them. To aid in disambiguating
these terms, the project partners created a glossary, included in Appendix A.
CONCLUSION
The AIMS project did not promise to solve all problems associated with born-digital stewardship. In fact, we realized
that recommendations could only be for good practice rather than best practice. This is a practical approach but
also a recognition that there is no single solution for many of the issues that institutions face when dealing with
born-digital collections. Instead, the AIMS project partners developed this framework as a further step towards best
practice for the profession.
Foreword viii
AIMS: An Inter-Institutional Model for Stewardship
T he AIMS Framewor k:
T he Functions of Stewardship
INTRODUCTION
One of the primary research outputs of the AIMS project is the AIMS Framework: The Functions of Stewardship.
The Framework attempts to map an emerging world combining traditional archival practices with new
technologies. While traditional practices evolve (one need only witness the impact of Greene & Meissner s article
on more product, less process (MPLP) methodology 2 as evidence of this), the increasing sense of urgency at
institutions for a scalable methodology of acquiring and processing born-digital materials will change the traditional
paradigm even more.
As the four partner institutions worked collaboratively to design procedures for accessioning and processing born-
digital materials, we discovered these cannot be isolated activities carried out in one department, or by one staff
specialty, or apart from the rest of the archival management work ow. The Functions of Stewardship document the
entire lifecycle of born-digital material from the moment the institution becomes interested in acquiring to the
instant that a researcher accesses the material.
The Framework is divided into four main functions that should be thought of as sequential steps in a very high-level
work ow. However, it is also important to view the process as a whole. Decisions made at the beginning of the
process will have a direct impact on later outcomes. Furthermore, with growing legacy collections of data on disks
and servers already sitting in our stacks, the process at an individual institution may begin somewhere in the middle
or may require moving through the functions in an order different than what is presented here.
The AIMS partners reached consensus that the activities described in the framework are necessary for ensuring
the successful management of born-digital and hybrid collections. As described in the Foreword, one of the strengths
of the project is the diversity of archival environments and practices at each of the AIMS partners institutions. This
diversity, while high level, prompts the AIMS Framework to provide a sound basis for developing more robust and
sophisticated local practice.
Greene, M. A., & Meissner, D. (2005). More product, less process: Revamping traditional archival processing. American Archivist, 68(2), 208-
2
263.
1
AIMS: An Inter-Institutional Model for Stewardship
The four Functions of Stewardship outlined in the rest of this document are:
Collection Development: the actions and policies of an institution to acquire material for end-
users as they de ne them both current and future. Collection development activities form the basis
for subsequent actions and decisions undertaken by the institution as they accept stewardship for and
legal ownership of materials from a donor, creator, or seller. This is particularly important as institutions
develop their strategies for dealing with born-digital materials.
Accessioning: a core function of archives, wherein an archival institution takes physical and legal
custody of a group of records from a donor and documents the transfer in a register or other
representation of the institution s holdings.
Arrangement and Description: the processes undertaken by an institution to establish
intellectual control of the material following the physical control secured during accessioning. It also
prepares the material for discovery by preserving the context of the materials, and prepares for access
by applying appropriate restrictions.
Discovery and Access: the systems and work ows that make material, and the metadata that
support it, available to users while ensuring compliance with any access restrictions with. The process
of discovery and access requires some action on the part of individual users for example carrying
out a search or requesting an item.
Each functional area is further described in this document with necessary objectives identi ed for each. These
objectives are further detailed through expected outcomes, decision points, and tasks. In addition, keys to success,
or areas that should be addressed and conditions that should be put into place before beginning work in an area,
are de ned for each objective.
One area intentionally not addressed in this project is digital preservation the speci c practices developed to
ensure the long-term viability and security of data. The reason was twofold. First, while an emerging discipline, digital
preservation has many well-documented best practice models and methodologies. Reiterating what others have
already determined would not be useful. Instead, the framework assumes that efforts outside of the archival
functions ensure the viability of data. This leads to the second reason digital preservation was not discussed: it is
larger than the scope of this project. Digital preservation is a major infrastructure issue for libraries, archives, and
other institutions. The only way to achieve reasonable success in digital preservation is in economies of scale
wherein the nuts and bolts of preservation (storage space, repository infrastructure, refreshing of media, etc.) are
carried out in the same way for all digital content. In this way, the speci c archival activities that are explored here
do not overlap with preservation activities.
Appraisal is also not de ned as a speci c, separate function. Rather, appraisal activities are included in any or all of
the rst three functions within this framework collection development, accessioning, and arrangement and
description. Principles, strategies, and tasks related to appraisal process will appear within each function.
As a nal note, the AIMS Framework bears some resemblance to the PARADIGM Workbook in scope and
content. However, PARADIGM contains much more detail about acquiring collections and collection development.
2
AIMS: An Inter-Institutional Model for Stewardship
While more detail is useful, the PARADIGM Workbook sometimes lacks the broad and holistic viewpoint that the
AIMS Framework can and will provide. The project partners hope that both PARADIGM and the AIMS Framework
can be used together by institutions working towards the establishment of practices for the stewardship of born-
digital materials.
3
AIMS: An Inter-Institutional Model for Stewardship
1 . Collection Development
DEFINITION AND SCOPE
Collection Development: the actions and policies of an institution to acquire material for end-users as
they de ne them both current and future. Collection development activities form the basis for
subsequent actions and decisions undertaken by the institution as they accept stewardship for and legal
ownership of materials from a donor, creator, or seller. This is particularly important as institutions develop
their strategies for dealing with born-digital materials.
PREFACE
In the initial stages of the AIMS project, the activities described in this section of the AIMS Framework were
referred to as pre-accessioning intervention. From an archivist s perspective, this is a more accurate description
than collection development because it highlights attempts to determine what actions should be undertaken or
what information should be gathered in order to lay a solid foundation for the long-term stewardship of born-
digital archives in order to inform and assist curators when working with donors during this early stage. However,
since this work is undertaken within a larger framework across disciplines and with a variety of other archive or
library staff, the term collection development is more universally understood.
Until recently, born-digital materials were often viewed as an adjunct to the paper or analog materials in a
collection. They were seen as less important, in many cases thought to be duplicative or uninteresting, and perhaps
as items that could be discarded. Speci c collecting activities related to born-digital materials within manuscript
collections were sporadic and unde ned. Ensuring preservation and access usually included printing out the digital
les. While feasible when there are only a few items, this activity is neither sustainable in the long term nor
preferable. Despite their complications, born-digital materials are more exible, enabling full-text search or other
interactivity. The loss of this exibility downstream in a discovery environment has led to a growing effort to keep
digital les (rather than printing or discarding them), whether or not they are duplicates of analog material.
Traditionally the selection and cultivation of a collection has been the sole purview of a subject curator or archivist,
possibly working in conjunction with an acquisitions committee. While the processing team or archivists may have
been called on to assist with parts of the process, communication during this initial phase might be limited. When
dealing with born-digital materials, however, it is best to employ a more collaborative approach from the outset:
technical expertise and experience with newly designed work ows (or those just being tested) from archival or
digital staff will aid the curator in appraising materials, performing test captures, and identifying any issues related to
accessioning, processing, preservation or delivery. There are numerous examples of scenarios where technical and
1. Collection Development 4
AIMS: An Inter-Institutional Model for Stewardship
legal expertise would be necessary, including: undertaking research on new capture methodologies from a media
type not previously encountered; negotiating permission to capture or extract data from a proprietary web
service; assessing the feasibility of taking material dependent on software or other programs that require signi cant
commitment to deliver or render; and understanding the licensing and intellectual property rights implications of
capturing or copying software as well as data. These activities are substantially different from those undertaken
when dealing with analog materials, and it is best to discuss these issues with a team from the outset.
The team approach in the collection development phase will allow all parties to:
1. be aware of broad issues as they arise in order to develop strategies for incorporating them into
current and future work ows,
2. work closely together to better understand the institution s ability or capacity to receive the digital
materials in question and to undertake long-term stewardship
3. have a full understanding of the implications of donation, acquisition, processing and delivery.
KEYS TO SUCCESS
Collection development is the rst step in the AIMS Framework for born-digital stewardship. Decisions made
regarding born-digital materials by the curator and donor will affect each additional step of the archival process and
the long-term plan for accessibility. While the objectives below differ slightly from the traditional collection
development experience, they serve many of the same functions, including establishing trust with donors and
depositors and creating comprehensive documentation for future activities.
Before embarking on these objectives, institutions should also foster discussions about the policies and procedures
highlighted below. These discussions will require consideration of future scenarios, an activity made all the more
dif cult by rapidly changing technology and professional practice. The methodology or practice at each institution
will differ in the approach to developing policies and procedures. An understanding of your institution s strategic
environment and risk tolerance will be essential in successfully navigating these decisions. Some institutions will
require deliberation and the creation of policies as a rst step; others are more likely to favor experimental action.
The goal is for all departments and staff involved to agree on expectations, abilities and capacity. There is no right
or wrong way to do it, but waiting for a perfect work ow or tool to evolve may mean losing those materials
through data loss or to competing organizations. Spending time at this stage to think through future activities will
result in less confusion and dif culty later.
Born-Digital Collecting Policies
The collection development policies of an institution are designed to guide the acquisition of materials according to
their mission and collecting goals in order to meet the needs of their end users. The implementation of the
collection development policy in relation to born-digital materials might involve:
1. Collection Development 5
AIMS: An Inter-Institutional Model for Stewardship
prioritizing collections based on the needs or strategies of the institution and its user communities (stated
or implicit)
developing relationships with donors
assessing born-digital and analog materials and the relationship between them
evaluating how the former might t into current technological strategies or push further development at
the institution
Therefore an institution s born-digital collecting policy needs to establish the institution s position its principles
and general standpoint on a wide range of issues which have implications for stewardship. This will ensure that it
is effective in guiding discussions and decisions relating to speci c donations and individual accessions during
collection development activities. These discussions will determine how an institution s policy is applied in a speci c
instance, any exceptions to the policy, as well as how options within it will be recorded in a legal agreement.
A born-digital collection development policy should supplement an institution s general collecting policy, and include
information about:
Method(s) used for transfer and/or capture of materials
Methods for identifying and dealing with les that contain viruses or other threats to preservation
Options for dealing with les that are duplicates, redundant, or out-of-scope
Criteria for capture (or acquisition by other means) of proprietary or open-source software, or of
hardware
Strategies and methods for preserving materials (what is preserved and how)
Strategies and methods for providing access to materials (what is delivered to the user and how)
Policies and strategies for dealing with con dential or other sensitive content
Conditions governing access (restrictions, limits on access, users) and how they are applied and
enforced
Policies relating to intellectual property rights, including Creative Commons Licensing and copyright
(the role of the institution and how it is undertaken)
Retention (or not) of original storage media
Categories of digital material (AV, databases, text, etc.) which the institution is able to preserve,
manage, and deliver, with indicative listing of le types and formats, and limitations where applicable
Methods for ensuring and demonstrating integrity and authenticity, with associated criteria. (As
discussed in Chapter 3: Arrangement and Description, more development is needed in this area.)
These issues have technical, archival, ethical, and legal elements. Many of them relate to the technical processes
required for accessioning and delivery of materials and are discussed more fully elsewhere in this document. These
processes have ethical and legal implications that the donor needs to be aware of and to understand in order to
1. Collection Development 6
AIMS: An Inter-Institutional Model for Stewardship
give informed consent. For example, if the institution s default policy is for a bit-for-bit capture,3 will the donor be
asked for their preference? How would the institution handle les previously deleted by the donor/creator but
which are included in the data capture? Will there be a difference between what is captured and what is accessible
to the public (for example, when is the bit-for-bit copy retained for preservation and processing purposes only and
not for access)? Will material available online differ from what's available on-site (for example via a standalone
computer in the reading room)? Will legacy/transfer storage media be retained and what software or hardware will
be captured or acquired? Software may be needed in order to render the les with their signi cant properties4
but capture of proprietary software from a donor may contravene licensing agreements.
There are several useful papers discussing the issues and ethics of working with born-digital materials.5 However,
your institution should have a written statement that the curator may use as a primary point of reference.
Technical limitations at an institution might initiate a list of preferred formats based on capacity and ability. 6 This
may be driven by preservation strategies and, to a lesser degree, the current capability for delivery. However, the
acquisition of born-digital material should be based rstly on a curatorial appraisal of its t within the collection
development policies of the institution. In addition, a feasibility study or technical appraisal should be performed by
archivists and/or technical specialists before nal decisions are made. While an institution might take in a format that
is not on its preferred list, it would need to understand that it cannot guarantee the same level of stewardship
i.e., preservation might be only at the bit-level.
Recognizing your institution s ability and willingness to collect born-digital material and de ning the parameters of
this effort is key. Many institutions are rede ning their collecting policies and overall strategies for the 21st century,
and born-digital is recognizably a huge issue for many repositories as was demonstrated in the 2011 OCLC survey
report.7
3As an example, see Appendix F.7: Beinecke Rare Book and Manuscript Library s Born Digital Archival Acquisition Collection & Accession
Guidelines, speci cally In acquiring born digital materials the capture by snapshot of all working les on a speci c computer, will be the
preferred method of acquisition; in most cases BRBL will wish to capture entire digital environments without any advanced collection editing
by creator or curator.
4For example, at Stanford, when the Peter Koch computer les were acquired by a logical capture, the fonts associated with his InDesign and
Quark design les were not captured. This created an inability to render the printer s designs accurately in the virtual machine especially as
many of the fonts were no longer available.
5 For example: Matthew Kirschenbaum, Richard Ovenden, and Gabriela Redwine, Digital Forensics and Born-Digital Content in Cultural Heritage
Collections (Washington: Council on Library and Information Resources, December 2010); and Digital Lives Research Project
(http://www.bl.uk/digital-lives/)
6 For example: Deep Blue Preservation and Format Support Policy at the University of Michigan
(http://deepblue.lib.umich.edu/about/deepbluepreservation.jsp) and Wellcome Library Digital Curation toolbox
(http://library.wellcome.ac.uk/node289.html)
The British Library s website discusses their collecting policies for the 21st century
7
(http://pressandpolicy.bl.uk/content/default.aspx?NewsAreaId=312) and the anxiety on the part of archivists for dealing with born-digital ma-
terials is documented in Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives, Jackie M. Dooley and Katherine
Luce, OCLC Research, Oct. 2010.
1. Collection Development 7
AIMS: An Inter-Institutional Model for Stewardship
Donors and Trust: University of Hull
Digital
preservation
Simon Wilson
strategy
Digital Archivist, University of Hull
An institution
An organization recently contacted the Hull History Centre regarding the transfer of over 100 lin-
must have a good
ear metres of their historical archives, dating back over 170 years. The organisation also expressed
understanding of a willingness in principle to participate in the AIMS project.
its technological
Despite several meetings to discuss the potential type and range of born-digital material to transfer,
capabilities and the organization was hesitant and eventually withdrew from the project. Having recently trans-
must have some ferred their paper archives to the Hull History Centre, there was an on-going relationship with the
donor and a level of trust and understanding about the value and importance of archives. The or-
sort of
ganization recognised that the paper archives had historical value, even though they were clearly no
preservation longer using the paper-based records on a daily basis and could not justify the space those materi-
strategy in place als occupied in their of ce. Although Hull History Centre staff were thinking about the continuity
of records from paper to born-digital, the organization regarded the records differently and had not
or in
thought about how they would continue preserving the legacy of their work in the digital age. The
development in organization s digital material was still actively being created and used, and less likely to be perceived
order to as being archival or having historical value in the same way that the paper archives clearly did. The
born-digital les accumulating on the servers were less visible than their paper predecessors, and
undertake
shortage of space was less of a concern. Data security and the possibility that sensitive material
stewardship of would be transferred was also a worry although voiced somewhat vaguely, perhaps because the
risk would only become relevant and speci c once the principle of transferring the born-digital
born-digital
material was accepted.
archives
responsibly. This The lesson learned from this experience: place greater emphasis in the initial discussions with po-
tential donors on the continuation of established practices relating to material of archival value,
strategy must
whatever its format, rather than on the format of the material. The reluctance of this organization
include the
to discuss with Hull History Centre staff the born-digital material that they were creating hampered
management of the Centre s ability to identify or recommend possible material for the archives, or to offer reassur-
ance about the protection of sensitive material. This uncertainty also led to the organization s con-
material from the
cern that the identi cation of material for transfer would take considerable time and effort on their
moment of part.
transfer, through
Also evident was the fact that, in the future, the Centre must be clearer in explaining that the na-
processing in a
ture of born-digital archives necessitates the capture of electronic records soon after creation
virtual much sooner than is traditionally the case for a paper-based materials. In developing a user interface
for born-digital archives, the Centre will actively look to demonstrate to donors the ability to safely
workspace, and
store and control user access for materials stored in the Centre s digital repository and allay any
nally ingestion
fears that may arise from phenomenon such as wiki-leaks.
into preservation
Interestingly, another organization which has been regularly transferring material to us since the
and/or delivery
1960s and which was equally reluctant to include born digital records in these transfers has re-
repositories. cently itself raised the issue of its born-digital archives. Early reluctance was the result of concerns
about access to and misuse of the material, from the viewpoint of intellectual property and reputa-
tion rather than personal con dentiality. A change of staff within the organisation, together with a
greater emphasis on preserving a record of its more recent activities, has helped to overcome
those initial reservations. The risks of transfer and more open access are still present, but the or-
ganization is now able to see the advantages of creating a born-digital archives, as well as the obsta-
cles to be overcome.
1. Collection Development 8
AIMS: An Inter-Institutional Model for Stewardship
A complete understanding and description of the infrastructure will include:
storage environment
equipment for transfer, capture, and quarantine
maintenance activities; personnel and skills required
planning and communication strategies
Not all institutions need build their own digital repository. An acceptable strategy might include joining a local or
regional repository. 8 Wherever the location of the speci c repository, institutions need to ensure and demonstrate
that they can and will undertake responsible stewardship, or question whether they should in fact be collecting
born-digital materials.
Legal agreements
Many institutions will already have in place a template for agreements with donors (or depositors, sellers, or
vendors) which covers analog materials. Before active collection begins, the agreements should be amended to
cover born-digital materials. As with the collection development and preservation policies above, the agreement
should acknowledge and make explicit reference to the salient characteristics of born-digital materials and the
additional issues which arise with born-digital or hybrid archives. This will facilitate common understanding between
donor and institution, ensure informed consent, and record key decisions and information for future reference. At a
minimum, the agreement should include the following elements:
Scope and description of materials being transferred, both analog and digital, in
copyright infringement.48