© 2010 Robert McKercher. All rights reserved.
Why the Impulse to Digitize Everything Is Misguided
Problems with mass digitization projects' book scanning, such as those described by Charles Henry and Kathlin Smith in a section of the Council on Library and Information Resources (CLIR) 2010 report The Idea of Order: Transforming Research Collections for 21st Century Scholarship, have led me to consider the possibility that cultural heritage institutions’ impulse to digitize their entire legacy collections may be misguided.
Anxiety about the daunting task of converting millions of legacy items has spurred a rush among cultural heritage institutions to digitize as much as possible as quickly as possible. That anxiety often has led to hurried, ill-considered, and poorly executed mass scanning efforts. Perhaps worse, it has led many institutions to abdicate their professional responsibility to provide free, equitable information access by handing over the keys to Google Books and other private enterprises. Anxiety and hurry are bound to create more problems than careful, considered action.
In their study of mass digitization projects, Henry and Smith found that the rushed digitization had created obstacles to the subsequent usefulness of the newly digital information in the institutions’ catalogs. Metadata deficiencies (misspellings, truncated titles, missing information, inaccurate tags, and incorrect page links) frustrated digital's search and retrieval advantages over analog. Similarly, poor scan quality with blurred text, obscured text, and missing pages limited the resource's usefulness. In addition, for many works still under copyright protection, the institutions did not provide full-text versions to users, even though the entire book had been digitized; therefore, user access to the resources’ informational content was limited in a way that was not the case for their analog counterparts.
Despite these problems, I would in no way suggest digitization should be avoided. Research has demonstrated the necessity of participation in the digital information economy, and in a case study of an indigenous digital library in Africa, Elizabeth Greyling’s and Sipho Zulu made the case that community participation in building the digital collection both reconnected the community with its cultural heritage and helped to bridge the digital divide. Digital divide refers to the inequality between groups related to access to and the skills necessary to take advantage of digital resources.
However, I would argue that the African program’s successes were partly due to the fact that it focused on newly captured content - digital recordings of interviews and oral histories rather than conversion of extant analog recordings. This allowed the program to devote the majority of its limited resources to infrastructure development and skills development.
Even well-established cultural heritage institutions in the developed world have finite resources. Administrators must decide how to allocate money, time, and effort. Budgets tend to be focused on maintaining existing services, leaving generally modest resources for upgrades or innovation, and for many institutions their existing services are based on a pre-digital paradigm. Because of this, I believe it’s the unique items, to which mass digitization projects have no access or whose formats are ill-suited for automated scanning, that should be targeted for digitization rather than commercially published materials. Eventually the market will incentivize commercial publishers to create malleable, fully searchable digital files of copyrighted works, as though they were born digital. By focusing on unique items and through careful and aggressive development of systems and infrastructure to build strong collections of born-digital materials, cultural heritage institutions can maximize their limited time and resources.
- Greyling, Elizabeth and Zulu, Sipho. (2010). Content Development in an Indigenous Digital Library: a Case Study in Community Participation. IFLA Journal 36(1), 30-9
- Henry, Charles and Smith, Kathlin. (2010). Ghostlier Demarcations: Large-Scale Text Digitization Projects and Their Utility for Contemporary Humanities Scholarship. The Idea of Order: Transforming Research Collections for 21st Century Scholarship. pp. 106-115 www.clir.org/pubs/reports/pub147/pub147.pdf