|Bruce R. Kingma
School of Information Science and Policy, Department of Economics, and
Interdisciplinary Ph.D. Program in Information Science
University at Albany, State University of New York
Digital texts in a networked environment hold the promise of lower cost access to information by a greater number of users than can be provided by printed texts. Projects such as The Making of America, Project MUSE, JSTOR, and the Early Canadiana Online project investigated in this study offer access to digital texts over the Internet to millions of potential users. These digital projects also offer the promise of lower costs by avoiding the cost of printing and shipping multiple copies of a text for patrons. In theory, once the fixed costs of digitization are incurred there is a zero marginal cost of providing an additional copy.
The potential benefits of digital access are considerable. Patrons, who previously traveled to a repository of rare books or a microfiche room at a research library, instead can access historical information from their desktop. This dramatically decreases the time and effort patrons spend traveling to the source of the information. This also dramatically increases the potential benefits to new patrons, who can now access historical texts that previously were only available at sites too distant for them to consider. The economic question is whether the cost of digitization is low enough to justify this future stream of benefits.
This study examines the economics of digital, microfiche, and print access for the Early Canadiana Online (ECO) project. ECO is a digital library of selected books and pamphlets from pre-1900. The ECO project includes over 3,000 titles and 500,000 images. This paper reports the cost estimates for digital, microfiche, and print access. The complete report, available at <http://www.albany.edu/~bk797/research.htm >, also includes an examination of the economic theory of digital information and pricing alternatives for the ECO project.
Previous studies by Anne Kenney  at Cornell University and Paul Conway  at Yale University investigated the costs of digitization. Both studies measured the marginal costs per image of primarily in-house scanning. These studies are valuable in providing accurate estimates of local scanning projects and the additional costs of increasing existing digitization activities. By comparison, this study includes all costs including management and overhead, construction, utilities, and all other costs associated with the production, cataloging, and sales of texts in microfiche or digital format. The cost estimates in this study are considerably higher than the marginal cost estimates in previous studies but are a more accurate estimate of the full costs of the production of microfiche or digital projects from start to finish. Costs per image or book in this study are average costs.
This study also investigates the benefits of digitization. The primary benefit of these digital projects is the return to patrons from accessing these materials. Once digitized, stored, and made accessible over a campus network or the Internet, the materials are more easily accessible to more patrons. Patrons, who previously had to travel to a library with the original or microfiche copies of materials, can now view them online from their home or office.
Cost Estimates of Early Canadiana Online
Estimating the costs of digital projects is necessary to determine efficient investments in digitization of print or microfiched information products. The primary goals of this project are to estimate and compare the costs of three methods of information delivery; print, microfiche, and digital. Cost data was collected from the ECO project partners: The University of Toronto, Laval University, and the Canadian Institute for Historical Microreproductions (CIHM). The cost of construction of a new electronic library at the University at Albany was also used.
The Cost of Print
Table 1 shows the cost estimates for book storage and access. These costs are based on the cost of the Thomas Fisher Rare Book Library at the University of Toronto. Construction costs are based on the 1999 library construction project at the University at Albany. Special environmental controls used in a rare book library imply that the construction costs in Table 1 may underestimate the actual construction costs. All costs in this report are shown in Canadian dollars. The exchange rate used is $1.5257 Canadian dollars (CD) per US dollar.
The cost per volume of construction, utilities, and maintenance is comparable to an estimate of $4.68CD from Bowen . However, the cost per use of $134.72 is significantly higher than the $1.50CD cost of retrieval reported by Bowen, the $3CD for the New York Public Library and $6CD for the Harvard Depository Library reported by Lesk , or the $9CD maximum retrieval cost estimated by Getz . In Table 1 the cost per use is derived by dividing the cost by the number of requests for the year. This inflates the cost per retrieval by adding the costs of storage into the equation. However, it is important to note that the "service" of a library is the use of its materials. All costs when divided by the use of those materials gives an average cost for service which will be higher than separating out only part of these costs for retrieval.
For a comparison with the estimates cited by Bowen, Lesk, and Getz, an estimated 80% of salaries at the Thomas Fisher Rare Book Library is for access. Taking 80% of salary costs yields an estimate of $40 per transaction for labor, still significantly higher than other estimates. However, a rare book library has concerns of preservation that require additional staff care and monitoring for patron access. In addition, this estimate includes the total cost of administration, vacations, and benefits for employees rather than the marginal cost of retrieval based on a staff memberís time spent multiplied by his salary.
Table 1 does not include the cost of purchasing a book. This is important although it will be a small percentage of total costs once the purchase price is amortized over the expected life of storage and use of the book. For example a rare book that costs $500 but is expected to last 100 years in storage has an annual cost, when amortized, of $4.80. Table 1 also does not include the value of the land. This can be significant but is different depending on the location of the library.
The Cost of Microfiche
The annual costs of microfiche storage and access at the University of Toronto are shown in table 2. Cost per volume is based on a 216 page text, the average size of a text digitized in the Yale Open Book Project. As with Table 1 these costs represent the average cost per unit for storage or access. Just as the cost of purchasing a book is not included in Table 1, the cost of purchasing the microfiche is not included in Table 2.
Both the cost of storage per volume and the cost per use are significantly lower for microfiche than for rare books. This is not surprising since microfiche is intended to provide access to and storage of information at a lower cost than print.
The cost per use is derived by dividing the total costs of microfiche storage and access by total use. As with Table 1, this assumes that the value of microfiche storage is for access to patrons. If salaries and equipment are the only costs for access, and 80% of salaries are for access, then the cost per transaction can be estimated as $3.75, which is comparable to estimates of the costs of book retrieval. Both retrieval functions are similar in that staff must locate, check out, and reshelve the requested materials.
Table 2 does not include the subscription price of the microfiche to the library. These costs are part of the economic cost of producing microfiche and are shown in Table 3. To avoid double-counting, these costs are not shown in Table 2. The costs of microfiche production are shared costs. Library subscription fees, grants and donations are used to jointly finance the production of the microfiche as a public good.
Table 3 includes all economic costs of microfiche production including the value of space CIHM uses at the National Library of Canada. While this space is donated to CIHM, it still represents an economic cost of producing microfiche. As with previous tables, the average cost of production is derived by dividing total costs by number of units.
The first four rows of Table 3 show the cost of producing master copies of microfiche. The cost of producing master copies of microfiche is $114 per fiche, $1.54 per image, or $333.11 per 216 page volume. This is the cost of producing a set of master copies that are then used to produce additional microfiche copies for distribution to subscribing libraries. The cost of the master copies is a shared cost for all subscribing libraries.
If we compare the cost per volume of creating and storing a master microfiche copy relative to creating and storing a print copy, microfiche is expensive to create but has significant savings in storage ($0.16 per volume per year) relative to print ($5.89). However, at annual savings of $5.73 per year, it would take over 50 years to cover the cost of creation ($333.11) if the master copies were created solely for the use of one library.
Microfiche is produced by CIHM, not to have a single copy, but to provide multiple copies to libraries that would not otherwise have access to early Canadian literature. With a limited number of print copies, microfiche becomes a cost-effective alternative for providing access. CIHM produces several copies of each microfiche to sell as subscriptions for libraries throughout Canada, the United States, and the rest of the world. By purchasing a subscription, these libraries share the costs of the original microfiche production.
CIHM produces about 30 copies each year for library subscriptions and additional copies of individual microfiche at an additional cost of $236,092. The last two rows in Table 3 show how these costs can be shared among the subscribing libraries. If the full cost of microfiche production is averaged over the 30 copies, the cost of annual production is $43,399 per library. This includes the shared costs of production plus the costs of making copies. If an additional 12 copies of each fiche, on average, are sold the average cost is $30,999 per library.
The average costs per fiche, per image, and per volume for 30-42 copies are shown in the final three columns of Table 3. The sharing of the full costs of production among subscribing libraries reduces the cost to $0.04-$0.06 per image or $9.69-$13.56 per volume. This compares favorably to the cost of each library acquiring a printed manuscript. At an annual savings of $5.73 per volume for each library, it takes 1.7-2.4 years for the microfiche to cover the costs of creation ($9.69-$13.56).
Once produced, it is anticipated that a microfiche copy of a text will last for 100 years. The purchase of microfiche is an investment in an archival copy of materials that is expected to provide access for patrons to the information for many years. If the cost of the microfiche is spread out or amortized over a 100-year period, then the annual cost of microfiche production is only $0.65-$0.46 per 216 page volume per year. When this is added to the cost of storage from Table 2, the annual cost comes to $0.81-$0.62 per volume per year for producing, storing, and providing access to a text in microfiche format.
These costs indicate that when microfiche is produced in large numbers to accommodate several libraries, it costs significantly less to produce, store, and provide access to microfiche than to books. This shared cost per library declines further if the number of libraries acquiring subscriptions increases. In addition, the CIHM microfiche subscription provides access to a larger collection of texts than is likely to exist in any single library of rare books. These cost estimates show that microfiche is the more cost-effective alternative to library storage of print to provide patron access to out-of-print texts.
The Cost of Digital
Digitization of texts may be able to provide even greater savings relative to microfiche and print. Unlike print and microfiche, which must be produced and delivered to a library, digital texts have the advantage of being stored remotely but accessed globally via the Internet. The cost of reproduction and distribution of digital information in a networked environment is zero. The only costs are the one-time fixed costs of producing and the annual fixed costs of storing the data. These fixed costs can be shared by the subscribing libraries. In theory this could drive the cost per library to a significantly lower level than with microfiche.
In the Early Canadiana Online Project, microfiche was converted to digital format. Microfiche was sent to Preservation Resources for scanning and the University of Michigan for optical character recognition. Cost estimates shown in Table 4 are based on contractual costs for scanning and OCR.
The total costs for production are $236.08 per title or $1.20 per image. Costs in the second and future years for digital storage and access are $35.76 per title or $0.18 per image. This includes the cost of salaries for maintaining the ECO Project database and annual costs of hardware storage. Although the cost of producing digital copy from fiche is less than the cost of microfiche, the cost of storage and access for digital, in this project, is more expensive. This is the result of costs averaging over a smaller number of available digital images which will be higher than the average cost per fiche in a university microtext room which contains hundreds of thousands of microfiche.
There are two factors that significantly lower the average cost per image of digital production and storage: the number of libraries subscribing to the database and the number of images stored. The production costs of the digital images are fixed costs that are constant regardless of the number of libraries that subscribe to the database. If there are 30 library subscriptions to the database the cost per library would be $8.63 per volume. An increase in the number of libraries or other organizations that subscribe to the database will decrease the "cost-share" for each organization. In addition, the annual cost of storage and access to the database is also a "shared" cost. If this cost is shared among 30 libraries it decreases to $1.31 per volume per library per year.
As the number of images available in the ECO Project increases, the cost per volume will also decline. Space costs (utilities, construction, etc.) and salaries for maintaining and updating the database and server constitute 97 percent of the costs of storage and access. These costs are incurred regardless of the number of images. Storage costs per volume are $0.90 of annual costs. As the number of images in the database increase, total storage costs will increase, but the average cost will continue to decline.
The cost estimates from Table 4 can be compared to similar recent studies estimating the cost of digital production. Estimates from studies at Cornell University and Yale University are shown in Table 5. (Cost estimates from Cornell and Yale are shown in Canadian dollars for comparison. Cost per volume is based on a 216 page text.)
These earlier studies show a significantly lower cost of digitization. The Cornell study created digital copies from paper while the study at Yale created digital copy from microfiche. The major difference between the Early Canadiana Online Project and these earlier studies is the method used for estimating costs. Both the Yale and Cornell studies estimated costs by timing staff scanning pages of print or microfiche. These studies are based on the marginal cost of scanning images and producing digital copy. The cost estimates for the ECO project are average costs based on dividing total project costs by the number of images, titles, or volumes. The ECO Project cost analysis includes the full cost of producing digital copies and mounting the database on a server for access over the Internet. The ECO project is larger in scope, number of titles, and number of images. ECO costs include all salaries, space costs, and outsourcing of digitization and OCR. Therefore, this cost analysis should be viewed as a liberal cost estimate of a large digitization project with Internet access to the database.
Diseconomies of Scale
The ECO project scanned a larger number of titles and images than the projects at Yale University and Cornell University. The project scanned 3308 titles compared to the 1270 titles scanned at Cornell or the 2000 titles scanned at Yale. Table 6 compares fixed, variable, and total cost estimates for the three projects.
The variable cost estimates in Table 6 for the ECO project include only the cost of scanning the images. OCR, space, and other salary costs contribute to total costs. For comparison with the Yale and Cornell studies, however, the vendorís cost of providing digital access may be more relevant. If more texts are digitized without OCR, then the additional cost would be $0.674 per page. The relative costs and size of the three projects are shown in Table 7.
The increase in cost per image and cost per title between the three projects may be the result of diseconomies of scale, i.e., an increasing average cost as output increases. Larger projects may require more staff or have a greater complexity of task that results in higher costs per unit. However, much of the difference shown may simply be the result of different methods of estimating costs.
Cost of Access to Digital Information
The cost of access to digital information is difficult to quantify. Access to digital information includes the personal computer, network connection, and space used by the patron. Since these are all fixed costs of access that a patron or library must incur regardless of what information is accessed, the marginal cost of accessing any image or database is zero.
We can attempt to quantify the average cost per use to the library of providing access to digital information. This is shown in Table 8.
Table 8 includes the cost of computers within the library, staff to maintain the server and network, and the cost of space for each computer. Cost per use is shown in terms of internal use and all uses of library databases regardless of the source. Internal use is defined as the number of unique and significant hits to the library server which originate from within the library (0.3 million per week). Use is the number of hits regardless of source (1.2 million per week). Regardless of which definition of use is applied, access to digital documents comes at a very low average cost per use. This is significantly lower than the average cost per use for microfiche or rare books.
Table 8 also illustrates the importance of understanding the difference between total, average and marginal costs. Table 8, like previous tables, shows the total and average costs per use. The total cost of providing electronic access within a university library is significant, but the high level of use of terminals within the library results in a very low average cost per use. The marginal or additional cost for each patronís use is zero. All costs in Table 8 are fixed costs, incurred regardless of whether a patron uses a terminal or not. Investments in information technology within university libraries can be expensive although digital documents in a networked environment come at a zero marginal cost of distribution.
User Costs of Access
The final economic cost of access is the cost to the user. With print and microfiche the user must travel to the library to use the information. Any library will have only a limited collection of print titles. To read other titles in print from the collection, a patron may have to travel to another research library. With the CIHM microfiche collection, a research library can offer patrons access to a greater number of titles than are typically available in print, although the patron must still travel to the library to access the microfiche.
Digital copies are accessible to all patrons of subscribing libraries with a network connection. This increased accessibility of the collection to patrons may result in a greater number of subscribing libraries and greater access to the CIHM collection of materials.
The cost to patrons of using information is the opportunity cost of their time spent in acquiring and consuming it. The value of access to information by patrons is reflected in the demand for using the database. The demand for use of Early Canadiana Online is illustrated in Figure 1.
In theory if the user has a cost of time of $10 per use of a manuscript in a rare books library, he may only use the manuscript 5 times a month. If the patronís opportunity cost of time spent consuming the information decreases, then use will increase.
Microfiche is easier and takes less effort to use than books in a rare book library. Microfiche delivery by library staff takes less time than retrieval of a rare book. Once a patron understands how to use a microfiche reader, he can view several books with relative ease. In addition, patrons do not have to travel to another library to view early Canadiana texts if their library holds the entire CIHM collection on microfiche. If we assume that the cost to a patron of accessing an Early Canadiana text on microfiche is $5, then patron use of the microfiche will increase to 30 times a month.
Finally, digital access lowers the opportunity cost of access to the information even further. Digital access enables patrons to view the information from their personal computer in their home or office, or from a computer terminal in the library. Instant access to a large collection of images from the CIHM collection means faster, searchable access to the images.
To patrons, the time savings from digital access has two parts. First, there is the value to patrons of lower cost access to images they would have traveled to the library to view on microfiche. If a patron would have used microfiche 30 times a month at a cost of $5 per use, and this cost declines to $2 per use in digital form, then this patron has a $3 lower cost of access for 30 uses, or has decreased his cost by $90 a month. Second, there are additional uses of digital access that provide additional benefits to patrons. These additional uses can be assigned an average value of $1.50 each, or one-half of the value of lower cost access to the first 30 uses a month. If use increases to 50, the additional 20 uses per month would provide a benefit to this patron of roughly $30. The total value to this patron would be $120; the $90 in lower costs plus the additional $30 in benefit from an increase in access.
During this study, patron use of the print, fiche, and digital collection was observed. Patrons were also asked questions about their use and travel time to the library. Annual use of the collection at the University of Toronto and Laval University increased from 2984 for print and microfiche to an estimated 7030 uses of the digital texts. Travel time to the library for print and microfiche patrons varied from less than 30 minutes to more than one day, with 90 percent of patrons needing one hour or less. The results of this survey are discussed in more detail in a separate report by Joan Cherry and Wendy Duff .
If we assume that digital access saves print and microfiche patrons 30 minutes of travel time and that the value of this time is $10 per hour, then the annual economic savings of increasing uses from 2984 to 7030 equals $25,035 . This represents a lower-end estimate of the savings from accessing the CIHM collection online versus traveling to the library to use the microfiche or print. Some patrons are likely to save more than 30 minutes of travel time. Other patrons are likely to have an opportunity cost of time greater than $10 per hour. Most significantly, use of the Early Canadiana Online collection is likely to increase as more scholars and students are made aware of it.
The economic paradox of digital information is finding the correct financial strategy to collect sufficient revenues to pay for the benefits of digitization. Digital information provides greater access to information at a lower cost. However, funding the production, archiving, and access to the information requires creative financing, including value based pricing of information as well as the solicitation of grants and donations.
Information production and access comes at a cost. An accurate measurement of the full economic costs of different methods of information delivery is essential in determining the most cost-effective method. This study has shown the costs of three methods of access: print, microfiche, and digitization of microfiche. The cost of digital information is lower on a cost per library or per patron basis so long as a sufficient number of libraries are interested in subscribing to the database.
In general, the lower cost of digital production will continue to result in more information products appearing in digital format on the Internet. The increase in the number of digital products will further contribute to the information overload of patrons and librarians. Information consumers are confronted with too many journals, databases, and research sources for the limited amount of time and attention they can give to any one source. Given a limited amount of time for information consumption, patrons will search for information of higher quality for use of their time. Any new digital product must have an assurance of quality in order to convince patrons and librarians that there is value in spending time consuming it. Manuscripts of historical significance, such as the ECO Project, produced by trusted organizations, such as CIHM, provide libraries and patrons with an assurance of quality.
Notes and References
 Anne R. Kenney, "Digital to Microfilm Conversion: A Demonstration Project 1994-1996" <http://www.library.cornell.edu/preservation/pub.htm>.
 Paul Conway, "Yale University Libraryís Project Open Book" D-Lib Magazine, <http://www.dlib.org/dlib/february96/yale/02conway.html> February 1999.
 Table 1 costs of utilities, maintenance, salaries, equipment and supplies are based on University of Toronto cost estimates. Construction costs are based on new library construction at the University at Albany. All costs are in Canadian dollars. Exchange rate used is 1.5257. Costs are amortized using 5% rate of interest and life spans as follows: construction 25 years, equipment 5 years, and computers 3 years. Cost per volume based on library capacity of 500,000 volumes. Cost per transaction based on 21,874 transactions.
 William Bowen, "JSTOR and the Economics of Scholarly Communication," The Economics of Information in the Networked Environment, Meredith Butler and Bruce Kingma, eds., Haworth Press 1998.
 Michael Lesk, "Substituting Images for Books: The Economics for Libraries," <http://www.lesk.com/mlesk/unlv/unlv.html> April 4, 1998.
 Malcolm Getz, "Evaluating Digital Strategies for Storing and Retrieving Scholarly Information," Economics of Digital Information: Collection, Storage and Delivery, Sul H. Lee, ed., Haworth Press 1997.
 All costs are amortized in Table 2 with the exception that amortization for microfiche readers used a 15 year life span. Volumes are considered to contain 216 images. Images are page images. Each microfiche image has two page images. University of Toronto microtext use was 62,856 in 1997 for 3,387,777 units stored in a room of 810 square meters.
 Joan Cherry and Wendy Duff, "Improving Access to Early Canadiana," unpublished manuscript, Faculty of Information Studies, University of Toronto, 1999.
 The calculation of the economic value to patrons is (2984 uses)($5 saved per use) plus (7030-2984 additional uses)($2.50 average savings per use) equals $25,035.
I would like to thank The Andrew W. Mellon Foundation, the Canadian Institute for Historical Microreproduction, and the University of Toronto for their support. I would like to thank Pam Bjornson, Meredith Butler, Marshall Clinton, Malcolm Getz, Tim Neff, Guy Teasdale, and Karen Turko for their assistance. The author is solely responsible for any errors in this report.
Copyright © 2000 Bruce R. Kingma
|Top | Contents
Search | Author Index | Title Index | Monthly Issues
Previous story | Next Story
Home | E-mail the Editor
D-Lib Magazine Access Terms and Conditions