D-Lib Magazine
February 1999
Volume 5 Number 2
ISSN 1082-9873The D-Lib Test Suite
Testbeds for Digital Libraries Research
William Y. Arms
Corporation for National Research Initiatives
warms@cnri.reston.va.usGreg Janée
University of California, Santa Barbara
gjanee@alexandria.ucsb.eduCarl Lagoze
Cornell University
lagoze@cs.cornell.eduWilliam H. Mischo
University of Illinois, Urbana-Champaign
w-mischo@uiuc.eduGinger Ogle
University of California, Berkeley
ginger@cs.berkeley.eduScott Stevens
Carnegie Mellon University
sms@cs.cmu.edu
Abstract
The D-Lib Test Suite is a set of digital libraries collections that are available as testbeds for research. Collectively, they provide a tremendous resource for everybody who is interested in digital libraries and related fields. This project briefing describes the five testbeds and highlights the opportunities that these testbeds provide.
The five testbeds have been created by the following universities. Follow the hyperlink for an introduction to each collection.
- Carnegie Mellon University
- Cornell University
- University of California, Berkeley
- University of California, Santa Barbara
- University of Illinois
Further information about the D-Lib Test Suite can be found at:
Purpose
At the beginning of this year, DARPA introduced a new resource for digital library researchers, the D-Lib Test Suite. The Test Suite is a group of university testbeds, coordinated by CNRI, which are made available over the Internet for research in digital libraries and related disciplines. The testbeds include large collections of images, video segments, maps, journal articles with SGML markup, and more.
The overall aim of the Test Suite is to accelerate and enhance research into digital libraries. Such research needs large testbeds to evaluate and demonstrate new concepts. In recent years, several excellent collections have been created in conjunction with federally funded research projects. To maximize the benefit from this previous work, some are now being made available to other researchers. There are three major reasons for this initiative:
- Efficiency of research
- Early digital library research, such as the Digital Libraries Initiative, required each project to build its own testbeds. Many of these testbeds are valuable collections, but creating them is time consuming and expensive. Moreover, research on the collections has to wait while the testbed is being developed. The Test Suite provides all researchers with resources that they can use at once.
- Quantitative research
- Research results are most valuable when they are compared with other approaches and validated against many sets of data. Because of the lack of suitable testbeds, most digital library projects test their research against only their own testbeds, thus making methods difficult to compare. The Test Suite provides collections for comparative and quantitative experiments.
- Interoperability and distributed systems
- The Test Suite provides a platform for experiments in interoperability and distributed systems. Some of the Test Suite partners are already engaging in interoperability experiments, such as tests of protocols, metadata standards, type schemes, and markup languages. Other researchers are invited to join them.
Testbeds
The initial testbeds and the testbed partners are as follows. Follow the links for more details about each collection.
- Informedia Digital Video and Spoken Language Document Testbed. Carnegie Mellon University's digital library with more than 2,000 hours of video segments, from news broadcasts, and scientific education.
- Networked Computer Science Technical Reference Library. A distributed library of computer science research materials (including D-Lib Magazine) managed by Cornell University.
- The UC Berkeley Environmental Digital Library. An extensive collection of materials about the California environment, including several large collections of images, provided by the University of California, Berkeley.
- The Alexandria Digital Library. The collections of maps and other geospatial data created by the University of California, Santa Barbara.
- DeLIver: Desktop link to Engineering Resources. Articles from current scientific journals, with SGML markup, provided by the Grainger Engineering Library of the University of Illinois at Urbana Champaign.
Research Program
The intent of the D-Lib Test Suite is to stimulate research. We have no preconceived ideas about who will use the testbeds or what research will be carried out. However, we ourselves are researchers; D-Lib and the partners in the test suite will be coordinating a number of research intitiatives that use the testbeds. Initially three categories of research are envisaged:
- Individual research projects
- The test suite and the individual testbeds are available for a wide range of research projects in digital libraries. The aim is to encourage wide use of the testbeds, but there may be some limitations because of resource constraints.
- D-Lib Metrics Working Group
- Measurements, evaluation, and quantitative comparisons are a theme that should run throughout digital libraries. Over the past year, the D-Lib Metrics Working Group has begun work in developing a number of metrics that can be used for quantitative research in digital libraries. It is anticipated that this effort will lead to controlled experiments using several of the testbeds.
- Interoperability
- By its very nature, interoperability research requires diverse, independent testbeds. The members of the test suite have begun experiments in interoperability amongst the testbeds and with other digital libraries.
How to Use the Test Suite
The Test Suite is there to be used. It is a general resource for researchers in digital libraries and related fields. Its success will be measured by the quality of research that it enables. Subject to DARPA approval, it is available to all U.S. government-funded researchers and other not-for-profit groups in the U.S.. Non-U.S. and for-profit researchers will be accommodated whenever possible. However, resources are limited. Research projects that will require significant support should contact the test suite team. Some of the materials in the collections are licensed to the testbeds; researchers may be required to sign licenses before using these materials.
Researchers in digital libraries who wish to use the Test Suite, should contact either William Arms at CNRI, or, to use one of the individual testbeds, the named contact person at the testbed.
Acknowledgements
The D-Lib Test Suite is coordinated by the Corporation for National Research Initiatives with support from DARPA, grant N66001-98-1-8908. The development of the testbeds has been supported in part by the NSF/DARPA/NASA Digital Libraries Initiative, and by grants from DARPA, the National Science Foundation, and others.
[ Testbeds ]
Copyright © 1999 The Corporation for National Research Initiatives
Top | Contents
Search | Author Index | Title Index | Monthly Issues
Journal Review | Next Story
Home| E-mail the EditorD-Lib Magazine Access Terms and Conditions
DOI: 10.1045/february99-arms