P R I N T E R - F R I E N D L Y  F O R M A T Return to Article
 

D-Lib Magazine

May/June 2010
Volume 16, Number 5/6

 

The National Digital Library Project

Wei Dawei, Sun Yigang
National Library of China

doi:10.1045/may2010-dawei

 

Abstract

The National Digital Library Project (NDLP), which was launched in 2005, has attracted wide attention across Chinese society. This paper introduces the project, and covers its basic characteristics, objectives, and content. It also explains the basic concepts and the overall structure of the project and gives an introduction to the design of its basic platform, application platform, business management system and its standardization control system. As of the end of 2009, the National Library of China maintained over 320 TB of digital resource. Finally, the article also presents the digital resources involved, including collection policies, such as the collection of web information and new media.

Keywords: Digital Library, National Digital Library of China, National Digital Library Project

 

Introduction

The National Library of China (NLC) is the largest collector of traditional Chinese resources. By the end of 2008, the collections of the NLC totaled 26,980,000 items. Its massive collections enable the NLC to provide information service to the central government and other governmental organizations, educational, scientific, and research institutions, and the general public.

Figure: Collections of the NLC
Figure 1. Collections of the NLC

Since the 1980s the Internet has developed rapidly in China. When the China Internet Network Information Center (CNNIC) released 1st Statistical Survey Report on the Internet Development in China in October 1997, the number of 'netizens' in China had reached 620 thousand, with 290 thousand computers connected to the Internet. According to the 24th Statistical Survey Report on the Internet Development in China, by the end of June 2009, the total number of netizens in China had reached 338 million, among which 155 million obtain Internet access through their cell phones. Meanwhile, the shortage of Chinese digital material means that only 12% of Internet resources are Chinese.

As an important online information provider, the NLC shoulders great responsibility for the collection of Chinese digital resources and subsequent service provision. In October 2005, the NLC launched the National Digital Library Project with the support of government funding.

 

1. Targets of the National Digital Library Project

The targets of the National Digital Library Project include the following:

  • collecting, producing, and preserving Chinese digital resources to establish the largest Chinese digital information preservation database;
  • constructing the hardware and software platform to support the life cycle management of digital resources;
  • providing high-quality Chinese digital resource services both at home and abroad, through the Internet, to build the biggest Chinese digital resources service base; and
  • building a resource delivery and service system that uses the NLC as the operational core, and uses other major domestic libraries as its supplemental service locations, so as to support other major regional and special digital libraries.
 

2. Components of the National Digital Library Project

The components of the National Digital Library Project include hardware infrastructure, digital library application systems, digital library standards, and digital resources collections and services. Through the construction of the hardware and software systems, the following targets will be met:

  • establishing a platform that provides technological support to massive collecting;
  • producing, preserving and managing digital resources;
  • digitizing NLC's important special collections;
  • collecting and preserving the data to ensure long-term preservation of national academic digital resources;
  • setting up the Chinese information service platform to provide information service to the central government, other governmental organizations, key educational, scientific, and research institutions, other social organizations and the general public, as well as supporting other digital library systems;
  • building the digital library standard system and innovating the key technologies and standards for Chinese information processing.

The components of National Digital Library of China are illustrated in Figure 2 below.

Figure: Components of National Digital Library of China
Figure 2. Components of National Digital Library of China
 

2.1 Hardware Infrastructure

The hardware infrastructure includes network, storage, and cluster systems.

The National Digital Library will be connected to the major networking systems of China, such as those of the China Telecommunications Corporation, China United Network Communications Corporation Limited, and State Administration of Radio Film and Television. In order to provide better service to government, educational and research institutions, the NLC systems will interconnect at high speed with the China Research and Educational Network and the network systems of the main government organizations. As NLC Phase II is completed, it will connect with the network systems of Phase I and branch libraries. Furthermore, wireless access to the network will be available to readers in NLC Phases I and II.

The storage policy of NLC is to combine the online, near line and offline storage systems. The NLC will use disk, tape and CD for online storage, FC-SATA disk for near line storage, and CD and tape for offline storage. The capacity of online storage, near line storage and offline storage will grow to 150 TB, 150 TB, and 360 TB respectively.

The following are stored online:

  • real-time bibliographic records;
  • data needed in the process of digital resources distribution and service; and
  • the temporary data produced in the process of digitalization.

For the data with low-frequency of use, high-quality digital files for permanent preservation, and resources collected from the Internet, near-line storage is applied. For little-used digital resources, digital resources that need to be permanently preserved, and for data backup, offline storage is used. Cluster computing will also be employed, and a cluster management system to process a multi-cluster system will be built.

 

2.2 Digital Library Application Systems

The main target for digital library application systems is the construction of a digital resource collection and acquisition system, a digital resources processing system, a digital resources organization and management system, and a digital resources distribution and service system, with the management of digital resources life cycle as its core.

The function of the digital resources collecting and acquiring system is to digitize print, audio and video materials; collect web resources with special topics and provide a channel for the educational, scientific and research institutions to deposit their doctoral dissertations and other e-resources. The system includes three subsystems:

  1. digitization of print resources;
  2. online resources harvesting;
  3. legal deposit of digital resources.

The major missions of the digital resources processing system are to produce and combine the metadata, produce the database, carry out knowledge discovery and organization, to share the metadata of printed documents and electronic resources among all the literature information agencies nation-wide. The system includes four subsystems:

  1. metadata preparation and integration;
  2. resource base construction and integration;
  3. knowledge organization and management;
  4. the National Union Catalogue.

The function of the digital resources organization and management system is to manage the above-mentioned digital resources in an orderly way and preserve them so as to make them available decades later, or even several hundred years later. The digital resources organization and management system can also register and manage the copyright of digital resources. Thus, the digital resources distribution and service system can provide information service according to the copyright of the resources. The system includes three subsystems:

  1. digital resources organization and management;
  2. long-term preservation;
  3. digital rights management.

The major tasks of the digital resources distribution and service system are to package all types of digital resources, provide service according to user requirements and manage the digital resources. The whole system can be divided into many subsystems. For example, these include but are not limited to the metadata search subsystem, the virtual reference subsystem, interlibrary loan and document delivery subsystems, the grass-root level resource distribution, on-demand subsystem, and the full text search system.

The National Digital Library workflow will be established on the basis of the four systems mentioned above.

First, the digital resources collection system and digital resources production system manage the digital resources to meet the requirements of digital library management and service provision, including the digitization of print, audio and video materials, processing the harvested web information, format transaction and metadata indexing of databases, e-books, and e-periodicals.

Next, the processed digital resources enter into the digital resources management system. The management system manages the digital objects, metadata and related digital copyright, homogenizes heterogeneous resources, creates a uniform retrieval portal, and distributes the processed digital resources.

Lastly, the digital resources distribution and service system will interact with users to provide convenient service. For example, users can customize special subject information through the "My Library" system to receive information pushed from the distribution and service system; interoperate online with professional reference librarians; perform cross-platform retrieval; and accept documents from the delivery system.

 

2.3 Digital Library Standards

The NDL Project will formulate a series of standards regulating resource construction, description, organization, long-term preservation and service provision so as to make all the data normative and easy to deal with. The National Digital Library Standard System primarily includes standardized processing of Chinese characters, digital object identifiers, digital object management, general regulation of metadata, knowledge management, and digital resource statistics.

Figure: Standards of the China Digital Library Project
Figure 3. Standards of the China Digital Library Project
 

2.4 The Construction of Digital Resources and Service Provision

The digital resources of NLC come from:

  1. legal deposits;
  2. purchased databases;
  3. self-developed databases (usually from its unique collections and web-based information);
  4. the digital collections harvested from the Internet.

As of 30th June 2009, the total volume of its digitized resources had already exceeded 250 TB.

2.4.1 Deposit of Digital Resources

According to the nation's regulations, the National Library receives the legal deposit of digital publications, which includes audio tapes, video tapes, laser disks, VCDs, DVDs, electronic newspapers and other electronic publications. The NLC currently has 1,620,000 pieces of all types of electronic publications.

2.4.2 Purchased Digital Resources

By the end of 2008, the number of purchased databases reached 136; 59 are Chinese and 77 are foreign language. The content of these databases includes periodicals in Chinese and foreign languages, newspapers, books, dissertations, conference papers, etc. According to the copyright of the databases, some of them are available on the Internet, while others are only available through the NLC's Intranet.

2.4.3 Digitized Special Collections

The NLC began the digitization of its collections in 2000. By the end of 2008, the total volume of its digitized special collections exceeded 180 TB, which includes electronic books, dissertations, Min Guo documents, on-line lectures, oracle bones, Dunhuang materials, rubbings, digital chronicles, New Year pictures, etc. Some of the NLC's special collections are described below.

 

Digital Chronicles: To date, 6,868 chronicles have been digitized in the NLC. In order to provide readers with access to this information, the NLC converted the digital chronicles into full-text data, of which 2,841 have been finished. The first 24 pages are available on the Internet. Upon completion of the hardware platform of the National Digital Library Project, all of the digital chronicles will be available online.

Sample Figure
 

Digital Oracle Bone Inscriptions and Stone Rubbings: The NLC has about 6,000 oracle bone inscriptions, which had already been digitized, and 30,000 stone rubbings, among which 24,000 kinds had already been digitized. Users can get access to all of these resources on the Internet.

Sample Figure
 

International Duhuang Project: The NLC is one of the partners of the International Duhuang Project (IDP), along with the British Library, and other institutions. The IDP project website was launched in 2002. Currently, 18,000 images of Duhuang Manuscripts are available on the Internet. (See http://idp.nlc.gov.cn/.)

Sample Figure
 

Xixia Dynasty Collection: Most of the Xixia Materials collected by the NLC are the only existing copies, and are quite valuable for scholars. NLC began the digitization of these valuable materials in 2002. For the convenience of researchers, the NLC also digitized research papers on the Xixia Dynasty. At present 14,000 pages of papers have been digitalized and are available on the Internet.

Sample Figure
 

New Year Pictures Database: New Year pictures are the traditional Chinese drawings beloved by Chinese people. The NLC has collected more than 4,000 of them. It has digitized 607 out of 734 pieces, and a description for every New Year picture is also available, including the content, historical background, special features and other related information. All of these resources are available on the Internet.

Sample Figure
 

Doctoral Dissertations: The NLC is the government-appointed institution for collecting doctoral dissertations. The total volume of the collection has reached 120,000, making NLC the largest collector of Chinese dissertations. For the purposes of preservation and usage, the NLC began digitizing dissertations in 2004, and 109,000 have been digitized to date. The NLC provides access to the first 24 pages of each dissertation on the Internet.

Sample Figure
 

Min Guo Documents Database: The NLC's collection of Min Guo documents are in urgent need of rescue due to serious acidification. Thus, the NLC has paid special attention to the digitization of these materials. The NLC has digitalized the following Min Guo documents: 8,434 Min Guo books totaling 2,150,000 pages; 4,535 Min Guo periodicals totaling 86,000 volumes and approximately 5,040,000 pages. The NLC provides the access to the first 24 pages of each document on the Internet.

Sample Figure
 

2.5 Collection of Web Information

It is an important responsibility of NLC to collect the increasing amount of web-based information. NLC started the tentative work for web information collection and preservation in 2003. To date, the NLC has collected and preserved all of the information from more than 20,000 governmental websites, 245 e-newspapers, and special subject information such as Chinese studies and the Olympic Games.

 

3.New Media Services

New media services refers to the newly developed mobile media services aimed at smart phones, as well as digital library services based on digital television.

Mobile devices such as smart phones provide portability, real time response, interactivity, and other characteristics which reduce the limitations on time and place for public information usage, information collection, and reader interaction, thus enabling information sources to be more diverse, comprehensive and timely.

In recent years, the technology of digital television has been advancing rapidly. Digital television is not only a brand new broadcast technology, but also a new kind of life style. Its effect can spread into every part of life. It has changed television from one-way communication to interactive information dissemination, as well as greatly improving the quality of sound and image. The State Administration of Radio Film and Television has announced that digital video broadcasting would be launched in 2010 and that analog television broadcasting would cease. Digital television technology will be a key media platform for digital library services.

The NLC, having considered the impact of new media, has decided to energetically explore new methods and forms for digital library services, as illustrated in Table 1 below.

Service Type Service Mode Details
Announcements, Promotions,
and Exhibitions
SMS, MMS, WAP, Digital TV This type of service includes the announcement of library information such as notices, notes about user accounts, book recommendations, and exhibitions. Promotions and exhibitions can be announced using SMS and MMS, as well as the WAP web and digital television service modes. The WAP web and digital television service modes are suitable for exhibition.
Retrieval and Reading WAP, Digital TV This kind of service includes bibliographic information, electronic resources retrieval and online reading or downloading.
Library Navigation WAP, Digital TV This kind of service provides a general introduction to the library, the readers' guide and library navigation.
Interaction with Readers SMS, WAP This kind of service offers book recommendations, lecture reservations, reader surveys, etc.
Document or Book Management SMS, MMS, Digital TV This kind of service includes the management of reader accounts, book renewal, overdue notices, reminders, book reservations, etc.
Video-on-Demand Digital TV Currently, libraries rarely lend their collections of CDs, which contain large amounts of information, to users. These CDs can be provided to users though digital television, providing users access to the CDs at home, and enabling libraries to bring their collections of CDs into full play without worrying about damage to the CDs.

Table 1: Digital Library Service Based on New Media

The market-based development of digital television and mobile media have enabled new media services to become a growth point of the National Digital Library.

 

About the Authors

Photo of Wei Dawei

Wei Dawei has been the deputy director, acting director, and now the director of the Automation Department (also known as the New Technology Research Center) of the National Library of China (NLC), director of NLC Computer & Network System Department (i.e. the NLC Library Technology Research Center). He also manages the System Management & Maintenance Department of the Beijing National Library Digital Technology Co., Ltd.

 
Photo of Sun Yigang

Sun Yigang got his Masters degree from Beihang University. He is the research librarian of the National Library of China, and is the director of digital library research and construction professional committee of Chinese Society for Library Science, member of Internet application and information service committee of Chinese Institute of Communications, and member of China Society for Scientific and Technical Information. His research area is focused on digital library, information processing and computer application.

 
 
P R I N T E R - F R I E N D L Y  F O R M A T Return to Article