The Cross-Language Evaluation Forum (CLEF) has been running for nine years now. The results of the CLEF 2008 campaign were presented at a two-and-a-half day workshop held in Aarhus, Denmark, 17-19 September, immediately following the twelfth European Conference on Digital Libraries (ECDL 2008).
The objective of the Cross Language Evaluation Forum is to promote research in the field of multilingual system development. This is done through the organisation of annual evaluation campaigns offering tasks designed to test different aspects of mono- and cross-language information retrieval (IR) systems. The intention is to encourage experimentation with all kinds of multilingual information access from the development of systems for monolingual retrieval operating on many languages to the implementation of complete multilingual multimedia search services. The aim is to encourage the development of next generation multilingual IR systems.
This year 100 groups, mainly but not only from academia, participated in the campaign. Most of the groups were from Europe, but there was also a good contingent from North America and Asia plus a few participants from South America and Africa.
CLEF 2008 Tracks
CLEF 2008 offered seven tracks designed to evaluate the performance of systems for:
Two new tracks were offered as pilot tasks:
In addition, MorphoChallenge 2008, an activity of the EU Network of Excellence Pascal, was organized in collaboration with CLEF.
Most of the tracks adopt a corpus-based automatic scoring method for the assessment of system performance. The test collections consist of sets of statements representing information needs known as topics (queries) and collections of documents (corpora). System performance is evaluated by judging the documents retrieved in response to a topic with respect to their relevance (relevance assessments) and computing recall and precision measures.
A number of document collections were used to build the test collections for CLEF 2008:
Diverse sets of topics or queries were prepared in many languages according to the needs of the various tracks. At the end of the campaign, the result is a number of valuable and reusable test collections.
The CLEF Workshops play an important role by providing the opportunity for all the groups that have participated in the evaluation campaign to get together to compare approaches and exchange ideas. The Workshop was held in Aarhus, Denmark, this year and was attended by 150 researchers and system developers. The schedule was divided between plenary track overviews, plus parallel, poster and breakout sessions. There were several invited talks. Noriko Kando, National Institute of Informatics Tokyo, reported on the activities of NTCIR-7 (NTCIR is an evaluation initiative focussed on testing IR systems for Asian languages), while John Tait of the Information Retrieval Facility (IRF), Vienna, presented a proposal for an Intellectual Property track that would focus on cross-language retrieval of legal patents in CLEF 2009.
The presentations given at the CLEF Workshops and detailed reports on the experiments of CLEF 2008 and previous years can be found on the CLEF website. The preliminary agenda for CLEF 2009 will be available from mid-November.
CLEF and Treble-CLEF
CLEF 2008 was organized under the auspices of TrebleCLEF, a Coordination Action of the Seventh Framework Programme Over the years, CLEF has done much to promote the development of multilingual IR systems. However, the focus has been on building and testing research prototypes rather than developing fully operational systems. TrebleCLEF is building on and extending the results achieved by CLEF. The objective is to support the development and consolidation of expertise in the multidisciplinary research area of multilingual information access and to promote a dissemination action in the relevant application communities.
TrebleCLEF thus has three main goals:
The aim will be to provide applications that need multilingual search solutions with the capability of identifying the most appropriate technology. For this purpose, a series of best practice workshops have been organised:
A Summer School on Multilingual Information Access is also being organised for June 2009 in Pisa. The focus of the Summer School will be on "How to build effective MLIA systems and how to evaluate them".
More information on the activities of TrebleCLEF can be found on the website.
Links Referenced in this report
Copyright © 2008 Carol Peters