The Cross-Language Evaluation Forum (CLEF) aims at promoting research and development in Cross-Language-Information Retrieval (CLIR).
CLEF aims at
The CLEF 2001 evaluation campaign has attracted an increasing number of
participating groups (coming from European countries, North America and
Asia), in total 31 groups delivered results. Several new groups especially
from Europe have been participating. At the workshop the results of the
CLEF 2001 campaign have been resumed, also the methodology and functionality
of CLIR systems and the future work of CLEF campaign have been discussed.
providing an infrastructure for the testing and evaluation of information
retrieval system operating on European languages, and
creating test-suites of reusable data which can be employed by system developers
for benchmarking purposes.
The submitted results of the tests carried out by the participants show
the steady improvement of the overall results, but they also show still
a lot of difficulties to overcome language-specific problems like resolving
compounds and ambiguity of terms. Several groups tried to use automatic
translation tools like Systran, Power Translator or Babylon, which performed
not bad, but failed totally in some cases: mainly in cases of proper names,
which have natural language equivalents, in cases of idiomatic phrases,
homonyms or hyphenated compounds. This means in the end that it is not
sufficient only to use machine translation systems, there are additional
components necessary to avoid the mentioned problems.
Another strategy of shaping CLIR systems was to combine the results
of different machine translations systems to achieve a better retrieval
results; this strategy really worked well. Other groups combined different
methods for solving the translation problem, in these cases the results
improved, too. An increasing number of groups used corpus-based, statistical
In most tests the results are quite good in an overall perspective (although
they do not reach the results of monolingual retrieval), but they have
specific problems in some cases (which differ between the groups and depend
on the used strategies). The results also suffer from quality problems
(of dictionaries, corpora, language tools, and machine translation systems)
with “less popular” language pairs (like German/Italian) or languages.
The proceedings of last year (CLEF 2000) have now been printed.
At the main ECDL 2001 conference the papers were grouped around the
following aspects of research in the digital library area: user modeling
and user communities, digitisation and multimedia, knowledge management,
information retrieval and multilinguality. In the multilinguality session
the papers dealt with CLIR in Japanese and English and with the handling
of character sets. With respect to the work of the ETB the paper on Learning
Spaces in Digital Libraries could be of interest which also deals with
the application of a metadata set on geospatially-referenced learning resources
(Coleman et al.). The proceedings have been printed in advance of the conference.
Panos Constantopoulos, Ingeborg T. Sølvberg (eds.) : Research
and Advanced Technology for Digital Libraries, 5th European Conference,
CEDL 20001, Darmstadt, Germany, September 2001, Proceedings. Berlin et
al.: Springer 2001 (= Lecture Notes in Computer Science (LNCS), 2163)
Anita S. Coleman et al. : Leraning Spaces in Digital Libraries. In:
Constantopoulos/ Sølvberg (eds.) : Research and Advanced Technology
for Digital Libraries, 5th European Conference, CEDL 20001, Darmstadt,
Germany, September 2001, Proceedings. Berlin et al.: Springer 2001 (= LNCS
2163), p. 251-262
Carol Peters (ed.): Cross-language Information Retrieval and Evaluation.
Workshop of the Cross-Language Information Evaluation Forum, CLEF 2000,
Lisbon, Portugal, September 21-22, 2000, Revised Papers. Berlin et al.:
Springer 2001, (= Lecture Notes in Computer Science (LNCS), 2069 )
||Tuesday, 11 Dec 2001
||Tuesday, 11 Dec 2001
||thesaurus, standardisation, multilinguality