GeoCLEF 2008

Evaluation of multilingual Geographic Information Retrieval (GIR) systems

Task Description

The 2008 GeoCLEF track will consist of four parts:

  1. A modification of the existing GeoCLEF search task (offered for the last time this year):
    • As in previous years, GeoCLEF will examine geographic search of a text corpus. How best to transform into a machine readable format the imprecise description of a geographic area found in many user queries is still an open research problem.
    • Some topics will simulate the situation of a user who poses a query when looking at a map on the screen. For these topics, the system will receive the content part and a rectangular shape which defines the geographic context
    • 75 training topics are available from previous years, another 25 old CLEF ad-hoc topics will be assembled for additional training and 25 test topics in at least three languages will be developed (English, German and Portuguese)
    • Spatial diversity of the results will be evaluated (for some topics)
    • Participants will be invited to suggest topics and issues which they would like to investigate. Suggestions can be made by email: mandl at uni-hildesheim de
  2. The query parsing task:
    • Due to the low number of registered participants, the task needed to be cancelled
    • Nevertheless, interested and registered participants will be granted access to the data (Query log of MSN search engine) Please contact Xing Xie (at Microsoft dot com) after May 15th
    • Further details of the task from last year are provided.
  3. GikiP: Wikipedia task pilot (new this year)
    • This pilot intends to focus on geographical information in Wikipedia.
      New types of topics adequate for Wikipedia will be explored in a
      multilingual setting.
      For now a multilingual search task with German, Portuguese and English is
      offered. (Other languages may be considered depending on demand.)
      This task will use the Wikipedia collections developed and made available
      within the question answering track (QA@CLEF), already available for
      download. Please contact geoclef-admin to gain access to the data.
      Further details of the task are provided at the pilot dedicated website.
  4. Pilot: Image search task
    • Images are a natural target for geographic search. This task will explore and evalaute systems for geographic image search.
    • Spatial diversity of the results will be evaluated.
    • This task is organized within the ImageCLEF track 
Important Dates
  • Topic Release: May 6, 2008
  • Run submissions due: June 17, 2008
  • Relevance judgments released: July 17, 2008
  • GeoCLEF Notebook papers due: August 15, 2008
  • CLEF Workshop in Aarhus: September 17-19, 2008
GeoCLEF 2008 Topics


Organisers of GeoCLEF

Thomas Mandl (mandl at uni-hildesheim de) and Christa Womser-Hacker, University of Hildesheim, Germany

Diana Santos Linguateca, SINTEF, Norway, Paula Carvalho, Linguateca, Portugal

Mark Sanderson, Department of Information Studies, University of Sheffield, UK

Fred Gey and Ray Larson, University of California, Berkeley, USA


In order to conduct geo-retrieval well, you may need resources such as gazetteers or ontologies. Here is a brief list of resources that we know about. Please contact Mark Sanderson, if you have other resources you want added to this list.

Introduction and background
Geographical Information Retrieval (GIR) concerns the retrieval of information involving some kind of spatial awareness. Given that many documents contain some kind of spatial reference, there are examples where geographical references (georeferences) may be important for IR. For example, to retrieve, re-rank and visualise search results based on a spatial dimension (e.g. “find me news stories about riots near Dublin City”). In addition to this, many documents contain geo-references expressed in multiple languages which may or may not be the same as the query language. This would require an additional translation step to enable successful retrieval.

The aim of GeoCLEF is to provide the necessary framework in which to evaluate GIR systems for search tasks involving both spatial and multilingual aspects. GeoCLEF is the cross-language geographic retrieval track which is run as part of the Cross Language Evaluation Forum (CLEF) campaign.

Mailing list
We have set up a mailing list for general information: geoclef [AT] uni-hildesheim DOT de
Please write to geoclef-request [AT] uni-hildesheim DOT de  to be added to the list.
Another mailing list for participants has also been set up:
geoclef-participants [at] uni-hildesheim DOT de

Past GeoCLEF

GeoCLEF 2005

GeoCLEF 2006

GeoCLEF 2007

Last Modified: Apr. 2008