LogCLEF 2009

LogCLEF deals with the analysis of queries as expression of user behavior.
Goal is the analysis and classification of queries in order to improve search systems.


LogCLEF 2009 · LogCLEF 2010

Task Description

Log Analysis and Geographic Query Identification (LAGI)

The identification of geographic queries within a query stream and the recognition of the geographic component are key problem for geographic information retrieval (GIR). Geographic queries require specific treatment and often a geographically oriented output (e.g. a map). The task would be to (1) classify geographic queries and (2) identify their geographic and non-geographic elements. The task design and evaluation measures would be similar to the ones used in the track in 2007. A real search engine log file and logs from The European Library (TEL) are used for this task.

Task Guidelines and Examples

Resources

Click on the following links to get to the static versions of the Portuguese and the English Wikipedia.


Log Analysis for Digital Societies (LADS)

The Log Analysis for Digital Society (LADS) task deals with logs from The European Library (TEL) and intends to analyze user behavior with a focus on multilingual search.
Potential targets are query reformulation, multilingual search behavior and community identification.
This task is open to different approaches, in particular data mining approaches in order to extract knowledge from the data and find interesting user patterns.

Frequently Asked Questions

Suggested sub-tasks for the analysis of the log data are:

  1. user session reconstruction; this step needs to be considered as a prerequisite to the following ones;
  2. user interaction with the portal at query time; e.g. how users interact with the search interface, what kind of search they perform (simple or advanced), and how many users feel satisfied/unsatisfied with the first search and how many of them reformulate queries, browse results, leave the portal to follow the search in a national library;
  3. multlinguality and query reformulation; e.g. what are the collections that are selected the most by users, how the language (country/portal interface) of the user is correlated to the collections selected during the search, how the user reformulate the query in one language or in a different language;
  4. user context and user profile; e.g. how the study of the actions in the log can identify user profiles, how the implicit feedback information recorded in the logs can be exploited to create the context in which the user operates and how this context evolves.
Participants are required to

Sample

Click on the following links to download a sample of the logfiles and a description of the files.

Contact

Thomas Mandl (mandl at uni-hildesheim.de), University of Hildesheim, Germany
Giorgio Maria Di Nunzio (dinunzio at dei.unipd.it), University of Padua, Italy
Maristella Agosti (agosti at dei.unipd.it), University of Padua, Italy
Julia Maria Schulz (schulzju at uni-hildesheim.de), University of Hildesheim, Germany