LogCLEF deals with the analysis of queries as expression of user
Goal is the analysis and classification of queries in order to improve search systems.
LogCLEF 2009 · LogCLEF 2010
LogCLEF will be a workshop lab at CLEF 2010.
Topic and Goal
Log data constitute a relevant aspect in the evaluation process of
the quality of a search engine and the quality of a multilingual search
service; log data can be used to study the usage of a search engine,
and to better adapt it to the objectives the users were expecting to
The goal of LogCLEF is the analysis and classification of queries in order to understand search behavior in multilingual contexts and ultimately to improve search systems.
Data Collection for LogCLEF 2010
The data collection consists of two large logfiles from information
* The European Library (TEL) logs
As in 2009, a large log of activities from The European Library are provided. This service provides access to several national libraries of Europe. Users and content come from many languages.
Click on the following links to download a sample of the logfiles and a description of the files.
* Deutscher Bildungsserver (DBS) logs
The "Deutscher Bildungsserver" is a quality controlled internet directory for educational resources. A raw server log representing three months of activities on the portal is made available. The size of the files is 5 GB.
Click on the following links to download a sample of the logfiles, a description of the files and a file about the DBS structure.
The main question behind the task definition comes from search
service providers who wonder how they can improve their services.
Ultimately, researchers need to better understand user behavior in
order to reach that high level goal.
Two objectives of the analysis of the logs are proposed, one for each set of logs:
Deutscher Bildungsserver (DBS) logsThe objective of the analysis of the DBS logs is the exploration of the relation between query and viewed content. The analysis can explore formal issues of the query and content as well as the distribution of words within both.
- Are query terms related to the content viewed and/or paths taken within the system?
- Can query modifications be explained by the content viewed?
- Develop metrics to identify successful searches
The European Library (TEL) logsInvestigate the issue of query languages with respect to successful search. A successful search is defined as one of the following action listed in the right hand box when an item of the result clicked is listed. + Services: Availability at the library, Link to other services (Amazon, etc), collection homepage + Options: Save in favorites, Send by email
- language identification for the queries
- initial language vs country IP address
- subsequent languages used on same search
- country of the library vs language of the query vs language of the interface
- Explore the relationship between language of the query, origin of the user, selected interface language and language of library viewed
- Are different languges used within subsequent searches?
- What kind of language changes occur?
|January - May 2010||Registration|
|May 2010||Data Release|
|June 8th||Final Task Description Release|
|July, 20||Submission of Results|
|August, 1st||Submission of Notebook Paper|
|August, 10||Submission of Revised Notebook paper and extended Abstract|
|Sept, 22-23.||Workshop in Padua|
Maristella Agosti, University of Padua
Jim Jansen, University of Pennsylvania, USA
Jaap Kamps, University of Amsterdam, The Netherlands
Vivien Petras, Humboldt University Berlin, Germany
Johannes Leveling, Dublin City University, Ireland
Inderjeet Mani, MITRE Corp., USA