LogCLEF 2010
LogCLEF deals with the analysis of queries as expression of user
behavior.
Goal is the analysis and classification of queries in order to improve
search systems.
LogCLEF 2009 · LogCLEF 2010
LogCLEF will be a workshop lab at CLEF 2010.
Topic and Goal
Log data constitute a relevant aspect in the evaluation process of
the quality of a search engine and the quality of a multilingual search
service; log data can be used to study the usage of a search engine,
and to better adapt it to the objectives the users were expecting to
reach.
The goal of LogCLEF is the analysis and classification of queries in
order to understand search behavior in multilingual contexts and
ultimately to improve search systems.
Data Collection for LogCLEF 2010
The data collection consists of two large logfiles from information
providers:
* The European Library (TEL) logs
As in 2009, a large log of activities from The European Library
are provided. This service provides access to several national
libraries of Europe. Users and content come from many languages.
Click on the following links to download a sample
of the logfiles and a description
of the files.
* Deutscher Bildungsserver (DBS) logs
The "Deutscher Bildungsserver"
is a quality controlled internet directory for educational resources. A
raw server log representing three months of activities on the portal is
made available. The size of the files is 5 GB.
Click on the following links to download a
sample of the logfiles, a description
of the files and a file about the DBS
structure.
Task Definition
The main question behind the task definition comes from search
service providers who wonder how they can improve their services.
Ultimately, researchers need to better understand user behavior in
order to reach that high level goal.
Two objectives of the analysis of the logs are proposed, one for each
set of logs:
Deutscher Bildungsserver (DBS) logs
The objective of the analysis of the DBS logs is the exploration of the relation between query and viewed content. The analysis can explore formal issues of the query and content as well as the distribution of words within both.Potential analysis
- Are query terms related to the content viewed and/or paths taken within the system?
- Can query modifications be explained by the content viewed?
- Develop metrics to identify successful searches
The European Library (TEL) logs
Investigate the issue of query languages with respect to successful search. A successful search is defined as one of the following action listed in the right hand box when an item of the result clicked is listed. + Services: Availability at the library, Link to other services (Amazon, etc), collection homepage + Options: Save in favorites, Send by emailPotential analysis
- language identification for the queries
- initial language vs country IP address
- subsequent languages used on same search
- country of the library vs language of the query vs language of the interface
- Explore the relationship between language of the query, origin of the user, selected interface language and language of library viewed
- Are different languges used within subsequent searches?
- What kind of language changes occur?
Schedule
| January - May 2010 | Registration |
| May 2010 | Data Release |
| June 8th | Final Task Description Release |
| July, 20 | Submission of Results |
| August, 1st | Submission of Notebook Paper |
| August, 10 | Submission of Revised Notebook paper and extended Abstract |
| Sept, 22-23. | Workshop in Padua |
Contact
Thomas Mandl
(mandl at uni-hildesheim.de), University of Hildesheim, Germany
Giorgio Maria
Di Nunzio (dinunzio at dei.unipd.it), University of Padua, Italy
Steering Committee
Maristella Agosti, University of Padua
Jim Jansen, University of Pennsylvania, USA
Jaap Kamps, University of Amsterdam, The Netherlands
Vivien Petras, Humboldt University Berlin, Germany
Johannes Leveling, Dublin City University, Ireland
Inderjeet Mani, MITRE Corp., USA
