The objective of the computational linguistics subproject of SOLDISK is to operationalize the description of discourses of solidarity or desolidarization provided by the social sciences on the basis of theories and/or of a detailed analysis of individual texts. From such individual analyses, we intend to generalize towards the identification and classification of such discourses in large amounts of textual data.

For that purpose, we will use different tools and methods such as part-of-speech tagging, parsing as well as methods form Distributional Semantics. One example is the search for lexical or structural indicators for statements in migration discourses that are expressions of solidarity or of desolidarization or that provide motivation for such positions. Taking the retrieved data as (partial) training data, as well as in parallel to this indicator-based approach, we use Machine Learning to find further related data and/or to generalize from the observed data.

The subproject’s genuinely computational linguistic issues are to develop appropriate Machine Learning applications, as well as to describe and document all steps of our research process, computational as well as manual ((e.g. for relevance feedback) in a detailed way, ideally by means of process metadata.

Alongside, the subproject is responsible, within the SOLDISK project, for the provision, the linguistic (and metadata) annotation, as well as the preparation of all corpus data for interactive or automatic exploration.



Dipl.-Ling. Max Kisselew, Prof. Dr. Ulrich Heid
IwiSt – Institute for Information Science and Natural Language Processing,
Work group on computational linguistics and language technology,
Universität Hildesheim
