2. Research questions addressed in the workshop — contributions of the participants
The workshop will comprise four sections which address the above mentioned themes:
- theoretical and descriptive notions of terminological variation (section 1);
- the interplay between theoretical or descriptive notions on the one hand and techniques for (semi-) automatic data extraction from text on the other (section 2);
- the needs of translators, technical writers and terminologists with respect to the results of both data extraction and description (section 3);
- possibilities to take over or to adapt methods from neighbouring disciplines, such as terminology, lexical semantics, or the creation of resources for natural language processing (section 4).
We will now further detail the research questions to be addressed in these four sections of the planned workshop.
2.1 Concepts of terminological variation in linguistics and terminology
We intend to address, among other things, the following issues; we also indicate who among the invited experts is expected to address these issues.
- What is the purpose of defining term variation? For whom, and for which applications is the pheonomenon to be defined? To which extent is the definition, and consequently the treatment of variation, dependent on the intended user of the description (translators vs. terminologists vs. domain experts), or on the intended (semi-) automatic application (human vs. machine translation, data extraction from texts, passage retrieval, e-mail-routing, ontology construction, etc.)?
This topic will be addressed from different angles: by Beatrice Daille for NLP, by Oliver Czulo for translators, human and machine translation, and by Ina Rösiger from the viewpoint of text retrieval and ontology construction.
- What is the exact relationship with synonymy: are variants 'exact synonyms' or 'quasi-synonyms'? To what extent can a more flexible approach to synonymy be useful for the specification of variation phenomena? While e.g. DE 'Renovierung' and DE 'Renovation' can be seen as full synonyms (although with a diatopic difference, the latter being typical of Swiss German), morphologically or syntactically related variants may rather fall under a notion of quasi-synonymy, cf. DE 'Holzfaserdämmplatten' vs. 'Holzfaserplatten zur Dämmung' (where the latter items makes a purpose relation explicit which is implicit in the compound).
This topic will be addressed by Laura Giacomini in her introduction to section 1, by Beatrice Daille in the framework of her proposals for a variant typology, as well as by Aleksandra Dralle, from the viewpoint of definition making and ontology design.
- What is the relation between variation and diasystematic marking: are all variants somehow 'marked', e.g. for register, region, level of formality, etc.? The above example of DE/CH 'Renovation' illustrates diatopic markedness, while e.g. the couple of DE 'Fahrtrichtungswechselanzeiger' (standard automotive terminology) vs. DE 'Blinker' (lay people's term) shows the layering of specialized languages with respect to the targeted public. We do not believe that all variants are marked with respect to diasystematic properties: most morphologically and syntactically related variants (e.g. EN 'energy production' vs. 'production of energy') are not; what does this mean for the set of phenomena to be described as variants? Does it make sense to open the domain beyond those expressions which have so far been discussed in the science of specialized communication?
This topic will be touched upon by Josef Ruppenhofer, and it will be relevant to the talk by Oliver Czulo.
- Which lexical and grammatical objects are involved into variation? Does it affect only single word terms (DE 'dämmen' vs. DE 'isolieren', '[to] insulate')? Or also multiword terms, collocations, verbs and their argument structure, morphologically related paraphrases? What are criteria at the levels of morphology, syntax, semantics, lexicalization, idiomatization that allow us to delimit the range of phenomena to be covered? In fact, DE 'begehbare Dachbodendämmung' is also found as 'begehbare Dämmung des Dachbodens'; while this couple satisfies most synonymy criteria, this is less clearly so with 'begehbare Dämmung am Dachboden' or 'Dachboden begehbar dämmen'.
The examples cited are from Laura Giacomini's ongoing work, and the issue will be taken up by Rösiger.
- To which extent does the richness in variants depend on the nature of the domain of specialization (e.g. medicine vs. law vs. technical and scientific matters), and along with it, on extralinguistic phenomena, such a traditions of standardization, the technicality of the domain, or the fact that a domain is just emerging? It is well known that medical languages differ widely according to the public addressed (experts vs. lay people); it seems that established technical domains show less variation that new, emerging ones; but we do not have enough data yet to see clear patterns emerging.
This topic will be addressed by Pius ten Hacken, from the viewpoint of standardization; examples from different domains will be contributed by Tanja Wissik (legal texts) and Ornella Wandji-Tchami (medical language).
2.2 Data extraction for the description of terminological variation.
In this context, we intend to address the following research questions:
- What are state of the art techniques for the recognition of terminological variants in computational terminology (cf. the opening statement by Ulrich Heid)?
- To what extent is the concept of variation used in computational terminology work dependent on the technological possibilities available? Does this notion satisfy the requirements of practical applications and of 'end users?
This issue is the guiding theme of Beatrice Daille's talk; it will also be addressed by Melanie Siegel.
- Which inspiration can computational terminology get from computational linguistics (e.g. from distributional semantics, from role labelling or from statistical approaches to ontology building), in order to improve the coverage of variation phenomena and their classification?
This topic will be addressed by Josef Ruppenhofer, as well as by Melanie Siegel.
2.3 Representing variation in interactive resources: needs of translators and technical authors
In this section, we focus on interactive tools for language professionals; the following issues will be covered:
- How are variants currently treated in models of terminological description, as well as in their implementation in current terminology tools, e.g for computer-assisted translation? Whose needs are satisfied by this modelling, and how do translators and terminologists work with these models and tools?
This will be addressed by Laura Giacomini in her introduction, as well as by Aleksandra Dralle and by Rita Temmerman.
- What can terminologists and term bank designers learn from lexicographers with respect to (i) the representation, (ii) the presentation and (iii) the access to terminological variants in electronic resources for interactive use?
Ulrich Heid will point to some issues of presentation and access in his introductory statement. Examples of the current state of specialized dictionaries will be given by Ornella Wandji-Tchami.
- To what extent can theories of semantics, such as Frame Semantics, ongoing work in computational lexical semantics and the current work on linking lexicons and ontologies contribute to the modelling of variation in termbanks?
This issue will be addressed in Laura Giacomini's introduction. Aleksandra Dralle's presentation will show an application of a frame-based approach, from the viewpoint of practical work in a translation agency.
2.4 Towards new approaches for the representation of terminological variation
This section of the workshop will be devoted to examples of possible contributions to the modelling of variation stemmming from other research areas than just terminology. Its scope overlaps to some extent with that of section 3, but the following issues will be addressed in addition:
- Can approaches from lexicography and computational linguistics help us to improve not only the representation of variants, but also the access to knowledge about variants? How is this access dependent on the tasks users want to carry out with the resources?
This topic is dealt with by Melanie Siegel, as well as by Oliver Culo.
- Which features beyond those described and discussed in section 1 need to be taken into account in the description of term variation?
Examples for this issue are discussed by Josef Ruppenhofer.
At the end of this section and of the workshop as a whole, we intend to briefly summarize what was discussed as 'best practice' in the field of variant description. To this end, the hosts will present, in the closing session, what they perceived as the consensus (or as still open issues) in the discussions of the individual talks and of each section.