Composition is a very productive word formation process in German. For many applications, it is helpful to have information about the parts of the compound, as usually the semantic interpretation is based on the meaning of its parts. In GermaNet, nominal compounds are therefore split into their constituent parts, i.e., modifier and head. This splitting identifies the immediate constituents at each level of analysis and thus reflects the recursive nature of compounds that have more than two constituent parts such as Autobahnanschlussstelle(‘motorway junction’). The immediate constituents of this compound are Autobahn and Anschlussstelle, with the first constituent then splitting further into Auto and Bahn and the second constituent further split into Anschluss and Stelle (see Figure 1).
Figure 1: Split compound
What makes compound splitting for German a challenging task is the fact that compounding is not always simple string concatenation, but often involves the presence of intervening linking elements or the elision of word-final characters in the modifier constituent of a compound (Henrich & Hinrichs, 2011). In GermaNet, all modifiers are lemmatized and if a modifier is ambiguous with respect to its word class (due to conversion), both possibilities are specified:
Compound splitting in GermaNet is supported by an automatic algorithm, which combines several individual compound splitters. Please see the referenced paper below for more information on the automatic splitting. All automatically split compounds are manually post-corrected and enriched with relevant properties before they are inserted into GermaNet.
The following properties are specified for modifiers and/or heads:
Abbreviation
If one part of the compound is an abbreviation, it is labelled as Abkürzung.
Examples:
Compound | Modifier | Head |
---|---|---|
SIM-Karte | SIM (abbreviation) | Karte |
ISO-Norm | ISO (abbreviation) | Norm |
Bonus-CD | Bonus | CD (abbreviation) |
Affixoid
Affixoids are morphemes with a special status between bound and free morphemes. As they have a clearly assigned meaning, it makes sense to split the respective words. The bound morpheme is labelled as Affixoid.
Examples:
Compound | Modifier | Head |
---|---|---|
Grundfrage | grund (affixoid) | Frage |
Riesenchance | riesen (affixoid) | Chance |
Hauptsaison | haupt (affixoid) | Saison |
Generalschlüssel | general (affixoid) | Schlüssel |
Foreign Word
If one part (or more) of the compound is not a German word, it is labelled as Fremdwort. Note that those constituents which are borrowed words but are nowadays used as loanwords defined in a standard German dictionary (such as Duden) are not considered as foreign words in GermaNet (e.g. Drink and Pool in the examples below).
Examples:
Compound | Modifier | Head |
---|---|---|
Longydrink | long (foreign word) | Drink |
Swimmingpool | swimming (foreign word) | Pool |
Logdatei | log (foreign word) | Datei |
Konfix
The label Konfix refers to a word which is borrowed from a foreign language, in many cases from Latin or Greek, and whose meaning stems from that particular language. Konfixes are bound morphemes, but in opposition to all other affixes two Konfixes can be combined to form a so-called Konfixkompositum. Those Konfixkomposita are not split in GermaNet, whereas compounds existing of a Konfix and a native word are split.
Examples:
Compound | Modifier | Head |
---|---|---|
Milligramm | milli (Konfix) | Gramm |
Zentimeter | zenti (Konfix) | Meter |
Monokultur | mono (Konfix) | Kultur |
Opaque Morpheme
Modifiers whose meaning is not transparent any more without considering the etymology of the word are labelled with the property opaques Morphem.
Examples:
Compound | Modifier | Head |
---|---|---|
Himbeere | Him (opaque morpheme) | Beere |
Karfreitag | Kar (opaque morpheme) | Freitag |
Sintflut | Sint (opaque morpheme) | Flut |
Lebkuchen | Leb (opaque morpheme) | Kuchen |
Elfenbein | Elfen (opaque morpheme) | Bein |
Proper Name
If the whole compound is a named entity, it is not split in GermaNet. If only the modifier is a proper name, the compound is split and the label Eigenname is added to the modifier.
Examples:
Compound | Modifier | Head |
---|---|---|
Hubbleteleskop | Hubble (proper name) | Teleskop |
Wertherstimmung | Werther (proper name) | Stimmung |
Hiobsbotschaft | Hiob (proper name) | Botschaft |
Virtual Word Form
Virtual word forms, labelled as Virtuelle Bildung, are regularly built according to existing word formation rules. However, they do not exist in isolation, but only as part of a compound.
Examples:
Compound | Modifier | Head |
---|---|---|
Einflussnahme | Einfluss | Nahme (virtual word form) |
Fragesteller | Frage | Steller (virtual word form) |
Farbgebung | Farbe | Gebung (virtual word form) |
Word Group
Modifiers consisting of a phrase are marked as Wortgruppe and the parts of the phrase are annotated as the modifier.
Examples:
Compound | Modifier | Head |
---|---|---|
Dreiwege-Katalysator | drei Weg (word group) | Katalysator |
Nacht-und-Nebel-Aktion | Nacht und Nebel (word group) | Aktion |
Pro-Kopf-Einkommen | pro Kopf (word group) | Einkommen |
The following table gives an overview of the constituent parts of a compound (i.e. modifier and head) and the corresponding properties that are annotated for each constituent in GermaNet:
Property | Modifier | Head |
---|---|---|
Abbreviation | x | x |
Affixoid | x | x |
Foreign Word | x | x |
Konfix | x | |
Opaque Morpheme | x | x |
Proper Name | x | |
Virtual Word Form | x | |
Word Group | x |
In addition to the information described above that is included in GermaNet (since release 8.0), a list of split compounds with their modifier(s) and head is freely available for download here:
The list of compound data is free for academic research as defined in GermaNet's academic research licence agreement. For any other intended purposes, please contact us.
The format of these split compounds is one compound per line: first the compound itself, then a <tab> space, then the modifier (in case of two modifiers, these are separated by the pipe (|) symbol), then a <tab> space again, and finally the head. For example:
Apfelbaum Apfel Baum
Goldmünze Gold Münze
Laufband laufen|Lauf Band
The following paper describes the automatic compound splitting that is performed before the manual post-correction. If you want to use the split compounds in the context of scientific or research work, please refer to the paper:
Verena Henrich and Erhard Hinrichs: Determining Immediate Constituents of Compounds in GermaNet. In Proceedings of Recent Advances in Natural Language Processing (RANLP 2011), Hissar, Bulgaria, September 2011, pp. 420-426.
Our website uses cookies. Some of them are mandatory, while others allow us to improve your user experience on our website. The settings you have made can be edited at any time.
or
Essential
in2cookiemodal-selection
Required to save the user selection of the cookie settings.
3 months
be_lastLoginProvider
Required for the TYPO3 backend login to determine the time of the last login.
3 months
be_typo_user
This cookie tells the website whether a visitor is logged into the TYPO3 backend and has the rights to manage it.
Browser session
ROUTEID
These cookies are set to always direct the user to the same server.
Browser session
fe_typo_user
Enables frontend login.
Browser session
Videos
iframeswitch
Used to show all third-party contents.
3 months
yt-player-bandaid-host
Is used to display YouTube videos.
Persistent
yt-player-bandwidth
Is used to determine the optimal video quality based on the visitor's device and network settings.
Persistent
yt-remote-connected-devices
Saves the settings of the user's video player using embedded YouTube video.
Persistent
yt-remote-device-id
Saves the settings of the user's video player using embedded YouTube video.
Persistent
yt-player-headers-readable
Collects data about visitors' interaction with the site's video content - This data is used to make the site's video content more relevant to the visitor.
Persistent
yt-player-volume
Is used to save volume preferences for YouTube videos.
Persistent
yt-player-quality
Is used to save the quality settings for YouTube videos.
Persistent
yt-remote-session-name
Saves the settings of the user's video player using embedded YouTube video.
Browser session
yt-remote-session-app
Saves the settings of the user's video player using embedded YouTube video.
Browser session
yt-remote-fast-check-period
Saves the settings of the user's video player using embedded YouTube video.
Browser session
yt-remote-cast-installed
Saves the user settings when retrieving a YouTube video integrated on other web pages
Browser session
yt-remote-cast-available
Saves user settings when retrieving integrated YouTube videos.
Browser session
ANID
Used for targeting purposes to profile the interests of website visitors in order to display relevant and personalized Google advertising.
2 years
SNID
Google Maps - Google uses these cookies to store user preferences and information when you view pages with Google Maps.
1 month
SSID
Used to store information about how you use the site and what advertisements you saw before visiting this site, and to customize advertising on Google resources by remembering your recent searches, your previous interactions with an advertiser's ads or search results, and your visits to an advertiser's site.
6 months
1P_JAR
This cookie is used to support Google's advertising services.
1 month
SAPISID
Used for targeting purposes to profile the interests of website visitors in order to display relevant and personalized Google advertising.
2 years
APISID
Used for targeting purposes to profile the interests of website visitors in order to display relevant and personalized Google advertising.
6 months
HSID
Includes encrypted entries of your Google account and last login time to protect against attacks and data theft from form entries.
2 years
SID
Used for security purposes to store digitally signed and encrypted records of a user's Google Account ID and last login time, enabling Google to authenticate users, prevent fraudulent use of login credentials, and protect user data from unauthorized parties. This may also be used for targeting purposes to display relevant and personalized advertising content.
6 months
SIDCC
This cookie stores information about user settings and information for Google Maps.
3 months
NID
The NID cookie contains a unique ID that Google uses to store your preferences and other information.
6 months
CONSENT
This cookie tracks how you use a website to show you advertisements that may be of interest to you.
18 years
__Secure-3PAPISID
This cookie is used to support Google's advertising services.
2 years
__Secure-3PSID
This cookie is used to support Google's advertising services.
6 months
__Secure-3PSIDCC
This cookie is used to support Google's advertising services.
6 months