TüBa-D/DP is a machine-annotated dependency treebank of German. The goal of TüBa-D/DP is to offer high-qualitity syntactic annotations for a huge amount of contemporary German text. The annotations following the TüBa-D/Z UD annotation guidelines (Çöltekin et al., 2017) as closely as possible.
Each text of the TüBa-D/DP is annotated with the following layers:
- Universal part-of-speech tags
- STTS part-of-speech tags
- Inflectional morphology (UD and TüBa-D/Z)
- Topological fields
- UD dependency relations
A more detailed description of the annotation guidelines can be found in the stylebook.
Add links to the treebanks in TüNDRA here.
- The political speeches corpus is provided by Adrien Barbaresi under the Creative Commons Attribution-ShareAlike 4.0 International License.
- The Wikipedia subcorpus is licensed under the Creative Commons Attribution-ShareAlike 3.0 Unported License.
- The raw text of 'die tageszeitung' used in the corpus is copyright of contrapress media GmbH, Berlin. Licenses will be granted on a case-by-case basis at the discretion of the copyright holder, and may include charges or restrictions on the data use. Please contact tuebadz-info for more information.
Please cite the following reference if you use this treebank in your work:
TüBa-D/DP stylebook, Daniël de Kok and Sebastian Pütz, 2019, Seminar für Sprachwissenschaft, University of Tübingen