Tübingen's Partially Parsed Corpus of Written German - TüPP-D/Z
TüPP-D/Z is a collection of articles from the daily newspaper, "die tageszeitung", which have been automatically annotated with clause structure, topological fields, and chunks, in addition to more low level annotation including parts of speech and morphological ambiguity classes.
The TüPP-D/Z data of the current release is taken from the 1999 HTML distribution (scientific edition) of the "tageszeitung", which includes newspaper articles from September 2, 1986 up to May 7, 1999 and which amounts to more than 200 million word tokens of text.