Heterogenous Primary Research Data of the SFB 833 - Representation and Processing

The aim of the INF project was to archive the primary research data produced by the SFB 833 in a sustainable fashion in order to support the following desiderata:
1. the requirement of the Deutsche Forschungsgemeinschaft (DFG) to make all primary research data created by DFG funding available to the research community for a period of at least ten years,
2. to realize the concept of an enhanced publication, where the publication as such is linked to the primary research data which formed the empirical basis of the analyses reported in the publication, and
3. to support the reusability of the data inside and outside of the SFB 833 as well as to support the replication of experimental results that have been published on the basis of the data sets in question.
Regarding the infrastructural aspects of the project, INF closely collaborated with the Center of Information, Communication and Media (IKM) of the University of Tübingen.


  • Hinrichs, E. (2021). Multilinguale Sprachressourcen für die linguistische Forschung. In H. Lobin, A. Witt & A. Wöllstein (Eds.), Deutsch in Europa: Sprachpolitisch, grammatisch, methodisch. Jahrbuch des Instituts für Deutsche Sprache 2020 (pp. 189-208). Berlin/Boston, MA: De Gruyter.
  • Hinrichs, E., Hinrichs, M., Kübler, S. & Trippel, T. (2019). Language technology for digital humanities: introduction to the special issue. Language Resources and Evaluation 53(4), 559-563. DOI: doi.org/10.1007/s10579-019-09482-4.
  • Trippel, T. & Zinn, C. (2019). Describing research data with CMDI – Challenges to establish contact with linked open data. In A. Pareja-Lora, M. Blume, B. C. Lust & C. Chiarcos (Eds.), Development of Linguistic Linked Open Data Resources for Collaborative Data-Intensive Research in the Language Sciences (pp. 99-114). Cambridge, MA: MIT-Press.
  • Trippel, T. & Zinn, C.  (2019). Lessons learned: On the challenges of migrating a research data repository from a research institution to a university library. Language Resources and Evaluation. link.springer.com/article/10.1007%2Fs10579-019-09474-4 DOI: 10.1007/s10579-019-09474-4.  
  • Hinrichs, E., et al. (2018). Bridging the LAPPS grid and CLARIN. In H. Mazo, A. Moreno, J. Odijk, S. Piperidis & T. Tokunaga (Eds.), Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018). European Language Resources Association (ELRA). http://www.lrec-conf.org/proceedings/lrec2018/pdf/662.pdf.
  • Hinrichs E., Trippel T., et al. (2018). Gute Forschungsdaten, bessere Forschung: wie Forschung durch Forschungsdatenmanagement unterstützt wird. Book of Abstracts -- Digital Humanities im deutschsprachigen Raum, March, Cologne, Germany. http://dhd2018.uni-koeln.de/wp-content/uploads/boa-DHd2018-web-ISBN.pdf.
  • Hinrichs, E., & Trippel, T. (2017). CLARIN-D: Eine Forschungsinfrastruktur für die sprachbasierte Forschung in den Geistes- und Sozialwissenschaften. Bibliothek - Forschung und Praxis 1(41). 
  • Chernov, A., Hinrichs, E. & Hinrichs, M. (2017). Search your own treebank. Proceedings of the Fifteenth International Workshop on Treebanks and Linguistic Theories (TLT15) (pp. 25-34). Bloomington, IN, January 2017.

  • Zinn, C., Trippel, T., Kaminski, S. & Dima, E. (2016). Crosswalking from CMDI to Dublin Core and MARC 21. Paper presented at the Tenth International Conference on Language Resources and Evaluation (LREC 2016). Portorož, Slowenia.
  • Trippel, T. & Zinn, C. (2015). DMPTY - A wizard for generating data management plans. Paper presented at the CLARIN Annual Conference 2015. Wrocław, Poland.

  • Zinn, C. & Trippel, T. (2016). Enhancing the quality of metadata by using authority control. Paper presented at the LREC 2016 Workshop “LDL 2016 – 5th Workshop on Linked Data in Linguistics: Managing, Building and Using Linked Language Resources”. Portorož, Slowenia.

  • Haaf, S., Fankhauser, P., Trippel, T., Eckart, K., Eckart, T., Hedeland, H., Herold, A., Knappen, J., Schiel, F., Stegmann, J. & van Uytvanck, D. (2014). CLARIN's Virtual Language Observatory (VLO) under scrutiny -- The VLO taskforce of the CLARIN-D centres. Paper presented at the CLARIN Annual Conference. Soesterberg, The Netherlands.

  • Broeder, D., van Uytvanck, D., Gavrilidou, M. & Trippel, T. (2012). Standardizing a component metadata infrastructure. In N. Calzolari, K. Choukri, T. Declerck, M. Uğur Doğan, B. Maegaard, J. Mariani, J. Odijk & S. Piperidis (Eds.), Proceedings of the 8th International Conference on Language Resources and Evaluation (LREC 2012). Istanbul: European Language Resources Association (ELRA).

  • Dima, E., Henrich, V., Hinrichs, E., Hinrichs, M., Hoppermann, C., Trippel, T., Zastrow, T. & Zinn, C. (2012). A repository for the sustainable management of research data. In N. Calzolari, K. Choukri, T. Declerck, M. Uğur Doğan, B. Maegaard, J. Mariani, J. Odijk & S. Piperidis (Eds.), Proceedings of the 8th International Conference on Language Resources and Evaluation (LREC 2012). Istanbul: European Language Resources Association (ELRA).

  • Dima, E., Hinrichs, E., Hoppermann, C., Trippel, T. & Zinn, C. (2012). A metadata editor to support the description of linguistic resources. In N. Calzolari, K. Choukri, T. Declerck, M. Uğur Doğan, B. Maegaard, J. Mariani, J. Odijk & S. Piperidis (Eds.), Proceedings of the 8th International Conference on Language Resources and Evaluation (LREC 2012). Istanbul: European Language Resources Association (ELRA).

  • Zinn, C., Hoppermann, C. & Trippel, T. (2012). The ISOcat registry reloaded: A re-engineering proposal fllowing schema.org. In E. Simperl, P. Cimiano, A. Polleres, O. Corcho & V. Presutti (Eds.), Proceedings of the 9th Extended Semantic Web Conference (ESWC 2012) (pp. 285-299). Lecture Notes in Computer Science (Bd. 7295). Berlin/Heidelberg: Springer. DOI: 10.1007/978-3-642-30284-8_26.

  • Barkey, R., Hinrichs, E., Hoppermann, C., Trippel, T. & Zinn, C. (2011). Komponenten-basierte Metadatenschemata und Facetten-basierte Suche: Ein flexibler und universeller Ansatz. In J. Griesbaum, T. Mandl & C. Womser-Hacker (Eds.), Information und Wissen: global, sozial und frei? Proceedings des 12. Internationalen Symposiums für Informationswissenschaft (ISI 2011), Hildesheim, 9. bis 11. März 2011 (pp. 62-73). Schriften zur Informationswissenschaft (Bd. 58). Boizenburg: VWH.

  • Barkey, R., Hinrichs, E., Hoppermann, C., Trippel, T. & Zinn, C. (2011). Trailblaizing through forests of resources in linguistics. Digital Humanties (DH), 19. - 22. Juni 2011. Stanford, CA: Stanford University.

  • Hoppermann, C., Trippel, T. & Zinn, C. (2011). Devil’s advocate on metadata in science. In H. Hedeland, T. Schmidt & K. Wörner (Eds.), Multilingual Resources and Multilingual Applications - Proceedings of the Conference of the German Society for Computational Linguistics and Language Technology (GSCL) 2011 (pp. 105-109). Arbeiten zur Mehrsprachigkeit (Folge B, Nr. 96, 2011). Hamburg: Universität Hamburg.

  • Zinn, C. (2011). Building a faceted browser in CouchDB using views on views and Erlang metaprogramming. In H. Kuchen (Ed.), Functional and Constraint Logic Programming: 20th International Workshop, WFLP 2011, Odense, Denmark, July 19, 2011, Proceedings (pp. 104-122). Lecture Notes in Computer Science (Bd. 6816). Berlin/Heidelberg: Springer. DOI: 10.1007/978-3-642-22531-4_7.


Shortcut to:

The INF project supports the other projects of the SFB 833, offering the following:

  • Repository of primary research data
  • Support for the creation of metadata
  • An infrastructure for web based experiments
  • Support for sustainable data formats
  • Search for and within resources
  • Technical support

On this page there are FAQs for SFB 833 members, answers provided by INF. Additionally there are shortcuts to services often used. Some services and sections require authentication and authorisation. For all of these fields the staff of INF supports the SFB 833 members, with the contact persons listed below. INF also offers training in the mentioned areas. If there are any questions the SFB 833 members should not hesitate to contact INF.


1. What is TALAR and how can I archive my data?
2. What kind of resource types (CMDI profiles) are there in TALAR?
3. The resource types in TALAR do not fit my project. Is there a possibility to still archive my data in TALAR?
4. How can I use existing descriptions as a template for my project?
5. Which are the most important data fields that I should fill in?
6. What is the COMEDI editor?
7. How can I share my descriptions with other users in COMEDI?
8. How can I download a metadata set in the COMEDI editor?
9. Some information in my project has changed. How can I change metadata retroac-tively?
10. How can I delete or update data archived in TALAR?
11. How do I cite research data?
12. Can I reference publications which were published together with the data?
13. Who has access to the data in TALAR and how can I hand out read permissions?
14. I am a new member of the SFB 833. How do I get a username and password for TALAR?
15. How secure is my data?