The TCML cluster

The Training Center for Machine Learning (TCML) is a BMBF funded project which consists of a large GPU cluster. The GPU cluster is administered by the Cognitive Systems group in the Department of Computer Science (Wilhelm-Schickard-Institute). The following research groups are project partners in the TCML project and their members and students have the right to use the cluster for their research:

  1. PD Dr. Philipp Berens (Neural Data Science)
  2. Prof. Dr. Martin Butz (Cognitive modelling)
  3. Jun.-Prof. Dr. Enkeleyda Kasneci (Perception Eng.)
  4. Prof. Dr. Hendrik Lensch (Computergraphik)
  5. Prof. Dr. Ulrike von Luxburg (Machine Learning)
  6. Prof. Dr. Kay Nieselt (Integrative Transcriptomics)
  7. Prof. Dr. Nico Pfeifer (Medical Informatics)
  8. Prof. Dr. Wolfgang Rosenstiel (Computer Eng. , BCI)
  9. Prof. Dr. Andreas Schilling (Media Inf., Visual Comp.)
  10. Prof. Dr. Felix Wichmann (Neural Inform. Processing)
  11. Prof. Dr. Andreas Zell (Cognitive Systems) (administrators)

This cluster is intended to be used by various groups of people. The master and bachelor students working in the field of machine learning can benefit from the powerful computation capabilities in their practical assignments in courses and their thesis/projects research. It is also intended to be used for training courses on machine learning for participants from industry.

Overview of the cluster

The overview of the cluster can be seen in the figure. To separate and control computations and data storage/transfer, the cluster is divided into compute, data and head nodes.

The compute nodes: There are 40 compute nodes on which all the computations are performed. Each compute node has the following hardware specifications:

  • 2 TB SSD disk space
  • 256 GB memory
  • Intel XEON CPU E5-2650 v4
  • 4x GeForce GTX 1080 Ti

The data nodes: In addition there are two data nodes which hold all data on the cluster. The hardware specifications of each data node is as follows:

  • 73 TB disk space
  • 135 GB memory
  • Intel XEON CPU E5-2620 v4

The head node: The head (master) node controls all the functionalities of the cluster in a centralized manner. This makes sure that the jobs (tasks) on the cluster are scheduled and monitored properly. The users are only allowed to access this node and submit their jobs here.

A workload manager (Slurm) schedules the jobs based on the requested resources, availability of resources and priority of tasks in an optimal way. Slurm allocates different compute nodes to different jobs in an intelligent manner.

Documentation Link

The detailed documentation can be found on this link. It contains detailed information about the hardware and software of the cluster. There is also an example on how to run your code on the cluster using the workload manager (Slurm) and the container virtualization software (Singularity).

Applying for a User Account

If you need an account on the cluster, write an email with the following content to tcml-contactspam prevention@listserv.uni-tuebingen.de. With this email you accept our privacy policy:

Subject: Account application *your name*
Body:
First Name: *your first name*
Last Name: *your last name*
Department: *your department*
Phone number: *your university phone number if available*
Student ID: *your matriculation number if available*
Research Group/ Chair: *your research group (aka chair)*
Position: *Professor, Ph.D. student, university employee, student, ... *
Linux User ID: *Optional. For compatible data transfer from other WSI,ZDV clusters it might be useful to provide your ZDV linux user ID here. You can find the latter by using the id command when logged in to your ZDV account.*
Reason: *For what reason do you want to use the cluster? What exactly do you want to compute on it?*
Comments: *your further comments, questions ...*

After applying successfully, you will get an email with your username and password. You can find your ZDV Linux user ID by using the id command when logged in to you ZDV account.

Important notes:

  • After receiving your password, log in to the cluster and change your password with the passwd command
  • It is not allowed to use the account for anything else as specified in your given reason. If you use it for any other purpose your account will be removed.
  • If any illegal actions or actions prohibited by the university were done, the university can take disciplinary action against the individual which might lead to exmatriculation in extreme cases.
  • Members of research groups that are project partners in TCML project will get precedence. If you are not a member, you can still apply for an account. However, we cannot guarantee an account for external users since it depends upon the current load of the cluster.
  • If your application is successful, you will be automatically added to the mailing list tcml-user@listserv.uni-tuebingen.de, after which you will start getting important news concerning the cluster. You can unsubscribe from the mailing list with an empty email to tcml-users-leave@listserv.uni-tuebingen.de

Logging in to the Head Node

To use the server, the users must login to the head node and execute their tasks from there. The head node has the following IP address or alias:
IP address: 134.2.17.101
Alias: tcml-master01.uni-tuebingen.de

To access the head node, use ssh in the following way:

$ ssh username@tcml-master01.uni-tuebingen.de

Privacy policy

We use the data provided to us only to create a user account on the cluster and to contact you with issues concerning the cluster. We wont share given data with anyone. Administrators are allowed to delete your data after warning you beforehand. As long as it does not contradict to the former three sentences, the privacy policy of the Eberhard Karls Universitaet Tuebingen holds.

Contact

If you have any question, complaint, suggestion of improvement or anything else feel free to contact us via tcml-contactspam prevention@listserv.uni-tuebingen.de.