The overview of the cluster can be seen in the figure. To separate and control computations and data storage/transfer, the cluster is divided into compute, data and head nodes.
The compute nodes: There are 40 compute nodes on which all the computations are performed. Each compute node has the following hardware specifications:
- 2 TB SSD disk space
- 256 GB memory
- Intel XEON CPU E5-2650 v4
- 4x GeForce GTX 1080 Ti
The data nodes: In addition there are two data nodes which hold all data on the cluster. The hardware specifications of each data node is as follows:
- 73 TB disk space
- 135 GB memory
- Intel XEON CPU E5-2620 v4
The head node: The head (master) node controls all the functionalities of the cluster in a centralized manner. This makes sure that the jobs (tasks) on the cluster are scheduled and monitored properly. The users are only allowed to access this node and submit their jobs here.
A workload manager (Slurm) schedules the jobs based on the requested resources, availability of resources and priority of tasks in an optimal way. Slurm allocates different compute nodes to different jobs in an intelligent manner.