HPC servers¶
Introduction¶
There are many possibilities to do High Performance Computing (HPC) at the TU Delft. You can use the new DelftBlue cluster, the Delft AI Cluster or the stand-alone servers of the department.
The HPC servers at the department are available for calculations which cannot be executed on a desktop or laptop within an acceptable amount of time. These servers are equipped with a large amount of cores and/or memory. Some of them also have powerful GPU cards or high-speed network-connections (Infiniband).
We encourage our users to use social load balancing (a.k.a. social queueing). With that people monitor the current usage of the system by their co-workers and use the system accordingly trying to share the resources evenly. It also means people communicate with each other on how to use the system. If you plan to do huge calculations you’re invited to ask your fellow co-workers if they can spare you some calculation-time or resources for this.
Important
In the end it’s all about sharing the available resources, respecting your co-workers and helping each other out.
A distinction is made between standalone servers, head-nodes and cluster-nodes. Although all these computers are more or less the same, their usage and network interconnection differ:
- Standalone servers
- servers that allow direct login and can be used interactively. Jobs are not executed by a workload manager (e.g. Slurm) used in clusters. Note that these servers do not provide any load balancing of the jobs and users should be nice to each other by applying social queueing (communicate with fellow users about the use of a server) and starting all jobs with
nice
. Thereby distributing cpu-time more fairly between users while maintaining enough cpu-time for the system to run smoothly (see more information withman nice
). - Head-node
- server is only used for sending jobs to a cluster and small pre- and/or post-processing of the data used in the cluster. Users login to the server directly and start jobs on the cluster with the workload manager (e.g. Slurm). The head-node is also used for monitoring the progress of the calculations on the cluster.
- Cluster-nodes
- or simply ‘nodes’ are only accessible by sending jobs via the workload manager on the head-node. These servers cannot be accessed directly and are typically not used interactively. In some cases the nodes have a high speed network connection like Infiniband which can be used for very fast transportation of data between jobs on different nodes (MPI).
To use the cpu power and the available memory as much as possible for research calculation the use of GUI (Graphic User Interface) should be kept to a minimum or at best not be used at all. Preferably graphic viewing of the results should be done on your local workstation (desktop or laptop) by transporting output data to your workstation.
For all HPC servers access is granted on a per-user basis. Contact the system administrator of MI or CI for more information.
Standalone HPC Linux¶
Department ImPhys¶
The department ImPhys has 2 standalone HPC servers:
name | cores | cpu | memory | gpu | compute cap. |
---|---|---|---|---|---|
jupiter-imphys.tnw.tudelft.nl |
96 | 2x AMD EPYC 9454 2.75GHz | 1.5TB | 2x A40 | 8.6 |
saturn-imphys.tnw.tudelft.nl |
96 | 2x AMD EPYC 9454 2.75GHz | 1.5TB | - | - |
These servers run Ubuntu 24.04 LTS.
More information on the use of these servers can be found here:
Cluster Optics¶
The group Optics has 2 standalone HPC servers. Roland Horsten has more information about these computers.
Cluster CI¶
The group CI currently has 3 standalone HPC servers:
name | cores | cpu | memory | gpu | compute cap. |
---|---|---|---|---|---|
hpc24 |
48 | 2x Intel E5-2670 v3 | 256GB | 4x Tesla K40c | 3.5 |
hpc29 |
80 | 2x Intel Gold 6148 | 384GB | 4x Tesla P100 | 6.0 |
hpc30 |
80 | 2x Intel Gold 6148 | 384GB | 1x Tesla P100 | 6.0 |
These servers run CentOS 7.
More information on the use of these servers can be found here:
Note
All information about using the queueing system for sending jobs is not applicable for the hpc
-servers above as they are all configured as standalone server
Standalone HPC Windows¶
There are four virtual (VMWare) standalone HPC servers with Windows:
name | cores | cpu | memory | gpu | compute cap. |
---|---|---|---|---|---|
srv232 |
8 | Intel Gold 6248 | 64GB | - | - |
srv263 |
8 | Intel Gold 6248 | 64GB | - | - |
srv264 |
8 | Intel Gold 6248 | 64GB | - | - |
srv265 |
8 | Intel Gold 6248 | 64GB | - | - |
All these servers run Windows Server 2016 and have installed:
- Intel compiler
- Visual Studio
- Matlab
- PZFlex (only installed on
srv232
)
For more information and access please contact the system administrator.
Delft AI Cluster¶
The Delft AI Cluster DAIC was previously named HPC Cluster and before that INSY Cluster. Its usage is focussing more and more toward AI (deeplearning) providing nodes with powerful GPU’s. Only groups/departments that contribute to the cluster are allowed to use it. Through the recent contribution of two PI’s within our department all members of ImPhys can use this cluster. Please contact Ronald to get access.
These nodes can be used with Slurm via the head-nodes login.hpc.tudelft.nl
. SSH-access to these head-nodes is only possible through linux-bastion.tudelft.nl
for employees and student-linux.tudelft.nl
for students. You cannot connect from your local machine to these head-nodes, see Connection. All nodes in this cluster run CentOS 7.
For all users of the ImPhys department a special Slurm partition named imphys
is provided. Jobs submitted in this partition gives unrestricted right to the nodes bought by members of our department.
More information (a lot more…) about the HPC cluster and how to submit a Slurm job can be found here:
Specifically on how to enable the use of the GPU:
- How can I use a GPU?. If you would like to use ‘our’ nodes with the Nvidia V100 GPU cards use this:
#SBATCH --gres=gpu:v100
DelftBlue cluster¶
In 2022 the new DelftBlue cluster has been taken into production and can be used by all researchers and students at the TU Delft. The cluster consists of 218 nodes with 48 cores and 192GB memory, 10 GPU nodes with each 4 V100S NVIDIA cards, 6 fat nodes with 768GB memory and 4 fat nodes with 1536GB memory.
The cluster uses Slurm queueing and can be accessed via the head-nodes login.delfblue.tudelft.nl
.
More information can be found on:
Note
Make sure you at least read all information on how to get an account and on how to access DelftBlue. The document size is searchable and has a lot of information on how to use the cluster and the software available.
Connection¶
Linux standalone servers and head-node¶
To give commands on the Linux servers you can use the SSH protocol to open a command-line terminal. For more information on using ssh, see this link. All interactive servers and head-nodes are accessible from the internet by appending tudelft.net
after the server-name. For example if you would like to connect the hpc24
you would enter:
$ ssh <netid>@hpc24.tudelft.net
Note
Replace <netid>
with your netid
Important
If you cannot access the shared storages on /tudelft
or /tudelft.net
check with klist
if you have the right Kerberos credentials. You can get these credentials with the command kinit
. See more information here: https://hpcwiki.tudelft.nl/index.php/How_to_log_in#How_to_access_your_Windows_network_shares
If your computer cannot access the server directly you can use the ssh-entry point of the TU Delft linux-bastion.tudelft.nl
for employees and student-linux.tudelft.nl
for students and from there login to the server. Do not use the linux-bastion-ex.tudelft.nl
! In this case you would enter:
# login to bastion host
$ ssh <netid>@linux-bastion.tudelft.nl
# or for students
$ ssh <netid>@student-linux.tudelft.nl
⋮
# from the bastion host login to the server
-bash-4.1$ ssh <netid>@hpc24.tudelft.net
Important
This is always the case with the head-nodes of the HPC cluster login.hpc.tudelft.net
and the head-nodes of the DelfBlue cluster login.delftblue.tudelft.nl
Windows servers¶
The Windows servers can be accessed with RDP. This can be done from your local computer or via the Citrix Workspace.