GPU Nodes

RCDS now has ten generally available GPU Nodes. To use the nodes, you’ll need to submit jobs using ‘sbatch’ on fortyfour.ibest.uidaho.edu. To request one or more GPU’s, use an sbatch command like:

sbatch -p gpu-long --gres=gpu:1 ascript.slurm

The GPU nodes in the ‘gpu-long’ partition have a job time limit of one week. Here are the node specs:

Node(s) GPU GPU RAMSys RAMProcessor
n105 2x Nvidia GTX 1080Ti 11GB 128 GB Xeon E5-2620 v4 (16 cores)
n106 2x Nvidia T4 16GB 64 GB 2x Xeon E5-2680 v2 (40 cores)
n110-113 2x Nvidia GTX 1080Ti 11GB 128 GB 2x Xeon E5-2623 v4 (8 cores each)
n114 2x Nvidia RTX 2080Ti 12GB 128 GB Ryzen 1920X (12 cores)
n118 1x Nvidia Titan RTX 24GB 128 GB Ryzen 2920X (12 cores)
n120-121 2x Nvidia T4 16GB 192 GB 2x Xeon Silver 4216 (32 cores each)

There is also a single GPU node in the gpu_short partition. This lesser spec’d node (8 cores, 1x Nvidia 1080 Ti, 64G RAM) is intended for short jobs, and debugging scripts. The time limit on the gpu-short partition is 24 hours.

The GPU nodes all have Tensorflow installed in the Python modules.

IMCI

The Institute for Modeling Collaboration and Innovation (IMCI) purchased five GPU nodes for use by researchers associated with CMCI (and with special permission).

  • Nodes 65-66, which are Dell R730 rack servers, each with two NVIDIA Tesla K80 cards. Each card has 9,984 gpu cores, so between these two nodes there are 19,968 cores!
  • Nodes 115-116 are Supermicro servers with two NVIDIA V100 GPUs.
  • Node 123, a Supermicro server with four NVIDA A100 GPUs.

To use these nodes, you must be a member of the ‘cmci’ group, and you have to specifically request the nodes by submitting your job to the cmci partition.

sbatch -p cmci-gpu --gres=gpu:1 job_script.slurm

In order to choose the type of GPU you’d like to use, add it to the gres specification

sbatch -p cmci-gpu --gres=gpu:v100:2 job_script.slurm

Software

Applications must be compiled specifically to use these GPU nodes, currently compiled software includes:

  • GROMACS
  • TensorFlow