Tensorflow

TensorFlow (TM) is an open source software library for numerical computation using data flow graphs.

Tensorflow 2.1

Tensorflow 2.1 is installed to the module python 3.6.7 directories. It is currently only generally available on the cluster nodes in the ‘gpu-short’ and ‘gpu-long’ partitions respectively. In order to use Tensorflow with GPU support - you’ll need to explicitly reqeuest the GPU(s) as follows:

sbatch -p gpu-long --gres=gpu:1 tensorf.slurm

This version of tensorflow is not MPI enabled. Example slurm script:

#!/bin/bash
#SBATCH --mail-user=boswald@uidaho.edu
#SBATCH --mail-type=BEGIN,END

echo $(hostname)
nvidia-smi -L
module load python/3.6.7 cuda/10.1
python /mnt/lfs2/benji/tf_test.py

Here is an example tensorflow test script that will work on multiple versions of Tensorflow:

#!/usr/bin/python

#checks to see if GPU's are available to tensorflow

import tensorflow as tf
import re

match = re.search(r'api\.v2',str(tf.version))
if match:
    # test for GPU's new way
    tf.config.list_physical_devices('GPU')
else:  
    # Use the older method for Tensorflow v1.X
    # Creates a graph.
    a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a')
    b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b')
    c = tf.matmul(a, b)
    # Creates a session with log_device_placement set to True.
    sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))
    # Runs the op.
    print(sess.run(c))
exit()

Tensorflow 1.X

Tensorflow 1.X is installed to the module python 3.5.2 directories. It is currently only generally available on the cluster nodes in the ‘gpu-short’ and ‘gpu-long’ partitions respectively. In order to use Tensorflow with GPU support - you’ll need to explicitly reqeuest the GPU(s) as follows:

sbatch -p gpu-long --gres=gpu:1 tensorf.slurm

Tensorflow is compiled against OpenMPI - so you’ll need to load that module as well regardless of whether you use the MPI features. Example slurm script:

#!/bin/bash
#SBATCH --mail-user=boswald@uidaho.edu
#SBATCH --mail-type=BEGIN,END

echo $(hostname)
nvidia-smi -L
module load python/3.5.2 openmpi/1.10.2 cuda/10.0
python /mnt/lfs2/benji/tf_test.py

Example Tensorflow script (tf_test.py):

#!/usr/bin/python

import tensorflow as tf
# Creates a graph.
a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a')
b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b')
c = tf.matmul(a, b)
# Creates a session with log_device_placement set to True.
sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))
# Runs the op.
print(sess.run(c))
exit()