Sunday, September 22, 2019

Google Cloud Service GPU Profiling: K80/T4 vs GTX1060

Recently I tried training neural network using Google cloud service. The user experience is in general good. The 300$ free credit also helps. To support deep learning task, one can create virtual machines in Google cloud service which contain both CPU and GPU. There are a few options of GPU ranging from K80 to V100 depending on how heavy the computation task is and user's affordability. I compared the performance of two GPUs: K80 and T4 with my local machine. Here are the results:

The benchmark used is a Keras example based on MNIST data set. The metric for profiling is the duration used for one epoch of training. The table below shows the profiling results:


GPU
CPU
Time
GTX1060 6GB GDDR5
Intel 6-Core i5, 16G RAM
35s
Tesla K80, 12GB GDDR5
4vCPU, 26G RAM
8s
Tesla T4, 16GB GDD6
4vCPU, 26G RAM
4s


GTX1060 is the GPU used in my local machine. The profiling results show that for this MNIST benchmark, the time used by K80 is about one fourth and the time used by T4 is about one ninth of that of my local machine. Note that when CUDA/Tensorflow libraries are not set up correctly, the computation might be done in CPU and the computation time will be increased drastically. For example, the duration of training one epoch will increase from 8s to 84s in case the computation is done in CPU not GPU. Thus when computation time is unexpectedly long in Google cloud service, please confirm that the computation is performed indeed in GPU.