CUDA, cuDNN and NCCL for Anaconda Python

Access GPU CUDA, cuDNN and NCCL functionality are accessed in a Numpy-like way from CuPy. CuPy also allows use of the GPU in a more low-level fashion as well.

Before starting GPU work in any programming language realize these general caveats:

  • I/O heavy workloads may make realizing GPU benefits more difficult
  • Consumer GPUs (GeForce) can be > 10x slower than workstation class (Tesla, Quadro)

CUDA requires a discrete Nvidia GPU. Check for existence of an Nvidia GPU by:

  • Linux: a blank response means an Nvidia GPU is not detected.

    lspci | grep -i nvidia
  • Windows: Look under the “render” tab to see if an Nvidia GPU exists.

    dxdiag

Determine the Compute Capability of the GPU and install the correct CUDA Toolkit. CuPy is installed distinctly depending on the CUDA Toolkit version installed on your computer. Reboot.

CuPy syntax is very similar to Numpy. There are a large set of CuPy functions relevant to many engineering and scientific computing tasks.

import cupy

dev = cupy.cuda.Device()
print('Compute Capability', dev.compute_capability)
print('GPU Memory', dev.mem_info)

The should return like:

Compute Capability 75

If you get error like

cupy.cuda.runtime.CUDARuntimeError: cudaErrorInsufficientDriver: CUDA driver version is insufficient for CUDA runtime version

This means the CUDA Toolkit version is expecting a newer Nvidia driver. The Nvidia driver can be updated via your standard Nvidia update program that was installed from the factory. “Table 1” of the CUDA Toolkit release notes gives the CUDA Toolkit required Driver Versions.

Examples:

Alternatives to CuPy include Numba.cuda, which is a lower-level C-like CUDA interface from Python. CUDA for Julia is provided in JuliaGPU. Anaconda Accelerate was discontinued