CUDA, cuDNN and NCCL for Anaconda Python

Access GPU CUDA, cuDNN and NCCL functionality are accessed in a Numpy-like way from CuPy. CuPy also allows use of the GPU in a more low-level fashion as well.

Before starting GPU work in any programming language realize these general caveats:

  • I/O heavy workloads may make realizing GPU benefits more difficult
  • Consumer GPUs (GeForce) can be > 10x slower than workstation class (Tesla, Quadro)


You must have a discrete Nvidia GPU in your laptop or desktop. Check for existence of an Nvidia GPU in your computer by:

  • Linux: a blank response means an Nvidia GPU is not detected.

    lspci | grep -i nvidia
  • Windows: Look under the “render” tab to see if an Nvidia GPU exists.



  1. Determine the Compute Capability of your model GPU and install the correct CUDA Toolkit version.
  2. CuPy is installed distinctly depending on the CUDA Toolkit version installed on your computer.
  3. reboot or import cupy will fail with errors like:

AttributeError: type object ‘cupy.core.core.Indexer’ has no attribute ‘reduce_cython

Check CuPy

CuPy syntax is very similar to Numpy. There are a large set of CuPy functions relevant to many engineering and scientific computing tasks.

import cupy

dev = cupy.cuda.Device()
print('Compute Capability', dev.compute_capability)
print('GPU Memory', dev.mem_info)

The should return like:

Compute Capability 75

If you get error like

cupy.cuda.runtime.CUDARuntimeError: cudaErrorInsufficientDriver: CUDA driver version is insufficient for CUDA runtime version

This means the CUDA Toolkit version is expecting a newer Nvidia driver. The Nvidia driver can be updated via your standard Nvidia update program that was installed from the factory. “Table 1” of the CUDA Toolkit release notes gives the CUDA Toolkit required Driver Versions.