Zum Inhalt springen

Install NVIDIA GPU drivers

Zuletzt aktualisiert am

In this guide, you will learn how to install drivers for NVIDIA GPU machine types:

You can install the NVIDIA drivers via package manager on the Linux distribution of your host system. The current distributions you can choose from are found in the NVIDIA documentation. For the following example, we will use Ubuntu & Fedora and the APT package manager. For other Linux distributions, please consult the official website for the CUDA toolkit.

  1. Install the CUDA repository public GPG key. Execute the following commands in order.

    Terminal window
    wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2404/x86_64/cuda-keyring_1.1-1_all.deb
    sudo dpkg -i cuda-keyring_1.1-1_all.deb
    sudo apt-get update
  2. Install the driver.

    Terminal window
    sudo apt-get install -y nvidia-open
  3. Install the fabric-manager (only for the machine type n3.104d.g8). The fabric manager is necassary to use the interconnection of multiple GPUs with NVIDIA NVSwitch.

    Terminal window
    # The driver and the fabricmanager should have the same version e.g. version 560:
    sudo apt install nvidia-fabricmanager-560
    # Reboot your machine:
    sudo reboot
    Start the fabricmanager:
    sudo systemctl start nvidia-fabricmanager

How to use the NVIDIA System Management Interface (SMI)

Section titled “How to use the NVIDIA System Management Interface (SMI)”

nvidia-smi (also NVSMI) provides monitoring and management capabilities for each of NVIDIA architecture families. It is provided along with the NVIDIA open drivers & the CUDA toolkit. An example for an NVSMI output:

Terminal window
nvidia-smi
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 565.57.01 Driver Version: 565.57.01 CUDA Version: 12.7 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA A100 80GB PCIe On | 00000000:05:00.0 Off | 0 |
| N/A 31C P0 45W / 300W | 1MiB / 81920MiB | 0% Default |
| | | Disabled |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| No running processes found |
+-----------------------------------------------------------------------------------------+

Before you can use the CUDA toolkit and driver, you need to perform the following steps. The first one is adding the installation path to your PATH variable. If you used the .run installation method, just run, in the following we use cuda-12.2 as an example:

Terminal window
export PATH=/usr/local/cuda-12.2/bin${PATH:+:${PATH}}

For the runfile installation, you also need to add the path used above to the LD_LIBRARY_PATH. In the following we use “cuda-12.2” as an example:

Terminal window
# For 64-bit operating systems:
export LD_LIBRARY_PATH=/usr/local/cuda-12.2/lib64\
${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
# For 32-bit operating systems:
export LD_LIBRARY_PATH=/usr/local/cuda-12.2/lib\
${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}

If you used any of the package managers, you also need to add the path of Nsight Compute to the PATH variable:

Terminal window
export PATH=/opt/nvidia/nsight-compute/${PATH:+:${PATH}}