Install NVIDIA GPU drivers
Diese Seite ist noch nicht in deiner Sprache verfügbar. Englische Seite aufrufen
In this guide, you will learn how to install drivers for NVIDIA GPU machine types:
You can install the NVIDIA drivers via package manager on the Linux distribution of your host system. The current distributions you can choose from are found in the NVIDIA documentation. For the following example, we will use Ubuntu & Fedora and the APT package manager. For other Linux distributions, please consult the official website for the CUDA toolkit. Install the CUDA repository public GPG key. Execute the following commands in order:
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2404/x86_64/cuda-keyring_1.1-1_all.debsudo dpkg -i cuda-keyring_1.1-1_all.debsudo apt-get updateInstall the driver:
sudo apt-get install -y nvidia-openInstall the fabric-manager (only for the machine type n3.104d.g8). The fabric manager is necassary to use the interconnection of multiple GPUs with NVIDIA NVSwitch:
The driver and the fabricmanager should have the same version e.g. version 560:sudo apt install nvidia-fabricmanager-560Reboot your machine:sudo rebootStart the fabricmanager:sudo systemctl start nvidia-fabricmanagerTo properly use the VMs that have a GPU attached to them, it is necessary to have the correct drivers installed. The necessary drivers as well as a list of supported Linux distributions can be found in the NVIDIA datacenter documentation. You can install them by using a .run-file. Set the variable BASE_URL. It describes the first part of the download URL:
BASE_URL=https://us.download.nvidia.com/teslaSet the variable DRIVER_VERSION, which contains the currently most recent driver version:
DRIVER_VERSION=565.57.01Use the curl command to download the .run-file from NVIDIA:
curl -fSsl -O $BASE_URL/$DRIVER_VERSION/NVIDIA-Linux-x86_64-$DRIVER_VERSION.runInstall the dependencies:
sudo apt install build-essentialRun the .run-file to install the drivers:
sudo sh NVIDIA-Linux-x86_64-$DRIVER_VERSION.runYou can install the NVIDIA drivers via package manager on the Linux distribution of your host system. The current distributions you can choose from are found in the NVIDIA documentation. For the following example we will use Ubuntu & Fedora and the APT package manager. For other Linux distributions, please consult the official website for the CUDA toolkit. Install the headers and development packages for the currently running kernel:
sudo apt-get install linux-headers-$(uname -r)```Install the CUDA repository public GPG key. Execute the following commands in order:```bashwget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-ubuntu2204.pinsudo mv cuda-ubuntu2204.pin /etc/apt/preferences.d/cuda-repository-pin-600wget https://developer.download.nvidia.com/compute/cuda/12.2.2/local_installers/cuda-repo-ubuntu2204-12-2-local_12.2.2-535.104.05-1_amd64.debsudo dpkg -i cuda-repo-ubuntu2204-12-2-local_12.2.2-535.104.05-1_amd64.debsudo cp /var/cuda-repo-ubuntu2204-12-2-local/cuda-*-keyring.gpg /usr/share/keyrings/```Update the the apt repository cache and install the driver and the CUDA toolkit:```bashsudo apt-get updatesudo apt-get -y install cuda-driverssudo apt-get -y install cudaHow to use the NVIDIA System Management Interface (SMI)
Section titled “How to use the NVIDIA System Management Interface (SMI)”nvidia-smi (also NVSMI) provides monitoring and management capabilities for each of NVIDIAs
architecture families. It is provided along with the NVIDIA open drivers & the CUDA toolkit.
An example for an NVSMI output:
nvidia-smi +-----------------------------------------------------------------------------------------+ | NVIDIA-SMI 565.57.01 Driver Version: 565.57.01 CUDA Version: 12.7 | |-----------------------------------------+------------------------+----------------------+ | GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |=========================================+========================+======================| | 0 NVIDIA A100 80GB PCIe On | 00000000:05:00.0 Off | 0 | | N/A 31C P0 45W / 300W | 1MiB / 81920MiB | 0% Default | | | | Disabled | +-----------------------------------------+------------------------+----------------------+ +-----------------------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=========================================================================================| | No running processes found | +-----------------------------------------------------------------------------------------+How to perform post-installation actions
Section titled “How to perform post-installation actions”Before you can use the CUDA toolkit and driver, you need to perform the following steps. The first one is adding the installation path to your PATH variable. If you used the .run installation method, just execute, in the following we use “cuda-12.2” as an example:
export PATH=/usr/local/cuda-12.2/bin${PATH:+:${PATH}}For the runfile installation, you also need to add the path used above to the LD_LIBRARY_PATH. In the following we use “cuda-12.2” as an example:
For 64-bit operating systems: export LD\_LIBRARY\_PATH=/usr/local/cuda-12.2/lib64\\ ${LD\_LIBRARY\_PATH:+:${LD\_LIBRARY\_PATH}} For 32-bit operating systems: export LD\_LIBRARY\_PATH=/usr/local/cuda-12.2/lib\\ ${LD\_LIBRARY\_PATH:+:${LD\_LIBRARY\_PATH}}If you used any of the package managers, you also need to add the path of Nsight Compute to the PATH variable:
export PATH=/opt/nvidia/nsight-compute/${PATH:+:${PATH}}