Install NVIDIA GPU drivers
Zuletzt aktualisiert am
In this guide, you will learn how to install drivers for NVIDIA GPU machine types:
You can install the NVIDIA drivers via package manager on the Linux distribution of your host system. The current distributions you can choose from are found in the NVIDIA documentation. For the following example, we will use Ubuntu & Fedora and the APT package manager. For other Linux distributions, please consult the official website for the CUDA toolkit.
-
Install the CUDA repository public GPG key. Execute the following commands in order.
Terminal window wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2404/x86_64/cuda-keyring_1.1-1_all.debsudo dpkg -i cuda-keyring_1.1-1_all.debsudo apt-get update -
Install the driver.
Terminal window sudo apt-get install -y nvidia-open -
Install the fabric-manager (only for the machine type n3.104d.g8). The fabric manager is necassary to use the interconnection of multiple GPUs with NVIDIA NVSwitch.
Terminal window # The driver and the fabricmanager should have the same version e.g. version 560:sudo apt install nvidia-fabricmanager-560# Reboot your machine:sudo rebootStart the fabricmanager:sudo systemctl start nvidia-fabricmanager
To properly use the VMs that have a GPU attached to them, it is necessary to have the correct drivers installed. The necessary drivers as well as a list of supported Linux distributions can be found in the NVIDIA datacenter documentation. You can install them by using a .run-file.
-
Set the variable BASE_URL. It describes the first part of the download URL.
Terminal window BASE_URL=https://us.download.nvidia.com/tesla -
Set the variable
DRIVER_VERSION, which contains the currently most recent driver version.Terminal window DRIVER_VERSION=565.57.01 -
Use the
curlcommand to download the .run-file from NVIDIA.Terminal window curl -fSsl -O $BASE_URL/$DRIVER_VERSION/NVIDIA-Linux-x86_64-$DRIVER_VERSION.run -
Install the dependencies.
Terminal window sudo apt install build-essential -
Run the .run-file to install the drivers.
Terminal window sudo sh NVIDIA-Linux-x86_64-$DRIVER_VERSION.run
You can install the NVIDIA drivers via package manager on the Linux distribution of your host system. The current distributions you can choose from are found in the NVIDIA documentation. For the following example we will use Ubuntu & Fedora and the APT package manager. For other Linux distributions, please consult the official website for the CUDA toolkit.
-
Install the headers and development packages for the currently running kernel.
Terminal window sudo apt-get install linux-headers-$(uname -r) -
Install the CUDA repository public GPG key. Execute the following commands in order.
Terminal window wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-ubuntu2204.pinsudo mv cuda-ubuntu2204.pin /etc/apt/preferences.d/cuda-repository-pin-600wget https://developer.download.nvidia.com/compute/cuda/12.2.2/local_installers/cuda-repo-ubuntu2204-12-2-local_12.2.2-535.104.05-1_amd64.debsudo dpkg -i cuda-repo-ubuntu2204-12-2-local_12.2.2-535.104.05-1_amd64.debsudo cp /var/cuda-repo-ubuntu2204-12-2-local/cuda-*-keyring.gpg /usr/share/keyrings/ -
Update the the apt repository cache and install the driver and the CUDA toolkit.
Terminal window sudo apt-get updatesudo apt-get -y install cuda-driverssudo apt-get -y install cuda
How to use the NVIDIA System Management Interface (SMI)
Section titled “How to use the NVIDIA System Management Interface (SMI)”nvidia-smi (also NVSMI) provides monitoring and management capabilities for each of NVIDIA architecture families. It is provided along with the NVIDIA open drivers & the CUDA toolkit. An example for an NVSMI output:
nvidia-smi+-----------------------------------------------------------------------------------------+| NVIDIA-SMI 565.57.01 Driver Version: 565.57.01 CUDA Version: 12.7 ||-----------------------------------------+------------------------+----------------------+| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC || Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. || | | MIG M. ||=========================================+========================+======================|| 0 NVIDIA A100 80GB PCIe On | 00000000:05:00.0 Off | 0 || N/A 31C P0 45W / 300W | 1MiB / 81920MiB | 0% Default || | | Disabled |+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+| Processes: || GPU GI CI PID Type Process name GPU Memory || ID ID Usage ||=========================================================================================|| No running processes found |+-----------------------------------------------------------------------------------------+How to perform post-installation actions
Section titled “How to perform post-installation actions”Before you can use the CUDA toolkit and driver, you need to perform the following steps. The first one is adding the installation path to your PATH variable. If you used the .run installation method, just run, in the following we use cuda-12.2 as an example:
export PATH=/usr/local/cuda-12.2/bin${PATH:+:${PATH}}For the runfile installation, you also need to add the path used above to the LD_LIBRARY_PATH. In the following we use “cuda-12.2” as an example:
# For 64-bit operating systems:export LD_LIBRARY_PATH=/usr/local/cuda-12.2/lib64\ ${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
# For 32-bit operating systems:export LD_LIBRARY_PATH=/usr/local/cuda-12.2/lib\ ${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}If you used any of the package managers, you also need to add the path of Nsight Compute to the PATH variable:
export PATH=/opt/nvidia/nsight-compute/${PATH:+:${PATH}}