How can an NVIDIA vGPU VM determine the correct NVIDIA-grid driver version?

Question

NVIDIA GRID driver installation: https://docs.nvidia.com/grid/latest/grid-vgpu-user-guide/index.html#installing-vgpu-drivers-linux

I work with an environment where we have multiple hosts with Tesla cards, each serving slices of vGPU to client VMs, but with a small number of different versions of the NVIDIA GRID driver. We install the NVIDIA GRID driver automatically, but I want to go from "one size fits most" (where the majority driver is selected for installation and differences are fixed up manually), to a fully automated solution.

I cannot find in the NVIDIA documentation how to query the host computer to determine what that version should be. It seems like it should be through either lspci, dmesg, or nvidia-smi tools. But:

# lspci|grep VGA
02:00.0 VGA compatible controller: NVIDIA Corporation GV100GL [Tesla V100 PCIe 16GB] (rev a1)

No clue there. dmesg only tells you a version when the module loads successfully, i.e. the version matches; and nvidia-smi tells us there aren't any cards until the version matches.

Is there some information fed to the client, or is this a lost cause? (i.e. tell the host maintainers to use a single consistent version)

How can an NVIDIA vGPU VM determine the correct NVIDIA-grid driver version?

0 Answers0