Cluster GPU Configuration
Automatic GPU Operator Configuration
Opni can utilize GPU acceleration to enable log anomaly detection for workload and application logs. If you would like Opni to learn from your own workloads, follow the instructions below to configure your cluster. If GPU acceleration is not enabled, Opni can still analyze your Control Plane logs using our pretrained models.
Opni bundles its own modified version of the Nvidia GPU Operator which is the recommended way to enable GPU acceleration. It may be possible to install the upstream GPU Operator via helm chart, but this is not recommended, as it will likely not work with Opni. Follow the instructions below to enable GPU acceleration using Opni's built-in GPU Operator.
Prerequisites
GPU Node Hardware Requirements:
- 1x NVIDIA GPU
- If not using VGPU, any Quadro or Tesla GPU will work.
- If using VGPU, one of the following GPUs is required:
- Tesla M6, M10, M60
- Tesla P4, P6, P40, P100
- Tesla V100
- Quadro RTX 6000/8000
- Tesla T4
- NVIDIA A10, A16, A30, A40, A100, RTX A5000/A6000
- MIG is not supported when using automatic GPU Operator configuration at this time.
- Ubuntu 20.04. No other host OS is supported at this time.
- The GPU node should not have an Nvidia driver or container runtime installed. The GPU Operator handles installing all the necessary drivers automatically.
Kubernetes Cluster Requirements:
Vanilla upstream Kubernetes, or one of the following distributions:
- RKE
- K3S >= v1.22.2+k3s1
- RKE2
- Amazon EKS
- Google GKE
- Azure AKS
Opni Operator installed
Cert-Manager installed
NFD installed (see deploy/examples/nfd_aio.yaml in the Opni repo)
Install Custom Resources
GPUPolicyAdapter
Creating a GPUPolicyAdapter resource will trigger the GPU Operator to start configuring the cluster.
If not using VGPU, the GPUPolicyAdapter resource does not need any customization, and can be created with an empty spec, as follows:
apiVersion: opni.io/v1beta1
kind: GpuPolicyAdapter
metadata:
name: gpu
spec: {}
(note that the empty brackets after spec:
are important)
If using VGPU, please follow the official documentation to build the VGPU driver
image, and create a GPUPolicyAdapter resource with the following spec, filling
in the driver
and licenseConfigMap
fields:
apiVersion: opni.io/v1beta1
kind: GpuPolicyAdapter
metadata:
name: vgpu
spec:
images:
driver: # Image name for the VGPU driver
vgpu:
licenseConfigMap: # ConfigMap containing gridd.conf and the client config token
licenseServerKind: nls
Once the GPUPolicyAdapter resource is created, the GPU Operator will begin
configuring the cluster. This takes a few minutes, and the container engine
will be restarted during this process. Once all the pods in the gpu-operator-resources
namespace complete, GPUs will be available for use and the Opni GPU Controller
pod should start.
Provider-specific Notes and Troubleshooting
K3S
- If you are using K3S, you must use v1.22.2+k3s1 or later. This version introduces automatic detection of Nvidia container runtimes on nodes, and a large number of bugfixes related to pod lifecycle were added in upstream Kubernetes for this release. If you are using an earlier version, you may experience issues where pods become stuck in the 'Terminating' state. If this happens, force-deleting the stuck pods should help resolve the issue.
RKE
RKE uses Docker as its container engine. If you are using RKE, you should be aware of how the Nvidia runtime interacts with Docker. When the GPU operator is installed for the first time, it will create a
RuntimeClass
object which allows Kubernetes to use thenvidia
runtime, which is configured in the docker engine itself. The RuntimeClass will look similar to this when using RKE:apiVersion: node.k8s.io/v1alpha1
kind: RuntimeClass
metadata:
name: nvidia
handler: dockerThe RuntimeClass
name
is used in a pod spec to identify the runtime to use. It is specified using theruntimeClassName
field:apiVersion: v1
kind: Pod
spec:
# ...
runtimeClassName: nvidia
# ...When using containerd for example, the
handler
field in theRuntimeClass
is used to select a specific container runtime by name. However, dockershim does not support selecting custom container runtimes, meaning the nvidia container runtime must be set as the default. The GPU Operator will still utilize aRuntimeClass
, but itshandler
must be set todocker
to work. Otherwise, you may see an error like this:Failed to create pod sandbox: rpc error: code = Unknown desc = RuntimeHandler "nvidia" not supported
If you encounter this error, check the
handler
field in theRuntimeClass
and ensure it is set todocker
.
VGPU
If you are using VGPUs, you should be aware of the following:
Nvidia gridd does not run on the virtualization host. Instead, it runs in guest VMs - when using the GPU Operator, gridd runs inside the nvidia driver daemonset pod.
The VGPU driver must match the driver on the host. When downloading the driver package from the Nvidia License Portal, you will be presented with two driver runfiles, for example:
NVIDIA-Linux-x86_64-470.63-vgpu-kvm.run
- Installed on the hostNVIDIA-Linux-x86_64-470.63.01-grid.run
- Installed on the guest (automatic with GPU operator)
Ensure the correct drivers are used for the host and guest.
CUDA drivers do not need to be installed on the virtualization host (and are not included by default with the VGPU driver)
VGPU mediated device UUIDs are not persistent across reboots. This can cause libvirt to fail after a reboot if you are running a VM with VGPUs. To fix this:
List instances on the host with
virsh list --all
(the instance will be shut off)Find the instance that has the VGPU. If you don't know which one it is, use the
virsh edit
command to view the XML configuration of each instance and search formdev
.Undefine the instance using
virsh undefine
and restart libvirtd.
You might not be able to determine from the host whether a guest VGPU is licensed. To check license status, you should open a shell to a pod running with the nvidia container runtime and use
nvidia-smi
from there.On the virtualization host, you may notice that your physical GPU is bound to the
nvidia
driver, even though thenvidia_vgpu_vfio
driver is available. The official docs are ambiguous on this, but your GPU should be bound to thenvidia
driver. The guest VGPU will also be bound to thenvidia
driver once the guest drivers are installed.
Other Notes
The
nvidia.com/gpu
resource request behaves counterintuitively. It serves as a node scheduling hint for pods, and to keep track of which/how many pods are using GPUs.nvidia.com/gpu
a first-class node resource just like cpu and memory. If a pod makes a request for one, the node it is scheduled on must have that resource available. The existence of this resource request does not equate to a pod taking exclusive ownership of the device. In fact, a pod can use a GPU without requesting annvidia.com/gpu
resource at all, as long as it has the correct environment variables andruntimeClassName
set. This is how the GPU Operator pods are configured - they do not request GPU resources by design, rather they are scheduled onto GPU nodes by other means and set up to use the nvidia container runtime.Regardless of
nvidia.com/gpu
resource requests or container runtime, GPUs will only be available if theNVIDIA_VISIBLE_DEVICES
andNVIDIA_DRIVER_CAPABILITIES
environment variables are set correctly. They should be set as follows:NVIDIA_VISIBLE_DEVICES=all
NVIDIA_DRIVER_CAPABILITIES=compute,utility
They can be set either in the pod spec, or in the docker image itself by using
ENV
in the Dockerfile.
If you are not using the GPU operator, but you are using the nvidia device plugin daemonset by itself, be aware that you must patch the daemonset to include
runtimeClassName: nvidia
in its pod template. The GPU operator will do this automatically, but the daemonset does not have it set by default.The GPU Operator installs the drivers in such a way that they are not directly visible or usable by the host. As such, tools like
nvidia-smi
will not be available, so you will need to exec into a pod running with the nvidia container runtime to interact with a GPU.A common way to pass a GPU through to a virtual machine is by using the vfio-pci driver. There are several tutorials online that explain how to set up vfio-pci. It is important to note that some Nvidia GPUs expose additional PCI devices (such as an audio or usb controller) in the same IOMMU group as the GPU device itself. When binding the GPU device to the vfio-pci driver, ensure that any additional devices belonging to the GPU are bound as well.