GPU Workloads
Overview
Bacalhau supports running jobs on GPUs out of the box. This guide covers how to set up and use GPUs with Bacalhau.
Supported GPU Types
Bacalhau currently supports:
- NVIDIA GPUs
- AMD GPUs
- Intel GPUs
These are only available with the Docker executor.
Prerequisites
Basic Requirements
- Docker installed
- Appropriate GPU drivers for your hardware
GPU-Specific Setup
NVIDIA GPUs
- Install NVIDIA GPU Drivers
- Install NVIDIA Container Toolkit (nvidia-docker2)
- Verify with
nvidia-smi
command
AMD GPUs
- Install AMD GPU drivers
- Set up Docker for ROCm following this guide
- Verify with
rocm-smi
command
Intel GPUs
- Install Intel GPU drivers
- Set up Docker for Intel GPUs following this guide
- Verify with
xpu-smi
command
Running GPU Jobs
Command Line
Use the --gpu
flag to specify the number of GPUs your job requires:
bacalhau docker run --gpu=1 nvidia/cuda:11.0.3-base-ubuntu20.04 nvidia-smi
Using YAML
You can also submit GPU jobs using YAML configuration:
Name: gpu-test-job
Type: batch
Count: 1
Tasks:
- Engine:
Type: docker
Params:
Image: 'nvidia/cuda:11.6.2-base-ubuntu20.04'
Entrypoint:
- /bin/bash
Parameters:
- -c
- nvidia-smi && echo 'GPU is working!'
Name: TestGPU
ResourcesConfig:
CPU: '1'
Memory: '1GB'
Disk: '10GB'
GPU: '1'
Important Notes
- Your container must include the appropriate CUDA runtime and be compatible with the CUDA version on the node
- GPU access can be controlled using resource limits
- The Bacalhau network must have executor nodes with GPUs exposed