device Stanza
Placement | job -> group -> task -> resources -> device |
The device
stanza is used to create both a scheduling and runtime requirement
that the given task has access to the specified devices. A device is a hardware
device that is attached to the node and may be made available to the task.
Examples are GPUs, FPGAs, and TPUs.
When a device
stanza is added, Nomad will schedule the task onto a node that
contains the set of device(s) that meet the specified requirements. The device
stanza
allows the operator to specify as little as just the type of device required,
such as gpu
, all the way to specifying arbitrary constraints and affinities.
Once the scheduler has placed the allocation on a suitable node, the Nomad
Client will invoke the device plugin to retrieve information on how to mount the
device and what environment variables to expose. For more information on the
runtime environment, please consult the individual device plugin's documentation.
See the device plugin's documentation for a list of supported devices.
In the above example, the task is requesting two GPUs, from the Nvidia vendor,
but is not specifying the specific model required. Instead it is placing a hard
constraint that the device has at least 2 GiB of memory and that it would prefer
to use GPUs that have at least 4 GiB. This examples shows how expressive the
device
stanza can be.
Device supported is currently limited to Linux, and container based drivers due to the ability to isolate devices to specific tasks.
device
Parameters
name
(string: "")
- Specifies the device required. The following inputs are valid:<device_type>
: If a single value is given, it is assumed to be the device type, such as "gpu", or "fpga".<vendor>/<device_type>
: If two values are given separated by a/
, the given device type will be selected, constraining on the provided vendor. Examples include "nvidia/gpu" or "amd/gpu".<vendor>/<device_type>/<model>
: If three values are given separated by a/
, the given device type will be selected, constraining on the provided vendor, and model name. Examples include "nvidia/gpu/1080ti" or "nvidia/gpu/2080ti".
count
(int: 1)
- Specifies the number of instances of the given device that are required.constraint
(Constraint: nil)
- Constraints to restrict which devices are eligible. This can be provided multiple times to define additional constraints. See below for available attributes.affinity
(Affinity: nil)
- Affinity to specify a preference for which devices get selected. This can be provided multiple times to define additional affinities. See below for available attributes.
device
Constraint and Affinity Attributes
The set of attributes available for use in a constraint
or affinity
are as
follows:
Variable | Description | Example Value |
---|---|---|
${device.type} | The type of device | "gpu", "tpu", "fpga" |
${device.vendor} | The device's vendor | "amd", "nvidia", "intel" |
${device.model} | The device's model | "1080ti" |
${device.attr.<property>} | Property of the device | ${device.attr.memory} => 8 GiB |
For the set of attributes available, please see the individual device plugin's documentation.
Attribute Units and Conversions
Devices report their attributes with strict types and can also provide unit information. For example, when a GPU is reporting its memory, it can report that it is "4096 MiB". Since Nomad has the associated unit information, a constraint that requires greater than "3.5 GiB" can match since Nomad can convert between these units.
The units Nomad supports is as follows:
Base Unit | Values |
---|---|
Byte | Base 2: KiB, MiB, GiB, TiB, PiB, EiB Base 10: kB, KB (equivalent to kB), MB, GB, TB, PB, EB |
Byte Rates | Base 2: KiB/s, MiB/s, GiB/s, TiB/s, PiB/s, EiB/s Base 10: kB/s, KB/s (equivalent to kB/s), MB/s, GB/s, TB/s, PB/s,EB/s |
Hertz | MHz, GHz |
Watts | mW, W, kW, MW, GW |
Conversion is only possible within the same base unit.
device
Examples
The following examples only show the device
stanzas. Remember that the
device
stanza is only valid in the placements listed above.
Single Nvidia GPU
This example schedules a task with a single Nvidia GPU made available.
Multiple Nvidia GPU
This example schedules a task with a two Nvidia GPU made available.
Single Nvidia GPU with Specific Model
This example schedules a task with a single Nvidia GPU made available and uses the name to specify the exact model to be used.
This is a simplification of the following:
Affinity with Unit Conversion
This example uses an affinity to tell the scheduler it would prefer if the GPU had at least 1.5 GiB of memory. The following are both equivalent as Nomad can do unit conversions.
Specified in GiB
:
Specified in MiB
: