The eth0 interface is for public connectivity to the internet and eth1 is for private connectivity to other Droplets in the same VPC network. GPU multi-node applications must use interfaces starting from eth2, which are for GPU-to-GPU communication, while control traffic should use eth1 for private communication between nodes.
How to Configure Multi-Node GPU Droplets
Generated on 3 Jul 2026
DigitalOcean Droplets are Linux-based virtual machines (VMs) that run on top of virtualized hardware. Each Droplet you create is a new server you can use, either standalone or as part of a larger, cloud-based infrastructure.
To create multi-node GPU deployments, you must first contact support. Multi-node GPU deployments can only be created in multiples of 8 GPUs, and support needs to enable the specific Droplet plan slug for you to use when you create your GPU Droplets.
After creation, the configuration of the network that connects the GPUs using a NCCL or RCCL topology requires you to take additional steps, like configuring MTU or assigning IPv4 addresses to the GPU network cards.
Configure the GPU Network Interface Controllers
With the exception of NVIDIA B300, each GPU has a dedicated network interface controller (NIC); B300 Droplets will have two per GPU. This means that each multi-node ready Droplet will have additional interfaces, from eth2 to eth9, or from eth2 to eth17.
For rail-only fabric deployments, each NIC must have its own subnet that is disjoint from the others. For example, eth2 could use 192.68.50.0/24, eth3 could use 192.68.51.0/24, and so on. IPv6 link-local addresses are automatically assigned to these interfaces once they are active, making them a simpler option for GPU-to-GPU communication.
As for IPv4, each NIC needs a unique IP address on each subnet. We recommend using the same final octet in each subnet for a given Droplet. For example, one Droplet would have the addresses 192.68.50.2, 192.68.51.2, and so on. An additional Droplet would have 192.68.50.3, 192.68.51.3, and so on.
You can address the NICs in one of two ways:
-
With user data, which is useful if you intend to use a base image that doesn’t support Netplan, but requires a specific naming convention for your Droplets.
-
Manually with Netplan, which is useful if the Droplet naming convention for the user data script is not suitable for your needs.
-
Using Ansible, which is useful if you want to apply changes to an existing set of GPU Droplets.
To use our user data script, you must adopt a specific naming convention for your Droplets:
- The name must end with a hyphen,
-, followed by an integer between 1 to 254. For example,examplename-1. - The name must have no other hyphens.
Then, use the following cloud-config file when you create the Droplet (add eth10|eth11|eth12|eth13|eth14|eth15|eth16|eth17 in case of B300):
#cloud-config
write_files:
- path: /usr/sbin/gpu-fabric.sh
content: |
#!/bin/bash
IFACES=$(ip -br addr | grep eth | grep -E 'eth2|eth3|eth4|eth5|eth6|eth7|eth8|eth9' | awk '{print $1}')
subnet=50
octet=$(hostname | cut -d '-' -f 2)
for i in ${IFACES}; do
/usr/sbin/ip link set dev ${i} up
/usr/sbin/ip link set dev ${i} mtu 4200
ADDR="192.168.${subnet}.${octet}/24"
/usr/sbin/ip addr add dev ${i} ${ADDR}
subnet=$((subnet + 1))
done
/usr/sbin/ip -br addr
permissions: '0755'
bootcmd:
- /usr/sbin/gpu-fabric.sh
runcmd:
- /usr/sbin/gpu-fabric.shYou can pass this script when creating a GPU Droplet with doctl by using the -user-data-file flag.
You can use Netplan to configure the NICs. The AI/ML-ready image we provide for GPU Droplets includes Netplan support.
On each Droplet, open /etc/netplan/50-cloud-init.yaml and add the following block after eth1:
eth2:
dhcp4: false
dhcp6: false
link-local: []
addresses:
- 192.168.50.2/24
mtu: 4200
eth3:
dhcp4: false
dhcp6: false
link-local: []
addresses:
- 192.168.51.2/24
mtu: 4200
eth4:
dhcp4: false
dhcp6: false
link-local: []
addresses:
- 192.168.52.2/24
mtu: 4200
eth5:
dhcp4: false
dhcp6: false
link-local: []
addresses:
- 192.168.53.2/24
mtu: 4200
eth6:
dhcp4: false
dhcp6: false
link-local: []
addresses:
- 192.168.54.2/24
mtu: 4200
eth7:
dhcp4: false
dhcp6: false
link-local: []
addresses:
- 192.168.55.2/24
mtu: 4200
eth8:
dhcp4: false
dhcp6: false
link-local: []
addresses:
- 192.168.56.2/24
mtu: 4200
eth9:
dhcp4: false
dhcp6: false
link-local: []
addresses:
- 192.168.57.2/24
mtu: 4200You can optionally also edit the eth1 MTU to 9002. Additional stanzas for eth10 to eth17 will be needed to configure all the NICs available in NVIDIA B300 Droplets.
Save the file and apply the changes:
sudo netplan applyRepeat this process on every other Droplet, replacing the fourth octet each time. For example, change 192.168.50.2 to 192.168.50.3 on the next Droplet, then to 192.168.50.4 on the next, and so on.
You can use our gpu-fabric Ansible playbook to configure multi-node GPU Droplets:
A simple Ansible playbook to configure multi-node GPU Droplets.
The README of the repository has installation and usage instructions which are replicated here:
This content is automatically generated from https://github.com/digitalocean/gpu-fabric/blob/main/README.md.
This repository contains a simple Ansible playbook to configure multi-node GPU Droplets.
To use this playbook:
-
On the machine that you will use to run this playbook, first install Ansible and then clone this repository.
-
In the
inventory/dropletsfile in your cloned version of this repository, in the[multinode_gpu_droplets]section, specify the public IP addresses of your GPU Droplets. -
Ansible uses SSH under the hood to configure Droplets. If you have never connected to your Droplets with SSH and the
.ssh/configfile on your machine does not includeStrictHostKeyChecking no, add the following line to theinventory/dropletsfile:
ansible_ssh_common_args='-o StrictHostKeyChecking=no'- Save the file, then run the playbook from the root of the repository:
ansible-playbook -i inventory/droplets customer-play.yamlThe output of a successful run looks similar to the following:
PLAY [multinode_gpu_droplets] ***********************************************************************************
TASK [Gathering Facts] ******************************************************************************************
ok: [10.10.10.10]
TASK [read /etc/netplan/50-cloud-init.yaml] *********************************************************************
ok: [10.10.10.10]
TASK [extract /etc/netplan/50-cloud-init.yaml] ******************************************************************
ok: [10.10.10.10]
TASK [set a unique index for each droplet] **********************************************************************
ok: [10.10.10.10] => (item=10.10.10.10)
TASK [adjust /etc/netplan/50-cloud-init.yaml] *******************************************************************
ok: [10.10.10.10]
TASK [write /etc/netplan/50-cloud-init.yaml] ********************************************************************
ok: [10.10.10.10]
TASK [install lldp] *********************************************************************************************
ok: [10.10.10.10]
PLAY RECAP ******************************************************************************************************
10.10.10.10 : ok=7 changed=0 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
Verify Connectivity
You can check the IP addresses assigned to the fabric NICs:
ip -br aThis lists the network interfaces and their IP addresses, for example:
lo UNKNOWN 127.0.0.1/8 ::1/128
eth0 UP 162.243.220.179/24 10.13.0.5/16 fe80::4006:aff:fe4d:d7cb/64
eth1 UP 10.128.0.2/16
eth2 UP 192.168.50.1/24
eth3 UP 192.168.51.1/24
eth4 UP 192.168.52.1/24
eth5 UP 192.168.53.1/24
eth6 UP 192.168.54.1/24
eth7 UP 192.168.55.1/24
eth8 UP 192.168.56.1/24
eth9 UP 192.168.57.1/24Make sure these match the addresses you assigned.
Configure NCCL or RCCL
For the best performance with multi-node training using NCCL (NVIDIA GPUs) or RCCL (AMD GPUs), you must provide additional GPU-specific configuration on all Droplets in your multi-node deployment.
NVIDIA H100
For H100 GPUs you must download a NCCL topology file, then configure it in nccl.conf.
First, download the topology file and save it as /etc/nccl/topo.xml.
Then, edit /etc/nccl.conf to include the following lines:
NCCL_TOPO_FILE=/etc/nccl/topo.xml
NCCL_SOCKET_IFNAME==eth1
NCCL_CROSS_NIC=0
NCCL_NET_DISABLE_INTRA=1
NCCL_IB_TC=104
NCCL_IB_FIFO_TC=192NVIDIA H200
For H200 GPUs, edit /etc/nccl.conf to include the following lines:
NCCL_SOCKET_IFNAME==eth1
NCCL_CROSS_NIC=0
NCCL_NET_DISABLE_INTRA=1
NCCL_IB_TC=104
NCCL_IB_FIFO_TC=192AMD MI3XX Family
AMD MI3XXX GPUs, for example, edit /etc/rccl.conf to include the following lines:
NCCL_SOCKET_IFNAME==eth1
NCCL_CROSS_NIC=0
NCCL_PXN_DISABLE=0
NCCL_NET_DISABLE_INTRA=1
NCCL_IB_TC=104
NCCL_IB_FIFO_TC=192
NCCL_IB_GID_INDEX=1Note that you may need to update the NCCL_IB_GID_INDEX value for your environment. See GID Index Selection for more information.
NVIDIA B300
For B300 GPUs, edit /etc/nccl.conf to include the following lines:
NCCL_SOCKET_IFNAME==eth1
NCCL_CROSS_NIC=0
NCCL_NET_DISABLE_INTRA=1
NCCL_IB_TC=104
NCCL_IB_FIFO_TC=192GID Index Selection
Both NCCL and RCCL use GID indexes to determine which RoCE version and address family to use for RDMA communication. By default, NCCL will auto-select an appropriate index and RCCL will default to an index 0. If either of these are incorrect you must manually set the GID index based on the configuration of the RDMA device. The available indexes can be found by running show_gids for ConnectX series NICs or show_gid for Pollara series NICs.
$ show_gids
DEV PORT INDEX GID IPv4 VER DEV
--- ---- ----- --- ------------ --- ---
mlx5_0 1 0 fe80:0000:0000:0000:5c25:73ff:fec0:ae68 v1 eth2
mlx5_0 1 1 fe80:0000:0000:0000:5c25:73ff:fec0:ae68 v2 eth2
mlx5_0 1 2 0000:0000:0000:0000:0000:ffff:c0a8:3201 192.168.50.1 v1 eth2
mlx5_0 1 3 0000:0000:0000:0000:0000:ffff:c0a8:3201 192.168.50.1 v2 eth2
mlx5_1 1 0 fe80:0000:0000:0000:5c25:73ff:fec3:77e6 v1 eth3
mlx5_1 1 1 fe80:0000:0000:0000:5c25:73ff:fec3:77e6 v2 eth3
mlx5_1 1 2 0000:0000:0000:0000:0000:ffff:c0a8:3301 192.168.51.1 v1 eth3
mlx5_1 1 3 0000:0000:0000:0000:0000:ffff:c0a8:3301 192.168.51.1 v2 eth3
mlx5_2 1 0 fe80:0000:0000:0000:5c25:73ff:fec0:b5f8 v1 eth4
mlx5_2 1 1 fe80:0000:0000:0000:5c25:73ff:fec0:b5f8 v2 eth4
mlx5_2 1 2 0000:0000:0000:0000:0000:ffff:c0a8:3401 192.168.52.1 v1 eth4
mlx5_2 1 3 0000:0000:0000:0000:0000:ffff:c0a8:3401 192.168.52.1 v2 eth4
mlx5_3 1 0 fe80:0000:0000:0000:5c25:73ff:fec0:b560 v1 eth5
mlx5_3 1 1 fe80:0000:0000:0000:5c25:73ff:fec0:b560 v2 eth5
mlx5_3 1 2 0000:0000:0000:0000:0000:ffff:c0a8:3501 192.168.53.1 v1 eth5
mlx5_3 1 3 0000:0000:0000:0000:0000:ffff:c0a8:3501 192.168.53.1 v2 eth5
mlx5_4 1 0 fe80:0000:0000:0000:5c25:73ff:fec3:78fe v1 eth6
mlx5_4 1 1 fe80:0000:0000:0000:5c25:73ff:fec3:78fe v2 eth6
mlx5_4 1 2 0000:0000:0000:0000:0000:ffff:c0a8:3601 192.168.54.1 v1 eth6
mlx5_4 1 3 0000:0000:0000:0000:0000:ffff:c0a8:3601 192.168.54.1 v2 eth6
mlx5_5 1 0 fe80:0000:0000:0000:5c25:73ff:fec0:b3c8 v1 eth7
mlx5_5 1 1 fe80:0000:0000:0000:5c25:73ff:fec0:b3c8 v2 eth7
mlx5_5 1 2 0000:0000:0000:0000:0000:ffff:c0a8:3701 192.168.55.1 v1 eth7
mlx5_5 1 3 0000:0000:0000:0000:0000:ffff:c0a8:3701 192.168.55.1 v2 eth7
mlx5_6 1 0 fe80:0000:0000:0000:5c25:73ff:fec0:a870 v1 eth8
mlx5_6 1 1 fe80:0000:0000:0000:5c25:73ff:fec0:a870 v2 eth8
mlx5_6 1 2 0000:0000:0000:0000:0000:ffff:c0a8:3801 192.168.56.1 v1 eth8
mlx5_6 1 3 0000:0000:0000:0000:0000:ffff:c0a8:3801 192.168.56.1 v2 eth8
mlx5_7 1 0 fe80:0000:0000:0000:5c25:73ff:fec0:b680 v1 eth9
mlx5_7 1 1 fe80:0000:0000:0000:5c25:73ff:fec0:b680 v2 eth9
mlx5_7 1 2 0000:0000:0000:0000:0000:ffff:c0a8:3901 192.168.57.1 v1 eth9
mlx5_7 1 3 0000:0000:0000:0000:0000:ffff:c0a8:3901 192.168.57.1 v2 eth9
}Use a GID index corresponding to RoCE v2 and the desired IP address. For the above example you would use NCCL_IB_GID_INDEX=1 for an IPv6 address and NCCL_IB_GID_INDEX=3 for an IPv4 address.
Only a single index can be chosen for all interfaces so if the desired index is not available or consistent across all NICs please review the IP address configuration of your interfaces.