Preview:
DataParallel requires every input tensor be provided on the first device in its device_ids list.

It basically uses that device as a staging area before scattering to the other GPUs and it's the device where final outputs are gathered before returning from forward. If you want device 2 to be the primary device then you just need to put it at the front of the list as follows

model = nn.DataParallel(model, device_ids = [2, 0, 1, 3])
model.to(f'cuda:{model.device_ids[0]}')
 Save
After which all tensors provided to model should be on the first device as well.

x = ... # input tensor
x = x.to(f'cuda:{model.device_ids[0]}')
y = model(x)
 Sa
downloadDownload PNG downloadDownload JPEG downloadDownload SVG

Tip: You can change the style, width & colours of the snippet with the inspect tool before clicking Download!

Click to optimize width for Twitter