DataParallel requires every input tensor be provided on the first device in its device_ids list.
It basically uses that device as a staging area before scattering to the other GPUs and it's the device where final outputs are gathered before returning from forward. If you want device 2 to be the primary device then you just need to put it at the front of the list as follows
model = nn.DataParallel(model, device_ids = [2, 0, 1, 3])
model.to(f'cuda:{model.device_ids[0]}')
Save
After which all tensors provided to model should be on the first device as well.
x = ... # input tensor
x = x.to(f'cuda:{model.device_ids[0]}')
y = model(x)
Sa
Preview:
downloadDownload PNG
downloadDownload JPEG
downloadDownload SVG
Tip: You can change the style, width & colours of the snippet with the inspect tool before clicking Download!
Click to optimize width for Twitter