DataParallel requires every input tensor be provided on the first device in its device_ids list. It basically uses that device as a staging area before scattering to the other GPUs and it's the device where final outputs are gathered before returning from forward. If you want device 2 to be the primary device then you just need to put it at the front of the list as follows model = nn.DataParallel(model, device_ids = [2, 0, 1, 3]) model.to(f'cuda:{model.device_ids[0]}') Save After which all tensors provided to model should be on the first device as well. x = ... # input tensor x = x.to(f'cuda:{model.device_ids[0]}') y = model(x) Sa
Preview:
downloadDownload PNG
downloadDownload JPEG
downloadDownload SVG
Tip: You can change the style, width & colours of the snippet with the inspect tool before clicking Download!
Click to optimize width for Twitter