Use PyTorch with a single GPU
Train multiple models with at the same time with Dask
Have many GPUs training the same model together with Dask