Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add LocalCUDA Cluster (do not merge) #2432

Closed
wants to merge 4 commits into from

Conversation

mrocklin
Copy link
Member

This should not be merged. If this goes well then I'll move this to some other repository.

In the mean time, if people are able to try this out and give feedback that would be welcome.

pip install git+https://github.com/mrocklin/distributed@cuda-cluster --upgrade
In [1]: from dask.distributed import LocalCUDACluster, Client

In [2]: cluster = LocalCUDACluster()

In [3]: client = Client(cluster)

In [4]: import os

In [5]: def f():
   ...:     return os.environ['CUDA_VISIBLE_DEVICES']
   ...:

In [6]: client.run(f)
Out[6]:
{'tcp://127.0.0.1:33502': '4',
 'tcp://127.0.0.1:35447': '6',
 'tcp://127.0.0.1:37706': '2',
 'tcp://127.0.0.1:38728': '3',
 'tcp://127.0.0.1:39490': '7',
 'tcp://127.0.0.1:40090': '1',
 'tcp://127.0.0.1:42862': '5',
 'tcp://127.0.0.1:43920': '0'}

@mrocklin mrocklin mentioned this pull request Dec 19, 2018
@lesteve
Copy link
Member

lesteve commented Dec 20, 2018

Disclaimer: I am not a GPU expert at all. It feels like if CUDA_VISIBLE_DEVICES is already set though, you should only be able to use the devices listed in CUDA_VISIBLE_DEVICES.

This is linked to #2430 (comment): when a job starts running in our cluster, CUDA_VISIBLE_DEVICES is already set. Using other devices would mean "stealing" other jobs GPUs and possibly make other jobs crash (for example by having them run out of GPU memory).

@mrocklin
Copy link
Member Author

@cmgreen210 should you find yourself with some free time I'd be curious if the following would have worked for your use case with multiprocessing:

pip install git+https://github.com/mrocklin/distributed@cuda-cluster --upgrade
from dask.distributed import LocalCUDACluster, Client, progress
cluster = LocalCUDACluster()
client = Client(cluster)

futures = client.map(your_function, arg_sequence)
progress(futures)

I think that this should naively handle the things that you ran into, but I wouldn't be surprised if I've left something out or this breaks in some other way. If you have an opportunity to break this and provide feedback I would find that valuable. No pressure though if you're busy.

@cmgreen210
Copy link

@cmgreen210 should you find yourself with some free time I'd be curious if the following would have worked for your use case with multiprocessing:

pip install git+https://github.com/mrocklin/distributed@cuda-cluster --upgrade
from dask.distributed import LocalCUDACluster, Client, progress
cluster = LocalCUDACluster()
client = Client(cluster)

futures = client.map(your_function, arg_sequence)
progress(futures)

I think that this should naively handle the things that you ran into, but I wouldn't be surprised if I've left something out or this breaks in some other way. If you have an opportunity to break this and provide feedback I would find that valuable. No pressure though if you're busy.

Interesting @mrocklin, I'll give it a shot when I have time.


yield [
self._start_worker(
**self.worker_kwargs, env={"CUDA_VISIBLE_DEVICES": str(i)}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a note: while this will target the correct GPU with each worker, it will prevent workers from seeing other GPUs and prevent using CUDA IPC. If you'd want to use CUDA IPC with 2 GPUs you'd want something like:

CUDA_VISIBLE_DEVICES=0,1 ...
CUDA_VISIBLE_DEVICES=1,0 ...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Thanks @kkraus14 !

@mrocklin
Copy link
Member Author

mrocklin commented Jan 2, 2019

OK, I have a small dask-cuda repository locally that has this functionality (and also the consideration that @lesteve brought up earlier about respecting CUDA_VISIBLE_DEVICES that may be given to our process).

Where should this go? I can push this up to the dask github org, but I'd be happier to have it move into the rapidsai org (I suspect that rapids devs are more likely to do maintenance on this than Dask devs). If the answer is rapidsai then I'll need someone else to make the repository on github (blank repo ideally, no commits) and give me permissions.

@cjnolet
Copy link

cjnolet commented Jan 4, 2019

+1. I also agree with keeping this LocalCudaCluster separate. Would be really nice to see a fully distributed CUDA cluster in the future as well (I certainly don't mind contributing / helping to maintain).

Makes me wonder if this is presenting a good opportunity to build a repository focused on developer tooling within the RAPIDS ecosystem.

@mrocklin
Copy link
Member Author

mrocklin commented Jan 8, 2019

Closing in favor of https://github.com/mrocklin/dask-cuda

@mrocklin mrocklin closed this Jan 8, 2019
@mrocklin
Copy link
Member Author

mrocklin commented Jan 8, 2019

Thanks all for the comments

@mrocklin mrocklin deleted the cuda-cluster branch January 8, 2019 16:47
@cjnolet
Copy link

cjnolet commented Jan 8, 2019

@mrocklin, I've been so slammed the past 2 weeks and I would really like to make use of this (specifically for py.test within dask-cuml). What is the verdict on the new home for this? Are we moving it into RAPIDS?

@mrocklin
Copy link
Member Author

mrocklin commented Jan 8, 2019 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants