Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding support for Pathways proxy #690

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

jesus-orozco
Copy link
Contributor

Enable JAX+Pathways single-controller architecture for coordination of accelerators.

  • Set jax_backend to "proxy"
  • Additional logic to handle new jax_backend type
  • Import previewutilities library to identify pathways-enabled workloads

@jesus-orozco jesus-orozco marked this pull request as ready for review October 2, 2024 15:49
pyproject.toml Outdated
@@ -92,6 +92,7 @@ gcp = [
"google-cloud-build==3.24.1",
"ml_goodput_measurement==0.0.2",
"pyOpenSSL>=22.1.0", # compat with cryptography version.
"pathwaysutils@git+https://github.com/google/pathways-utils", # for JAX+Pathways single-controller accelerator coordinator
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm concerned about depending a github repo directly without specifying a specific version, as any changes to the repo will be automatically reflected. Does pathways-utils have versioned releases that can be installed via pip install?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the feedback @ruomingp ! The short-term roadmap for the pathways-utils library doesn't account for a Pypi release. Adding a tag to the dependency so we can control which version gets installed with Axlearn.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @jesus-orozco , is there an ETA for a pypi release? IIRC, introducing git hashes to pyproject causes issues for our own pypi release. We prefer to avoid this if possible.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm following up with the product team to get a better picture of the roadmap. At the moment Github is the only path forward to install the package, unless we bake the whl directly into the docker image.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@markblee Pypi release ETA is Oct 18

pyproject.toml Outdated Show resolved Hide resolved
axlearn/common/launch_trainer_main.py Outdated Show resolved Hide resolved
Copy link
Contributor

@ruomingp ruomingp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks.

axlearn/common/launch_trainer_main.py Outdated Show resolved Hide resolved
pyproject.toml Show resolved Hide resolved
axlearn/common/utils_spmd.py Show resolved Hide resolved
pyproject.toml Outdated
@@ -92,6 +92,7 @@ gcp = [
"google-cloud-build==3.24.1",
"ml_goodput_measurement==0.0.2",
"pyOpenSSL>=22.1.0", # compat with cryptography version.
"pathwaysutils@git+https://github.com/google/pathways-utils", # for JAX+Pathways single-controller accelerator coordinator
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @jesus-orozco , is there an ETA for a pypi release? IIRC, introducing git hashes to pyproject causes issues for our own pypi release. We prefer to avoid this if possible.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants