Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[shortfin] Make SystemBuilder fully configuration/environment driven. #420

Merged
merged 3 commits into from
Nov 5, 2024

Conversation

stellaraccident
Copy link
Contributor

  • Now shortfin.SystemBuilder (which was an abstract class) has a constructor. When used (vs one of the concrete subclasses), the system_type= keyword (or SHORTFIN_SYSTEM_TYPE env var, or SystemBuilder.default_system_type property) is used to drive subclass selection.
  • This feature was already in the C++ API: this patch merely exposes it to Python in an ergonomic way.
  • Pytest configuration and fixtures were updated to support configurable system independence where possible/straightforward. The result is that some tests now run for --system amdgpu.
  • Pytest --compile-flags= is now available to also provide explicit compilation flags for tests that need to compile a binary.
  • Both such tests are currently non functional on amdgpu, which needs further triage. This is ok to land because no one is passing --compile-flags yet, leaving time to triage these tests.

* Now `shortfin.SystemBuilder` (which was an abstract class) has a constructor. When used (vs one of the concrete subclasses), the `system_type=` keyword (or SHORTFIN_SYSTEM_TYPE env var, or `SystemBuilder.default_system_type` property) is used to drive subclass selection.
* This feature was already in the C++ API: this patch merely exposes it to Python in an ergonomic way.
* Pytest configuration and fixtures were updated to support configurable system independence where possible/straightforward. The result is that some tests now run for `--system amdgpu`.
* Pytest `--compile-flags=` is now available to also provide explicit compilation flags for tests that need to compile a binary.
* Both such tests are currently non functional on amdgpu, which needs further triage. This is ok to land because no one is passing `--compile-flags` yet, leaving time to triage these tests.
Copy link
Contributor

@renxida renxida left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the future,

cpu system builder is replaced with sf.SystemBuilder()

right?

@stellaraccident
Copy link
Contributor Author

In the future,

cpu system builder is replaced with sf.SystemBuilder()

right?

Yes, unless if in the (hopefully) rare case that we are intentionally writing a CPU only test (like in the test case where we check the API and options of the CPU builder specifically).

A lot of the tests are still CPU only but can be fixed.

@stellaraccident stellaraccident merged commit 120851c into main Nov 5, 2024
11 checks passed
@stellaraccident stellaraccident deleted the shortfin_device_independence branch November 5, 2024 03:29
@stellaraccident
Copy link
Contributor Author

@ScottTodd for post review

Comment on lines +119 to +120
pytest tests/ --system amdgpu \
--compile-flags="--iree-hal-target-backends=rocm --iree-hip-target=gfx1100"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clever - passing a list of flags through via this new option.

We have a few tests in other places that pull --iree-hip-target from a IREE_HIP_TARGET environment variable but that needs the tests to know about the env var for that specific backend and doesn't allow for passing extra flags.

One recent example is #373, with this code:

CPU_SETTINGS = {
    "device_flags": [
        "-iree-hal-target-backends=llvm-cpu",
        "--iree-llvmcpu-target-cpu=host",
    ],
    "device": "local-task",
}
IREE_HIP_TARGET = os.environ.get("IREE_HIP_TARGET", "gfx1100")
gpu_settings = {
    "device_flags": [
        "-iree-hal-target-backends=rocm",
        f"--iree-hip-target={IREE_HIP_TARGET}",
    ],
    "device": "hip",
}

fyi @stbaione we might want to use this --compile-flags pattern for pytest options across more of the project, including that integration test.

One advantage to putting the flags in the test vs loading them from CLI flags is that the test itself can encode xfail behavior for each set of parameters. Hopefully we won't have that many xfails in shortfin 🤞

In yet another place, I have config files specify compile and run flags along with lists of xfails that are used to instantiate parameterized tests, like https://github.com/iree-org/iree/blob/main/tests/external/iree-test-suites/onnx_ops/onnx_ops_cpu_llvm_sync.json

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will get better. I'm going to poke through enough API so that the compile flags can be completely inferred. But was too much for tonight (and in any case, you do want to be able to set them explicitly too).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Presumably then, too, if worse comes to worse, XFAIL could be inferred based on what device it actually is.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't anticipate having a lot of XFAIL'y stuff in shortfin. We should just be fixing things.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants