-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[shortfin] Make SystemBuilder fully configuration/environment driven. #420
Conversation
* Now `shortfin.SystemBuilder` (which was an abstract class) has a constructor. When used (vs one of the concrete subclasses), the `system_type=` keyword (or SHORTFIN_SYSTEM_TYPE env var, or `SystemBuilder.default_system_type` property) is used to drive subclass selection. * This feature was already in the C++ API: this patch merely exposes it to Python in an ergonomic way. * Pytest configuration and fixtures were updated to support configurable system independence where possible/straightforward. The result is that some tests now run for `--system amdgpu`. * Pytest `--compile-flags=` is now available to also provide explicit compilation flags for tests that need to compile a binary. * Both such tests are currently non functional on amdgpu, which needs further triage. This is ok to land because no one is passing `--compile-flags` yet, leaving time to triage these tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the future,
cpu system builder is replaced with sf.SystemBuilder()
right?
Yes, unless if in the (hopefully) rare case that we are intentionally writing a CPU only test (like in the test case where we check the API and options of the CPU builder specifically). A lot of the tests are still CPU only but can be fixed. |
@ScottTodd for post review |
pytest tests/ --system amdgpu \ | ||
--compile-flags="--iree-hal-target-backends=rocm --iree-hip-target=gfx1100" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clever - passing a list of flags through via this new option.
We have a few tests in other places that pull --iree-hip-target
from a IREE_HIP_TARGET
environment variable but that needs the tests to know about the env var for that specific backend and doesn't allow for passing extra flags.
One recent example is #373, with this code:
CPU_SETTINGS = {
"device_flags": [
"-iree-hal-target-backends=llvm-cpu",
"--iree-llvmcpu-target-cpu=host",
],
"device": "local-task",
}
IREE_HIP_TARGET = os.environ.get("IREE_HIP_TARGET", "gfx1100")
gpu_settings = {
"device_flags": [
"-iree-hal-target-backends=rocm",
f"--iree-hip-target={IREE_HIP_TARGET}",
],
"device": "hip",
}
fyi @stbaione we might want to use this --compile-flags
pattern for pytest options across more of the project, including that integration test.
One advantage to putting the flags in the test vs loading them from CLI flags is that the test itself can encode xfail behavior for each set of parameters. Hopefully we won't have that many xfails in shortfin 🤞
In yet another place, I have config files specify compile and run flags along with lists of xfails that are used to instantiate parameterized tests, like https://github.com/iree-org/iree/blob/main/tests/external/iree-test-suites/onnx_ops/onnx_ops_cpu_llvm_sync.json
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will get better. I'm going to poke through enough API so that the compile flags can be completely inferred. But was too much for tonight (and in any case, you do want to be able to set them explicitly too).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Presumably then, too, if worse comes to worse, XFAIL could be inferred based on what device it actually is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't anticipate having a lot of XFAIL'y stuff in shortfin. We should just be fixing things.
shortfin.SystemBuilder
(which was an abstract class) has a constructor. When used (vs one of the concrete subclasses), thesystem_type=
keyword (or SHORTFIN_SYSTEM_TYPE env var, orSystemBuilder.default_system_type
property) is used to drive subclass selection.--system amdgpu
.--compile-flags=
is now available to also provide explicit compilation flags for tests that need to compile a binary.--compile-flags
yet, leaving time to triage these tests.