Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Promote riscv64gc-unknown-linux-gnu to Tier-1 (without host tools) #3707

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,185 @@
- Feature Name: `promote-riscv64gc-unknown-linux-gnu-to-tier-1-without-host-tools`
- Start Date: 2024/10/03
- RFC PR: TODO
- Rust Issue: TODO

# Summary
[summary]: #summary

Promote the `riscv64gc-unknown-linux-gnu` Rust target to be the first Tier-1 (without host tools) platform.


# Motivation
[motivation]: #motivation

The `riscv64gc-unknown-linux-gnu` target is [currently a Tier 2 (with host tools) Rust target](https://forge.rust-lang.org/release/platform-support.html#tier-2), in accordance with the target tier policy [here](https://doc.rust-lang.org/nightly/rustc/platform-support.html).

Since the introduction of the target, there has been an upward trend in use. Several operating system environments (Linux, FreeBSD, Android, NuttX) support RISC-V systems based on the `riscv64gc` ISA extension and this number is increasing.

During discussions with users and partners, the [RISE project](https://riseproject.dev/) has received feedback from users that they would like to use Rust, but they are hesitant due to the Tier 2 status.

In the last 2 quarters, good progress has been made in understanding and filling the gaps that remain in the path to attaining [Tier 1 (without host tools)](https://doc.rust-lang.org/nightly/rustc/target-tier-policy.html#tier-1-target-policy) status for this target.
Comment on lines +19 to +21
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is truly the cause, then how do you explain how aarch64-apple-darwin had been tier 2 for several years and only very recently moved to tier 1, but people felt absolutely confident building on it (Zed comes to mind). That may be because there were other tier 1 aarch64 and darwin targets, but that only makes this problem worse, doesn't it? A tier 1 riscv target will be the first.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will have to defer to more experienced folks to explain that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is good feedback and should be clarified. Actual rustc quality and tier level are not necessarily the same. Particularly in tier 2, we see a wide variety of quality levels in practice. There's production grade tier 2 targets and some that frequently have bugs.

However, having also done the arm tier 1 raise, we've observed that arm quality sharply increased by just having one target "up there". It means the target architecture gets care and will be improved to a level where it rarely fails CI. Tier 2 targets in the "same flock" benefit from that. I would assume something similar to happen to RISC-V.


As a direct result, those gaps have either already been filled or are very close to being filled.

As such, this RFC aims to demonstrate what has been done.

Please note that this RFC's authors are performing this work as part of the [RISE Project](https://wiki.riseproject.dev/display/HOME/Project+RP004%3A+Support+64-bit+RISC-V+Linux+port+of+Rust+to+Tier-1).


# Guide-level explanation
[guide-level-explanation]: #guide-level-explanation

Currently, users of the `riscv64gc-unknown-linux-gnu` target can add it to their local installation with:

```bash
rustup target add riscv64gc-unknown-linux-gnu
```

This is possible because `riscv64gc-unknown-linux-gnu` is a tier 2 target as described in the [Platform Support](https://doc.rust-lang.org/nightly/rustc/platform-support.html) document, and the Rust project produces official binaries of the host tools used on the target (eg. `cargo`) and libraries used in binaries for the target (eg. `std`).

These binaries are only "guaranteed to build," not "guaranteed to work" like they would be if the target was Tier 1. While these host tools and libraries are created, there is no promise that all (or any) of the tests pass.

This RFC seeks to demonstrate that libraries of the target are currently in a state where all tests are passing. It seeks to demonstrate that the target sufficiently fulfills the other criteria required to promote it to be the first Tier 1 (without host tools) target.

This RFC does not seek to demonstrate that `rustc`, `cargo`, or other host platform tools are passing all tests, or that they are suitable for tier promotion.


# Reference-level explanation
[reference-level-explanation]: #reference-level-explanation

The following is a point by point breakdown of [the Tier 1 Target Policy](https://doc.rust-lang.org/nightly/rustc/target-tier-policy.html#tier-1-target-policy) and the state of the `riscv64gc-unknown-linux-gnu` target in regards to it.

> 1.a. Tier 1 targets must have substantial, widespread interest within the
developer community, and must serve the ongoing needs of multiple production
users of Rust across multiple organizations or projects. These requirements
are subjective, and determined by consensus of the approving teams

It is also generally fair to state that there is a clear upward trend in the use of `riscv64gc-unknown-linux-gnu` as a compile target. Several operating system environments (Linux, FreeBSD, Android, NuttX) support riscv64gc based systems and this trend is increasing.

One key production user of this compilation target is Google: Rust is used to implement several key Android subsystems. Here is a quote from Google, with permission:

> "Android has added support for RISC-V as a target as of Android 15, with the projected RVA23 profile as a baseline for Android 16. With Android's well-known reliance on Rust as a memory safe alternative to C/C++, it's critical to have RISC-V support at Tier-1." Lars Bergstrom (@larsbergstrom), Google.

The RISE project has received feedback from other users that they would like to use Rust, but they are hesitant due to the Tier 2 status.

> 1.b. The target maintainer team must include at least 3 developers.

There are currently 4 maintainers listed in [the target's platform page](https://doc.rust-lang.org/nightly/rustc/platform-support/riscv64gc-unknown-linux-gnu.html) including Kito Cheng, Michael Maitland, Robin Randhawa, and Craig Topper.

> 1.c. The target must build and pass tests reliably in CI, for all components that Rust's CI considers mandatory.

In https://github.com/rust-lang/rust/pull/126641 the `riscv64gc-gnu` job will be enabled in bors pre-merge tests. Those tests have passed for several months and the PR has no current blockers.

There are a few ignored tests on the platform:
- `tests/codegen/call-llvm-intrinsics.rs`: Covered by `tests/codegen/riscv-abi/call-llvm-intrinsics.rs` instead.
- `tests/codegen/catch-unwind.rs`: The closure is another function, placed before fn foo so CHECK can't find it.
- `tests/codegen/repr/transparent.rs`: Ignored because RISC-V has an i128 type used with `test_Vector`.
- `tests/run-make/inaccessible-temp-dir/`: Ignored because the test container runs as root and the test cannot create a directory it cannot access. (This issue is also present in arm test containers)
- `tests/run-make/rustdoc-io-error/rmake.rs`: Ignored for the same reason as `inaccessible-temp-dir` above.
- `tests/run-make/split-debuginfo/`: On this platform only `-Csplit-debuginfo=off` is supported, see [#120518](https://github.com/rust-lang/rust/pull/120518).
- `tests/ui/debuginfo/debuginfo-emit-llvm-ir-and-split-debuginfo.rs`: On this platform `-Csplit-debuginfo=unpacked` is unstable, see [#120518](https://github.com/rust-lang/rust/pull/120518).


> 1.d. The target must provide as much of the Rust standard library as is feasible
and appropriate to provide. For instance, if the target can support dynamic
memory allocation, it must provide an implementation of `alloc` and the
associated data structures.

`alloc` is implemented. There is currently no specific `std` functionality disabled for `riscv64gc-unknown-linux-gnu`.

> 1.e. Building the target and running the testsuite for the target must not take
substantially longer than other targets, and should not substantially raise
the maintenance burden of the CI infrastructure.

Running the `riscv-gnu` job from scratch takes approximately 73 minutes on CI. This is less time than the `i686-gnu` (78 minutes) job, or the `x86_64-gnu` job (93 minutes). It's fair to conclude that this proposal would not substantially lengthen CI jobs.

The existing `riscv64-gnu` test job is nearly identical to the `armhf-gnu` job and works as expected in existing processes. Emulating `riscv64gc-unknown-linux-gnu` can be done using normal tools like `qemu`, `docker`, or `lima` like other platforms such as `aarch64-unknown-linux-gnu` or `x86_64-unknown-linux-gnu`. It's fair to conclude that this proposal would not substantially raise the maintenance burden of the CI infrastructure.

> 1.f. If running the testsuite requires additional infrastructure (such as physical
systems running the target), the target maintainers must arrange to provide
such resources to the Rust project, to the satisfaction and approval of the
Rust infrastructure team.

Running the test suite does not require physical systems running the target. Emulating `riscv64gc-unknown-linux-gnu` can be done using normal tools like `qemu`, `docker`, or `lima` like other platforms such as `aarch64-unknown-linux-gnu` or `x86_64-unknown-linux-gnu`.

An emulated or real `riscv64gc-unknown-linux-gnu` can make use of the existing tier 2 host tools, or self-bootstrap in the event the host system cannot cross compile the appropriate artifacts to run the necessary tests.


> 1.g. Tier 1 targets must not have a hard requirement for signed, verified, or
otherwise "approved" binaries. Developers must be able to build, run, and
test binaries for the target on systems they control, or provide such
binaries for others to run. (Doing so may require enabling some appropriate
"developer mode" on such systems, but must not require the payment of any
additional fee or other consideration, or agreement to any onerous legal
agreements.)

No hard requirement of signing, verifying, or "approving" binaries exists for the `riscv64gc-unknown-linux-gnu` platform.

> 2.a. The long term viability of the existence of a target specific ecosystem should be clear.

RISC-V has a roughly 9 year history and there are a variety of vendors providing silicon using this instruction set. They include (but are not limited to) [Alibaba Cloud](https://www.alibabagroup.com/), [AllWinner](http://www.allwinnertech.com/), [antimicro](http://antmicro.com/), [BeagleBoard](https://beagleboard.org/), [Deep Computing](https://deepcomputing.io/), [Microchip](https://www.microchip.com/), [RIOS](http://rioslab.org/), [SiFive](https://sifive.com/), [SOPHGO](https://en.sophgo.com/site/index.html), and [StarFive](https://starfivetech.com/).

There is already an existing ecosystem of downstream users of this target. [Debian](https://wiki.debian.org/Ports/riscv64), [Ubuntu](https://ubuntu.com/download/risc-v) and [OpenSUSE](https://en.opensuse.org/openSUSE:RISC-V) all provide `riscv64` distributions of Linux and also package Rust. [Scaleway](https://labs.scaleway.com/en/em-rv1/) is offering `riscv64gc-unknown-linux-gnu` cloud instances.

Some of the ongoing development of this target has been supported by the [RISE Project](https://riseproject.dev/) which represents a broad array of industrial interests including, for example, Google, Intel, NVIDIA, and SiFive.

It is fair to say that the target specific ecosystem has been a viable target for some time now, and that this is likely to continue into the long term.

> 2.b. The long term viability of supporting the target should be clear.

It is hard to concretely quantify this aspect. This work was initiated and supported by the [RISE Project](https://riseproject.dev/) and there is an intention to continue to support the target and eventually propose a *Tier 1 with Host Tools* RFC when sufficiently fast hardware exists.

# Drawbacks
[drawbacks]: #drawbacks

Adopting the platform would require additional commitments by the Rust project. Future contributions may impact the target and cause changes to become delayed or halted entirely due to problems on the target.

In general, it should be uncomplicated for contributors to build for and use a `riscv64gc-unknown-linux-gnu` emulator like `qemu`, `docker`, or `lima`. Additionally, the platform is a `*-unknown-linux-gnu` target which is generally quite well understood, contributors do not need to learn what could be an otherwise unfamiliar operating system.

This target does not place significant burdens on the project that would not be present on any other target.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not so sure.

Everyone will be expected to be able to easily contribute to maintaining RISCV targets, because all PRs are blocked on a failure for a tier 1 target's tests. Having to set up QEMU, etc. at all in order to debug a contribution failing on RISCV will be a significant and novel additional burden for many contributors, above and beyond what many will expect for contributions, and may easily block PRs. The reason the tier 1 targets that exist now are acceptable (and there is argument that some are rapidly becoming not!) is because it is relatively easy to obtain access to these machines by sheer dint of their commonality, and they usually are okay at performance. The entire reason this proposal exists in its current form is because this target, however, does not fulfill either criterion.

Yes, it is "merely software", but in this case, running the test suite has to be made as turnkey as possible: none of the existing test infra that runs in CI should be taken as "good enough". No other tier 1 target will have to always be emulated in order to effectively run its test suite.

Fortunately, there has been work on making testing easier for e.g. better testing the wasm targets, by supporting a "runtool" in our test infrastructure. This makes it possible to simply use x.py test --target wasm32-wasip1 if one sets the WASI_SDK_PATH. So, it should be relatively easy to make testing this target from a different machine host about as simple as ./x.py test --target riscv64gc-unknown-linux-gnu, and I would expect that before this is promoted to tier 1.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, it should be uncomplicated for contributors to build for and use a riscv64gc-unknown-linux-gnu emulator like qemu, docker, or lima.

+1 to what Jubilee said. If as a compiler contributor I have to build an emulator to run the test suite (especially to bless tests that fail in PR CI or full CI) for riscv specific tests or revisions, then that is a very significant burden. For current Tier 1 targets I'm lucky enough to have access to both Windows and Linux via dev-desktop and I don't need to build an emulator for those targets. Even apple-specific failures can already be a pain. I don't want to have to litter //@ ignore-riscv into our test suites more than there already exists if there are test failures that are blocking PRs but neither the PR author nor reviewer can bless the test easily.

I have to also note that many of our tests are //@ ignore-cross-compile and are not exercised on the cross-compiled target when running tests for various reasons. This means that if a Tier 1 target is emulated it will likely receive a lot less test coverage than you might expect for a Tier 1 target.

Copy link
Member

@the8472 the8472 Oct 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having to set up QEMU, etc. at all in order to debug a contribution failing on RISCV will be a significant and novel additional burden for many contributors

I think of apple OSes as a far worse target than that. There isn't even emulation available. For windows microsoft offers VMs. For other CPUs there's QEMU at least. Apple is worse than all of that.

Despite that I don't have to care about apple because there are dedicated target maintainers. I'd expect a new tier 1 target to get the same level of care from its dedicated maintainers, taking the burden off everyone else.

That said, good documentation and anything that would streamline the emulation setup and testing would be welcome.

Copy link
Member

@kennytm kennytm Oct 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think of apple OSes as a far worse target than that. There isn't even emulation available. For windows microsoft offers VMs. For other CPUs there's QEMU at least. Apple is worse than all of that.

True, you can't (legally) emulate the Apple targets. On the other hand, the user base developing for an Apple platform is huge enough that, even if you (the contributor) don't have access it is easy to find a collaborator knowing how a Mac or iPhone work to help. Same story for Windows and Linux. The point is not how easy to emulate/simulate the target, but how easy for an average contributor to understand the target-specific issues.

(Personally, without checking the maintenance status, I find it surprising that RISC-V can be promoted before WASM32.)

Copy link
Member

@workingjubilee workingjubilee Oct 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, what I was eliding here is that e.g. common RISCV hardware doesn't work with common Linux distros because of insufficient upstreaming of necessary kernel patches. Often, the manufacturer doesn't even ship a sufficiently updated kernel! So if a contributor DOES get a RISCV device and sets it up, they may have trouble running rustc on it. Meanwhile, Apple devices at least give me the courtesy of booting and running rustup out-of-the-box.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@workingjubilee, we'll get back ASAP with the exact host targets we've used. I would be surprised if the qemu-system-riscv64 invocation from any Tier-1 host would be any different but yes, the fundamental invocation needs to be known. I would assume that that would be reasonably discernible from the CI report itself when a failure occurs and therefore be easily invokable 'locally' but we'll ensure that that gets suitable coverage, either in the RFC text itself or some suitable proxy.

Your point is taken all the same (although speaking for myself I wouldn't use some of the terms you've chosen to make your point but that's just me).

Thanks all the same and please stand by.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

riscv64gc-unknown-linux-gnu from an x86_64-unknown-linux-gnu host?

I've run DEPLOY=1 ./src/ci/docker/run.sh riscv64gc-gnu (which is exactly what the CI uses) from aarch64-darwin and x86_64-unknown-linux-gnu several times over the last months. This uses the qemu support Rust has already existing.

I've not run the tests from a Windows host.

The experience does leave some things to be desired, for example, it's a bit hard to run a specific test filter using run.sh, and there are quite a few container build steps. I think it's fair to say this process is not super obvious and may provide barriers to contributors, I also think this problem is somewhat shared with other targets (such as armhf).

Locally, most of my RISC-V development happens within a lima VM. All tests pass even using RISC-V host tools here. However this is not the most common tool. It's (almost annoyingly) fairly easy to 'just get' a RISC-V container with the tool.

Running a RISC-V VM in QEMU is generically documented, for example, here. Users would need to have (as it documents) opensbi and some other usual QEMU dependencies. I believe a user could set target.riscv64gc-unknown-linux-gnu.qemu-rootfs in their config.toml, and have it work via x.py... if they had everything set up correctly. (The problem is: getting there)

I believe if a user has docker and binfmt configured correctly (sadly, I lack a Linux host with binfmt support) it should be possible to run docker run --platform linux/riscv64 --rm -ti docker.io/riscv64/ubuntu:24.04 and get a RISC-V container.

It's, unfortunately, not quite the ./x.py test --target "${TARGET}" we discussed recently.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I haven't tested docker run --platform linux/riscv64 --rm -ti docker.io/riscv64/ubuntu:24.04 exactly, I have made use of binfmt via a (toolbox)[https://docs.fedoraproject.org/en-US/fedora-silverblue/toolbox/] container, only requiring minimal setup for binfmt to work for RISC-V. All from a x86_64-unknown-linux-gnu host.

My usual testing over the past few months for RISC-V has been a RISC-V VM, running ubuntu, which I then test via rustc's remote testing, which I've found to be the most pain free for me. Only requiring a disk image and a readily available QEMU command, all of which can be found via ubuntu's wiki.

I'll also note that I'm running Fedora Linux, and despite running a lot of this via ubuntu, it's an almost entirely seamless experience.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of that could also be done for many of the other tier 2 targets, including arm-unknown-linux-gnueabihf that is already at the same level of CI testing that is being proposed for RISC-V, where armhf-gnu runs library tests through qemu. So to risk a slippery-slope argument, I'm not sure why we would promote RISC-V and not ARM, or potentially many others.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So to risk a slippery-slope argument, I'm not sure why we would promote RISC-V and not ARM, or potentially many others.

well riscv64gc-unknown-linux-gnu does have 4 maintainers listed in https://doc.rust-lang.org/nightly/rustc/platform-support/riscv64gc-unknown-linux-gnu.html proving the "The target maintainer team must include at least 3 developers" requirement, while arm-unknown-linux-gnueabihf does not even have a target-specific doc.


# Rationale and alternatives
[rationale-and-alternatives]: #rationale-and-alternatives

There exist two alternatives: Promoting the target to Tier 1 (with host tools), or not promoting the target at all.

## Tier 1 (with host tools)

The [Tier 1 Target Tier Policy](https://doc.rust-lang.org/nightly/rustc/target-tier-policy.html#tier-1-target-policy) section 1.e states:

> 1.e. Building the target and running the testsuite for the target must not take
> substantially longer than other targets...

During testing on [Scaleway Elastic Metal RV1](https://labs.scaleway.com/en/em-rv1/) it was determined that running a full `x.py test` run takes roughly 6 hours. A similar amount of time was taken on a 6 CPU, 16GB RAM VM.

During this testing, it was noted that all tests pass.

It's fair to conclude that existing available virtualization and hardware for `riscv64gc-unknown-linux-gnu` takes substantially longer to run the test suite than other targets. If sufficiently fast hardware existed, this RFC would be for Tier-1 with host tools instead.

## Not promoting the target

Not promoting the target could lead to a situation where the `riscv64gc-unknown-linux-gnu` tests are no longer passing, and this could impact users.

Anecdotally, not having the Tier 1 'badge' has been seen to become an obstacle to increasing mindshare in Rust for this target. Organisations tend to associate a Tier 1 categorisation with better quality, suitability for key projects, longevity etc. With a reasonably justified Tier 1 'badge' in place, the likelihood is that such organisations will tend to pick up and promote the use of Rust in production.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should also discuss the option of having a vendor other than the rust-lang org provide the equivalent of Tier 1 support, e.g. by running CI externally and communicating equivalence properly.

That way even host tools could be tested without the infra burden of maintaining custom github runners.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That roughly happens with additional arches on Linux distros already.


Because of this, not proceeding with promoting `riscv64gc-unknown-linux-gnu` to Tier 1 could result in a degradation of the state of the platform and impact users.

# Prior art
[prior-art]: #prior-art

There are currently no Tier 1 (without host tools) targets, so existing Tier 1 targets represent the closest prior-art. In addition, no RISC-V based target has ever been promoted to Tier 1 (with or without host tools).

Therefore, the `riscv64gc-unknown-linux-gnu` target is in somewhat uncharted territory.

# Unresolved questions
[unresolved-questions]: #unresolved-questions

No unresolved questions or issues remain.
Copy link
Member

@workingjubilee workingjubilee Oct 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a few:

  • What is RISE's position about Qualcomm's proposal that the C extension... part of the proposed target definition... be dispreferred next to a different way of handling instruction packeting? Perhaps not all backers of RISE agree, but RISE Project does have Qualcomm as a member, so...
  • In general, the RISCV spec still seems to have evolutionary growing pains, like Zicsr and Zifence. What'll the future hold? Are we sure that this is actually going to be the target everyone's going to want us to have moved to tier 1 even 5 years from now?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good points, again. Thanks.

For the first one: RISE's position matches the decision made by RVIA which is that the C extension is mandatory for RVA profiles. Some vendors may continue to disagree and that is just fine - which is the very point of an open ISA. However the majority opinion stands. We are well beyond any ripples as a result. Major OS vendors et al have internalised this position and that is the way things have moved.

For the second: The ratification ready RVA23 and RVB23 profiles fundamentally aim to promote the evolution of new extensions while drawing a line in the sand for those extension 'collections' that are deemed mandatory to service specific market vertical requirements (mobile, datacenter, et al).

The evolutionary growing pains you correctly allude to are in fact encouraged so long as the base mandate stands, which is the basis of commitments made by Google et al in so far as Android goes.

My personal take - FWIW - is that a lot of thought has been put into the definition of these profiles with robust discussion with a very broad and diverse collection of industry reps - a luxury that the incumbent ISAs do not have - and I think that this shall result in net goodness.

Happy to discuss further of course.


# Future possibilities
[future-possibilities]: #future-possibilities

As the first non i686/x86_64/aarch64 target to be considered for promotion to Tier-1, the `riscv64gc-unknown-linux-gnu` target will likely set a precedent for other `riscv*` targets to follow in the future.

As the first Tier 1 (without Host Tools) target, the `riscv64gc-unknown-linux-gnu` target will likely set a precedent for other Tier 1 (without host tools) targets to follow in the future.