-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFC] Static Function Argument Unpacking #3723
base: master
Are you sure you want to change the base?
Conversation
|
||
### Type Coercions of Collections | ||
|
||
If the collection being unpacked is a reference for the collection type, whether argument unpacking works, depends on if accessing it directly with the field access expression (`.idx`, or `[idx]`) would work at compile time. If it does, then argument unpacking works. (For the reference, see [`std::ops::Deref`](https://doc.rust-lang.org/std/ops/trait.Deref.html) and [type coercions](https://doc.rust-lang.org/reference/type-coercions.html).) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Earlier it is stated that "Tuples, tuple structs, and fixed-size arrays can be unpacked.", but this seems to indicate that anything implementing Index<Int>
could be unpacked. I think tuples definitely make sense and fixed-length [T; N]
arrays could make sense, but anything involving Index
should be out of scope for now because it turns the complexity up quite a bit.
It is pretty easy to turn slicelike collections into fixed-length arrays anyway, the user can handle that if they need this support.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On a similar note, it is probably worth mentioning type inference. For example:
fn foo(a: u8, b: u8, c: u8, d: u8, e: u8) { /* ... */ }
fn bar(buf: &[u8]) {
foo(...buf.try_into().unwrap());
}
Is that expected to work because &[T; 5]
has TryFrom<&[T]>
and u8
is Copy
? What about
fn bar(buf: &[u8], other: &[u8]) {
foo(1, ...buf.try_into().unwrap(), ...other.try_into().unwrap());
}
That probably needs to be forbidden as ambiguous, but that is something that should be spelled out here with some T-Types input.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch – I seem to have written this part about accessing the fields in an ambiguous and self-contradictory way. I'll need to make some changes and also study the problem a bit more.
On the whole, I'm aiming at a design that's infallible during compilation. But I think I'll need to look into if that claim can actually be made after all, i.e. if some seemingly infallible cases wouldn't be totally infallible, but only just as infallible as they currently are. (Leading into a runtime panic upon failure?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On a similar note, it is probably worth mentioning type inference. For example:
fn foo(a: u8, b: u8, c: u8, d: u8, e: u8) { /* ... */ } fn bar(buf: &[u8]) { foo(...buf.try_into().unwrap()); }Is that expected to work because
&[T; 5]
hasTryFrom<&[T]>
andu8
isCopy
? What aboutfn bar(buf: &[u8], other: &[u8]) { foo(1, ...buf.try_into().unwrap(), ...other.try_into().unwrap()); }That probably needs to be forbidden as ambiguous, but that is something that should be spelled out here with some T-Types input.
I'll definitely need to add a subchapter on type inference next to the referred "Type Coercions of Collections" subchapter. 👍🏼
4. All of the items inside the collection are unpacked. | ||
- For example, attempting to unpack a thousand-element array just to pass the first two elements as arguments to a function taking two parameters seems like a mistake that should be explicitly prevented. | ||
- Consequently, there must be at least as many unfilled parameter slots left in the function call as there are items inside the collection being unpacked. If there are *N* items in the collection being unpacked, the immediately next *N* parameter slots in the function call are filled with the collection's items as the arguments. | ||
5. Minimum of one element/field is required in the collection being unpacked. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this RFC is intended to help push variadic generics along, we need to allow unpacking zero-element things, since it is necessary for stuff like defining variadic Fn
:
struct CallLocked<F>(Mutex<F>);
// this should work for F: FnMut() -> u8 and that needs unpacking zero-element things
impl<F: FnMut(...Args) -> R, ...Args, R> Fn(...Args) -> R for CallLocked<F> {
fn call(&self, ...args: ...Args) -> R {
self.0.lock().unwrap()(...args)
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a big fan of not special-casing zero things in the language semantics. That seems especially important for being able to unpack arrays of const-generic length.
(We should probably lint on sketchy things like let x = [...v.clear()];
, like we lint on let x = v.clear();
-- or clippy::let_unit_value
does, at least -- but the language should allow it for all lengths.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Noted, these arguments sound reasonable to me. Initially, I leaned towards starting with a strict design, since it's easier to relax rather than add constraints later on, but that's because I couldn't come up with a reason why zero-length collections could ever be unpacked. :)
In light of this, I'll make the following two changes to the RFC:
- Zero-length collection won't be errors anymore. Instead, in these cases, the argument unpacking syntax will just desugar into nothingness – i.e. no arguments are unpacked. This corner case will be explained in the text.
- The error diagnostic for this will be changed into a lint.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've updated the RFC based on the above comments in commit 395b6e4.
I think this design is now better and nicely in line with the guiding principles I had set, specifically "Compatibility with other features" and "Avoiding ambiguity with simple rules and by requiring explicit control by the user (developer)". :)
This isn't exactly varidics but it is related. Cc @Jules-Bertholet who I believe has done some design work in that area. (Some recent discussion in that area https://rust-lang.zulipchat.com/#narrow/channel/213817-t-lang/topic/Variadic.20generics.20experiment) |
|
||
The ellipsis symbol composed from three consecutive ASCII dot characters is used in the "et cetera" or "and so on" sense in many design documents and code examples. Giving it an actual syntactical meaning could lead to some confusion or readability issues. Preferring `…`, i.e. the Unicode character U+2026, Horizontal Ellipsis, in those places could help. | ||
|
||
# Rationale and Alternatives |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One thing I'd like to see discussed is why this is limited to "functions, methods, and closures".
Why not also allow other things like array literals as well?
let a = [1, 2, 3];
let b = [7, 8, 9];
let c = [...a, ...b];
assert_eq!(c, [1, 2, 3, 7, 8, 9]);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or for extra fun,
let c = [...*b"hello world", b'\0'];
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a bit about this under "Future Possibilities" subchapter "Unpacking in Fixed-Size Array and Tuple Literals", but I should make it more general and expand it to cover additional cases as well – such as that char array example you gave.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ultimately, I'd love to see unpacking expanded thus.
The reason I left it out of scope for now is that, presumably, non-trivial additional work would be required to think over the possible interactions with other features and explore the design space thoroughly enough.
My preference would be to postpone expansion of the feature this way into a future RFC, but I'm not strongly opposed to including it in this one. I think postponing would also have the benefit, that – assuming this current RFC gets accepted – the RFC would be more lightweight. Either way, maybe I should elaborate on the reason for omission from the RFC.
What's your view on this – include now or have a separate RFC?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added further elaboration on the subject of unpacking within collection literals in commit 5033917. There's an IRLO thread from the end of 2020 with a pre-RFC on array expansion syntax by nwn that pretty much covers this idea, but would need synchronization on the selected syntax. (I wonder if they'd like to continue on this by themselves or work together to write a new version?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've also written a bit more about the limited scope into the RFC in commit 3a2f450. FWIW, I'd be happy to write full RFCs for some of the listed future possibilities expanding on this proposal, provided that this one has a good chance of ultimately getting accepted. :)
Some general thoughts on "minimum viable RFCs", somewhat offtopic maybe, take as you will
As I see it, planning, writing, sharing, and discussing major initiatives such as this RFC requires a non-trivial investment of time – don't get me wrong, so far I'm enjoying this. This is why I think suggesting a bite-sized proposal at a time to gauge interest works well from an RFC author's perspective and decreases any risks on burnout or sadness stemming from seeing the work not receiving interest or getting rejected (or at worst, getting ridiculed, but I'm sure the risks for that in the Rust community are very low to begin with).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One possibility for a way forward that just occurred to me is landing these connected features (with accepted RFCs) into nightly one by one, but stabilizing them in one batch. This way, users of stable Rust would benefit from the features without any of them feeling unfinished.
A place where I found myself wanting something similar is when passing function pointers instead of closures: // Compiles
[10].into_iter().map(usize::count_ones);
// Does not
std::iter::zip([10], [11]).map(usize::min); The second iterator will error by stating that this |
when we get variadic generics, you could probably just have: pub trait Iterator {
// assumes variadic generics are built on tuples
pub fn splatted_map<F: FnMut(...Self::Item) -> R, R>(self, f: F) -> SplattedMap<Self, F>
where
Self::Item: Tuple,
{
todo!()
}
// ... all existing trait methods
} |
Co-authored-by: Pyry Kontio <[email protected]>
This papercut is definitely related, and I've collected some links related to the problem under the "Prior Art" subchapter "Using Tuples in Place of Argument Lists" (possibly the Zulip thread there was started by you?). Unfortunately, I couldn't come up with a nice design leveraging the syntax proposed here to address this. |
Are you suggesting this as a change to |
yeah, that zulip thread was me! |
|
That would sadly not work as a fix then, since this papercut is not |
I don't see why we can't add |
I'd call this a workaround, not a fix, because it doesn't really address the issue itself. All in all, this is a papercut, and on its own I don't think it is enough justification for adding new dedicated methods, especially when it could feasibly be addressed by the language itself/some other change later on. |
The same ellipsis syntax with a very similar meaning could be adopted to defining fixed-size arrays and tuple literals as well. For example: | ||
```rust | ||
const CDE: [char; 3] = ['C', 'D', 'E']; | ||
const ABCDEFG1: [char; 7] = ['A', 'B', ...CDE, 'F', 'G']; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is added, then I’d expect for the syntax to be supported in vec!
as well (eventually, if not at the same time as in arrays).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True. Using vec![]
is a good example I should add to the subchapter "Unpacking Arguments for Macro Invocations" as a synergy for unpacking in arrays (or the other way around).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've now added a mention of vec![]
use under "Unpacking Arguments for Macro Invocations" in commit e7f696c.
While it is not spelled out, I suppose unpacking Example 1 with variadic function call: use std::ffi::c_char;
unsafe extern "C" {
unsafe fn printf(fmt: *const c_char, ...);
}
fn main() {
let expr = (1, 2, 3);
unsafe {
printf(c"%d %d %d\n".as_ptr(), ...expr);
}
} Example 2 with overloaded unboxed closure: #![feature(fn_traits, unboxed_closures)]
#[derive(Copy, Clone)]
struct F;
impl FnOnce<(u8, u8)> for F {
type Output = ();
extern "rust-call" fn call_once(self, args: (u8, u8)) {
println!("2 args: {args:?}");
}
}
impl FnOnce<(u8, u8, u8)> for F {
type Output = ();
extern "rust-call" fn call_once(self, args: (u8, u8, u8)) {
println!("3 args: {args:?}");
}
}
fn main() {
let f = F;
f(1, 2);
f(3, 4, 5);
let expr = (6, 7);
f(...expr);
} Because I've seen proposal above suggesting // error[E0284]: type annotations needed
printf(c"%d %d %d\n".as_ptr(), ...buf.try_into().unwrap());
// error[E0284]: type annotations needed
f(...buf.try_into().unwrap()); |
Yes, I think so. I hadn't thought of this, but it makes sense. Thanks for the examples – I'll need to do some further thinking w.r.t. calling variadic functions and update the text accordingly! |
It should be noted that, if you can unpack into a closure, you can also unpack into a tuple struct, since tuple structs implement the struct Foo(i32, i32);
Foo(...tup2) It would also make sense to allow unpacking into tuples: let tup2 = (1, 2);
let tup3 = (...tup2, 3); Also note that arrow/slice patterns already have a counterpart to unpacking: match some_slice {
[1, 2, rest @ ..] => {}
} So we could use the |
The rest pattern is just
|
## Unpacking Arguments for Macro Invocations | ||
|
||
Macros, callable with the `macro_name!(…)`, `macro_name![…]`, or `macro_name!{…}` syntax have been omitted from the scope of this proposal. The reason for omission is the time concerns related to doing the due diligence researching the design. For example, some macros (e.g. `println!`) accept an indefinite number of arguments. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is handling macro invocation is even considered in this RFC lol
the input to a macro always see ...expr
as two tokens and it should never change
macro_rules! backward_compatiblity_check {
($($t:tt)*) => { println!(concat!($("<", stringify!($t), "> "),*)) }
}
fn main() {
backward_compatiblity_check![1, ...expr, 2];
// <1> <,> <...> <expr> <,> <2>
}
if vec![]
or println!()
want to support unpacking it is the responsibility of the implementation of those macros themselves individually.
You're being nitpicky, but I updated my comment. The idea is to use
|
I was not being nitpicky. Perhaps you could clarify what you mean by this:
|
I understand this RFC focuses on function calls, however I do appreciate the shootout at the end to Functional Record Update. Specifically, a natural extension of Functional Record Update would be to allow unpacking a different struct, with public fields which happen to match, something like: struct Large { a: i8, b: u16, c: i32, d: u64 }
struct Small { b: u16, d: u46 }
fn foo(small: Small) -> Large {
Large { a: 8, c: 32, ...small }
} In which case indeed the same "unpack" syntax should be used (deprecating |
const CONST_NUMBER: u8 = 42; | ||
|
||
fn ret_one_refarg() -> &'static u8 { | ||
&CONST_NUMBER |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: constants aren't statics. Referencing a constant is completely identical to writing &42
.
|
||
### Generic Parameters | ||
|
||
When function parameters are generic, using `<T>`, `impl` or `dyn`, exactly the same should happen as when the arguments are passed by hand. I.e., the argument's type must be *compatible* with the parameter's type. Just as when entering that argument manually with a field access expression. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: dyn
has nothing to do with generics.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about generic arguments? Can I write fn foo<T>(t: T) { bar(...t) }
then call foo((1, 2, 3))
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rust's generics aren't checked post-monomorphization (as in C++ or D), so no fn foo<T>(t: T) { bar(...t) }
should not compile at all.
For this to work it should require a constraint such as
// in std::marker::*
trait Unpack {
type Target: Tuple;
fn unpack(self) -> Self::Target;
}
fn foo<T: Unpack<Target = (u32, u32, u32)>>(t: T) {
bar(...t);
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here are a few comments on motivation, largely based on my personal experience and feeling instead of facts, but others might agree as well. Some data-driven evidence might be helpful here.
|
||
Argument unpacking reduces the verbosity and increases the ergonomics of Rust, it: | ||
|
||
- Improves code writing ergonomics by removing the need for repetitive, unneeded intermediate steps. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why would tuples avoid repetition to begin with? If they are repetitive, why isn't the function defined to accept the parameters as a struct instead?
Argument unpacking reduces the verbosity and increases the ergonomics of Rust, it: | ||
|
||
- Improves code writing ergonomics by removing the need for repetitive, unneeded intermediate steps. | ||
- Allows more concise code in terms of number of lines and, occasionally, line length. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You need something that constructs the tuple to begin with. Why would we coincidentally have a tuple that happens to have the same order of items as another parameter? If we actually happen to have such a tuple, shouldn't it already be a type (e.g. struct Vec3 { x: f32, y: f32, z: f32 }
), and shouldn't the called function accept such type?
|
||
- Improves code writing ergonomics by removing the need for repetitive, unneeded intermediate steps. | ||
- Allows more concise code in terms of number of lines and, occasionally, line length. | ||
- Allows reducing the number of named local variables in scope. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since you already have a tuple/array of this, you could simply pass tuple.0, tuple.1, tuple.2
to the argument list instead of ...tuple
. I don't see how this avoids additional local variables, unless you are referring to avoiding naming tuple
itself, which would only be the case when the output of another function happens to coincide with (a subsequence of) the inputs of the called function, in which case, again, should have been made its own struct.
- Improves code writing ergonomics by removing the need for repetitive, unneeded intermediate steps. | ||
- Allows more concise code in terms of number of lines and, occasionally, line length. | ||
- Allows reducing the number of named local variables in scope. | ||
- Is intuitive for developers accustomed to argument unpacking from other programming languages. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The majority of use cases of argument unpacking in other languages almost always pass into a variadic parameter of the same type [citation needed].
- Allows more concise code in terms of number of lines and, occasionally, line length. | ||
- Allows reducing the number of named local variables in scope. | ||
- Is intuitive for developers accustomed to argument unpacking from other programming languages. | ||
- Adds a missing piece to the family of certain kind of syntactic sugar already in Rust, with features such as *struct update syntax* and *destructuring assignment*. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Struct update syntax and destructuring assignment have the same type on both sides. The sequence of parameters of a function (or its subsequence) is rarely identical to another type, except in the niche cases of pairs (e.g. a function that accepts two parameters of the same type and compares them) and well-defined vector types (e.g. (x: f32, y: f32, z: f32)
or (r: u8, g: u8, b: u8)
. In the former case the function should accept [T; 2]
instead, and in the latter case the function should just define a type to avoid the boilerplate.
- Is intuitive for developers accustomed to argument unpacking from other programming languages. | ||
- Adds a missing piece to the family of certain kind of syntactic sugar already in Rust, with features such as *struct update syntax* and *destructuring assignment*. | ||
|
||
Furthermore, argument unpacking provides groundwork for both the syntax and its intended use for possible next steps and related proposals: As long as compatibility is sufficiently considered, the proposed feature could also reduce the workload and scope of more general and ambitious initiatives, e.g. *variadic generics*, by iterating towards them in smaller steps. This may be a double-edged sword, however, as argued under [Drawbacks](#drawbacks). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does this reduce the workload of variadic generics? I can only see it introducing additional interactions to consider in those features.
It seems odd to allow use for constructing tuple like structs, but not for anonymous tuples. Is there any reason not to support unpacking in tuples? |
yes, I think we should avoid adding more ways that tuples can't be used like tuple structs, since we already have enough pain from that for macro authors (e.g. you can't write |
Thanks for the suggestion. Having more alternatives from Rust-specific prior art helps, and so I've listed this Table 1 in commit 857517a. It's actually a bit similar to Scala's syntax, which I added in commit e06e70a. |
Personally, I see zero technical impediments and am totally in favor of having these as well. There's some discussion about human reasons under this comment thread: #3723 (review) @tmccombs, @programmerjake, what's your view on these? |
I think that it should work in any reasonable future implementation of variadics to be able to create a tuple or array with syntax like |
Thanks for the critical feedback! Anecdotally, I ran into the need for argument unpacking myself in a situation very similar to the one under "Guide-Level Explanation", where functions I was using were in different crates and I was merely passing the returned tuple from one as the conventional arguments in another. This is actually what prompted me to write the RFC. :) I've added a bit of conjecture under the "Motivation" chapter in commit 31eb229, but you're right in that data-driven evidence would be great. Here are a couple of experiments I've had in mind that I could try to run if I have the time:
|
Summary
This RFC adds call-site unpacking of tuples, tuple structs, and fixed-size arrays, using
...expr
within the function call's parentheses as a shorthand for passing arguments. The full contents of these collections with known sizes are unpacked directly as the next arguments of a function call, desugaring into the corresponding element accesses during compilation.Rendered