-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New struct syntax #866
New struct syntax #866
Conversation
As far as declaration, I agree with this post: that's exactly how it makes sense in my head, so I would prefer my declaration syntax. This matches how we declare functions - functions have types in the signatures like I would never use the shortcut syntax because of two reasons:
That said, I would prefer the "fake tuples" syntax if we were to have it over any "new" syntax because it looks more Rusty. |
``` | ||
|
||
Note how part of the function argument pattern reads `x: x1: f32`. | ||
The syntax might not be completely ambiguous, but it definately is confusing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: definitely
+1 to this RFC, but I'd propose that we leave the struct declaration syntax unchanged. I am aware that you think my proposal violates the "initialization follows declaration" rule, however: I believe all instances of "initialization follows declaration" can be reinterpreted as "declaration and initialization omit different parts of a full syntax", once type ascriptions are landed. And the latter rule, which I think is more consistent, also avoids giving |
It seems that my line notes failed to be line notes. Sorry. @iopq, I think the "fake tuples" shortcut syntax is indeed a good variant of tuple-pattern based multiple assignment. The fact that it uses |
A vague proof why all data structures that obey the "following" rule also obey the "omitting" rule: Let's say a slot is a position in a "following"-rule-obeying data structure's syntax, where a type is expected in declarations, and a value expression is expected in initializations. For any "following"-rule-obeying data structure D, the so called full syntax F is just like its declaration syntax, but with the types in every slot replaced by ascripted expressions of the form Now, we can say, the declaration syntax of D is F with the And the initialization syntax is F with the (Note: my interpretation for tuple structs in the diff comments is more complex than necessary, the |
I'm surprised by this: this is an RFC essentially saying that every syntax element should just be used for one thing, by example of ":", going as far as calling it "weird syntax" in the alternatives section. Then it moves on to suggesting "=>" and happily admits that it is already used in the I'm still opposed to the change completely: I would prefer to keep the old syntax and still don't see examples that would be impossible without type ascription syntax further down (even if more verbose). It introduces a lot of noise into the ecosystem, on the final stretches to stability. There was a decision to finally release Rust - for better or worse, to start using it as a practical language, even if it has some warts. That's a wart I can certainly live with. |
About
Is this matching |
@skade, without a syntax change, we would not be able to introduce keyword arguments later without some hacks, like the However, given that this change would introduce ambiguities into |
@CloudiDust I can also live without keyword arguments. Never put much weight on them, they tend to lead to huge parameter lists where passing a struct might just be better. And for short lists, I don't see them that important. |
@skade, personally I think keyword arguments (and default arguments) are nice things to have (especially, over ad-hoc function overloading). But when we try to give them a consistent syntax, we should not introduce ambiguities into I hope something like |
@skade, also, it is already largely agreed that Rust's keyword and default arguments are going to be sugars of struct arguments. Using my syntax, |
You're comparing apples to oranges here. If we used Remember that the declaration is meant to be "backwards" and
Nice find, but I think that compiler wont have any problem with it and we can tell which block is a match block and which is the struct body by looking at the word before it. If it's a path (
There's no syntax ambiguity, it could be visually confusing at most. |
About the ambiguity. Can you just change |
@dpc I wanted to leave the More precisely, because every function might be defined in the future as having one input value (usually a tuple, but we would make structs possible as input value to introduce named arguments) it would allow us to drop |
@phaux, Rust's paths are expressions too, if I am not mistaken, bare variables and type names are all path expressions. I think the above snippet will always be parsed as matching It's like bare |
@CloudiDust Paths of enum variants and unit-like (empty) structs are expressions, but they can't be followed by a block. Seems pretty unambiguous to me. Edit: It turns out that having a struct literal as a match expression is not allowed anyways. It's always interpreted as the opening of the match block, even if it's struct literal block. |
set_color(Color(0: u8, 0, 0)); // initialization | ||
let Color(red: u8, grn, blu) = get_color(); // pattern matching | ||
|
||
struct Color<T>{r=>T, g=>T, b=>T}; // declaration |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer
struct Color<T> {
r: T,
g: T,
b: T
}
as it is still type ascription.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ya, I'm not sure why =>
is used as "has type" here. Wasn't the idea to have :
mean "has type" everywhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jmesmon The reasoning behind it is right there in the RFC.
TL;DR: Deciding on =>
or :
is cosmetic, but using =>
is actually a little bit more consistent if you take into account the use-follows-declaration rule and the fact that type ascription is an operator that can be used in expressions and patterns only.
Seeing that PR was not outright closed, I will push a change to the declaration syntax so we can have some kind of consensus on this issue.
@phaux, I'd say this is expected behaviour given Rust's grammer (I should have realized this when I posted my comment above, the "always" bit. :-P) Rust has (for the most part) an LL grammer so there should be no ambiguity as far as the parser is concerned. But programmers aren't LL parsers. |
@dpc |
@phaux how do you tell whether you see that the error is in the second block because it's not valid block syntax although I might be confused, but it looks like the parser might have to backtrack to handle this ambiguity? |
@iopq Anything involving curly brackets is not allowed in |
@phaux doesn't that mean the parser has to backtrack or look ahead? |
@iopq, the parser will always parse it as matching Your playpen code will still fail to compile even if records are added to the language, only that it will not be a parsing error then, but one with semantics. (It will be a So basically while what I find has visual ambiguity, it doesn't compile anyway. Still when using the proposed |
There's basically only ONE more arrow construct that I could think of.
|
@iopq, |
I'd like to suggest rejecting this and #841. Both don't represent anything actionable and don't take the potential damage (massive churn in all existing code) into account. Both are speculative not only on the path they could take for their syntax, but also on the syntax of future features. We've already burned through the third arrow style in this discussion alone, #841 not being much better. |
For reference, here is a possible extension to the current syntax that can do type ascription, type-inferred struct literals/patterns and named arguments without breaking changes. The basic idea is using
A pattern Examples: let foo1 = {value: Foo}; // block
let foo2 = .{field: foo1}; // struct
// bind to other name:
let .{x: point_x, y: point_y} = get_point();
// bind to other name and ascribe:
let .{x: point_x: f32, y: point_y: f32} = get_point();
// bind to same name:
let .{x, y} = get_point();
// bind to same name and ascribe:
let .{.x: f32, .y: f32} = get_point();
// function with an anonymous struct as the last argument for supporting named/default arguments
fn draw_rect<N: Float>(.{x: N, y: N, w: N, h: N}) { … }
// calling the function
draw_rect(.x: x1: f32, .y: y1, .w: x2 - x1, .h: y2 - y1); This solution is less consistent and seems more arbitrary compared to this RFC's solution. (But the fact of it not being a breaking change is an advantage.) |
Along the line of the previous non-breaking solution, there is another possible breaking solution that:
The solution is: instead of using Current: And the declaration syntax remains unchanged. Basically, if we see a Examples: // struct declaration, not changed, as it is types that are after field names:
struct Point { x: f32, y:f32 };
let foo1 = {value: Foo}; // block
let foo2 = {.field: foo1}; // anonymous/type-inferred struct
// bind to other name:
let {.x: point_x, .y: point_y} = get_point();
// bind to other name and ascribe:
let {.x: point_x: f32, .y: point_y: f32} = get_point();
// bind to same name:
let {.x, .y} = get_point();
// or
let {x, y} = get_point();
// bind to same name and ascribe:
let {x: f32, y: f32} = get_point();
// function with an anonymous struct as the last argument for supporting named/default arguments
fn draw_rect<N: Float>({x: N, y: N, w: N, h: N}) { … }
// calling the function
draw_rect(.x: x1: f32, .y: y1, .w: x2 - x1, .h: y2 - y1); |
I just had the same idea after reading your previous post, but then I realized: The dot before field doesn't solve the ambiguity of what is what in This RFC + type ascription RFC propose only If this proposal doesn't get accepted, we have to specify all three values everytime. This might not be the end of the world (after all this thinking this syntax is becoming clear enough to me), but why do we want such a confusing ambiguity when we might as well just fix it before 1.0?? |
I think by putting a dot before a field, it makes Compared with Also |
@CloudiDust This is even more confusing than what we have now. I don't see how these dots add any clarity over the current syntax. Here's your dot syntax example rewritten with current syntax: let foo1 = {value: Foo}; // block
// removed dot, added comma.
// comma is currently only used for single-value tuples,
// but might be added with anonymous structs RFC:
let foo2 = {field: foo1,}; // struct
// with dots removed it's still unambiguous
let {x: point_x, y: point_y} = get_point();
let {x: point_x: f32, y: point_y: f32} = get_point();
let {x, y} = get_point();
// currently that's the only way of doing this,
// but it's still more clear than a mysterious dot before block:
let {x: x: f32, y: y: f32} = get_point(); |
@phaux without dots, what With dots, the rule is simple and consistent: only That's like, with this RFC: only Personally, I think |
+1 to this rfc as currently written (the thing to the right of |
+1, this sounds great. |
@CloudiDust When field names are long then you would want to bind to another name to shorten them anyways :) If I was to think of a sugar for Here's a small comparison I wrote: // current methods
let Rectangle{width, height, ..}: Rectangle<f32> = get_area();
let Rectangle{width, height, ..} = get_area::<Rectangle<f32>>();
// type ascription + anon structs
let {width: w: f32, height: h, ..} = get_area();
// as above + sugar (this is the syntax weirdness I mentioned in the RFC)
let {width:: f32, height, ..} = get_area();
let {.width: f32, height, ..} = get_area();
// new struct syntax (`:` is unambiguous so no need for weird syntax)
let {width: f32, height, ..} = get_area(); |
@phaux, I thought of With the // If only the field name is specified, then no need for `.`
let Rectangle{width, height, ..}: Rectangle<f32> = get_area();
let Rectangle {width, height, ..} = get_area::<Rectangle<f32>>();
// type ascription + anon structs
let {.width: w: f32, .height: h, ..} = get_area();
// when specifying types only, no need to put `.` before width (I made a mistake above, this is the correct syntax)
let {width: f32, height, ..} = get_area(); |
Notice how in the above snippet, when we didn't need to rename the fields, the syntax didn't involve |
When trying to write an RFC for While Then, if we need a "switch" anyway, why isn't the existing Which means Actually, no alternative has clear practical advantage over the current syntax. The following examples compares the current syntax and the various proposed alternatives, with type ascription and type-inferred struct literals/named arguments implemented: // Proposal A is the C99 designated initializer syntax.
// Proposal B is the `=>` syntax.
// Proposal C is the `.ident:` syntax.
struct Point { x: f64, y: f64 }
fn get_point() -> Point { ... }
fn use_point(p: Point) { ... }
// Point literals:
let point = Point {x: 3.0, y: 4.0} // current
let point = Point {.x = 3.0, .y = 4.0} // A
let point = Point {x => 3.0, y => 4.0} // B
let point = Point {.x: 3.0, .y: 4.0} // C
// Point patterns:
// without renaming and type ascription:
let Point {x, y} = get_point(); // current and all proposals
// with renaming but no type ascription:
let Point {x: new_x, y: new_y} = get_point(); // current
let Point {.x = new_x, .y = new_y} = get_point(); // A
let Point {x => new_x, y => new_y} = get_point(); // B
let Point {.x: new_x, .y: new_y} = get_point(); // C
// with type ascription but no renaming:
// *THIS IS THE ONLY SATUATION WHERE THE ALTERNATIVES HAVE CLEAR ADVANTAGE
// OVER THE CURRENT SYNTAX*
let Point {x: x: f64, y: y: f64} = get_point(); // current
let Point {x: f64, y: f64} = get_point(); // all proposals
// with both renaming and type ascription:
let Point {x: new_x: f64, y: new_y: f64} = get_point(); // current
let Point {.x = new_x: f64, .y = new_y: f64} = get_point(); // A
let Point {x => new_x: f64, y => new_y: f64} = get_point(); // B
let Point {.x: new_x: f64, .y: new_y: f64} = get_point(); // C
// By omitting the `Point` part above,
// we can get the corresponding type-inferred struct literal/pattern syntax.
// While in theory `{x: new_x}` is visually ambiguous,
// it can be disambiguated by the trailing `,`:
// `{x: new_x,}` is an type-inferred struct literal,
// and `{x: new_x}` is a block with type ascription.
// named arguments:
// with type-inferred struct literals:
use_point({x: 3.0, y: 4.0}); // current
use_point({.x = 3.0, .y = 4.0}); // A
use_point({x => 3.0, y => 4.0}); // B
use_point({.x: 3.0, .y: 4.0}); // C
// further sugaring: getting rid of `{}`:
// Not possible with the current syntax due to ambiguities
use_point(.x = 3.0, .y = 4.0); // A
use_point(x => 3.0, y => 4.0); // B
use_point(.x: 3.0, .y: 4.0); // C
// Note that, while the alternatives all permit us to do further sugaring,
// the results are not necessary better than `use_point({x: 3.0, y: 4.0})`. We can see in the above examples that, while the alternatives have consistency advantages in theory, such advantages may not exactly translate into practical advantages. |
You forgot one more example
all of the proposals can disambiguate blocks from struct literals without a trailing comma, while the current syntax cannot so if Rust gets default parameters, there will be cases where you just want to pass one member of an anonymous struct to a method and have the remaining members get default values
this would be an example where the
this is visually more confusing |
@iopq, nice find. However, we may accept if we don't change the current syntax, I'd expect the named argument declaration syntax to be: fn slice(&self, {
from: 0: usize,
to: self.len(): usize
}) I see no much advantage in declaring two names for the same named parameters ( That said, this may not be as clear as: fn slice(&self, {
from => 0: usize,
to => self.len(): usize
}) or fn slice(&self, {
.from = 0: usize,
.to = self.len(): usize
}) |
Rust already supports renaming struct fields:
this is already valid Rust, so the slots in structs do not have to be the same as the local bindings in the function so given anonymous structs and type ascription it would look like
with the
but if you were to add default parameters to current Rust it would look like
by extension, the
Although it would be weird because you'd have to pass an empty record into |
@iopq, I think this is a matter of perspective and perference. You prefer to combine struct definitions and patterns for convenience here, which is a valid goal. But I prefer to keep them separated, because they are separated for named structs. The following is my understanding/perference of the named argument definition syntax. In On the other hand, anonymous structs in function signatures are struct definitions, not literals or patterns. And struct definitions don't support field renaming. So instead of fn foo({x: y,}: {x: i32,}) -> i32 { y } Where And if we do not need renaming, we can omit the pattern and fn foo({x: i32,}) -> i32 { x }
And we can have default values: fn foo({x: 42: i32,}) -> i32 { x } (Trailing commas may or may not be needed.) I think empty records should be |
Whoops, I forgot to declare the type of the arguments. Why would default values use given I seriously hope to avoid |
@iopq, I think by separating definitions from patterns like I proposed in the above comment, we can avoid things like And the worst we have then would be |
I've seen anonymous struct syntax looking like |
@phaylon that breaks symmetry between named structs and records. We already have tuples with just |
Why not just leverage existing Default trait for default arguments? We are already agreeing on structs as a substitute for named arguments. The more core Rust we reuse, the better. Ideas (offtopic) |
@iopq: Well, the |
@phaylon Underscore could be used to signify that you want your struct/tuple literal to coerce to a named one. Omitting struct Point ( i32, i32 );
let p: Point = _(1, 2); // coerces to Point (new syntax)
let p: Point = (1, 2); // error
let p: (i32, i32) = (1, 2); // fully anonymous
struct Point { x: i32, y: i32 };
let p: Point = _{x: 1, y: 2}; // coerces to Point (new syntax)
let p: Point = {x: 1, y: 2}; // error
let p: {x: i32, y: i32} = {x: 1, y: 2}; // fully anonymous (new syntax) Look at all that symmetry! |
@phaylon I understand this can be added to Rust:
where the name of the struct |
I'm talking about both, because just re-using blocks would make it visually very hard to find. Anonymous structs still have a name, you just can't publically refer to them. |
I think by "an anonymous struct type", we may mean one of two things:
Note in Rust, we have anonymous tuples and named tuples (tuple structs), and anonymous tuples are structural types. So if we are to introduce anonymous struct types, I think we should introduce the "structural" one for symmetry. However, I have realized, by utilizing Actually, neither interpretation A nor B helps us define named/optional/default arguments if we are to utilize Under Interpretation A, when we want to implement And Interpretation B means that something like I now agree that, for anonymous/type-inferred struct literals/patterns, we should use the underscore syntax brought up by @phaylon, to disambiguate and signify that we just "don't care" about the struct names, not that the structs don't have names. (If And what's more important (at least for this comment thread) is that the current struct syntax is fine. |
Sadly, changes to struct literal syntax at this stage are not possible. Even if there were a perfect new syntax, the level of disruption at this late stage would be hard to justify. In particular, there is not any technical clash with type ascription, only potential confusion. Since type ascription should be rare, that does not justify such a large change. There are also many people who like the current syntax (not me, actually, I much prefer |
Foo{x=>1, y=>2}
struct syntax. Hopefully an improvement over #841. I tried to explain the implementation details more thoroughly.Rendered