Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support optional fields #5

Closed
dpup opened this issue Apr 19, 2024 · 12 comments
Closed

Support optional fields #5

dpup opened this issue Apr 19, 2024 · 12 comments
Labels
enhancement New feature or request

Comments

@dpup
Copy link
Owner

dpup commented Apr 19, 2024

This is discussed in grpc-ecosystem#21

From the language guide: https://protobuf.dev/programming-guides/proto3/#field-labels

optional: An optional field is in one of two possible states:

the field is set, and contains a value that was explicitly set or parsed from the wire. It will be serialized to the wire.
the field is unset, and will return the default value. It will not be serialized to the wire.
You can check to see if the value was explicitly set.

If I recall, this wasn't originally supported in proto3 but is now being recommended as a way of differentiating between nil and zero values, without having to use the Wrapper types.

@dpup dpup added the enhancement New feature or request label Apr 19, 2024
@dpup
Copy link
Owner Author

dpup commented Apr 19, 2024

@dpup
Copy link
Owner Author

dpup commented Apr 20, 2024

Because proto3 optional is implemented in terms of a singleton oneof, this almost works out of the box. The generated type looks like this:

type BaseOptionalFieldsRequest = {
  str?: string
  number?: number
}

export type OptionalFieldsRequest = BaseOptionalFieldsRequest
  & OneOf<{ optStr: string }>
  & OneOf<{ optNumber: number }>

The problems with this are that:

  1. str is already optional, so there's not really a functional difference.
  2. I think it's going to require the gateway to have EmitOptional set on the MarshalOptions

wdyt @seanami ?

@dpup
Copy link
Owner Author

dpup commented Apr 22, 2024

The above PR make it work, but doesn't update the typescript definitions. That means a non-optional field that is empty, implies zero-value. But there's no good way to differentiate on the client without knowing the API spec.

An option could be to add a flag to the generator: --strict_types which at least requires an explicit zero value to be specified. This would work with EmitUnpopulated=true, but with EmitUnpopulated=false undefined values could be sent over the wire and parsed into an object with non-null fields.

dpup added a commit that referenced this issue Apr 22, 2024
…he gRPC gateway (#7)

Support for proto3 optional fields, with
EmitUnpopulated enabled.

Issue: #5
@seanami
Copy link
Collaborator

seanami commented Apr 22, 2024

After reading through https://github.com/protocolbuffers/protobuf/blob/main/docs/field_presence.md, my assumption is that optional is primarily concerned with the ability to determine field presence without using the well-known wrapper types.

Thinking through the cases that need to be possible for field values:

  • For a field without presence (no optional keyword)
    • Default value
    • Non-default value
  • For a field with presence (with optional keyword)
    • Empty "not set" value
    • Default value
    • Non-default value

Thinking through what the most appropriate TypeScript-native way of representing the intent behind an optional field is:

  • By default, message fields in TypeScript are represented as TypeScript-optional (e.g. string my_field = 1 turns into myField?: string)
    • I believe this is done because we can never be 100% sure whether a field will be included in JSON over the wire or not, and we want our application to be resilient to unexpectedly omitted fields. So we want undefined to always be included in the possible types for every field.
    • This is a bit weird in the context of fields with presence, because undefined and the zero value (e.g. "" for a string field) are considered to be the same thing in TypeScript-land. We have two values for each field that both mean "default value". Otherwise, we could just use undefined to mean "not set".
  • The field presence docs suggest that null is a value that JSON can use for communicating the "not set" state.
  • This suggests that a desirable TypeScript type for an optional field would be: optional string my_field = 1 turns into myField?: string | null

However, this may not be what protojson actually outputs. We also have to make the TypeScript types match what protojson will output, not just what would be most naturally in a TypeScript environment...

I don't have an easy protojson test rig handy. Can you tell me what protojson.Marshal outputs for msg1 and msg2 below?

message TestMsg {
  string str = 1;
  optional opt_str = 2;
  TestSubMsg msg = 3;
  optional TestSubMsg opt_msg = 4;
}

message TestSubMsg {
  string str = 1;
  optional opt_str = 2;
}
var msg1 TestMsg
msg2 := TestMsg{Str: "test", OptStr: "test", Msg: &TestSubMsg{Str: "test", OptStr: "test"}, OptMsg: &TestSubMsg{Str: "test", OptStr: "test"}}

@dpup
Copy link
Owner Author

dpup commented Apr 22, 2024

I added a test case here that shows some of the behavior.

For you examples:

With EmitUnpopulated on:

  • msg1: {"str":"", "msg":null}
  • msg2: {"str":"test", "optStr":"test", "msg":{"str":"test", "optStr":"test"}, "optMsg":{"str":"test", "optStr":"test"}}

With EmitUnpopulated off:

  • msg1: {}
  • msg2: {"str":"test", "optStr":"test", "msg":{"str":"test", "optStr":"test"}, "optMsg":{"str":"test", "optStr":"test"}}

@dpup
Copy link
Owner Author

dpup commented Apr 22, 2024

So with EmitUnpopulated=true I think the types might make sense to be as follows:

message TestMsg {
  string str = 1;                  // string
  optional opt_str = 2;            // string?
  TestSubMsg msg = 3;              // TestSubMsg | null
  optional TestSubMsg opt_msg = 4; // TestSubMsg?
}

And with EmitUnpopulated=false:

message TestMsg {
  string str = 1;                  // string?
  optional opt_str = 2;            // string?
  TestSubMsg msg = 3;              // TestSubMsg?
  optional TestSubMsg opt_msg = 4; // TestSubMsg?
}

In which case, a flag seems like the best bet so that users can align the gRPC Gateway settings with the generated typescript. In the same way as the useProtoNames / use_proto_names is passed.

@seanami
Copy link
Collaborator

seanami commented Apr 22, 2024

Thanks for the test case illumination!

I see, so null actually gets used for non-optional fields when they're not set. (And I'm imagining that they'd also get used for optional fields when it's set to the empty/default value? Or would that just be the string instead?)

I guess I needed one more test case for each protojson option, which is "What does a 'explicitly set to empty' optional field look like?" Does it ever use null, or does it just use default values like "" or 0 or etc.?

Depending on that answer, I think your proposal is either basically right, or you might need to add null as a possible type for the optional values when EmitUnpopulated=true.


One thing that's not great about your EmitUnpopulated=true proposal: It's always possible that a field could be missing unexpectedly because the server is running a different version of the proto def. I think that's why the current types always includes undefined as a possible value for every field, because it might just not show up on the wire, and the app code should be resilient to that.

But, I agree that your proposal would be nice, and when using protos in TypeScript I've often found myself being annoyed by having to handle undefined everywhere, especially for outgoing messages, where it's really not necessary.

One idea to make your proposal work would be to add templatized helper code in between receiving the response message from an endpoint and handing it to app code. It would use the proto definition to check for all of the expected/non-optional fields, and if any are missing over the wire, it would add the field with the default value, rather than letting it be undefined. This would ensure that, once app code had a response message, it really DOES conform to the type information without falling back on undefined.

@dpup
Copy link
Owner Author

dpup commented Apr 22, 2024

Your point about compatibility is an interesting one, getting concrete it would happen under the following situations:

  • the server deprecates a field prior to clients being expired
  • a server is rolled back independently of clients
  • a client is pushed before the server

I'd expect that a lot of code doesn't handle these cases very gracefully right now :-/

Coercing undefined to zero type would require the proto descriptors (or some abstraction) on the client. Supporting nested messages seems especially tricky.


I have a POC of a parameter that updates the types per the above proposal. Here's your example from above:

type BaseOptionalTestMsg = {
  str: string
  msg: OptionalTestSubMsg | null
}

export type OptionalTestMsg = BaseOptionalTestMsg
  & OneOf<{ optStr: string }>
  & OneOf<{ optMsg: OptionalTestSubMsg }>


type BaseOptionalTestSubMsg = {
  str: string
}

export type OptionalTestSubMsg = BaseOptionalTestSubMsg
  & OneOf<{ optStr: string }>

Here's the same message without the flag specified. Based on the corresponding behavior in the gRPC Gateway you can't tell the difference between msg and optMsg.

type BaseOptionalTestMsg = {
  str?: string
  msg?: OptionalTestSubMsg
}

export type OptionalTestMsg = BaseOptionalTestMsg
  & OneOf<{ optStr: string }>
  & OneOf<{ optMsg: OptionalTestSubMsg }>

type BaseOptionalTestSubMsg = {
  str?: string
}

export type OptionalTestSubMsg = BaseOptionalTestSubMsg
  & OneOf<{ optStr: string }>

This seems like a good incremental step to me, since currently the typescript types aren't accurate if the server is set to use EmitUnpopulated.

Then we could explore ways of mapping

@dpup
Copy link
Owner Author

dpup commented Apr 23, 2024

Ok, I may have mispoke above.

Here's the proto message:

message OptionalFieldsMsg {
  string empty_str = 1;
  int32 empty_number = 2;
  OptionalFieldsSubMsg empty_msg = 3;
  optional string empty_opt_str = 4;
  optional int32 empty_opt_number = 5;
  optional OptionalFieldsSubMsg empty_opt_msg = 6;

  string zero_str = 7;
  int32 zero_number = 8;
  OptionalFieldsSubMsg zero_msg = 9;
  optional string zero_opt_str = 10;
  optional int32 zero_opt_number = 11;
  optional OptionalFieldsSubMsg zero_opt_msg = 12;

  string defined_str = 13;
  int32 defined_number = 14;
  OptionalFieldsSubMsg defined_msg = 15;
  optional string defined_opt_str = 16;
  optional int32 defined_opt_number = 17;
  optional OptionalFieldsSubMsg defined_opt_msg = 18;
}

message OptionalFieldsSubMsg {
  string str = 1;
  optional string opt_str = 2;
}

This is the default response.

  • All empty fields are omitted.
  • Zero fields are undefined unless they are optional — so the client needs to know they are optional in order to convert to empty.
{
  "zeroMsg": {},
  "zeroOptStr": "",
  "zeroOptNumber": 0,
  "zeroOptMsg": {},
  "definedStr": "hello",
  "definedNumber": 123,
  "definedMsg": {
    "str": "hello",
    "optStr": "hello"
  },
  "definedOptStr": "hello",
  "definedOptNumber": 123,
  "definedOptMsg": {
    "str": "hello",
    "optStr": "hello"
  }
}
type BaseOptionalFieldsMsg = {
  emptyStr?: string
  emptyNumber?: number
  emptyMsg?: OptionalFieldsSubMsg
  zeroStr?: string
  zeroNumber?: number
  zeroMsg?: OptionalFieldsSubMsg
  definedStr?: string
  definedNumber?: number
  definedMsg?: OptionalFieldsSubMsg
}

export type OptionalFieldsMsg = BaseOptionalFieldsMsg
  & OneOf<{ emptyOptStr: string }>
  & OneOf<{ emptyOptNumber: number }>
  & OneOf<{ emptyOptMsg: OptionalFieldsSubMsg }>
  & OneOf<{ zeroOptStr: string }>
  & OneOf<{ zeroOptNumber: number }>
  & OneOf<{ zeroOptMsg: OptionalFieldsSubMsg }>
  & OneOf<{ definedOptStr: string }>
  & OneOf<{ definedOptNumber: number }>
  & OneOf<{ definedOptMsg: OptionalFieldsSubMsg }>


type BaseOptionalFieldsSubMsg = {
  str?: string
}

export type OptionalFieldsSubMsg = BaseOptionalFieldsSubMsg
  & OneOf<{ optStr: string }>

Here is the response with emit_unpopulated enabled:

  • Only unset optional fields are undefined, which is more explicit IMO.
{
  "emptyStr": "",
  "emptyNumber": 0,
  "emptyMsg": null,
  "zeroStr": "",
  "zeroNumber": 0,
  "zeroMsg": {
    "str": ""
  },
  "zeroOptStr": "",
  "zeroOptNumber": 0,
  "zeroOptMsg": {
    "str": ""
  },
  "definedStr": "hello",
  "definedNumber": 123,
  "definedMsg": {
    "str": "hello",
    "optStr": "hello"
  },
  "definedOptStr": "hello",
  "definedOptNumber": 123,
  "definedOptMsg": {
    "str": "hello",
    "optStr": "hello"
  }
}
type BaseOptionalFieldsMsg = {
  emptyStr: string
  emptyNumber: number
  emptyMsg: OptionalFieldsSubMsg | null
  zeroStr: string
  zeroNumber: number
  zeroMsg: OptionalFieldsSubMsg | null
  definedStr: string
  definedNumber: number
  definedMsg: OptionalFieldsSubMsg | null
}

export type OptionalFieldsMsg = BaseOptionalFieldsMsg
  & OneOf<{ emptyOptStr: string }>
  & OneOf<{ emptyOptNumber: number }>
  & OneOf<{ emptyOptMsg: OptionalFieldsSubMsg }>
  & OneOf<{ zeroOptStr: string }>
  & OneOf<{ zeroOptNumber: number }>
  & OneOf<{ zeroOptMsg: OptionalFieldsSubMsg }>
  & OneOf<{ definedOptStr: string }>
  & OneOf<{ definedOptNumber: number }>
  & OneOf<{ definedOptMsg: OptionalFieldsSubMsg }>


type BaseOptionalFieldsSubMsg = {
  str: string
}

export type OptionalFieldsSubMsg = BaseOptionalFieldsSubMsg

@dpup
Copy link
Owner Author

dpup commented Apr 26, 2024

I'm going to call this good for now. Client-side mapping of undefined to zero types would be a nice followup.

@seanami
Copy link
Collaborator

seanami commented Apr 26, 2024

Thanks for all of that test code and examples, that's super helpful. 🙏

One annoying thing about this type of OneOf approach in the TypeScript is that the types of the fields become "baked in" after the initial object is defined. So, for example:

export function test() {
  let m: OptionalFieldsMsg = {
    emptyStr: '',
    emptyNumber: 0,
    emptyMsg: {str: ''},
    zeroStr: '',
    zeroNumber: 0,
    zeroMsg: null,
    definedStr: 'test',
    definedNumber: 4,
    definedMsg: { str: 'test'},
    emptyOptStr: 'test',
  };

  m.emptyOptNumber = 4;
}

You'll notice that emptyOptStr is defined as part of the original object literal, whereas emptyOptNumber is assigned to the field after the initial object is created.

Even though this is fine from a "are the fields all valid?" standpoint, TypeScript still throws an error:

image

This makes working with these types of objects annoying in practice, because you have to do things like this to assign new values after object creation:

  m = {
    ...m,
    emptyOptNumber: 4,
  };

This creates unnecessary objects and is also more verbose. So not ideal. Also, I think that using object spreads like ...m can mask other type issues, as I'm not certain that all of the members of the spread are checked.

Instead of using the OneOf and Absent helpers in TypeScript, do you think we could make the types something like this?

// emit_unpopulated = false

type BaseOptionalFieldsMsg = {
  emptyStr?: string;
  emptyOptStr?: string;
  emptyNumber?: number;
  emptyOptNumber?: number;
  emptyMsg?: OptionalFieldsSubMsg;
  emptyOptMsg?: OptionalFieldsSubMsg;
  // etc...
}

type BaseOptionalFieldsSubMsg = {
  str?: string;
  optStr?: string;
}

// emit_unpopulated = true

type BaseOptionalFieldsMsg = {
  emptyStr: string;
  emptyOptStr?: string;
  emptyNumber: number;
  emptyOptNumber?: number;
  emptyMsg: OptionalFieldsSubMsg | null;
  emptyOptMsg?: OptionalFieldsSubMsg; // Should this have `| null` added? Seems like it doesn't happen...
  // etc...
}

type BaseOptionalFieldsSubMsg = {
  str: string;
  optStr?: string;
}

This would make the types much easier to work with in the editor, and prevent unnecessary contortions. It's not clear that the OneOf/Absent is really getting us anything additional?

@dpup dpup reopened this Apr 26, 2024
@dpup
Copy link
Owner Author

dpup commented Apr 26, 2024 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants