Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enums of named integer constants/enums with "other" variant #574

Open
miwig opened this issue May 2, 2024 · 4 comments
Open

Enums of named integer constants/enums with "other" variant #574

miwig opened this issue May 2, 2024 · 4 comments

Comments

@miwig
Copy link

miwig commented May 2, 2024

This schema:

{
    "title": "My enum",
    "oneOf": [
        {
            "const": 1,
            "description": "Foo",
            "type": "integer"
        },
        {
            "const": 5,
            "description": "Bar",
            "type": "integer"
        }
    ]
}

generates this non-functional output:

#[derive(Clone, Debug, Deserialize, Serialize)]
#[serde(untagged)]
pub enum MyEnum {
    Variant0(i64),
    Variant1(i64),
}

Desured output would be:

pub enum MyEnum {
    Foo = 1,
    Bar = 5,
}

I think the serde_repr crate could be used in this case.

Similarly, this:

{
    "title": "My enum",
    "anyOf": [
        {
            "const": 1,
            "description": "Foo",
            "type": "integer"
        },
        {
            "const": 5,
            "description": "Bar",
            "type": "integer"
        },
        {
            "type": "integer"
        }
    ]
}

generates

#[derive(Clone, Debug, Deserialize, Serialize)]
pub struct MyEnum {
    #[serde(flatten, default, skip_serializing_if = "Option::is_none")]
    pub subtype_0: Option<i64>,
    #[serde(flatten, default, skip_serializing_if = "Option::is_none")]
    pub subtype_1: Option<i64>,
    #[serde(flatten, default, skip_serializing_if = "Option::is_none")]
    pub subtype_2: Option<i64>,
}

Desired output would be something like

pub enum MyEnum {
    Foo = 1,
    Bar = 5,
    Other(i64)
}

I think for integer variants this would require custom (de)serialization code. However, this Other feature would also be useful for enums with string variants, where I believe it might be accomplished with #[serde(other)]
Examples in the wild:

@ahl
Copy link
Collaborator

ahl commented May 2, 2024

First, clearly typify is not doing a good job of handling oneOfs with constant values. This type is awful:

pub enum MyEnum {
    Variant0(i64),
    Variant1(i64),
}

I would note that if you use the title field rather than the description field you'll get:

pub enum MyEnum {
    Foo(i64),
    Bar(i64),
}

... but that's only slightly better looking and no more faithful.

Here's what I'd want to see (more or less).

pub enum MyEnum {
    Foo,
    Bar,
}

impl<'de> serde::Deserialize<'de> for MyEnum {
    fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>
    where
        D: serde::Deserializer<'de>,
    {
        match u64::deserialize(deserializer)? {
            1 => Ok(Self::Foo),
            5 => Ok(Self::Bar),
            _ => Err(D::Error::invalid_value(...)),
        }
    }
}

I don't think I'd use the discriminant value since 1. serde itself doesn't consider that (and I'm loath to pull in another serde-like dependency) and 2. we need something similar for other, constant, non-integral values.


With regard to that second type (the anyOf) can you tell me more about how folks would use the Rust type? Or is there a more significant example you can share? My concern is that one could either say MyEnum::Foo or MyEnum::Other(1) and those would be different values that were equivalent when serialized.

I get that's the point of anyOf but wouldn't we then be within our right so just always deserialize into the Other variant?

@miwig
Copy link
Author

miwig commented May 2, 2024

I don't think I'd use the discriminant value since 1. serde itself doesn't consider that (and I'm loath to pull in another serde-like dependency) and 2. we need something similar for other, constant, non-integral values.

Sure, that code looks good. Setting a discriminant would make serialization code shorter but doesn't help that much for deserialization since there's no from_discriminant. It might be useful to have it available though if the schema goes to the trouble of defining it. But yes, non-integral values need a different solution.

With regard to that second type (the anyOf) can you tell me more about how folks would use the Rust type? Or is there a more significant example you can share? My concern is that one could either say MyEnum::Foo or MyEnum::Other(1) and those would be different values that were equivalent when serialized.

I get that's the point of anyOf but wouldn't we then be within our right so just always deserialize into the Other variant?

Sure, looking purely at the schema, with anyOf all the other variants become obsolete once you have {"type":"integer"} in there. From that standpoint, just generating the field as i64 would be justifiable, and certainly more correct than what typify is doing now.
Semantically, I would understand Other to mean "none of the above", so always deserializing into that variant would be a semantic error. I suppose a more correct way to express this in the schema would be to add a not subschema excluding the known variants to the Other variant?

In any case, what I suspect the authors of that schema meant is: "These are the known values as of now, but there may be user-defined values or future extensions that should be accepted for forward compatibility".

@ahl
Copy link
Collaborator

ahl commented May 2, 2024

Semantically, I would understand Other to mean "none of the above", so always deserializing into that variant would be a semantic error. I suppose a more correct way to express this in the schema would be to add a not subschema excluding the known variants to the Other variant?

In any case, what I suspect the authors of that schema meant is: "These are the known values as of now, but there may be user-defined values or future extensions that should be accepted for forward compatibility".

Agreed! I'm trying to figure out how we would infer that to be the case or come up with a heuristic that is generally applicable (i.e. rather than one that might be wrong). In general our current anyOf handling is terrible (and needs to be expanded into a oneOf powerset). Perhaps it would suffice to say "if one case overlaps with all others and no other overlap then we consider this to be discrete values plus a none-of-the-above". I think I'd still want to do something like this:

enum MyEnum {
    Foo,
    Bar,
    Other(MyEnumOther),
}
struct MyEnumOther(i64);

impl From<i64> for MyEnum {
    fn from(value: i64) -> Self {
        match value {
            1 => Self::Foo,
            5 => Self::Bar,
            n => Self::Other(n),
        }
    }
}

impl From<MyEnum> for i64 { .. }
impl From<MyEnumOther> for i64 {
    fn from(value: MyEnumOther) -> Self {
        value.0
    }
}

// .. and maybe this as well:
impl TryFrom<i64> for MyEnumOther {
    fn try_from(value: i64) -> Result<Self, ..> {
         match (value) {
            /* known values */ => Err(..),
            n => Ok(Self(n)),
        }
    }
}

This might be overly fussy but it would mean that a particular value had a single representation in the generated code.

@miwig
Copy link
Author

miwig commented May 2, 2024

That seems reasonable to me. For what it's worth, I don't know if this is a common pattern in the json schema world. I would be content to just write these types myself for now, but there it is somewhat annoying that the replacements in TypeSpaceSettings only seem to apply to types generated from references.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants