Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Either Data needs shape options or the default shape should be any shape #83

Open
rly opened this issue Aug 8, 2024 · 7 comments · May be fixed by #84
Open

Either Data needs shape options or the default shape should be any shape #83

rly opened this issue Aug 8, 2024 · 7 comments · May be fixed by #84
Assignees
Labels
category: enhancement improvements of code or code behavior priority: medium non-critical problem and/or affecting only a small set of NWB users

Comments

@rly
Copy link
Contributor

rly commented Aug 8, 2024

By default, a dataset's shape is null which means the attribute/dataset is a scalar. ref
In object-oriented programming - specifically, the Liskov substitution principle - and our type system, subtypes (subclasses) should not expand the range of allowed options from the supertype (superclass).

The dataset data type Data has no shape defined, so its shape is scalar. Pretty much all subtypes of Data define a non-scalar shape. This breaks the above principle.

This had low impact because almost all subtypes of Data that are used define a shape (except for NWB's ScratchData) but as we are now trying to improve how the API resolves and validates shapes, the base shape matters.

Option 1: Add scalar, 1D, 2D, 3D, 4D, 5D shape options to Data.

  • Problem: The schema language does not allow one to specify a dataset is either scalar or 1D.
  • Solution 1: We could add an option where shape = 0 means scalar.
  • Solution 2: We could add an option where shape = "ANY" means any shape

Option 2: Make Data a special case where the default is any shape. This is kind of hacky.

Option 3: Change the default shape to be any shape for datasets. This could break existing schema that do not specify a shape and assume it is a scalar. I think other options are better.

@rly rly added category: enhancement improvements of code or code behavior priority: medium non-critical problem and/or affecting only a small set of NWB users labels Aug 8, 2024
@rly rly self-assigned this Aug 8, 2024
@rly
Copy link
Contributor Author

rly commented Aug 8, 2024

@oruebel curious your thoughts. I think there is value to having a special key that represents Any shape. We have added that to LinkML: https://linkml.io/linkml/schemas/arrays.html#any

@oruebel
Copy link
Contributor

oruebel commented Aug 8, 2024

. I think there is value to having a special key that represents Any shape

That sounds reasonable

@bendichter
Copy link
Contributor

What's wrong with just saying shape: null means Any?

@bendichter
Copy link
Contributor

It seems to me the scalar should be the special case.

@rly rly linked a pull request Aug 8, 2024 that will close this issue
4 tasks
@rly
Copy link
Contributor Author

rly commented Aug 8, 2024

It seems to me the scalar should be the special case.

I agree the scalar should be the special case. But this will modify the meaning of existing schema where a shape was not defined, e.g., https://github.com/NeurodataWithoutBorders/nwb-schema/blob/dev/core/nwb.file.yaml#L27

Thinking through this some more, I realize that this is not totally a breaking change. It just means more options (all options) will be valid when no shape is specified, including scalar. These schema would need to be updated to say to restrict the shape to a scalar using a new special value for scalar.

We version the language and are already preparing a 3.0.0 release so such a change would be OK.

I am happy to make scalar be the special case instead.

@bendichter
Copy link
Contributor

I like having null mean any shape for a few reasons

  • It feels logically consistent to me. null already represents freedom in the shape spec, so just a null would mean ultimate freedom.
  • While we do have some scalar datasets, it seems to me like an odd thing to do to make a scalar dataset instead of making it an attribute, so I think that should be specified explicitly with shape: scalar

@rly
Copy link
Contributor Author

rly commented Aug 8, 2024

Sounds good. In LinkML, we also decided that shape: null = any shape. I was just worrying about compatibility. I'll make the change to shape: scalar

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category: enhancement improvements of code or code behavior priority: medium non-critical problem and/or affecting only a small set of NWB users
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants