Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Source Config: Metadata - Revisit DatasetDescription fields #131

Open
glass-ships opened this issue Apr 5, 2024 · 1 comment
Open

Source Config: Metadata - Revisit DatasetDescription fields #131

glass-ships opened this issue Apr 5, 2024 · 1 comment
Labels
documentation Improvements or additions to documentation help wanted Extra attention is needed question Further information is requested

Comments

@glass-ships
Copy link
Collaborator

glass-ships commented Apr 5, 2024

Currently, the metadata.yaml file (properties defined in koza/model/config/source_config.py) is not being used in any way other than documentation, and has fields with misleading names relative to how they're being used:

  • id simply isn't clear

  • name appears to be getting used as the name of the data source, which is redundant with ingest_title

  • ingest_title implies the title of the ingest itself, not the name of the data source as expected?

  • ingest_url same thing, would expect this is the link to the ingest repo, not the data source.
    We should consider renaming these to reflect what we expect the user to use here

  • source / provided_by: are these not the same thing? we should add a docstring making the distinction clear

    • It seems like source has been deprecated in favor of provided_by, and is no longer being used. Can we remove from the model?
  • rights / license: duplicates? do they refer to different things? again, docstring clarifying this would be good.

    • license does not appear to be used in any of our ingests. Can we remove from model?

One possible option is to simply remove the definition and usage of metadata within Koza, and simply allow it to exist for documentation purposes alongside ingest files.

Another would be to allow Koza to read the metadata file, and use the data contained within as default values for various fields during the transform process, possibly writing to an output metadata file, or as columns in the transform output.

I do think our move towards modularized ingests adds some importance to sorting this out.
Maybe we can add this as an agenda item for one of a data call?

@glass-ships glass-ships added documentation Improvements or additions to documentation help wanted Extra attention is needed question Further information is requested labels Apr 5, 2024
@matentzn
Copy link
Member

matentzn commented Apr 6, 2024

When you meet about this, please let me know when and where before finalising the metadata collected and the shape it is deployed in.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation help wanted Extra attention is needed question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants