-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support data schema (column metadata) #20
Comments
@jwestw This is related to what we discussed today. As mentioned above this is relevant to the output of CSVW. In addition, this would meet a need in Open SDG, where we don't have any way to control the ordering of the columns. Countries that are inputting their data from SDMX already have a data schema, in their DSD (data structure definition). So I think we can focus on the use-case of countries that are using CSV files, like the UK. I'll throw out some ideas for approaches below. Personally I kind of lean towards "jsonschema per indicator" along with "auto-generated". One central jsonschema fileWith this approach, there would be a single (very long) jsonschema file in the country's data repository, like "data-schema.json". It would be a full collection of all the columns and values used across all indicators. For example, part of it might look like this:
Pros: centrally located and comprehensive (this is analogous to an SDMX DSD) Jsonschema per indicatorWith this approach there would be a separate jsonschema file for each indicator. It would look the same as the above, but would only contain the columns/values that are used in that indicator. Pros: each indicator can be configured separately Auto-generatedThis approach could be combined with one of the other two. In this approach, if a column did not have any jsonschema representation, then that jsonschema would be auto-generated, assuming Using this same code we could also provide a way for countries to "initialize" a jsonschema file, for the purposes of customizing it. For example, say a country wants to customize their data schema for indicator 1.1.1 - they could run a Python script like Pros: spares the countries from needing to maintain jsonschema |
This is to support reading (#9) and writing to (#15) SDMX and CSVW (#19)
The indicator class will store the raw data, and metadata, and we also want the schema for the data. This includes things such as:
The text was updated successfully, but these errors were encountered: