-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added IRIs as extra field properties in datamodels #434
base: master
Are you sure you want to change the base?
Added IRIs as extra field properties in datamodels #434
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #434 +/- ##
=======================================
Coverage 82.79% 82.79%
=======================================
Files 14 14
Lines 616 616
=======================================
Hits 510 510
Misses 106 106
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Implementing the IRI
key the way you have is deprecated in pydantic v2, see https://github.com/EMMC-ASBL/oteapi-core/actions/runs/8063034440/job/22024046478?pr=434#step:7:224.
I have made the according suggested changes in one of the files, but this should be repeated throughout all the config files, where IRI
is implemented/added.
Also, I think you can remove the "type: ignore" comment afterwards.
@@ -36,6 +38,5 @@ class DataCacheConfig(AttrDict): | |||
tag: Optional[str] = Field( | |||
None, | |||
description="Tag assigned to the downloaded content, typically " | |||
"identifying a session. Used with the `evict()` method to clean up a " | |||
"all cache entries with a given tag.", | |||
"identifying a session. Used with the `evict()` method to clean up all cache entries with a given tag.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please ahead to the PEP8 Python conventions. Max line length is 80 characters.
oteapi/models/filterconfig.py
Outdated
query: Optional[str] = Field( | ||
None, | ||
description="Define a query operation.", | ||
IRI="http://www.w3.org/1999/02/22-rdf-syntax-ns#Statement", # type: ignore |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rdf:Statement
refer to a RDF statement as a triple. That is not the meaning of the query
field.
I haven't looked much into the details of the data cube vocabulary, but qb:slice seems to be a possible property one could refer to, since slicing is about selecting a subpart of a dataset, witch is exactly the purpose of a filter strategy. However, a big issue with data cube vocabulary is that it is not related to dcat, so I will not recommend to use it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The suggested properties for the query
, condition
and limit
fields seems arbitrary chosen. Since there exists no properties in the DCAT and related W3C vocabularies that correspond to these fields, I think that it is better that we define our them in the OTE Interface Ontology (OTEIO).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are absolutely right about the IRIs being a bit random for certain properties, but I would much rather go to more generic concepts in established ontologies, than making up our own concepts in a custom ontology that nobody has adopted.
) | ||
prefixes: Optional[Dict[str, str]] = Field( | ||
None, | ||
description=( | ||
"Dictionary of shortnames that expands to an IRI given as local " | ||
"value/IRI-expansion-pairs." | ||
), | ||
IRI="http://www.w3.org/2004/02/skos/core#notation", # type: ignore |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If using skos:notation
for prefixes, then we need a to define a custom datatype, like oteio:prefixType
, such that we can serialise a prefix as:
:mappingFilter skos:notation "rdfs: <http://www.w3.org/2000/01/rdf-schema#>"^^oteio:prefixType .
However, I think it would be simpler to define our own oteio:prefix
, such that we can express the above as:
:mappingFilter oteio:prefix "rdfs: <http://www.w3.org/2000/01/rdf-schema#>" .
description="Type of registered parser strategy.", | ||
IRI="http://purl.org/dc/terms/type", | ||
) # type: ignore | ||
entity: AnyHttpUrl = Field( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Entity is a difficult name. It can mean anything. I am not found of freely mixing different vocabularies, but if we really want to identify this with schema:url
, then I think the field should be named url
.
None, description="Type of registered resource strategy." | ||
None, | ||
description="Type of registered resource strategy.", | ||
IRI="http://purl.org/dc/terms/type", # type: ignore | ||
) | ||
|
||
downloadUrl: Optional[HostlessAnyUrl] = Field( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please spell the field name as downloadURL
to be consistent with DCAT.
@@ -42,6 +45,7 @@ class ResourceConfig(GenericConfig, SecretConfig): | |||
" type of the distribution is defined in IANA " | |||
"[[IANA-MEDIA-TYPES](https://www.w3.org/TR/vocab-dcat-2/#bib-iana-media-types)]." | |||
), | |||
IRI="http://www.w3.org/ns/dcat#mediaType", # type: ignore | |||
) | |||
accessUrl: Optional[HostlessAnyUrl] = Field( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please spell the field name as accessURL
to be consistent with DCAT.
None, description="Time when the transformation process started. Given in UTC." | ||
None, | ||
description="Time when the transformation process started. Given in UTC.", | ||
IRI="http://purl.org/dc/terms/date", # type: ignore |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
startTime
and finishTime
cannot have the same IRI. However, DCAT has dcat:startDate
and dcat:endDate
which are suitable here.
The only issue with dcat:startDate
and dcat:endDate
are that they have domain dcterms:PeriodOfTime
. Since a TransformationStatus
describes a time period, but isn't a time periode itself, the RDF serialisation cannot be straight forward, like
:transformation_status1
dcat:startTime "2024-02-29 08:45" ;
dcat:endTime "2024-02-29 09:00" .
but has to has to be expressed as:
:transformation_status1
dcterms:temporal [
a dcterms:PeriodOfTime ;
dcat:startTime "2024-02-29 08:45" ;
dcat:endTime "2024-02-29 09:00" ;
] .
Co-authored-by: Casper Welzel Andersen <[email protected]>
Co-authored-by: Casper Welzel Andersen <[email protected]>
Co-authored-by: Casper Welzel Andersen <[email protected]>
Co-authored-by: Casper Welzel Andersen <[email protected]>
Co-authored-by: Casper Welzel Andersen <[email protected]>
Co-authored-by: Jesper Friis <[email protected]>
Description:
Type of change:
Checklist for the reviewer:
This checklist should be used as a help for the reviewer.