Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added IRIs as extra field properties in datamodels #434

Open
wants to merge 8 commits into
base: master
Choose a base branch
from

Conversation

quaat
Copy link
Contributor

@quaat quaat commented Feb 27, 2024

Description:

Type of change:

  • Bug fix.
  • New feature.
  • Documentation update.

Checklist for the reviewer:

This checklist should be used as a help for the reviewer.

  • Is the change limited to one issue?
  • Does this PR close the issue?
  • Is the code easy to read and understand?
  • Do all new feature have an accompanying new test?
  • Has the documentation been updated as necessary?

@quaat quaat linked an issue Feb 27, 2024 that may be closed by this pull request
Copy link

codecov bot commented Feb 27, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 82.79%. Comparing base (f5587a6) to head (3f9b721).
Report is 1 commits behind head on master.

Additional details and impacted files
@@           Coverage Diff           @@
##           master     #434   +/-   ##
=======================================
  Coverage   82.79%   82.79%           
=======================================
  Files          14       14           
  Lines         616      616           
=======================================
  Hits          510      510           
  Misses        106      106           
Flag Coverage Δ
linux 82.79% <ø> (ø)
linux-strategies 82.79% <ø> (ø)
windows 82.14% <ø> (ø)
windows-strategies 82.14% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor

@CasperWA CasperWA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implementing the IRI key the way you have is deprecated in pydantic v2, see https://github.com/EMMC-ASBL/oteapi-core/actions/runs/8063034440/job/22024046478?pr=434#step:7:224.

I have made the according suggested changes in one of the files, but this should be repeated throughout all the config files, where IRI is implemented/added.
Also, I think you can remove the "type: ignore" comment afterwards.

oteapi/models/datacacheconfig.py Outdated Show resolved Hide resolved
oteapi/models/filterconfig.py Outdated Show resolved Hide resolved
oteapi/models/filterconfig.py Outdated Show resolved Hide resolved
oteapi/models/filterconfig.py Outdated Show resolved Hide resolved
oteapi/models/filterconfig.py Outdated Show resolved Hide resolved
@@ -36,6 +38,5 @@ class DataCacheConfig(AttrDict):
tag: Optional[str] = Field(
None,
description="Tag assigned to the downloaded content, typically "
"identifying a session. Used with the `evict()` method to clean up a "
"all cache entries with a given tag.",
"identifying a session. Used with the `evict()` method to clean up all cache entries with a given tag.",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please ahead to the PEP8 Python conventions. Max line length is 80 characters.

query: Optional[str] = Field(
None,
description="Define a query operation.",
IRI="http://www.w3.org/1999/02/22-rdf-syntax-ns#Statement", # type: ignore
Copy link
Contributor

@jesper-friis jesper-friis Feb 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rdf:Statement refer to a RDF statement as a triple. That is not the meaning of the query field.

I haven't looked much into the details of the data cube vocabulary, but qb:slice seems to be a possible property one could refer to, since slicing is about selecting a subpart of a dataset, witch is exactly the purpose of a filter strategy. However, a big issue with data cube vocabulary is that it is not related to dcat, so I will not recommend to use it.

Copy link
Contributor

@jesper-friis jesper-friis Feb 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The suggested properties for the query, condition and limit fields seems arbitrary chosen. Since there exists no properties in the DCAT and related W3C vocabularies that correspond to these fields, I think that it is better that we define our them in the OTE Interface Ontology (OTEIO).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are absolutely right about the IRIs being a bit random for certain properties, but I would much rather go to more generic concepts in established ontologies, than making up our own concepts in a custom ontology that nobody has adopted.

)
prefixes: Optional[Dict[str, str]] = Field(
None,
description=(
"Dictionary of shortnames that expands to an IRI given as local "
"value/IRI-expansion-pairs."
),
IRI="http://www.w3.org/2004/02/skos/core#notation", # type: ignore
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If using skos:notation for prefixes, then we need a to define a custom datatype, like oteio:prefixType, such that we can serialise a prefix as:

:mappingFilter skos:notation "rdfs: <http://www.w3.org/2000/01/rdf-schema#>"^^oteio:prefixType .

However, I think it would be simpler to define our own oteio:prefix, such that we can express the above as:

:mappingFilter oteio:prefix "rdfs: <http://www.w3.org/2000/01/rdf-schema#>" .

description="Type of registered parser strategy.",
IRI="http://purl.org/dc/terms/type",
) # type: ignore
entity: AnyHttpUrl = Field(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Entity is a difficult name. It can mean anything. I am not found of freely mixing different vocabularies, but if we really want to identify this with schema:url, then I think the field should be named url.

None, description="Type of registered resource strategy."
None,
description="Type of registered resource strategy.",
IRI="http://purl.org/dc/terms/type", # type: ignore
)

downloadUrl: Optional[HostlessAnyUrl] = Field(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please spell the field name as downloadURL to be consistent with DCAT.

@@ -42,6 +45,7 @@ class ResourceConfig(GenericConfig, SecretConfig):
" type of the distribution is defined in IANA "
"[[IANA-MEDIA-TYPES](https://www.w3.org/TR/vocab-dcat-2/#bib-iana-media-types)]."
),
IRI="http://www.w3.org/ns/dcat#mediaType", # type: ignore
)
accessUrl: Optional[HostlessAnyUrl] = Field(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please spell the field name as accessURL to be consistent with DCAT.

None, description="Time when the transformation process started. Given in UTC."
None,
description="Time when the transformation process started. Given in UTC.",
IRI="http://purl.org/dc/terms/date", # type: ignore
Copy link
Contributor

@jesper-friis jesper-friis Feb 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

startTime and finishTime cannot have the same IRI. However, DCAT has dcat:startDate and dcat:endDate which are suitable here.

The only issue with dcat:startDate and dcat:endDate are that they have domain dcterms:PeriodOfTime. Since a TransformationStatus describes a time period, but isn't a time periode itself, the RDF serialisation cannot be straight forward, like

:transformation_status1 
  dcat:startTime "2024-02-29 08:45" ;
  dcat:endTime "2024-02-29 09:00" .

but has to has to be expressed as:

:transformation_status1 
  dcterms:temporal [
    a dcterms:PeriodOfTime ;
    dcat:startTime "2024-02-29 08:45" ;
    dcat:endTime "2024-02-29 09:00" ;
  ] .

quaat and others added 6 commits February 29, 2024 13:00
Co-authored-by: Casper Welzel Andersen <[email protected]>
Co-authored-by: Casper Welzel Andersen <[email protected]>
Co-authored-by: Casper Welzel Andersen <[email protected]>
Co-authored-by: Casper Welzel Andersen <[email protected]>
Co-authored-by: Casper Welzel Andersen <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

IRI in Pydantic Datamodels for RDF serialization
3 participants