Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue #2812: Reflect explicitly XSD-typed Literals in JSON-LD serialization #2889

Merged
merged 3 commits into from
Sep 29, 2024

Conversation

lu-pl
Copy link
Contributor

@lu-pl lu-pl commented Aug 13, 2024

Summary of changes

This PR removes a single code path in the JSON-LD serializer responsible for dropping XSD type information explicitly supplied in an rdflib.Literal.

Without the change, the following drops the XSD.string type for the JSON-LD serialization, but not for Turtle and XML serializations.

from rdflib import Graph, Literal, URIRef, XSD

graph = Graph()

graph.add(
    (
        URIRef("https://test.subject"),
        URIRef("https://test.predicate"),
        Literal("test value explicit type", datatype=XSD.string),
    )
)
graph.add(
    (
        URIRef("https://test.subject"),
        URIRef("https://test.predicate"),
        Literal("test value implicit type"),
    )
)

print(graph.serialize())
print(graph.serialize(format="xml"))
print(graph.serialize(format="json-ld"))

Note: A test in the JSON-LD test-suite had to be modified for this change; this test will fail without the change.

Closes: #2812

Checklist

  • Checked that there aren't other open pull requests for
    the same change.
  • Checked that all tests and type checking passes.
  • If the change adds new features or changes the RDFLib public API:
    • Created an issue to discuss the change and get in-principle agreement.
    • Considered adding an example in ./examples.
  • If the change has a potential impact on users of this project:
    • Added or updated tests that fail without the change.
    • Updated relevant documentation to avoid inaccuracies.
    • Considered adding additional documentation.
  • Considered granting push permissions to the PR branch,
    so maintainers can fix minor issues and keep your PR up to date.

Supplying an XSD type argument to the rdflib.Literal datatype
parameter should be reflected in JSON-LD serializations.

Closes: RDFLib#2812
Modify test "t#0018" in JSON-LD test-suite: Add XSD types to the
expected JSON-LD output.

Add test "t#0020" in JSON-LD test-suite: Add another test with mixed
explicit typing in the input source.
@ashleysommer
Copy link
Contributor

ashleysommer commented Aug 26, 2024

@lu-pl
I thought in an RDFLib Literal there is no way to distinguish between a provided datatype and an inferred datatype.

How does this know whether to emit a datatype for only user-provided datatypes?

@lu-pl
Copy link
Contributor Author

lu-pl commented Aug 26, 2024

@lu-pl I thought in an RDFLib Literal there is no way to distinguish between a provided datatype and an inferred datatype.

How does this know whether to emit a datatype for only user-provided datatypes?

Thanks for the reply!

Is there such a thing as an inferred explicit datatype in rdflib.Literal?

A datatype is either explicitly assigned, in which case rdflib.Literal.datatype holds the datatype URI, or not, in which case rdflib.Literal.datatype is None.

from rdflib import Liteal, XSD

literal_implicit_type = Literal("value")
literal_explicit_type = Literal("value", datatype=XSD.string)

print(literal_implicit_type.datatype)
print(literal_explicit_type.datatype)

The only type inference defined in the standard are simple literals, so if something is not explicitly typed, it is interpreted as syntactic sugar for xsd:string.

Please note that concrete syntaxes MAY support simple literals consisting of only a lexical form without any datatype IRI, language tag, or base direction. Simple literals are syntactic sugar for abstract syntax literals with the datatype IRI http://www.w3.org/2001/XMLSchema#string (which is commonly abbreviated as xsd:string). (3.3 Literals)

(E.g. Turtle provides a type-infery shorthand syntax for numbers and bools, see 2.5.2)

The problem I am trying to address is that the JSON-LD serializer does not retain datatypes defined in an rdflib.Graph instance (regardless of how that graph got populated).

@lu-pl
Copy link
Contributor Author

lu-pl commented Aug 27, 2024

One thing that deems me worthy of discussion is the intended purpose of line 400 in plugins.serializers.jsonld.py.

This code path is responsible for dropping type info for explicitly typed Literals (#2812), so the PR removes it. But I don't see any reason for that code path to exist in the first place.

@nicholascar nicholascar merged commit 14d1006 into RDFLib:main Sep 29, 2024
22 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

JSON-LD serialization apparently loses XSD type information.
3 participants