sh:targetClass does not use shacl/ont graph hierarchies #148

gtfierro · 2022-06-29T15:52:14Z

Say I have a simple SHACL-based ontology as follows:

@prefix ex: <urn:ex#> .
@prefix sh: <http://www.w3.org/ns/shacl#> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .

ex:Class a owl:Class .
ex:SubClass a owl:Class ;
    rdfs:subClassOf ex:Class .
ex:SubSubClass a owl:Class ;
    rdfs:subClassOf ex:SubClass .

ex:FailedRule a sh:NodeShape ;
    sh:targetClass ex:Class ;
    sh:rule [
        a sh:TripleRule ;
        sh:object ex:Inferred ;
        sh:predicate ex:hasProperty ;
        sh:subject sh:this ;
    ] .

and a separate data graph:

@prefix ex: <urn:ex#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .

ex:A a ex:SubSubClass .

I would expect that running pyshacl with advanced features will generate the ex:A ex:hasProperty ex:Inferred triple on the graph. However, this only happens when I put the ontology/shapes and data in the same graph object.

Succeeds:

def test_ruleTargetClass_onegraph():
    data_g = rdflib.Graph().parse(data=shapes_and_ontology_data, format='turtle').parse(data=model_data, format='turtle')

    conforms, results_graph, results_text = pyshacl.validate(
        data_graph=data_g, advanced=True
    )
    assert conforms
    assert (rdflib.URIRef("urn:ex#A"), rdflib.URIRef("urn:ex#hasProperty"), rdflib.URIRef("urn:ex#Inferred")) in data_g

Fails:

def test_ruleTargetClass_twograph():
    shape_g = rdflib.Graph().parse(data=shapes_and_ontology_data, format='turtle')
    data_g = rdflib.Graph().parse(data=model_data, format='turtle')

    conforms, results_graph, results_text = pyshacl.validate(
        data_graph=data_g, shacl_graph=shape_g, advanced=True
    )
    assert conforms
    assert (rdflib.URIRef("urn:ex#A"), rdflib.URIRef("urn:ex#hasProperty"), rdflib.URIRef("urn:ex#Inferred")) in data_g

I am not running RDFS or OWL inference in this scenario and per the SHACL specification, I shouldn't have to -- the sh:targetClass property should be aware of the rdfs:subClassOf hierarchy. In fact, this seems to work great in the case where the data graph has the rdfs:subClassOf statements. However, the way sh:targetClass is implemented, it only considers triples inside the data graph argument to validate and not the shacl_graph or ont_graph arguments.

I believe there should be a straightforward fix to this by passing in the shacl and ontology graphs into the apply_rules function within pySHACL. I've developed a reproducible test case (above) and will start looking at implementing a fix --- how does the proposed approach sound?

The text was updated successfully, but these errors were encountered:

gtfierro · 2022-06-29T15:55:52Z

This issue also occurs when I put the SHACL graph as an argument to shacl_graph, ont_graph or both

ashleysommer · 2022-06-30T00:45:43Z

Hi @gtfierro
It appears to me that you've come across two different common PySHACL stumbling points, and you're treating them as a single issue.

The first problem is this:

this seems to work great in the case where the data graph has the rdfs:subClassOf statements. However, the way sh:targetClass is implemented, it only considers triples inside the data graph argument to validate

That is correct. All OWL and RDFS defined relationships/axioms need to be part of the data-graph at runtime. This is part of the SHACL spec document, section 2.1.3.2:

Note that, according to the SHACL instance definition, all the rdfs:subClassOf declarations needed to walk the class hierarchy need to exist in the data graph

This does not only affect sh:targetClass but all SHACL constraints that operate on classes. And it does not only affect PySHACL, you'll find this same behaviour in all SHACL validators that adhere strictly to the spec.

This is the most frequently asked question on the PySHACL issue tracker, see #142 #46 #38 (the second part), the main conversation about this was in #6.

The solution is to use the ont_graph feature, to mix the ontological definitions into your data graph at runtime. This issue is why that feature exists. See this example for how that works.

Your second problem is:

I would expect that running pyshacl with advanced features will generate the ex:A ex:hasProperty ex:Inferred triple on the graph

This is a duplicate of #78 (and closely related to #20)

The SHACL spec specifies that the validator should not modify either the data graph or shapes graph at run time:

performed on the fly as part of SHACL processing (without modifying either data graph or shapes graph)

PySHACL creates a clone of your source datagraph at runtime, and any operations on the datagraph (eg inferencing/entailment and triple rules) are performed on the cloned graph. That is why the inferred triples do not exist on data_g after validation is run, even when the run succeeds. Note however, you did find a bug! That is, the example you labeled "Succeeds" should actually fail. PySHACL sometimes does not create a clone of the datagraph when it thinks there are no modifications to do, eg. when there is no ont_graph to mix in, and when RDFS/OWL inferencing is disabled, it will operate directly on the input graph, which in this case is incorrect, and is why the triples do exist on the graph in your test.

See the long running issue discussion thread in #60 about the proposed alternative mode to operate PySHACL as a new kind of inferencing engine, that will emit the triple rules back into the input data_graph.

In your PR #149, you've implemented an alternative solution to address the first problem, in an effort to solve your second problem. That is why you are getting inconsistent results when using ont_graph in your tests, and when RDFS inferencing is enabled.

ashleysommer · 2022-06-30T00:54:29Z

Note, there is a little hack/workaround you can use if you really do want to have PySHACL modify your input datagraph rather than creating a clone. That is to use the undocumented inplace switch. Eg:

validate(data_g, shacl_graph=shapes_g, ont_graph=ont_g, advanced=True, inplace=True)

That will put the PySHACL validator into a non-spec-compliant operation mode where it skips the clone step, and does emit any changes directly back into the input graph. This is normally used when your data-graph is not cloneable (eg, you are using a sparql-connector on a remote data graph, or when your data graph cannot fit in memory), but I've seen users use the feature to emulate the behaviour you're expecting in this issue.

See here for an example

gtfierro · 2022-07-04T15:22:19Z

Wow -- thank you for the extremely thoughtful and detailed answer! I see now how my mental model was incorrect for using pySHACL and I've adjusted my code to access the inferred triples. My final solution is a little awkward because I have to subtract out the triples I don't want in my expanded data graph, but it works reliably and also generates the same output as TopBraid Composer.

I can leave this open if you would like an open reminder of the small bug that I inadvertently found, or I can close this because my original issue is technically resolved. Let me know!

ashleysommer · 2022-07-04T23:19:44Z

I can leave this open if you would like an open reminder of the small bug that I inadvertently found

No need, a fix for that was already included in the v0.19.1 release.

I'm glad to know you've solved your problem with a workaround.

gtfierro mentioned this issue Jun 29, 2022

Use SHACL graph to help find focus nodes for targetClass #149

Closed

ashleysommer closed this as completed Jul 4, 2022

ashleysommer mentioned this issue Jul 8, 2023

Returning modified data graph instead of validation report? #189

Closed

ashleysommer mentioned this issue Jun 21, 2024

SHACL Advanced Feature sh:SPARQLRule --> sh:construct #235

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sh:targetClass does not use shacl/ont graph hierarchies #148

sh:targetClass does not use shacl/ont graph hierarchies #148

gtfierro commented Jun 29, 2022

gtfierro commented Jun 29, 2022

ashleysommer commented Jun 30, 2022 •

edited

Loading

ashleysommer commented Jun 30, 2022 •

edited

Loading

gtfierro commented Jul 4, 2022

ashleysommer commented Jul 4, 2022

sh:targetClass does not use shacl/ont graph hierarchies #148

sh:targetClass does not use shacl/ont graph hierarchies #148

Comments

gtfierro commented Jun 29, 2022

gtfierro commented Jun 29, 2022

ashleysommer commented Jun 30, 2022 • edited Loading

ashleysommer commented Jun 30, 2022 • edited Loading

gtfierro commented Jul 4, 2022

ashleysommer commented Jul 4, 2022

ashleysommer commented Jun 30, 2022 •

edited

Loading

ashleysommer commented Jun 30, 2022 •

edited

Loading