You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, thank you for making this library available to everyone! It is of great use to my university research project.
I believe to have spotted a bug concerning the with_reverse walk option:
Expected Behavior
When generating walks using the RandomWalker in combination with the with_reverse = True flag, the returned walks should contain zero or more predecessor triples, followed by the vertice of interest, followed by zero or more successor triples. It should especially be possible to read the returned walks from left to right as a valid traversal on the directed graph. This behavior should not change with the source of the KG.
Current Behavior
When using a local KG, the returned walks are well formed and follow the requirements from above. If the KG instead uses a remote SPARQL source, the resulting walks are no longer legal traversals of the graph. Instead, the generated walks consist of a mirrored successor part, followed by the vertice of interest, followed by another successor part (in correct order).
Steps to Reproduce
frompyrdf2vecimportRDF2VecTransformerfrompyrdf2vec.embeddersimportWord2Vecfrompyrdf2vec.graphsimportKGfrompyrdf2vec.walkersimportRandomWalkerdbpedia=KG("https://dbpedia.org/sparql")
transformer=RDF2VecTransformer(
Word2Vec(sg=0, vector_size=10),
walkers=[RandomWalker(max_walks=1, max_depth=1, with_reverse=True, md5_bytes=None)],
verbose=1
)
transformer.get_walks(dbpedia, ["http://dbpedia.org/resource/The_Matrix"])
""" e.g. [[('http://dbpedia.org/resource/The_Wachowskis', 'http://dbpedia.org/property/writer', 'http://dbpedia.org/resource/The_Matrix', 'http://www.w3.org/1999/02/22-rdf-syntax-ns#type', 'http://dbpedia.org/class/yago/Wikicat1990sScienceFictionFilms')]]Notice that the first triple does not exist on DBpedia, only its inverse does."""
Thank you for reporting this @rgrenz. You are correct that the behaviour of with_reverse seems faulty and should be fixed (something on our roadmap). Unfortunately, bandwidth is rather limited and might take some time. Feel free to open a PR if you'd fix it locally. You are spot on that the fetch_hops needs to be extended to include reverse walking logic, which should use a different SPARQL query (with object rather than subject filled in). I think it can be fixed by extending solely get_query (https://github.com/IBCNServices/pyRDF2Vec/blob/main/pyrdf2vec/connectors.py#L136) and the suggested fetch_hops (the latter should do nothing more than passing on the with_reverse to get_query).
🐛 Bug
Hi, thank you for making this library available to everyone! It is of great use to my university research project.
I believe to have spotted a bug concerning the
with_reverse
walk option:Expected Behavior
When generating walks using the RandomWalker in combination with the
with_reverse = True
flag, the returned walks should contain zero or more predecessor triples, followed by the vertice of interest, followed by zero or more successor triples. It should especially be possible to read the returned walks from left to right as a valid traversal on the directed graph. This behavior should not change with the source of the KG.Current Behavior
When using a local KG, the returned walks are well formed and follow the requirements from above. If the KG instead uses a remote SPARQL source, the resulting walks are no longer legal traversals of the graph. Instead, the generated walks consist of a mirrored successor part, followed by the vertice of interest, followed by another successor part (in correct order).
Steps to Reproduce
Environment
Possible Solution
The
fetch_hops()
function from below should support thewith_reverse
option, as does its local counterpart_get_hops()
. However, this probably also requires modifications to the querying and caching code.https://github.com/IBCNServices/pyRDF2Vec/blob/fb7da659f67b6486a403a46bc2d3c589b802304c/pyrdf2vec/graphs/kg.py#L241-L256
The text was updated successfully, but these errors were encountered: