Skip to content

conversion:links_via

Tim L edited this page Jun 19, 2013 · 57 revisions
csv2rdf4lod-automation is licensed under the [Apache License, Version 2.0](https://github.com/timrdf/csv2rdf4lod-automation/wiki/License)

See conversion:Enhancement.

Introduction

Objects that are [promoted to Resources](conversion:range rdfs:Resource) may be linked to external resources. A list of RDF files containing mappings and a list of predicates to query those mappings may be specified (with conversion:links_via and conversion:subject_of, respectively). All predicates listed by conversion:subject_of will be used in all files listed by conversion:links_via. To express more granular control, use multiple ObjectSameAsEnhancements listing different files and predicates.

Example

diagram showing two datasets lod-linked to the same entity

What datasets mention Maine (results)?

PREFIX conversion: <http://purl.org/twc/vocab/conversion/>
SELECT ?g ?maine
WHERE {
  GRAPH ?g  {
    ?maine owl:sameAs <http://dbpedia.org/resource/Maine>
  }
  filter(?g != conversion:SameAsDataset)
}order by ?g ?maine

Relationship to owl:InverseFunctionalProperty

convesion:links_via cites lod-link files that should be used to perform locally scoped owl:InverseFunctionalProperty or owl:FunctionalProperty reasoning to derive an owl:sameAs for the subject or object of a triple created during conversion.

Writing the enhancement parameters

e.g. data-gov's 1147 dataset

countyoutflow0708.csv.e1.params.ttl:

@prefix dcterms: <http://purl.org/dc/terms/> .
@prefix :        <http://logd.tw.rpi.edu/source/data-gov/dataset/1147/params/enhancement/1/> .
 
:dataset a void:Dataset;
 
   conversion:base_uri           "http://logd.tw.rpi.edu"^^xsd:anyURI;
   conversion:source_identifier  "data-gov";
   conversion:dataset_identifier "1147";
   conversion:dataset_version    "2009-Oct-08";
   conversion:conversion_process [
      a conversion:RawConversionProcess;
      conversion:enhancement_identifier "1";
      conversion:enhance [
         ov:csvCol           1;
         ov:csvHeader       "State_Code_Origin";
         conversion:property_name "state_code_origin";
         conversion:range          rdfs:Resource;
         a conversion:TypedResourcePromotionEnhancement;
         conversion:range_name    "state";
 
         a conversion:ObjectSameAsEnhancementViaLookup;
         conversion:links_via <http://www.rpi.edu/~lebot/lod-links/state-fips-dbpedia.ttl>,
                              <http://www.rpi.edu/~lebot/lod-links/state-fips-geonames.ttl>,
                              <http://www.rpi.edu/~lebot/lod-links/state-fips-govtrack.ttl>,
                              <http://logd.tw.rpi.edu/source/twc-rpi-edu/file/instance-hub-us-states-and-territories/version/2011-Apr-09/conversion/instance-hub-us-states-and-territories.csv.e1.ttl>;
         conversion:subject_of dcterms:identifier;
         # a conversion:DirectSameAsEnhancement; # This will reference any matching external URIs, too.
      ];
   ];
.

The lod-link files state-fips-dbpedia.ttl, state-fips-geonames.ttl, state-fips-govtrack.ttl:

@prefix dcterms: <http://purl.org/dc/terms/> .
 
state-fips-dbpedia.ttl:
 <http://dbpedia.org/resource/Alabama> dc:identifier "AL", "01", "Alabama", "ALABAMA", "alabama" .
state-fips-geonames.ttl:
 <http://sws.geonames.org/4829764/> dc:identifier "AL", "01", "Alabama", "ALABAMA", "alabama" .
state-fips-govtrack.ttl
 <http://www.rdfabout.com/rdf/usgov/geo/us/AL> dc:identifier "01", "AL", "Alabama", "ALABAMA", "alabama" .

And input:

 @prefix ds1147: <http://logd.tw.rpi.edu/source/data-gov/dataset/1147/version/2009-Oct-08/> .
 
 ds1147:thing_1 raw:state_code_origin  "01".

becomes (e1)

@prefix e1: <http://logd.tw.rpi.edu/source/data-gov/dataset/1147/vocab/enhancement/1/> .
 
ds1147:thing_1  e1:state_code_origin <http://logd.tw.rpi.edu/source/data-gov/dataset/1147/type/state/01> .
 
<http://logd.tw.rpi.edu/source/data-gov/dataset/1147/type/state/01>
   rdfs:label "01";
   owl:sameAs <http://dbpedia.org/resource/Alabama>, 
              <http://sws.geonames.org/4829764/>, 
              <http://www.rdfabout.com/rdf/usgov/geo/us/AL>;
.

conversion:subject_of

conversion:subject_of can be used to specify the property used to link. For example, the lod-link file gives the identifiers as your:foo, specifying conversion:subject_of your:foo will achieve the same as above but use your:foo instead of the default dcterms:identifier.

@prefix dcterms: <http://purl.org/dc/terms/> .
 
state-fips-dbpedia.ttl:
 <http://dbpedia.org/resource/Alabama> your:foo "AL", "01", "Alabama", "ALABAMA", "alabama" .
state-fips-geonames.ttl:
 <http://sws.geonames.org/4829764/> your:foo "AL", "01", "Alabama", "ALABAMA", "alabama" .
state-fips-govtrack.ttl
 <http://www.rdfabout.com/rdf/usgov/geo/us/AL> your:foo "01", "AL", "Alabama", "ALABAMA", "alabama" .
      conversion:enhance [
         ov:csvCol           1;
         ov:csvHeader       "State_Code_Origin";
         conversion:property_name "state_code_origin";
         conversion:range          rdfs:Resource;

         conversion:links_via <http://your.org/lod-links-file>;
         conversion:subject_of your:foo;
      ];

Example: lod-linking countries

Thanks to Maryam for this example from her enhancements to World Bank's World Development Indicators dataset:

     conversion:enhance [
        ov:csvCol          3;
        ov:csvHeader       "Country Code";
        conversion:label   "country";
        conversion:property_name "country";
        conversion:comment "";
        conversion:range_template "[/sd]typed/country/[.]";
        conversion:range   rdfs:Resource;
        conversion:range_name   "Country";

        a conversion:ObjectSameAsEnhancement;
        conversion:links_via <http://www.cs.utoronto.ca/~mfazel/lod-links/country-dbpedia.ttl>;
        conversion:subject_of dcterms:identifier;    

Queries

What predicates point to something that is owl:sameAs something else (results)?

PREFIX owl:        <http://www.w3.org/2002/07/owl#>
PREFIX dcterms:    <http://purl.org/dc/terms/>
PREFIX conversion: <http://purl.org/twc/vocab/conversion/>
PREFIX ov:         <http://open.vocab.org/terms/>
PREFIX e1:  <http://logd.tw.rpi.edu/source/epa-gov-mcmahon-ethan/dataset/environmental-reports/vocab/enhancement/1/>

SELECT distinct ?predicate
WHERE {
  GRAPH <http://logd.tw.rpi.edu/source/epa-gov-mcmahon-ethan/dataset/environmental-reports/version/2011-Jan-12>  {
    ?s ?predicate ?link .
                  ?link owl:sameAs ?o .
                        optional { ?o dcterms:isReferencedBy ?dataset } 
                                               filter(!bound(?dataset))
  }
}

Quality Assurance: For a predicate that we expect to point to a lod-linked resource, which objects did not link (results)?

PREFIX owl:        <http://www.w3.org/2002/07/owl#>
PREFIX dcterms:    <http://purl.org/dc/terms/>
PREFIX conversion: <http://purl.org/twc/vocab/conversion/>
PREFIX ov:         <http://open.vocab.org/terms/>
PREFIX roe: <http://logd.tw.rpi.edu/source/epa-gov-mcmahon-ethan/dataset/environmental-reports/vocab/>
PREFIX e1:  <http://logd.tw.rpi.edu/source/epa-gov-mcmahon-ethan/dataset/environmental-reports/vocab/enhancement/1/>

SELECT distinct ?taxon ?id
WHERE {
  GRAPH <http://logd.tw.rpi.edu/source/epa-gov-mcmahon-ethan/dataset/environmental-reports/version/2011-Jan-12>  {
    [] roe:epa_web_taxonomy_term ?taxon .
    optional { ?taxon dcterms:identifier ?id }
    optional { ?taxon owl:sameAs ?o } filter(!bound(?o))
  }
}

What datasets use which lod-link files (results)?

prefix conversion: <http://purl.org/twc/vocab/conversion/>

select ?dataset ?e ?lodlinks
where {
  graph <http://purl.org/twc/vocab/conversion/ConversionProcess> {
    ?dataset conversion:conversion_process [
      conversion:enhancement_identifier ?e;
      conversion:enhance [ 
        conversion:links_via ?lodlinks
      ]
    ] 
  }
}order by desc(?e) ?dataset ?lodlinks

Which lod-link files are the most popular (results)?

prefix conversion: <http://purl.org/twc/vocab/conversion/>

select ?lodlinks count(*) as ?count
where {
  graph <http://purl.org/twc/vocab/conversion/ConversionProcess> {
    ?dataset conversion:conversion_process [
      conversion:enhance [ 
        conversion:links_via ?lodlinks
      ]
    ] 
  }
} group by ?lodlinks order by desc(?count)

Case Insensitivity

a conversion:CaseInsensitiveLODLink;

See also

See also conversion:range_template to control the URI produced for an object. This allows direct connection because the URI for the new entity will be identical to the URI of the external entity. conversion:domain_template does the same thing for naming the subject.

P. Bouquet, H. Stoermer, and D. Giacomuzzi. OKKAM: Enabling a Web of Entities. In I3: Identity, Identifiers, Identification. Proceedings of the WWW2007 Workshop on Entity-Centric Approaches to Information and Knowledge Management on the Web, Banff, Canada, May 8, 2007., CEUR Workshop Proceedings, ISSN 1613-0073, May 2007. online http://CEUR-WS.org/Vol-249/submission_150.pdf.

DRAFT

Regarding materializing from row to external same-as resources

 :thing_1 e1:state_code_origin <http://dbpedia.org/resource/Alabama> .

On one hand it would allow loading this RDF into a store with DBPedia data and querying directly across them without having to know there are owl:sameAs links. On the other, if you just wanted to query for ds1147:thing_1 e1:state_code_origin ?origin , materializing the owl:sameAs assertions would return you several results instead of just one (which might be expected).

SubjectSameAsEnhancement parameter

Although it is much more common that an object will need to link to an external entity, the subject may need to as well. For example, many tables describing the same nuclear plants should be linked. However, conversion:domain_template can be used to rename the subject to overlap directly with the name of the external entity. This more direct approach can also be done for the object using conversion:range_template.

conversion:enhance [
   a conversion:SubjectSameAsEnhancement, conversion:TypedResourcePromotionEnhancement;
   ov:csvCol           7;
   conversion:property_name "state";
   conversion:range          rdfs:Resource;
   conversion:links_via      <http://url.to/my_mappings.rdf>;
   conversion:subject_of      dcterms:identifier;
   # For TypedResourcePromotionEnhancement
   conversion:type          "state";
];

Same multi-typed as ObjectSameAsEnhancement

e.g., nuclear reactor 957?

Clone this wiki locally