Project-level Linked Data

The Semantic Web underpins a lot of the technology we are exploring in Transforming Musicology, particularly in its more pragmatic Linked Data incarnation. This technology relies on authorities publishing conformant, linkable data in their knowledge domains.

As an initial contribution to this, and also as a way of road-testing some of the technology before using it in more musical applications, we have recently started producing Linked Data versions of some of the content on our project website. Many of the pages on our website can be considered resources representing some real-world entities which make up the project. For example, the URI <http://www.t-mus.org/> can be considered the name or the label that uniquely identifies Transforming Musicology in the web of Linked Data. And similarly <http://www.t-mus.org/people/richard-lewis/> identifies (although not necessarily uniquely) me.

The next step for making Linked Data is to ensure that such a URI is dereferenceable as some machine-readable encoding of information about the resource it represents and—crucially—that it links to other resources. We have achieved this by publishing RDF (Resource Description Framework) encoded information about some of the resources alongside the "human readable" HTML. RDF is a very simple data model for making statements about resources. Each statement is in the form of a triple: subject — predicate — object, where the subject is the resource itself. The object may also be a URI of a resource or it may just be a plain value. The predicate describes the relationship between the subject and the object. Some examples:

<http://www.t-mus.org/> <title> "Transforming Musicology"
<http://www.t-mus.org/> <fundedBy> <http://www.ahrc.ac.uk/>

Of course, the data becomes more linked when the objects are resource URIs. RDF also requires—and this is important—that the predicate be a URI to a resource too. Consequently, when an RDF triple makes an assertion it does so using predicate semantics that are public and that may well also be used in other assertions in other data sets. When your data set asserts <http://purl.org/dc/elements/1.1/title> about a resource and my data asserts <http://purl.org/dc/elements/1.1/title> about a resource, we are both asserting exactly the same concept.

One of the practices, therefore, in the Semantic Web is to publish collections of related predicates together forming a kind of controlled vocabulary (in the library science terminology) known as an ontology. When publishing your data as RDF it's good to try and make use of existing predicates where possible. So in the case of the Semantic Web publication of our project information, we've made use of:

SWRC
Semantic Web for Research Communities, which provides predicates for making assertions about research projects such as project membership, funding, and time-frame.
DCV
Dataset Catalog Vocabulary, which provides predicates for describing published data.
SIOC
Semantically-Interlinked Online Communities, which provides predicates for describing online interactions and social networks. We're using this to describe our blog.
FOAF
Friend of a Friend, which provides predicates for describing persons and the relationships between them. We're using this to describe our project team.
Dublin Core
Dublin Core provides basic terms for describing intellectual creations including concepts such as title, publication date, and summary.

You (or more appropriately your computer) can see the RDF versions of our resources by sending an application/rdf+xml or text/turtle HTTP request to the URI. To see what that looks like, have a look at: http://www.t-mus.org/index.rdf. If you have curl installed, try this:

$ curl -s -L -H "Accept: text/turtle" http://www.t-mus.org/

As the project progresses we will be publishing more Linked Data and, hopefully, more musical Linked Data. Having a Semantic Web presence for the project itself (which this work achieves) will be useful here: we'll be able to point to the provenance of such data sets as Transforming Musicology and have T-Mus as a semantic resource.

Post a comment

Comments