Absence Is Not A Bug

From GetSemantic

Jump to: navigation, search

When building RDF-based applications, one should be aware that absence of data is not a bug and one should not rely on the abscence of data to infer things. This is not to say that it is impossible to do so - you can certainly use SPARQL to search for the absence of triples (using "FILTER (!bound (?variable))" where ?variable is an optional triple pattern defined in the WHERE query).

The reason behind the 'absence is not a bug' design is that of the 'open world assumption' - this is the assumption that you do not have all the data, and that other may exist that could be added to your data model.

If you had this data:

:Joe :first_name "Joe".

You cannot know that Joe does not have a second name - all you know is that Joe's second name is not in this data. When doing a query, you are asking "do you have the data in the model?" not necessarily "does this data not exist anywhere?".

When designing an RDF vocabulary, you should take into account the Absence Is Not A Bug design pattern - try to design it in such a way that when people use data in that vocabulary, they do not have to look for data absence.

[edit] Example

For instance, if you have a task in XML, you may represent it like this:

<task>
<description>Book tickets to Paris.</description>
<done />
</task>

You could then use XPath to query it for uncompleted tasks with something like task[not(done)].

In RDF, this is not the way to do it. You should explictly represent whether the task is done:

[] a :Task; :description "Book tickets to Paris."; :status :complete.
Personal tools