Bug 787549

Summary: RFE: Use RDF triples to encode metadata
Product: [Community] PressGang CCMS Reporter: Joshua Wulf <jwulf>
Component: Web-UIAssignee: pressgang-ccms-dev
Status: NEW --- QA Contact:
Severity: low Docs Contact:
Priority: low    
Version: 2.0CC: topic-tool-list
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Joshua Wulf 2012-02-06 05:29:38 UTC
The proposal is to use RDF triples to encode metadata in Skynet.

* RDF triples can be encoded in Docbook (from version 5) and DITA. XML with a non-docbook namespace is ignored by Docbook 5 processing tools. This means that data and metadata will both be encapsulated in the xml.

* RDF metadata can be read from the xml and stored in an RDF triple store, which can then be used as an index for querying operations.

* When the topic record is exported, the RDF triples are encoded in the xml and travel with it.

* The RDF metadata can be made available in the rendered output with a suitable processing chain (for example as HTML + RDFa).

Benefits include:

* Portability of data between different instances of Skynet becomes possible as RDF allows centralized ontologies to be used - the metadata schema is no longer specific to the Skynet instance.

* Semantic assembly of output pages based on defined rulesets. (see: http://www.bbc.co.uk/blogs/bbcinternet/2010/07/bbc_world_cup_2010_dynamic_sem.html for an example)

See also: https://engineering.redhat.com/rt/Ticket/Display.html?id=141002

Comment 1 Matthew Casperson 2012-02-22 22:34:13 UTC
RDF does not define what information is stored, just how it is presented. For that reason Skynet will not use RDF to store the properties and relationships, but it can output the data that is captured into the database into RDF.

A decision needs to be made as to which schema to target. Take your pick from http://en.wikipedia.org/wiki/Metadata_standards#Available_metadata_standards.

Comment 2 Joshua Wulf 2012-02-23 13:31:54 UTC
We might need a generic ability to generate and consume our own namespaced schemas, in addition to reusing applicable schemas that are already out there.

Comment 3 Joshua Wulf 2012-02-23 13:39:39 UTC
The consequence of that is that our metadata schemas would also become reusable, rather than only existing as a set of tags and categories in the database of an individual Skynet instance.

So information would be portable between Fedora and Red Hat instances, for example, because rather than referencing a metadata schema (tags and categories) with a local namespace (the Skynet instance database), the topics in each would reference commonly accessible metadata namespaces.

Once that issue is solved for one instance, it suddenly makes all instances of Skynet compatible, because the metadata schema is no longer locked up inside a database.

Comment 4 Matthew Casperson 2012-02-25 22:43:20 UTC
A common schema would be useful, but that would have to come after a review of EAP6 and what tags worked or didn't work, with an eye to tying it into an established schema (or defining our own).

Currently the categories, tags, property tags and topic relationships map either directly or indirectly to RDF triples.

Relationships (direct relationship): "Topic 10" has a "Prerequisite" whose value is "Topic 15"

Tags (indirect relationship from tag to category): "Topic 10" has a "Technology" whose value is "Application Server"

Property Tags (indirect relationship from tag to property tag): "Topic 10" has a "Bugzilla Product" whose value is "Hibernate"

But after some thought RDF itself is almost useless with the current restriction of having to produce static HTML. RDF is designed to be machine readable, and without a SPARQL endpoint it is only readable by either scraping the docs website for RDFa data or downloading a separate RDF file and querying it locally. Neither option really adds any value.