Hide Forgot
The proposal is to use RDF triples to encode metadata in Skynet. * RDF triples can be encoded in Docbook (from version 5) and DITA. XML with a non-docbook namespace is ignored by Docbook 5 processing tools. This means that data and metadata will both be encapsulated in the xml. * RDF metadata can be read from the xml and stored in an RDF triple store, which can then be used as an index for querying operations. * When the topic record is exported, the RDF triples are encoded in the xml and travel with it. * The RDF metadata can be made available in the rendered output with a suitable processing chain (for example as HTML + RDFa). Benefits include: * Portability of data between different instances of Skynet becomes possible as RDF allows centralized ontologies to be used - the metadata schema is no longer specific to the Skynet instance. * Semantic assembly of output pages based on defined rulesets. (see: http://www.bbc.co.uk/blogs/bbcinternet/2010/07/bbc_world_cup_2010_dynamic_sem.html for an example) See also: https://engineering.redhat.com/rt/Ticket/Display.html?id=141002
RDF does not define what information is stored, just how it is presented. For that reason Skynet will not use RDF to store the properties and relationships, but it can output the data that is captured into the database into RDF. A decision needs to be made as to which schema to target. Take your pick from http://en.wikipedia.org/wiki/Metadata_standards#Available_metadata_standards.
We might need a generic ability to generate and consume our own namespaced schemas, in addition to reusing applicable schemas that are already out there.
The consequence of that is that our metadata schemas would also become reusable, rather than only existing as a set of tags and categories in the database of an individual Skynet instance. So information would be portable between Fedora and Red Hat instances, for example, because rather than referencing a metadata schema (tags and categories) with a local namespace (the Skynet instance database), the topics in each would reference commonly accessible metadata namespaces. Once that issue is solved for one instance, it suddenly makes all instances of Skynet compatible, because the metadata schema is no longer locked up inside a database.
A common schema would be useful, but that would have to come after a review of EAP6 and what tags worked or didn't work, with an eye to tying it into an established schema (or defining our own). Currently the categories, tags, property tags and topic relationships map either directly or indirectly to RDF triples. Relationships (direct relationship): "Topic 10" has a "Prerequisite" whose value is "Topic 15" Tags (indirect relationship from tag to category): "Topic 10" has a "Technology" whose value is "Application Server" Property Tags (indirect relationship from tag to property tag): "Topic 10" has a "Bugzilla Product" whose value is "Hibernate" But after some thought RDF itself is almost useless with the current restriction of having to produce static HTML. RDF is designed to be machine readable, and without a SPARQL endpoint it is only readable by either scraping the docs website for RDFa data or downloading a separate RDF file and querying it locally. Neither option really adds any value.