Bug 738833 - Use xmllint on topic import.
Summary: Use xmllint on topic import.
Alias: None
Product: Topic Tool
Classification: Other
Component: cli-Topic_Tool
Version: 0.0.x
Hardware: Unspecified
OS: Unspecified
Target Milestone: ---
Assignee: Stephen Gordon
QA Contact:
Depends On:
TreeView+ depends on / blocked
Reported: 2011-09-15 23:28 UTC by Stephen Gordon
Modified: 2011-09-29 04:27 UTC (History)
1 user (show)

Fixed In Version: topic-tool-0.0.8-0
Doc Type: Bug Fix
Doc Text:
Clone Of:
Last Closed: 2011-09-29 04:24:13 UTC

Attachments (Terms of Use)

Description Stephen Gordon 2011-09-15 23:28:21 UTC
Description of problem:

It is possible to validate an XML fragment against a URL like so:

xmllint --dtdvalid http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd Removing_Red_Hat_Enterprise_Virtualization_Manager.xml

On import we should be able to use this to validate topics. To facilitate this:

- Add xmllint to package Requires.
- Add logic to the tool to:
   * Check for presence of topic.conf value VALIDATION_DTD, if set then use.
   * Otherwise set VALIDATION_DTD=TOPIC_REPO/validation.dtd, allowing the owner of the repository to "choose" the DTD - in our case this would be docbook 4.5.
   * On import, run xmllint --dtdvalid VALIDATION_DTD TOPIC_FILE and collect the output. If a diff of output and topic_file contains no differences then we have valid XML (yay!) - otherwise print the output and abandon the import.

- Add a flag to override the DTD check and also a config value for additional xmllint parameters. I think this will be needed as an escape hatch for the initial implementation in case we need less strict options to xmllint or something.

Comment 1 Stephen Gordon 2011-09-16 01:26:25 UTC
topic import test.xml References
ERROR: test.xml:3: parser error : Opening and ending tag mismatch: stitle line 3 and title
ERROR:    <stitle>test</title>
ERROR:                        ^
ERROR: xmllint returned validation errors.

Comment 2 Stephen Gordon 2011-09-16 03:55:55 UTC
$ topic import test.xml References
ERROR: test.xml:2: element section: validity error : Element section content does not follow the DTD, expecting (sectioninfo? , (title , subtitle? , titleabbrev?) , (toc | lot | index | glossary | bibliography)* , (((calloutlist | glosslist | bibliolist | itemizedlist | orderedlist | segmentedlist | simplelist | variablelist | caution | important | note | tip | warning | literallayout | programlisting | programlistingco | screen | screenco | screenshot | synopsis | cmdsynopsis | funcsynopsis | classsynopsis | fieldsynopsis | constructorsynopsis | destructorsynopsis | methodsynopsis | formalpara | para | simpara | address | blockquote | graphic | graphicco | mediaobject | mediaobjectco | informalequation | informalexample | informalfigure | informaltable | equation | example | figure | table | msgset | procedure | sidebar | qandaset | task | anchor | bridgehead | remark | highlights | abstract | authorblurb | epigraph | indexterm | beginpage)+ , (refentry* | section* | simplesect*)) | refentry+ | section+ | simplesect+) , (toc | lot | index | glossary | bibliography)*), got (title fail table )
ERROR: test.xml:4: element fail: validity error : No declaration for element fail
ERROR: Document test.xml does not validate against http://topicrepo.englab.bne.redhat.com/TopicRepository/dtd/validate.dtd
ERROR: xmllint returned validation errors.

Comment 3 Stephen Gordon 2011-09-16 04:03:28 UTC
Final output is similar to the above but, by default, validates to the proper DocBook 4.5 DTD. By doing this, and requiring the docbook-dtds package, we can use the built in catalog so that the validation is against the local copy of the DTD.

This is much faster than the other approach I had been trialling which was setting the DTD on the repository side. The topic.conf value to override the DTD to use is TOPIC_DTD.

Comment 4 Stephen Gordon 2011-09-16 04:09:01 UTC
Committed revision 70461.

Note You need to log in before you can comment on or make changes to this bug.