Bug 790316

Summary: xmllint error output complicates problem solving (Was: Invalid sequence in interleave)
Product: Red Hat Enterprise Linux 6 Reporter: Dag Wieers <dag>
Component: libxml2Assignee: Daniel Veillard <veillard>
Status: CLOSED WONTFIX QA Contact: qe-baseos-tools-bugs
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 6.2CC: dag, mnowak
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
URL: http://docs.oasis-open.org/office/v1.2/os/OpenDocument-v1.2-os-schema.rng
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-03-05 15:01:09 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Attachments:
Description Flags
Example document showing the error none

Description Dag Wieers 2012-02-14 08:56:52 UTC
Created attachment 561820 [details]
Example document showing the error

Description of problem:
We are experiencing 2 more RelaxNG related issues with xmllint when validating OpenDocument files. I have debugged the issue and can only conclude both seem to be related to xmllint, and not the document.

If you use the attached README.fodt using the OpenDocument v1.2 schema and xmllint, you'll see this:

----
[dag@moria ~]$ xmllint --noout --relaxng relaxng/OpenDocument-v1.2-cs01-schema.rng README.fodt
README.fodt:863: element text-properties: Relax-NG validity error : Expecting element map, got text-properties
README.fodt:863: element text-properties: Relax-NG validity error : Element style has extra content: text-properties
Relax-NG validity error : Extra element style in interleave
README.fodt:336: element automatic-styles: Relax-NG validity error : Invalid sequence in interleave
README.fodt:336: element automatic-styles: Relax-NG validity error : Element automatic-styles failed to validate content
README.fodt fails to validate
----

Since the 'style:map' element is inside a <zeroOrMore> construct, it should not be expected in this block.

The second error I do not understand at all. The line-number points  to the start of the <office:automatic-styles> block, however there is a 'Relax-NG validity error : Extra element style in interleave' error that does not show any line-number.

Any help on this is appreciated. I looked at the xmllint source-code and have to admit that I do not understand anything :-)

The OpenDocument RNG schema is available from: http://docs.oasis-open.org/office/v1.2/os/OpenDocument-v1.2-os-schema.rng

Comment 1 Dag Wieers 2012-02-14 09:31:05 UTC
We found one cause. Apparently for xmllint the RelaxNG <optional> needs to be in the order listed in the schema. So if the schema lists 3 optional elements, graphic-properties, paragraph-properties and text-properties, xmllint expects any  of these elements to appear in that order. If you do it in reverse, it will complain about some element it expects (which is should not expect at all).

So in this case it's not actually a zeroOrMore bug, but a superficial ordering requirement of xmllint for optional elements.

The quest continues :-/

Comment 2 Dag Wieers 2012-02-14 09:47:14 UTC
The quest has ended. It is all related to the ordering of the optional elements.

I don't think that RelaxNG requires the correct ordering of optional elements inside a block.

Comment 3 Dag Wieers 2012-02-14 12:36:54 UTC
After even more investigation, it seems that RelaxNG requires optional child elements to be in order, unless they are 'interleaved'.

So it means that:

    <group>
        <attribute name="style:family">
            <value>paragraph</value>
        </attribute>
        <optional>
            <ref name="style-paragraph-properties"/>
        </optional>
        <optional>
            <ref name="style-text-properties"/>
        </optional>
    </group>

Requires first style:paragraph-properties and then style:text-properties, while:

    <group>
        <attribute name="style:family">
            <value>paragraph</value>
        </attribute>
        <optional>
            <ref name="style-paragraph-properties"/>
        </optional>
        <optional>
            <ref name="style-text-properties"/>
        </optional>
    </group>

Does allow to have a different order. So I don't think this in itself is a bug. However the error message is not very clear on what exactly the problem is, the line-numbers do not help finding the exact cause of this problem (as it points to the parent-element, and not the child-element in question) and there are lots of unrelated errors caused by this.

Maybe there is room for improvement in the xmllint error output to help troubleshoot and solve validation errors. (I've lost countless days debugging those errors :-/)

Comment 5 Dag Wieers 2012-02-15 06:36:43 UTC
Some more info in our asciidoc-odf issue-tracker: https://github.com/dagwieers/asciidoc-odf/issues/27#issuecomment-3975720

Comment 6 Daniel Veillard 2012-02-17 07:52:51 UTC
  Dag,

 yes, if you are in a <group> items are supposed to be sequenced
i.e. the order of appearance in the group have to match the order
of appearance in the XML instance. And the fact that some items
may be optional doesn't change anything, if the item is present
it must be there at the right place. There is however an exception
to this, for the items of the group defining attributes, as
attributes are not ordered in the XML document model (though parser
will try to preserve order of appearance a validator cannot expect
this from the underlying parser).

Now I agree that libxml2 relax-ng implementation error reporting is
far from ideal. It comes in part from the method of implementation
(I did not use derivation as suggested by James Clark but another
algorithm with different trade-off), and the fact that RNG is so
flexible that it's hard to pop-up the error at the right place.
(example one can write an RNG which will validate a document if there
 is an id attribute anywhere in the document, good luck reporting
 a good message in case of failure...)

The error reporting is a limitation, but not strictly a bug, I agree
that it's seriously inconvenient ! There is one well known bug in
libxml2 RNG implementation it's with the nesting of interleaves, but
not that many people hit that and it will be excruciatingly hard to fix
with the current implementation.

  That said is there a bug here except for the limitation of the
error reporting ?

Daniel

Comment 7 Dag Wieers 2012-02-17 08:47:28 UTC
There is a bug in the error output, in that if the sequence of the <group> is not followed, and the first item is optional, it will complain that it is expecting the first item (which is optional).

In the very first example I reported, xmllint is expecting <map> according to the error, but <map> in ODF means a conditional style, which made no sense in what we wanted to do. So it should have said:

Error: Expecting element map or element paragraph-properties, got text-properties

On the other hand, if you don't want to make it more complex (or if it is impossible because of design limitations) it's probably better to be less specific and correct:

Error: Expecting other element (in sequence for style), got text-properties

It would be nice if the error indicated somehow that there is a strict sequence required for this element, rather than expecting the user to understand "interleave".

Feel free to do with the feedback as you like and close this bug :-) Hopefully this bug-report will end up high in Google so that it may help the next person with similar problems.

Comment 8 Daniel Veillard 2012-03-05 15:01:09 UTC
Okay, let's be honest, I don't see how I could fix that in the RHEL-6
line. This would probably require major surgery to the error handling code
in the Relax-NG engine of libxml2, and that's not something I would have
time for and the risk is fairly large too,

  So for RHEL-6 which is what this bug is about, I think WONTFIX is the
right keyword, and if by miracle we had a simple patch for it we can
still re-open this.

    sorry about that, and thanks for the feedback,

Daniel