Bug 111468

Summary: libgsf's gsf_xml_parser_context() broken
Product: [Fedora] Fedora Reporter: W. Michael Petullo <redhat>
Component: libgsfAssignee: Caolan McNamara <caolanm>
Status: CLOSED UPSTREAM QA Contact:
Severity: medium Docs Contact:
Priority: medium    
Version: 1   
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2004-05-03 21:50:23 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description W. Michael Petullo 2003-12-04 00:14:11 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 Galeon/1.2.7 (X11; Linux ppc; U;) Gecko/20030130

Description of problem:
Libgsf's gsf_xml_parser_context() function seems broken when used to
open an uncompressed xml file.  I am working on an application that
does the following:

1.  initializes the library
2.  creates a new gsf input entity using foo=gsf_input_gnomevfs_new()
3.  calls gsf_xml_parser_context(foo)
4.  does some stuff

The problem is that between step 3 and 4, the gsf input entity foo's
cur_offset is set to 10.  This is does even though no data is directly
read by the application.

Since cur_offset is used to calculate the amount of unread data, the
result of this is that gsf_input_remaining returns an incorrect value
and reading gets mixed up.

gsf-libxml.c:gsf_xml_parser_context looks like this:

gsf_xml_parser_context (GsfInput *input)
{
        GsfInputGZip *gzip;
                                                                     
          
        g_return_val_if_fail (GSF_IS_INPUT (input), NULL);
                                                                     
          
        gzip = gsf_input_gzip_new (input, NULL);
        if (gzip != NULL)
                input = GSF_INPUT (gzip);
        else {
                gsf_input_seek(input, 0, G_SEEK_SET);
                g_object_ref (G_OBJECT (input));
        }
                                                                     
          
        return xmlCreateIOParserCtxt (
                NULL, NULL,
                (xmlInputReadCallback) gsf_libxml_read,
                (xmlInputCloseCallback) gsf_libxml_close,
                input, XML_CHAR_ENCODING_NONE);
}

It appears that "gzip = gsf_input_gzip_new (input, NULL);" is what
causes the mixed up cur_offset value.  This make sense because
gsf_input_gzip_new would have to read a few bytes from input in order
to determine if the file is valid gzip'ed data or not.

So...

gsf_xml_parser_context (GsfInput *input)
{
        GsfInputGZip *gzip;
                                                                     
          
        g_return_val_if_fail (GSF_IS_INPUT (input), NULL);

        return xmlCreateIOParserCtxt (
                NULL, NULL,
                (xmlInputReadCallback) gsf_libxml_read,
                (xmlInputCloseCallback) gsf_libxml_close,
                input, XML_CHAR_ENCODING_NONE);
}

...works fine (but only with uncompressed files of course).

Either gsf_input_seek() should be used if gsf_input_gzip_new() fails
or gsf_input_gzip_new() should reset the file cursor before failing.

Version-Release number of selected component (if applicable):
1.8.2

How reproducible:
Always

Steps to Reproduce:
1.  Load an application that uses gsf_xml_parser_context(foo) into
your favorite debugger.

2.  Notice that foo->cur_offset is set to 10d after
gsf_xml_parser_context is executed on an uncompressed file.

Additional info:

Comment 1 Caolan McNamara 2004-05-03 21:50:23 UTC
This is being handled upstream at gnome.org: i.e. see
http://bugzilla.gnome.org/show_bug.cgi?id=141765.