Bug 1207685

Summary: repomd.xml parser error: Parse error at line: 1 (not well-formed (invalid token)), disabling
Product: [Fedora] Fedora Reporter: Michael Mráka <mmraka>
Component: librepoAssignee: Tomas Mlcoch <tmlcoch>
Status: CLOSED CURRENTRELEASE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: rawhideCC: tmlcoch
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: librepo-1.7.16-1 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-06-12 09:12:16 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Michael Mráka 2015-03-31 13:28:20 UTC
Description of problem:
Using librepo to download repo data from cookie based authenticated web server fails with repomd.xml parser error: Parse error at line: 1 (not well-formed (invalid token)), disabling.
Authentication is set using LRO_HTTPHEADER.

Version-Release number of selected component (if applicable):
librepo-1.7.14-1.fc23.x86_64

How reproducible:
always


Steps to Reproduce:
No simple reproducer, it's a rather complex dnf plugin accessing repositories on Spacewalk / Red Hat Satellite Server.

Plugin sets custom http headers via librepo.setopt(librepo.LRO_HTTPHEADER,...).
On server side I can see repomd.xml was sent ok.
  access_log: 10.34.43.38 - - [31/Mar/2015:14:31:51 +0200] "GET /XMLRPC/GET-REQ/rhel-x86_64-server-6/repodata/repomd.xml HTTP/1.1" 200 1545 "-" "dnf/0.6.4"
But dnf fails with
  Failed to synchronize cache for repo 'rhel-x86_64-server-6' from 'http://sputnik.brq.redhat.com/XMLRPC/GET-REQ/rhel-x86_64-server-6': repomd.xml parser error: Parse error at line: 1 (not well-formed (invalid token)), disabling.

Running strace on dnf process show that (most likely) librepo saves repomd.xml including http headers:
open("/tmp/dnf-ZNuLf6/repodata/repomd.xml", O_RDWR|O_CREAT|O_TRUNC, 0666) = 6
dup(6)                                  = 7
fcntl(7, F_GETFL)                       = 0x8002 (flags O_RDWR|O_LARGEFILE)
fsetxattr(7, "user.Librepo.DownloadInProgress", "\x00", 1, 0) = -1 EOPNOTSUPP (Operation not supported)
...
sendto(9, "GET /XMLRPC/GET-REQ/rhel-x86_64-server-6/repodata/repomd.xml HTTP/1.1\r\nUser-Agent: dnf/0.6.4..."320, MSG_NOSIGNAL, NULL, 0) = 320
...
recvfrom(9, "HTTP/1.1 200 OK\r\nDate: Tue, 31 Mar 2015 12:44:04 GMT\r\nServer: Apache\r\nContent-Length: 1545\r\nX-RHN-Proxy-Version: 0\r\nX-RHN-Client-Version: 0\r\nLast-Modified: Tue, 31 Mar 2015 02:40:07 GMT\r\nConnection: close\r\nX-Package-FileName: repomd.xml\r\nContent-Type: text/xml\r\n\r\n<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<repomd xmlns=...", 16384, 0, NULL, NULL) = 1809
...
write(7, "HTTP/1.1 200 OK\r\nDate: Tue, 31 Mar 2015 12:44:04 GMT\r\nServer: Apache\r\nContent-Length: 1545\r\nX-RHN-Proxy-Version: 0\r\nX-RHN-Client-Version: 0\r\nLast-Modified: Tue, 31 Mar 2015 02:40:07 GMT\r\nConnection: close\r\nX-Package-FileName: repomd.xml\r\nContent-Type: text/xml\r\n\r\n<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<repomd xmlns=...", 1809) = 1809
...
read(6, "HTTP/1.1 200 OK\r\nDate: Tue, 31 Mar 2015 12:44:04 GMT\r\nServer: Apache\r\nContent-Length: 1545\r\nX-RHN-Proxy-Version: 0\r\nX-RHN-Client-Version: 0\r\nLast-Modified: Tue, 31 Mar 2015 02:40:07 GMT\r\nConnection: close\r\nX-Package-FileName: repomd.xml\r\nContent-Type: text/xml\r\n\r\n<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<repomd xmlns=...", 8192) = 1809


See the difference between repomd.xml size reported in access_log and Content-Length: header (1545) and written/read from temp file (1806).