Bug 802839 - rpm2cpio fails intermittently when file is piped to stdin
rpm2cpio fails intermittently when file is piped to stdin
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: rpm (Show other bugs)
6.2
All Linux
medium Severity medium
: rc
: ---
Assigned To: Panu Matilainen
Patrik Kis
: Regression
Depends On:
Blocks: 840699
  Show dependency treegraph
 
Reported: 2012-03-13 11:37 EDT by David Rennalls
Modified: 2013-02-21 05:51 EST (History)
4 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2013-02-21 05:51:16 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description David Rennalls 2012-03-13 11:37:58 EDT
Description of problem:
At times piping a large .rpm on a local disk to rpm2cpio fails.. 
for e.g. 
[/tmp]$ cat large.rpm | rpm2cpio 1>/dev/null
error: rpm2cpio: headerRead failed: hdr blob(154603): BAD, read returned 98008
error reading header from package

strace confirms read is reading less than requested.. I ended up just looking at the code as this used to work fine with rpm 4.4.x (in RHEL 5.x). It seems some code was removed in 4.8.x that was handling all the cases for a 'read'. The case I'm hitting is straight from 'man 2 read'

...It is not an error if this number is smaller than the number of bytes requested; this may happen for example because fewer bytes are actually available right now (maybe because we were close to end-of-file, or because we are reading from a pipe, or...etc...

So this appears to be a regression unless I'm missing something.
In rpm 4.4.x (used on RHEL 5.x) ufdio_s->read=udfRead, udfRead loops etc until it gets all the data i.e. does things right
see http://rpm.org/gitweb?p=rpm.git;a=blob;f=rpmio/rpmio.c;h=dcb68c676a4f90a355d28b6f700a3322c5c62f3b;hb=refs/heads/rpm-4.4.x#l2346

In rpm 4.8.x (RHEL 6) ufdio_s->read=fdRead ('plain' read is done..)
see http://rpm.org/gitweb?p=rpm.git;a=blob;f=rpmio/rpmio.c;h=6473d557766b1941fc9cc1ceaec7a1e9b90d572e;hb=refs/heads/rpm-4.8.x#l760

That was introduced with this change...
Eliminate ufdio-specific read, write, seek and close
- we dont do network IO anymore so ufdio only differs from fdio by
  downloading the file on open if necessary, after that it's just fdio
http://rpm.org/gitweb?p=rpm.git;a=commit;h=2dc82d4e3e9c2959f4f731895993645761905073

..so I guess it's an oversight ?

Version-Release number of selected component (if applicable):
[~]$ rpm -q rpm
rpm-4.8.0-19.el6.i686

How reproducible:
Happens most of the time for me if the rpm is pretty large >150M and on a local disk.


Steps to Reproduce:
1. Run 'cat large.rpm | rpm2cpio 1>/dev/null' for a large rpm on a local disk
  
Actual results:
error: rpm2cpio: headerRead failed: hdr blob(154603): BAD, read returned 98008
error reading header from package

Expected results:
no error

Additional info:
Comment 3 Charlie Brady 2012-04-25 09:25:49 EDT
Why does this bug have the Needinfo flag set? What additional information do you need? Clearly rpm 4.8.x misuses read().
Comment 4 Miroslav Vadkerti 2012-04-25 09:37:11 EDT
It's a needinfo on the developer (you are probably not authorized to see the recepient or I forgot to add the developer mail address)
Comment 5 Miroslav Vadkerti 2012-04-25 09:37:37 EDT
Needinfo reset back on Panu to get his attention on this bug.
Comment 6 Panu Matilainen 2012-04-26 04:54:44 EDT
I suppose "oversight" is a fitting description, rpm2cpio being the only thing in rpm that accepts input from a pipe. I dont think I was even aware of it supporting reading from stdin prior to this bug...

While this technically is a regression, its on a very rarely used feature and the "normal" usage of 'rpm2cpio package.rpm|cpio ...' works just fine. The rpmio subsystem is such a house of cards that I'm not going to risk breaking more commonly used functionality with a hurried fix for a corner case issue  - moving to 6.4.0.
Comment 7 Panu Matilainen 2012-07-05 09:07:27 EDT
devel_ack, in 4.8.x this can be handled in the timedRead() wrapper easily enough. Upstream will need something different though...
Comment 11 Patrik Kis 2012-10-05 08:44:30 EDT
I'm trying to reproduce this issue but no error occurred from 200 attempts.

# rpm -q rpm
rpm-4.8.0-27.el6.x86_64
# du -h large-1-1.noarch.rpm
190M	large-1-1.noarch.rpm
# for i in `seq 1 200`; do cat large-1-1.noarch.rpm | rpm2cpio 1>/dev/null; done
#

Any ideas? Is it possible that it depends on particular rpm used for test?

Maybe another scenario that can be used to verify the bug fix?
Comment 12 Charlie Brady 2012-10-05 09:12:18 EDT
(In reply to comment #11)

> Any ideas? Is it possible that it depends on particular rpm used for test?

It will depend on OS and file system implementation. Maybe on memory pressure too. Try booting your system with mem=...
Comment 13 Panu Matilainen 2012-10-08 02:07:23 EDT
To reproduce you need a package which a header larger than system pipe buffer, the total size of the package (or amount of memory) is not important. Of course there tends to be a correlation: the packages with a large header tend to be large overall.

Anyway, the kernel package should be a fairly reliable reproducer.
Comment 15 Panu Matilainen 2012-10-30 06:01:35 EDT
Here's an actual reproducer (although you'll probably want to find a closer mirror for the kernel package):

$ wget http://ftp.funet.fi/pub/mirrors/fedora.redhat.com/pub/fedora/linux/releases/17/Fedora/x86_64/os/Packages/k/kernel-3.3.4-5.fc17.x86_64.rpm
$ cat kernel-3.3.4-5.fc17.x86_64.rpm |rpm2cpio|wc -l

Output with bug present will be something like:
error: rpm2cpio: headerRead failed: hdr blob(681753): BAD, read returned 129672
error reading header from package
0

Output with bug fixed (assuming same kernel package etc):
299593
Comment 16 Patrik Kis 2012-11-14 04:17:48 EST
Thanks Panu for the reproducer; it works well on my workstation but the success rate dramatically falls on any of the servers in our lab.

I checked the the pipe buffer size and it seems it is 65536 bytes on all system except ppc64 where it is 1048576 bytes. On ppc64 therefore I cannot reproduce it at all, but at least that makes sense. What I don't understand why it is that problem to reproduce it even on x86_64. Any idea? Is there anything else than the pipe buffer size that should be considered on the test systems?

I also tried to create a test package which contained nothing but a large changelog; that gain worked fine on my desktop but not on servers.
Comment 17 Panu Matilainen 2012-11-15 06:16:52 EST
I dont see what else besides pipe buffer size could affect it. Where are you getting the pipe buffer size from?

Also... if everything else fails, use a bigger hammer ;) Bigger hammer being a larger package in this case: even the kernel header is below 1M in size, whereas header max size is 64M.
Comment 18 Charlie Brady 2012-11-15 08:49:18 EST
Do you need more than one reproducer? Isn't this a simple problem, fixed by correct use of read() inside fdRead():

http://linux.die.net/man/2/read
Comment 19 Patrik Kis 2012-11-15 10:48:44 EST
(In reply to comment #17)
> I dont see what else besides pipe buffer size could affect it. Where are you
> getting the pipe buffer size from?
> 
I used the script found here:
http://unix.stackexchange.com/questions/11946/how-big-is-the-pipe-buffer

> Also... if everything else fails, use a bigger hammer ;) Bigger hammer being
> a larger package in this case: even the kernel header is below 1M in size,
> whereas header max size is 64M.
Well, the big hammer seems to help. Having a spec file with 15M changelog gives quite good failure chances. (The poorest is on s390x, about 5%, but that is enough to verify the issue with higher attempts).

(In reply to comment #18)
> Do you need more than one reproducer? Isn't this a simple problem, fixed by
> correct use of read() inside fdRead():
> 
> http://linux.die.net/man/2/read
Yes, the problem is seems to be quite simple, but I need a deterministic way to reproduce the issue, so we can say there is not regression in the future.
Now I think we found the way.
Comment 23 errata-xmlrpc 2013-02-21 05:51:16 EST
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2013-0461.html

Note You need to log in before you can comment on or make changes to this bug.