Bug 854798
Summary: | rpm crash with SIGBUS in certain operations when package file has I/O error(s) | ||||||
---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Frantisek Hanzlik <franta> | ||||
Component: | rpm | Assignee: | Fedora Packaging Toolset Team <packaging-team> | ||||
Status: | CLOSED NEXTRELEASE | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||
Severity: | unspecified | Docs Contact: | |||||
Priority: | unspecified | ||||||
Version: | 17 | CC: | ffesti, jzeleny, packaging-team, pknirsch, pmatilai, stuart | ||||
Target Milestone: | --- | ||||||
Target Release: | --- | ||||||
Hardware: | Unspecified | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2013-07-04 07:39:05 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
Frantisek Hanzlik
2012-09-06 02:18:37 UTC
Rpm needs to look at previous package contents when its a %config file which might've been modified and thus could need a backup. --justdb only affects writing of the new contents, not everything rpm do. Obviously rpm shouldn't crash in case of io-error on a file, but these kind of errors are rare and hard to reproduce/systematically test against. Do you happen to have a way of reproducing the "Input/output error" on that file? *** Bug 877742 has been marked as a duplicate of this bug. *** FWIW if somebody can easily reproduce disk IO errors, it wouldn't hurt to test with rpm >= 4.10.1 (ie Fedora >= 18), as the newer versions no longer use mmap() in this context. dmsetup simulates IO errors with the error target. I've never tried to use device-mapper directly - just use lvm2. But I'll give it a try - should be educational. o I'll try to make a script that creates a file with a few megs, creates a loopback device, then a device mapper device that replaces selected sectors with error. o then that bad_disk is mounted on /var/test, and a test.rpm installs files to /var/test. o somehow, we ensure that the simulated bad blocks are in one of the files. I think debugfs can provide block mapping for files, and then maybe the dmsetup script above can be smart enough to *add* specific bad blocks while mounted. But that may be too advanced for me. Might have to create a few, and reinstall the rpm until it hits one. Example block device with errors here: http://mbroz.fedorapeople.org/talks/DeviceMapperBasics/dm.pdf Created attachment 648127 [details] SRPM of test case for rpm Created a simple test case for this rpm bug. Uploaded abrt from running it to bug#878201 (abrt doesn't like to let you specify an existing bug). Instructions: 1) inspect, build SRPM 2) install RPM 3) cd /var/testrpm 4) sh -x testrpm.sh [root@sdg testrpm]# sh -x testrpm.sh ++ losetup -f + loopdev=/dev/loop0 + '[' /dev/loop0 '!=' /dev/loop0 ']' + dd if=/dev/zero of=testdisk.img bs=1k count=2048 2048+0 records in 2048+0 records out 2097152 bytes (2.1 MB) copied, 0.0355833 s, 58.9 MB/s + mke2fs -F testdisk.img mke2fs 1.42.3 (14-May-2012) Discarding device blocks: done Filesystem label= OS type: Linux Block size=1024 (log=0) Fragment size=1024 (log=0) Stride=0 blocks, Stripe width=0 blocks 256 inodes, 2048 blocks 102 blocks (4.98%) reserved for the super user First data block=1 Maximum filesystem blocks=2097152 1 block group 8192 blocks per group, 8192 fragments per group 256 inodes per group Allocating group tables: done Writing inode tables: done Writing superblocks and filesystem accounting information: done + mv /mnt/testrpm /mnt/testrpm.save + mkdir -p /mnt/testrpm + mount -o loop testdisk.img /mnt/testrpm + rsync -rav /mnt/testrpm.save/ /mnt/testrpm sending incremental file list ./ testrpm.dat sent 1048796 bytes received 34 bytes 2097660.00 bytes/sec total size is 1048576 speedup is 1.00 + rpm -V testrpm + umount /mnt/testrpm + losetup /dev/loop0 /home/stuart/testrpm/testdisk.img + dmsetup create bad_disk testrpm.dm + mount /dev/mapper/bad_disk /mnt/testrpm + rpm -V testrpm testrpm.sh: line 28: 2813 Bus error (core dumped) rpm -V testrpm + umount /mnt/testrpm + dmsetup remove bad_disk + losetup -d /dev/loop0 + rmdir /mnt/testrpm + mv /mnt/testrpm.save /mnt/testrpm + rpm -V testrpm BDB2053 Freeing read locks for locker 0x501: 2813/3078138240 BDB2053 Freeing read locks for locker 0x502: 2813/3078138240 BDB2053 Freeing read locks for locker 0x503: 2813/3078138240 BDB2053 Freeing read locks for locker 0x504: 2813/3078138240 BDB2053 Freeing read locks for locker 0x505: 2813/3078138240 BDB2053 Freeing read locks for locker 0x506: 2813/3078138240 BDB2053 Freeing read locks for locker 0x507: 2813/3078138240 BDB2053 Freeing read locks for locker 0x508: 2813/3078138240 BDB2053 Freeing read locks for locker 0x509: 2813/3078138240 BDB2053 Freeing read locks for locker 0x50a: 2813/3078138240 BDB2053 Freeing read locks for locker 0x50b: 2813/3078138240 BDB2053 Freeing read locks for locker 0x50c: 2813/3078138240 BDB2053 Freeing read locks for locker 0x50d: 2813/3078138240 BDB2053 Freeing read locks for locker 0x50e: 2813/3078138240 [root@sdg testrpm]# test case provided. Can install testrpm and run repeatedly until fixed. *** Bug 878201 has been marked as a duplicate of this bug. *** Ooh, nice. Thanks for the very nice reproducer. And education :) I had no idea I had such disk-error simulation tools right under my nose all this time. The good news is that rpm >= 4.10.1 doesn't crash on it, not using mmap() is where the difference comes from. It doesn't report an actual IO error on it though, just a "cannot compute" question mark but that's still a whole lot better than crashing: [root@localhost testrpm]# sh testrpm.sh 2>&1|grep /testrpm S.5....T. /var/testrpm/testrpm.sh S.?...... /mnt/testrpm/testrpm.dat S.5....T. /var/testrpm/testrpm.sh S.5....T. /var/testrpm/testrpm.sh [root@localhost testrpm]# Is there a performance difference between mmap and read for verify? I would like to see an E for Error replace the 5 when there is an IO Error. Although a '?' meets the specification ("the test could not be performed"), IO errors, when they occur, are something you *really* want to know about. This message is a reminder that Fedora 17 is nearing its end of life. Approximately 4 (four) weeks from now Fedora will stop maintaining and issuing updates for Fedora 17. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as WONTFIX if it remains open with a Fedora 'version' of '17'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version prior to Fedora 17's end of life. Bug Reporter: Thank you for reporting this issue and we are sorry that we may not be able to fix it before Fedora 17 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora, you are encouraged change the 'version' to a later Fedora version prior to Fedora 17's end of life. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete. Closing NEXTRELEASE, there are no updates planned for F17 this point anymore. As for better indication of IO-errors, that's more of an upstream issue. |