Created attachment 525098 [details]
script I used to produce the crash
Description of problem:
System crashed when I tried the equivalent of "yum update" after boot.
Version-Release number of selected component (if applicable):
I verified the checksum, and got a Sucess when I created the CD with K3b, verifying written data.
Probably every time. I tried twice.
The first time, I saved the output of
"echo n | yum update", and hand-edited it to get a list of my saved RPM filenames. Then I launched the yum update in a terminal window. While yum
was updating, the whole Gnome 3 interface disappeared, leaving the desktop background and a responsive mouse pointer. CTRL-ALT-Fn showed messages including "auditd disappeared". CTRL-ALT-BKSP gave me a dark screen. Had to power down because CTRL-ALT-DEL showed no response.
Booted again in mode 3 to rerun my saved script and save output, which I will post.
Steps to Reproduce:
1. Boot Fedora-16-Beta-x86_64-Live-Desktop.iso
2. In a terminal window, su 'yum update'
Normal Yum update
Will post the following:
yummm script I used to produce bug
yummm.one output of above
yummm.two error output of above
Created attachment 525099 [details]
output of yummm
Created attachment 525100 [details]
Created attachment 525101 [details]
Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/yum/rpmtrans.py", line 444, in callback
File "/usr/lib/python2.7/site-packages/yum/rpmtrans.py", line 510, in _instCloseFile
File "/usr/lib/python2.7/site-packages/yum/history.py", line 757, in trans_data_pid_end
File "/usr/lib/python2.7/site-packages/yum/history.py", line 640, in _commit
sqlite3.OperationalError: disk I/O error
error: python callback <bound method RPMTransaction.callback of <yum.rpmtrans.RPMTransaction instance at 0x601dc68>> failed, aborting!
...are you sure that you aren't running out of disk space? Not sure what else could cause that.
That's what I thought. The disk containing the RPMs had about 5G of available space. Maybe it's a system ramdisk that's running out of space. The message doesn't tell me which disk is running out of space.
I could try running df periodically in another window while the update is running.
I also noticed a selinux error fly by in the first ten or so updates. If I can get that error again, I'll file another bug report.
I can also try rpm -Uvh .
Well, I can tell you that from the traceback it's failing when it's trying to add data to the yum history file which should be in:
...if that helps. This is just a plain "yum update" right? No anaconda or anything in there that might be doing something weird?
Yes, it's supposed to be a plain yum update, such as a Fedora Live user would issue to try the Live with the latest test updates.
I tried booting in mode 3, and substituting "rpm -Uvh" for "yum update". I sent standard output and standard error to different files. The update ran fine until it got to around "mesa...", at which point there were a number of errors on dm-0. I don't have an exact trace of that yet. At that point, about the only command that worked was "cat", and of course the Power Off button. When I looked at the standard error file, it contained one line, something like "[nnnn blocks]". The fact of this being in standard error is significant, for I also noticed messages, something like "No such device stderr" in dmesg!
I'll repost my results, using dmesg again, and perhaps a new batch of updates that will have been issued since my first attempt to use yum.
Created attachment 525317 [details]
Script to show problem
After a number of tries, I have a script at attachment 525098 [details] that shows the bug, using RPM only. Procedure to run it is as follows:
1. Place this script on a R/W disk, preferably with journaling
2. Boot Fedora Live and mount aforementioned disk
3. Run the script using typescript, saving result to aforementioned disk
4. When endless errors occur, power down, and restart with a normal Linux. Output on the disk should be recovered from the journal.
Created attachment 525320 [details]
Output of above script
The output shows that I am getting RPMDB errors. But the first sign of trouble is inability to read the CPIO archives.
The first sign of trouble is not being able to write to the disk:
error: unpacking of archive failed on file /usr/lib64/evolution/3.2/libevolution-calendar.so.0.0.0;4e82c503: cpio: write failed - Read-only file system
error: evolution-3.2.0-1.fc16.x86_64: install failed
With the file system turning into read-only in middle of transaction, things are not going to end well... The question is: who's remounting it read-only (rpm doesn't touch the system mounts), or is that just eg ramdisk giving funny errors when it gets full (or something else).
Ehm... tried to update F16-Beta-RC3 live to reproduce this, yum got killed by OOM. Ugh.
Full system crash is reproducible also with Fedora 17-Beta-RC3 Live image.
Install 3 Packages (+2 Dependent packages)
Upgrade 256 Packages
Total download size: 282 M
I have 4 GB of RAM and this bug is reproducible also with 2 GB persistent overlay on 4 GB flashdisk:
sudo livecd-iso-to-disk --overlay-size-mb 2048 /path/to/ISO /dev/USBPARTITIONNAME
This bug affects many users using Fedora Live images and running "yum update" during Fedora 17 Test Days. We need it working for testing lastminute fixes. This bug should be resolved before Beta goes out and more users joins the testing. This bug already annoyed many testers and could discourage them from attending future Test Days.
Therefore I propose this bug as F17Beta blocker.
No. If you want an updated live spin, build an updated live spin. live images aren't designed for yum update to work. never have been.
Fedora Bugzappers volunteer triage team
Story from Gnome Shell Software Rendering test day on 2012-03-29:
Developer pushed important fixes on 2012-03-28 evening and they didn't made a way into nightly build. yum update crashed whole system with block I/O error. So, I spent a testday day building custom images instead of testing and supporting other testers.
In some cases custom images can't be build, because of broken updates-testing repo dependencies.
Disk installation could be often successfully upgraded from broken updates-testing repo using "yum upgrade --skip-broken".
Discussed at 2012-04-05 go/no-go meeting (http://meetbot.fedoraproject.org/fedora-meeting-1/2012-04-05/gono-go_continuation_f17_beta_rc3_part_two_or_three.2012-04-05-15.00.html ). Whether it would be useful or not, the fact is that yum update has never worked reliably on live boots, and it is not designed to be so. There is no requirement for this to work in the release criteria. It was agreed that this bug is rejected as a beta blocker.
Fedora Bugzappers volunteer triage team
This package has changed ownership in the Fedora Package Database. Reassigning to the new owner of this component.
(In reply to comment #13)
> live images aren't designed for yum update to work. never have been.
--> not even supposed to work -> NOTABUG.
This is not but in yum/rpm, but in kernel.
cite from http://fedoraproject.org/wiki/How_to_create_and_use_Live_USB#limited_overlay
One very important note about using the "primary" persistent overlay for system changes is that due to the way it's currently implemented (as a Device-mapper copy-on-write snapshot), every single change to it (writes AND deletes) subtracts from its free space, so it will eventually be "used up" and your USB stick will no longer boot.
The persistent overlay status may be queried by issuing this command on the live system:
dmsetup status live-rw
_ _ _
Using "dmsetup" is the only way determine free space when using DM overlay. yum and other programs (even "df") don't have correct informations.
Even when DM overlay is full, system should not crash with I/O errors.
How to configure device-mapper to store only last snapshot of filesystem overlay instead of whole history of write/rewrite/delete changes?
Can Fedora live system use different overlay implementation instead of DM?
That is way out of scope of this bug. There could theoretically be improvements in how we do live overlay, sure, but a bug report about not being able to yum update a live image isn't really the place. It would be a Feature.