Bug 741594

Summary: yum update crashes 16-Beta-RC3 Desktop Live
Product: [Fedora] Fedora Reporter: Peter H. Jones <jones>
Component: kernelAssignee: Kernel Maintainer List <kernel-maint>
Status: CLOSED NOTABUG QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: rawhideCC: awilliam, ffesti, gansalmon, itamar, james.antill, jnovy, jonathan, jones, kernel-maint, madhu.chinakonda, maxamillion, mishu, pmatilai, robatino, tla, zpavlas
Target Milestone: ---Keywords: Reopened
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: RejectedBlocker
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-11-19 15:16:25 EST Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Attachments:
Description Flags
script I used to produce the crash
none
output of yummm
none
Error output
none
dmesg output
none
Script to show problem
none
Output of above script none

Description Peter H. Jones 2011-09-27 07:42:15 EDT
Created attachment 525098 [details]
script I used to produce the crash

Description of problem:
System crashed when I tried the equivalent of "yum update" after boot.

Version-Release number of selected component (if applicable):
Fedora-16-Beta-x86_64-Live-Desktop.iso
Fedora-16-Beta-x86_64-Live-CHECKSUM containing:
807303a13e8a744022e978b9c2458970d793a2d3b5e1d6bfbfac719797f821eb *Fedora-16-Beta-x86_64-Live-Desktop.iso

I verified the checksum, and got a Sucess when I created the CD with K3b, verifying written data.

How reproducible:
Probably every time. I tried twice.
The first time, I saved the output of
"echo n | yum update", and hand-edited it to get a list of my saved RPM filenames. Then I launched the yum update in a terminal window. While yum
was updating, the whole Gnome 3 interface disappeared, leaving the desktop background and a responsive mouse pointer. CTRL-ALT-Fn showed messages including "auditd disappeared". CTRL-ALT-BKSP gave me a dark screen. Had to power down because CTRL-ALT-DEL showed no response.

Booted again in mode 3 to rerun my saved script and save output, which I will post.

Steps to Reproduce:
1. Boot Fedora-16-Beta-x86_64-Live-Desktop.iso
2. In a terminal window, su 'yum update'
3.
  
Actual results:
Crash

Expected results:
Normal Yum update

Additional info:

Will post the following:

yummm script I used to produce bug
yummm.one output of above
yummm.two error output of above
dmesg output
Comment 1 Peter H. Jones 2011-09-27 07:43:06 EDT
Created attachment 525099 [details]
output of yummm
Comment 2 Peter H. Jones 2011-09-27 07:43:52 EDT
Created attachment 525100 [details]
Error output
Comment 3 Peter H. Jones 2011-09-27 07:45:53 EDT
Created attachment 525101 [details]
dmesg output
Comment 4 James Antill 2011-09-27 12:23:53 EDT
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/yum/rpmtrans.py", line 444, in callback    
  File "/usr/lib/python2.7/site-packages/yum/rpmtrans.py", line 510, in _instCloseFile    
  File "/usr/lib/python2.7/site-packages/yum/history.py", line 757, in trans_data_pid_end
  File "/usr/lib/python2.7/site-packages/yum/history.py", line 640, in _commit    
sqlite3.OperationalError: disk I/O error
error: python callback <bound method RPMTransaction.callback of <yum.rpmtrans.RPMTransaction instance at 0x601dc68>> failed, aborting!

...are you sure that you aren't running out of disk space? Not sure what else could cause that.
Comment 5 Peter H. Jones 2011-09-27 15:17:30 EDT
That's what I thought. The disk containing the RPMs had about 5G of available space. Maybe it's a system ramdisk that's running out of space. The message doesn't tell me which disk is running out of space.

I could try running df periodically in another window while the update is running.

I also noticed a selinux error fly by in the first ten or so updates. If I can get that error again, I'll file another bug report.

I can also try rpm -Uvh .
Comment 6 James Antill 2011-09-27 15:25:30 EDT
Well, I can tell you that from the traceback it's failing when it's trying to add data to the yum history file which should be in:

<installroot>/var/lib/yum/history/blah.sqlite

...if that helps. This is just a plain "yum update" right? No anaconda or anything in there that might be doing something weird?
Comment 7 Peter H. Jones 2011-09-27 18:48:08 EDT
Yes, it's supposed to be a plain yum update, such as a Fedora Live user would issue to try the Live with the latest test updates.

I tried booting in mode 3, and substituting "rpm -Uvh" for "yum update". I sent standard output and standard error to different files. The update ran fine until it got to around "mesa...", at which point there were a number of errors on dm-0. I don't have an exact trace of that yet. At that point, about the only command that worked was "cat", and of course the Power Off button. When I looked at the standard error file, it contained one line, something like "[nnnn blocks]". The fact of this being in standard error is significant, for I also noticed messages, something like "No such device stderr" in dmesg!

I'll repost my results, using dmesg again, and perhaps a new batch of updates that will have been issued since my first attempt to use yum.
Comment 8 Peter H. Jones 2011-09-28 07:10:24 EDT
Created attachment 525317 [details]
Script to show problem

After a number of tries, I have a script at attachment 525098 [details] that shows the bug, using RPM only. Procedure to run it is as follows:

1. Place this script on a R/W disk, preferably with journaling
2. Boot Fedora Live and mount aforementioned disk
3. Run the script using typescript, saving result to aforementioned disk
4. When endless errors occur, power down, and restart with a normal Linux. Output on the disk should be recovered from the journal.
Comment 9 Peter H. Jones 2011-09-28 07:12:53 EDT
Created attachment 525320 [details]
Output of above script

The output shows that I am getting RPMDB errors. But the first sign of trouble is inability to read the CPIO archives.
Comment 10 Panu Matilainen 2011-09-29 01:41:18 EDT
The first sign of trouble is not being able to write to the disk:
error: unpacking of archive failed on file /usr/lib64/evolution/3.2/libevolution-calendar.so.0.0.0;4e82c503: cpio: write failed - Read-only file system
error: evolution-3.2.0-1.fc16.x86_64: install failed

With the file system turning into read-only in middle of transaction, things are not going to end well... The question is: who's remounting it read-only (rpm doesn't touch the system mounts), or is that just eg ramdisk giving funny errors when it gets full (or something else).
Comment 11 Panu Matilainen 2011-09-29 10:07:44 EDT
Ehm... tried to update F16-Beta-RC3 live to reproduce this, yum got killed by OOM. Ugh.
Comment 12 Martin 2012-04-05 10:36:35 EDT
Full system crash is reproducible also with Fedora 17-Beta-RC3 Live image.

Transaction Summary
================================================================================
Install    3 Packages (+2 Dependent packages)
Upgrade  256 Packages

Total download size: 282 M

I have 4 GB of RAM and this bug is reproducible also with 2 GB persistent overlay on 4 GB flashdisk:
sudo livecd-iso-to-disk --overlay-size-mb 2048 /path/to/ISO /dev/USBPARTITIONNAME


This bug affects many users using Fedora Live images and running "yum update" during Fedora 17 Test Days. We need it working for testing lastminute fixes. This bug should be resolved before Beta goes out and more users joins the testing. This bug already annoyed many testers and could discourage them from attending future Test Days.

Therefore I propose this bug as F17Beta blocker.
Comment 13 Adam Williamson 2012-04-05 10:50:35 EDT
No. If you want an updated live spin, build an updated live spin. live images aren't designed for yum update to work. never have been.



-- 
Fedora Bugzappers volunteer triage team
https://fedoraproject.org/wiki/BugZappers
Comment 14 Martin 2012-04-05 11:18:33 EDT
Story from Gnome Shell Software Rendering test day on 2012-03-29:
Developer pushed important fixes on 2012-03-28 evening and they didn't made a way into nightly build. yum update crashed whole system with block I/O error. So, I spent a testday day building custom images instead of testing and supporting other testers.
In some cases custom images can't be build, because of broken updates-testing repo dependencies.

Disk installation could be often successfully upgraded from broken updates-testing repo using "yum upgrade --skip-broken".
Comment 15 Adam Williamson 2012-04-06 00:04:09 EDT
Discussed at 2012-04-05 go/no-go meeting (http://meetbot.fedoraproject.org/fedora-meeting-1/2012-04-05/gono-go_continuation_f17_beta_rc3_part_two_or_three.2012-04-05-15.00.html ). Whether it would be useful or not, the fact is that yum update has never worked reliably on live boots, and it is not designed to be so. There is no requirement for this to work in the release criteria. It was agreed that this bug is rejected as a beta blocker.



-- 
Fedora Bugzappers volunteer triage team
https://fedoraproject.org/wiki/BugZappers
Comment 16 Fedora Admin XMLRPC Client 2012-04-13 19:07:10 EDT
This package has changed ownership in the Fedora Package Database.  Reassigning to the new owner of this component.
Comment 17 Fedora Admin XMLRPC Client 2012-04-13 19:10:42 EDT
This package has changed ownership in the Fedora Package Database.  Reassigning to the new owner of this component.
Comment 18 Panu Matilainen 2012-11-19 06:47:19 EST
(In reply to comment #13)
> live images aren't designed for yum update to work. never have been.

--> not even supposed to work -> NOTABUG.
Comment 19 Martin 2012-11-19 09:14:14 EST
This is not but in yum/rpm, but in kernel.

cite from http://fedoraproject.org/wiki/How_to_create_and_use_Live_USB#limited_overlay

One very important note about using the "primary" persistent overlay for system changes is that due to the way it's currently implemented (as a Device-mapper copy-on-write snapshot), every single change to it (writes AND deletes) subtracts from its free space, so it will eventually be "used up" and your USB stick will no longer boot.
...
The persistent overlay status may be queried by issuing this command on the live system:
dmsetup status live-rw
_ _ _

Problems:
Using "dmsetup" is the only way determine free space when using DM overlay. yum and other programs (even "df") don't have correct informations.

Even when DM overlay is full, system should not crash with I/O errors.

Questions:
How to configure device-mapper to store only last snapshot of filesystem overlay instead of whole history of write/rewrite/delete changes?

Can Fedora live system use different overlay implementation instead of DM?
Comment 20 Adam Williamson 2012-11-19 15:16:25 EST
That is way out of scope of this bug. There could theoretically be improvements in how we do live overlay, sure, but a bug report about not being able to yum update a live image isn't really the place. It would be a Feature.