Bug 475633
Summary: | yum silently bails out trying to update kernel | ||||||
---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Valdis Kletnieks <valdis.kletnieks> | ||||
Component: | yum | Assignee: | Seth Vidal <skvidal> | ||||
Status: | CLOSED NEXTRELEASE | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||
Severity: | medium | Docs Contact: | |||||
Priority: | low | ||||||
Version: | rawhide | CC: | ffesti, james.antill, katzj, michal, pmatilai, tim.lauridsen, yaneti, zaitcev | ||||
Target Milestone: | --- | ||||||
Target Release: | --- | ||||||
Hardware: | All | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2008-12-16 16:11:31 UTC | Type: | --- | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
So: 1. check 'dmesg' or your logs and see if the process is being oom-killed 2. how much ram do you have on this machine. 3. does it happen for anything else or just kernels? 1) Absolutely nothing in dmesg and /var/log/messages. /var/log/yum just has lines listing the RPMs that successfully updated, nothing resembling an error message. Plus, if a child process had OOM'ed or taken a signal, we'd have seen a SIGCHLD in the strace. 2) 2G of RAM, another 2G of swap, 0% used. /proc/meminfo and related show no memory pressure. 3) I've updated 31 other RPMs from Rawhide today, including kernel-headers and kernel-firmware. It's only the kernel RPM that causes issue. I wonder if it's the fact that the kernel is 'installonly' causing a problem. The code looks like: self.verbose_logger.log(yum.logginglevels.INFO_2, _('Finished Transaction Test')) if len(tserrors) > 0: errstring = _('Transaction Check Error:\n') for descr in tserrors: errstring += ' %s\n' % to_unicode(descr) raise yum.Errors.YumBaseError, errstring + '\n' + \ self.errorSummary(errstring) self.verbose_logger.log(yum.logginglevels.INFO_2, _('Transaction Test Succeeded')) ...so I'll put money on the fact that "transaction check error" is being raised, but something is going wrong and the output turns into '\n\n\n' (and I'd put money on a unicode thing upsetting the logging there). So your best bet to get to the bottom of this is to add print statements to find out what the problem is (just before the raise, line 440 in cli.py). Yeh, this is where the raise ends up: def exFatal(e): logger.critical('\n\n%s', to_unicode(e)) ...and the logger puts a \n on the end automatically ... and if it doesn't like the unicode => str conversion it just replaces the %s to '' ... which gives you \n\n\n. I would guess your /boot is getting low on space. Actually, it wasn't /boot, it looks like / is a tad tight: # df / Filesystem 1K-blocks Used Available Use% Mounted on /dev/mapper/VolGroup00-root 1566576 1358472 129464 92% / Which looked like plenty of space because I "knew" that a kernel would use about 90M in /lib/modules, leaving 40M to spare. However.... # du -s /lib/modules/2.6.2*.x86_64 85948 /lib/modules/2.6.27.5-113.fc10.x86_64 85972 /lib/modules/2.6.27.5-117.fc10.x86_64 195384 /lib/modules/2.6.28-0.110.rc7.git3.fc11.x86_64 195384 /lib/modules/2.6.28-0.113.rc7.git5.fc11.x86_64 195788 /lib/modules/2.6.28-0.114.rc7.git5.fc11.x86_64 Those kernel modules got awfully piggy. Compiled with -g? # size 2.6.27.5-113.fc10.x86_64/kernel/arch/x86/oprofile/oprofile.ko text data bss dec hex filename 19269 10752 736 30757 7825 2.6.27.5-113.fc10.x86_64/kernel/arch/x86/oprofile/oprofile.ko # size 2.6.28-0.114.rc7.git5.fc11.x86_64/kernel/arch/x86/oprofile/oprofile.ko text data bss dec hex filename 22741 68784 864 92389 168e5 2.6.28-0.114.rc7.git5.fc11.x86_64/kernel/arch/x86/oprofile/oprofile.ko So about the same binary image size... # ls -l 2.6.27.5-113.fc10.x86_64/kernel/arch/x86/oprofile/oprofile.ko -rwxr--r-- 1 root root 67552 2008-11-17 04:22 2.6.27.5-113.fc10.x86_64/kernel/arch/x86/oprofile/oprofile.ko # ls -l 2.6.28-0.114.rc7.git5.fc11.x86_64/kernel/arch/x86/oprofile/oprofile.ko -rwxr--r-- 1 root root 134216 2008-12-07 22:49 2.6.28-0.114.rc7.git5.fc11.x86_64/kernel/arch/x86/oprofile/oprofile.ko But the .ko's have doubled in size. Yes, this looks like a kernel RPM packaging issue - but yum should be able to output the message that it needs more room on '/'. It manages to do so for other filesystems. Is the fact it's '/' with no chars before-after causing a Python indigestion? 1. What lang/locale are you running in? 2. The 3 blank lines ARE that warning message. # echo $LANG en_US.UTF-8 And yes, I realize the blank lines are what little of the message made it out. Just commenting that the *rest* of the text would have been nice to have. ;) what ver of yum were you using when this happened? I'm trying to replicate it and I just cannot. Wow, I thought I included it - yum-3.2.20-7.fc11.noarch Is this python 2.6? python-2.6-2.fc11.x86_64 I just replicated it - 'df' reports 120M free on /. I was able to successfully install one 'kernel' update the other day after 'rpm -e' an older kernel so there was enough free space on /. So it's definitely root-caused by not enough space for the /lib/modules/<kernel> directory, but unclear why the error message gets bollixed up. (Incidentally, I was able to verify that out-of-space on some *other* filesystem (/usr/share in this case) properly reports the error, so it's something having to do with '/' in particular. Is some code someplace treating the filesystem name as '/$something' and going south when $something is ""? That's my first guess... Okay in python 2.6 only, so far the following is happening: We're passing an exception instance to yum.misc.to_unicode() which should just return the object b/c the exception instance is not also an instance of basestring. Then we're logger.critical('\n\n%s', to_unicode(e)). Which is not, apparently able to do the string assertion any longer. I tested this with a simple test and it looks like the logging module logger.critical() which I eventually traced back to logging/__init__.py LogRecord.getMessage() method. which does: msg % self.args which, apparently on python 2.6 no longer does the string assertion for exception objects. I scanned through the python 2.6 whats new pages looking for something related to this and I couldn't quite find anything. I can make a change to yum's to_unicode() which sorta 'handles' this. But I ultimately think it's an issue in logging or maybe in python's str assertions itself. Welcoming input from any of the other python folks about this. Okay, Ignacio found the specific problem bug report to python: http://bugs.python.org/issue1551432 patch to solve this in yum: http://yum.baseurl.org/gitweb?p=yum.git;a=commitdiff;h=5c4577185666afe7928073924f8c1ebd71de07ac it'll be out in the next yum release (3.2.21) If somebody needs it I have a repeatable case with tons of a disk space free and not a kernel. Packaging for gawk is broken in 3.1.6-3.fc11 (see bug 476240). 'yum -y -d 10 localupdate gawk-3.1.6-3.fc11.x86_64.rpm' prints the usual stuff, nothing really relevant to the problem on hand, and after that: Total download size: 2.6 M Downloading Packages: and this is it (including all those newlines). Michal, right, it is probably a transaction failure due to fileconflicts or something like that. It doesn't really matter, the issue is not related to the transaction at all and I'm pretty sure the patch I posted in comment 14 solves it. have you tried it? > it is probably a transaction failure due to fileconflicts or something > like that. Actually in the case of bug 476240 it is a broken attempt to create a broken cross-device hard link. > and I'm pretty sure the patch I posted in comment 14 solves it. > have you tried it? One apparently needs git to retrieve it; it does not matter as it is small and I applied it by hand. That indeed printed "Package gawk-3.1.6-3.fc11.x86_64.rpm is not signed" which I forgotten about. Only with 'yum -y -d 10 --nogpgcheck localupdate gawk-3.1.6-3.fc11.x86_64.rpm' I see now: .... Running Transaction Updating : gawk 1/2 Error unpacking rpm package gawk-3.1.6-3.fc11.x86_64 error: unpacking of archive failed on file /bin/pgawk-3.1.6a;49497911: cpio: link Warning: scriptlet or other non-fatal errors occurred during transaction. Running "posttrans" handler for "refresh-updatesd" plugin yum-updatesd not on the bus Running "posttrans" handler for "refresh-packagekit" plugin Transaction time: 1.641 =============================== Leaving rpm code =============================== Updated: gawk.x86_64 0:3.1.6-3.fc11 Complete! Errors are actually pretty fatal and I am left with gawk-3.1.6-2.fc11.x86_64. I am not sure if this is much of an improvement. :-) Since the goal was to output the error report and not have it blankly disappear I think we achieved our goal. If you have another bug please file it separately. > If you have another bug please file it separately. Ah, but that another kind of bugs was already reported in the past (see, for example, bug 216221; although this one was reassigned to rpm) and closed. In general the issue is that yum reports something done while it was not. *** Bug 477884 has been marked as a duplicate of this bug. *** yum-3.2.21-2.fc10 has been submitted as an update for Fedora 10. http://admin.fedoraproject.org/updates/yum-3.2.21-2.fc10 yum-3.2.21-2.fc9 has been submitted as an update for Fedora 9. http://admin.fedoraproject.org/updates/yum-3.2.21-2.fc9 yum-3.2.21-2.fc10 has been pushed to the Fedora 10 stable repository. If problems still persist, please make note of it in this bug report. yum-3.2.21-2.fc9 has been pushed to the Fedora 9 stable repository. If problems still persist, please make note of it in this bug report. |
Created attachment 326405 [details] strace -r -f yum -y update kernel Description of problem: While trying to upgrade kernel 2.6.28-0.121.rc7.git5.fc11, I found that yum would simply exit after running the transaction test, with no indication of what the problem was. # rpm -qa kernel kernel-2.6.28-0.114.rc7.git5.fc11.x86_64 kernel-2.6.28-0.110.rc7.git3.fc11.x86_64 kernel-2.6.28-0.113.rc7.git5.fc11.x86_64 # yum -y update kernel Loaded plugins: refresh-packagekit, refresh-updatesd Excluding Packages in global exclude list Finished Setting up Update Process Resolving Dependencies There are unfinished transactions remaining. You might consider running yum-complete-transaction first to finish them. --> Running transaction check ---> Package kernel.x86_64 0:2.6.28-0.121.rc7.git5.fc11 set to be installed --> Finished Dependency Resolution --> Running transaction check ---> Package kernel.x86_64 0:2.6.28-0.110.rc7.git3.fc11 set to be erased --> Finished Dependency Resolution Dependencies Resolved ================================================================================ Package Arch Version Repository Size ================================================================================ Installing: kernel x86_64 2.6.28-0.121.rc7.git5.fc11 rawhide 22 M Removing: kernel x86_64 2.6.28-0.110.rc7.git3.fc11 installed 173 M Transaction Summary ================================================================================ Install 1 Package(s) Update 0 Package(s) Remove 1 Package(s) Total size: 22 M Downloading Packages: ============================== Entering rpm code =============================== Running rpm_check_debug Running Transaction Test Finished Transaction Test # And that's it. It burps out several blank lines and quits. (Note that I had already upgraded kernel-firmware and kernel-headers separately - those upgraded just fine, it's just the 'kernel' RPM exhibiting this). I've attached the tail end of an 'strace -r -f yum update -y kernel', and from that it's pretty obvious that after it prints 'Finished transaction test', it's ready to pack it in and go home - about all that happens after that are closing files. Oddest thing - what's writing those 3 newlines out, and why? Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info: