Description of problem: Upgrading to the latest kernel (8.1.6) hangs since 4.5h in the following state: root 3605 0.0 0.0 62888 1092 pts/0 S+ 16:46 0:00 | \_ /bin/sh /var/tmp/rpm-tmp.34420 4 root 3609 0.0 0.0 62892 1220 pts/0 S+ 16:46 0:00 | \_ /bin/bash /sbin/new-kernel-pkg --package kernel --mkinitrd --depmod --install 2.6.18-8.1.6.el5 root 3618 0.0 0.0 63288 1560 pts/0 S+ 16:46 0:00 | \_ /bin/bash --norc /sbin/mkinitrd --allow-missing -f /boot/initrd-2.6.18-8.1.6.el5.img 2.6.18-8.1.6.el5 root 3872 0.0 0.0 63288 912 pts/0 S+ 16:46 0:00 | \_ /bin/bash --norc /sbin/mkinitrd --allow-missing -f /boot/initrd-2.6.18-8.1.6.el5.img 2.6.18-8.1.6.el5 root 3873 0.0 0.0 56348 928 pts/0 S+ 16:46 0:00 | \_ lvm.static lvs --ignorelockingfailure --noheadings -o vg_name /dev/systemjunior/root # strace -p 3873 Process 3873 attached - interrupt to quit read(3, <unfinished ...> Process 3873 detached # lsof -p 3873 COMMAND PID USER FD TYPE DEVICE SIZE NODE NAME lvm.stati 3873 root cwd DIR 253,0 4096 2 / lvm.stati 3873 root rtd DIR 253,0 4096 2 / lvm.stati 3873 root txt REG 253,0 1696856 7798889 /sbin/lvm.static lvm.stati 3873 root mem REG 253,0 55516736 5475361 /usr/lib/locale/locale-archive lvm.stati 3873 root mem REG 253,0 25462 5570819 /usr/lib64/gconv/gconv-modules.cache lvm.stati 3873 root 0r FIFO 0,6 99681795 pipe lvm.stati 3873 root 1w FIFO 0,6 99682059 pipe lvm.stati 3873 root 2w CHR 1,3 1457 /dev/null lvm.stati 3873 root 3u unix 0xffff880043e73c40 99682062 socket # lsof | grep 99682062 lvm.stati 3873 root 3u unix 0xffff880043e73c40 99682062 socket So it's some socket lvm.static tries to read from, but no process talks to. Version-Release number of selected component (if applicable): kernel-2.6.18-8.1.6.el5 lvm2-2.02.16-3.el5 mkinitrd-5.1.19.6-1 How reproducible: - Steps to Reproduce: 1.yum upgrade from an otherwise up to date RHEL5 system. 2. 3. Actual results: See above Expected results: No hanging of mkinitrd Additional info:
I have the same problem on RHEL 5.3! The affected system has been carefully upgraded from 4.7 using a 5.3 DVD. All issues with .rpmsave/.rpmnew have been solved. Everything is working fine, besides the problem with kernel upgrades. Here's my process tree: 9909 | - /usr/bin/python /usr/bin/yum upgrade 10510 | - /bin/sh /var/tmp/rpm-tmp.20713 3 10512 | - /bin/bash /sbin/new-kernel-pkg --package kernel-PAE --mkinitrd --depmod --install 2.6.18-128.1.10.el5PAE 10521 | - /bin/bash --norc /sbin/mkinitrd --allow-missing -f /boot/initrd-2.6.18-128.1.10.el5PAE.img 2.6.18-128.1.10.el5PAE 10649 | - /bin/bash --norc /sbin/mkinitrd --allow-missing -f /boot/initrd-2.6.18-128.1.10.el5PAE.img 2.6.18-128.1.10.el5PAE 10650 D | - lvm.static lvs --ignorelockingfailure --noheadings -o vg_name /dev/md1 I don't know for what this lvm.static call is. /dev/md1 is used for swap only. The only LVM2 physical volume I have is on /dev/md2. /dev/md0 is directly used for /boot. All three /dev/mdX are a RAID1 of two IDE HDs. Nevertheless I found out why it is hanging! Running strace lvm.static lvs --ignorelockingfailure --noheadings -o vg_name /dev/md1 instead of strace -p PID as Axel tried gave me the whole picture: lvm.static is trying to access /dev/cdrom. This fails as there is no CD in the drive. If there is a CD in the drive, the kernel upgrade succeeds! The drive is a Panasonic slot-in IDE CD-ROM. Please fix that problem as it is really annoying and certainly a bug. This is 100% reproducible and happens everytime on my system.
- lvm.static lvs --ignorelockingfailure --noheadings -o vg_name /dev/md1 (1) Why is this running lvm.static instead of lvm? (lvm.static is deprecated and nothing should be using it any more.) (2) Why is it ignoring locking failure when run as a normal process on the system? (Because sometimes in other cases does it have to be run in environments that need that argument?) (3) Why is it using 'lvs' to output a VG property? ('vgs' would suffice.) (4) Why is it asking LVM to say whether or not there is a VG called 'md1'?
As /sbin/mkinitrd is a shell script I added a set -x at the top and called /sbin/mkinitrd --allow-missing -f /boot/initrd-testing.img 2.6.18-128.1.10.el5PAE It's now hanging at lvm.static lvs --ignorelockingfailure --noheadings -o vg_name /dev/Volume00/root making no progress. Please see the attached output from set -x which reveals every command that is called.
Created attachment 343450 [details] Trace produced using set -x
I could simply Ctrl-c lvm.static. To show you that it hangs at open("/dev/cdrom", O_RDONLY|O_DIRECT|O_LARGEFILE|O_NOATIME I called strace -p lvm.static lvs --ignorelockingfailure --noheadings -o vg_name /dev/Volume00/root Please see the attached output. Note that /dev/Volume00/root is an ext3 logvol and mounted as /
Created attachment 343451 [details] strace output for lvm call
There were some changes to mkinird to cache dm devices (and not scan everything, bug #516047) so kernel update should be quick even with many DM/LVM devices. For the hang on opening /dev/cdrom - this is possibly other problem, I think it must block even other commands? like "blkid -p /dev/cdrom" or blockdev --getsz /dev/cdrom? (this is then maybe kernel problem) For now closing that, if you still see the problem with recent update, please reopen it with new logs, thanks.