Bug 998065
| Summary: | libguestfs kernel hang in RHEL 6.5 | |||
|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 6 | Reporter: | Colin Walters <walters> | |
| Component: | libguestfs | Assignee: | Richard W.M. Jones <rjones> | |
| Status: | CLOSED CURRENTRELEASE | QA Contact: | Virtualization Bugs <virt-bugs> | |
| Severity: | unspecified | Docs Contact: | ||
| Priority: | unspecified | |||
| Version: | 6.5 | CC: | bfan, leiwang, ptoscano, walters, wshi | |
| Target Milestone: | rc | |||
| Target Release: | --- | |||
| Hardware: | Unspecified | |||
| OS: | Unspecified | |||
| Whiteboard: | ||||
| Fixed In Version: | Doc Type: | Bug Fix | ||
| Doc Text: | Story Points: | --- | ||
| Clone Of: | ||||
| : | 998108 (view as bug list) | Environment: | ||
| Last Closed: | 2014-05-20 11:05:16 UTC | Type: | Bug | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Embargoed: | ||||
With the -412 kernel, nested (ie. using TCG) I'm getting a slightly different problem. Lots of: [ 24.678207] Clocksource tsc unstable (delta = 75815452 ns). Enable clocksource failover by adding clocksource_failover kernel parameter. [ 26.311900] Clocksource tsc unstable (delta = 78919174 ns). Enable clocksource failover by adding clocksource_failover kernel parameter. [ 32.618435] Clocksource tsc unstable (delta = 197138175 ns). Enable clocksource failover by adding clocksource_failover kernel parameter. [ 34.296502] Clocksource tsc unstable (delta = 167980713 ns). Enable clocksource failover by adding clocksource_failover kernel parameter. [ 63.056573] Clocksource tsc unstable (delta = 71039379 ns). Enable clocksource failover by adding clocksource_failover kernel parameter. [ 69.328987] Clocksource tsc unstable (delta = 226577207 ns). Enable clocksource failover by adding clocksource_failover kernel parameter. [ 72.595910] Clocksource tsc unstable (delta = 267708526 ns). Enable clocksource failover by adding clocksource_failover kernel parameter. [ 78.234690] Clocksource tsc unstable (delta = 85922093 ns). Enable clocksource failover by adding clocksource_failover kernel parameter. [ 81.805477] Clocksource tsc unstable (delta = 71247525 ns). Enable clocksource failover by adding clocksource_failover kernel parameter. [ 82.420955] Clocksource tsc unstable (delta = 115523027 ns). Enable clocksource failover by adding clocksource_failover kernel parameter. and not much progress being made. Upstream we switched over to using kvmclock (038ed0a08e & c53b459fdd) which we should probably do in RHEL too since it would avoid most of this trouble. I will test on baremetal next. It works OK for me on baremetal (with the -412 kernel). Is the error reproducible every time, or only occasionally? Are you using this on baremetal or nested (eg in a cloud VM)? (In reply to Richard W.M. Jones from comment #2) > It works OK for me on baremetal (with the -412 kernel). Hmm. So you're booting -412 on -412? I'm sadly stuck on 2.6.32-381.el6.x86_64 due to https://bugzilla.redhat.com/show_bug.cgi?id=987060 Although I haven't tested -412 yet as a host. Give me a bit to context switch and try it. > Is the error reproducible every time, or only occasionally? Hangs every time. > Are you using this on baremetal or nested (eg in a cloud VM)? Baremetal; Lenovo T420s laptop. Updates from IRC conversations and others: - Would be interesting to know if the kernel eventually prints out anything, or if nothing is printed before the libguestfs-test-tool timeout (10 mins). - I have tried the -412 kernel on 3 systems, 2 baremetal, 1 virtualized, and I can't reproduce it. Note the bug was reported on -410 so this is not necessarily indicative. Would be interesting to know if the -412 or -413 kernel also shows the bug. - Colin tried adding -cpu host,+kvmclock to the qemu command line, but that didn't make any difference. The bug still happened with kvmclock enabled (hence I'm removing the blocked bugs). This request was not resolved in time for the current release. Red Hat invites you to ask your support representative to propose this request, if still desired, for consideration in the next release of Red Hat Enterprise Linux. Hello Colin Walters, May I know do you still meet this kernel hung in latest rhel6 kernel? I don't see this, and we'd have heard about it if it was happening in the released RHEL 6.5 kernel. My guess is it was a temporary blip in an unreleased kernel. |
This is almost certainly a regression in kernel 2.6.32-410.el6.x86_64, but filing here. [root@pluto libvirt]# LIBGUESTFS_DEBUG=1 guestfish -a /var/lib/libvirt/gnome-ostree-local.img --ro -m /dev/sda3 -m /dev/sda1:/boot libguestfs: create: flags = 0, handle = 0x22f1540 libguestfs: launch: attach-method=appliance libguestfs: launch: tmpdir=/tmp/libguestfsQ8lPnC libguestfs: launch: umask=0022 libguestfs: launch: euid=0 libguestfs: command: run: febootstrap-supermin-helper libguestfs: command: run: \ --verbose libguestfs: command: run: \ -f checksum libguestfs: command: run: \ /usr/lib64/guestfs/supermin.d libguestfs: command: run: \ x86_64 supermin helper [00000ms] whitelist = (not specified), host_cpu = x86_64, kernel = (null), initrd = (null), appliance = (null) supermin helper [00000ms] inputs[0] = /usr/lib64/guestfs/supermin.d checking modpath /lib/modules/2.6.32-381.el6.x86_64 is a directory picked vmlinuz-2.6.32-381.el6.x86_64 because modpath /lib/modules/2.6.32-381.el6.x86_64 exists checking modpath /lib/modules/2.6.32-400.el6.x86_64 is a directory picked vmlinuz-2.6.32-400.el6.x86_64 because modpath /lib/modules/2.6.32-400.el6.x86_64 exists checking modpath /lib/modules/2.6.32-410.el6.x86_64 is a directory picked vmlinuz-2.6.32-410.el6.x86_64 because modpath /lib/modules/2.6.32-410.el6.x86_64 exists supermin helper [00001ms] finished creating kernel supermin helper [00002ms] visiting /usr/lib64/guestfs/supermin.d supermin helper [00002ms] visiting /usr/lib64/guestfs/supermin.d/base.img supermin helper [00002ms] visiting /usr/lib64/guestfs/supermin.d/daemon.img supermin helper [00002ms] visiting /usr/lib64/guestfs/supermin.d/hostfiles supermin helper [00022ms] visiting /usr/lib64/guestfs/supermin.d/init.img supermin helper [00022ms] visiting /usr/lib64/guestfs/supermin.d/udev-rules.img supermin helper [00022ms] adding kernel modules supermin helper [00057ms] finished creating appliance libguestfs: checksum of existing appliance: 99c7bcc2b5b4a1d498810ddcb2dc02b23cc4fcab261201e01e5df4f12ba112e7 libguestfs: [00061ms] begin testing qemu features libguestfs: command: run: /usr/libexec/qemu-kvm libguestfs: command: run: \ -nographic libguestfs: command: run: \ -help libguestfs: command: run: /usr/libexec/qemu-kvm libguestfs: command: run: \ -nographic libguestfs: command: run: \ -version libguestfs: qemu version 0.12 libguestfs: command: run: /usr/libexec/qemu-kvm libguestfs: command: run: \ -nographic libguestfs: command: run: \ -machine accel=kvm:tcg libguestfs: command: run: \ -device ? libguestfs: [00183ms] finished testing qemu features libguestfs: accept_from_daemon: 0x22f1540 g->state = 1 [00184ms] /usr/libexec/qemu-kvm \ -global virtio-blk-pci.scsi=off \ -nodefconfig \ -nodefaults \ -nographic \ -device virtio-scsi-pci,id=scsi \ -drive file=/var/lib/libvirt/gnome-ostree-local.img,snapshot=on,id=hd0,if=none \ -device scsi-hd,drive=hd0 \ -drive file=/var/tmp/.guestfs-0/root.14884,snapshot=on,id=appliance,if=none,cache=unsafe \ -device scsi-hd,drive=appliance \ -machine accel=kvm:tcg \ -m 500 \ -no-reboot \ -device virtio-serial \ -serial stdio \ -device sga \ -chardev socket,path=/tmp/libguestfsQ8lPnC/guestfsd.sock,id=channel0 \ -device virtserialport,chardev=channel0,name=org.libguestfs.channel.0 \ -kernel /var/tmp/.guestfs-0/kernel.14884 \ -initrd /var/tmp/.guestfs-0/initrd.14884 \ -append 'panic=1 console=ttyS0 udevtimeout=600 no_timer_check acpi=off printk.time=1 cgroup_disable=memory root=/dev/sdb selinux=0 guestfs_verbose=1 TERM=xterm'\x1b[1;256r\x1b[256;256H\x1b[6n Google, Inc. Serial Graphics Adapter 07/26/11 SGABIOS $Id: sgabios.S 8 2010-04-22 00:03:40Z nlaredo $ (mockbuild.redhat.com) Tue Jul 26 15:05:08 UTC 2011 Term: 80x24 4 0 SeaBIOS (version seabios-0.6.1.2-28.el6) Probing EDD (edd=off to disable)... ok \x1b[2JInitializing cgroup subsys cpuset Initializing cgroup subsys cpu Linux version 2.6.32-410.el6.x86_64 (mockbuild.bos.redhat.com) (gcc version 4.4.7 20120313 (Red Hat 4.4.7-3) (GCC) ) #1 SMP Wed Aug 7 12:07:46 EDT 2013 Command line: panic=1 console=ttyS0 udevtimeout=600 no_timer_check acpi=off printk.time=1 cgroup_disable=memory root=/dev/sdb selinux=0 guestfs_verbose=1 TERM=xterm KERNEL supported cpus: Intel GenuineIntel AMD AuthenticAMD Centaur CentaurHauls Disabled fast string operations BIOS-provided physical RAM map: BIOS-e820: 0000000000000000 - 000000000009d800 (usable) BIOS-e820: 000000000009d800 - 00000000000a0000 (reserved) BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved) BIOS-e820: 0000000000100000 - 000000001f3fd000 (usable) BIOS-e820: 000000001f3fd000 - 000000001f400000 (reserved) BIOS-e820: 00000000fffbc000 - 0000000100000000 (reserved) DMI 2.4 present. SMBIOS version 2.4 @ 0xFDA40 Hypervisor detected: KVM last_pfn = 0x1f3fd max_arch_pfn = 0x400000000 PAT not supported by CPU. init_memory_mapping: 0000000000000000-000000001f3fd000 RAMDISK: 1f1a8000 - 1f3efc00 No NUMA configuration found Faking a node at 0000000000000000-000000001f3fd000 Bootmem setup node 0 0000000000000000-000000001f3fd000 NODE_DATA [0000000000009000 - 000000000003cfff] bootmap [000000000003d000 - 0000000000040e7f] pages 4 (7 early reservations) ==> bootmem [0000000000 - 001f3fd000] #0 [0000000000 - 0000001000] BIOS data page ==> [0000000000 - 0000001000] #1 [0000006000 - 0000008000] TRAMPOLINE ==> [0000006000 - 0000008000] #2 [0001000000 - 000201fae4] TEXT DATA BSS ==> [0001000000 - 000201fae4] #3 [001f1a8000 - 001f3efc00] RAMDISK ==> [001f1a8000 - 001f3efc00] #4 [000009d800 - 0000100000] BIOS reserved ==> [000009d800 - 0000100000] #5 [0002020000 - 0002020059] BRK ==> [0002020000 - 0002020059] #6 [0000008000 - 0000009000] PGTABLE ==> [0000008000 - 0000009000] found SMP MP-table at [ffff8800000fda60] fda60 kvm-clock: Using msrs 4b564d01 and 4b564d00 kvm-clock: cpu 0, msr 0:1c277c1, boot clock Zone PFN ranges: DMA 0x00000001 -> 0x00001000 DMA32 0x00001000 -> 0x00100000 Normal 0x00100000 -> 0x00100000 Movable zone start PFN for each node early_node_map[2] active PFN ranges 0: 0x00000001 -> 0x0000009d 0: 0x00000100 -> 0x0001f3fd SFI: Simple Firmware Interface v0.7 http://simplefirmware.org Intel MultiProcessor Specification v1.4 MPTABLE: OEM ID: BOCHSCPU MPTABLE: Product ID: 0.1 MPTABLE: APIC at: 0xFEE00000 Processor #0 (Bootup-CPU) I/O APIC #0 Version 17 at 0xFEC00000. Processors: 1 SMP: Allowing 1 CPUs, 0 hotplug CPUs PM: Registered nosave memory: 000000000009d000 - 000000000009e000 PM: Registered nosave memory: 000000000009e000 - 00000000000a0000 PM: Registered nosave memory: 00000000000a0000 - 00000000000f0000 PM: Registered nosave memory: 00000000000f0000 - 0000000000100000 Allocating PCI resources starting at 1f400000 (gap: 1f400000:e0bbc000) Booting paravirtualized kernel on KVM NR_CPUS:4096 nr_cpumask_bits:1 nr_cpu_ids:1 nr_node_ids:1 PERCPU: Embedded 31 pages/cpu @ffff880002200000 s94872 r8192 d23912 u2097152 pcpu-alloc: s94872 r8192 d23912 u2097152 alloc=1*2097152 pcpu-alloc: [0] 0 kvm-clock: cpu 0, msr 0:22167c1, primary cpu clock kvm-stealtime: cpu 0, msr 220e880 Built 1 zonelists in Node order, mobility grouping on. Total pages: 126045 Policy zone: DMA32 Kernel command line: panic=1 console=ttyS0 udevtimeout=600 no_timer_check acpi=off printk.time=1 cgroup_disable=memory root=/dev/sdb selinux=0 guestfs_verbose=1 TERM=xterm [ 0.000000] Disabling memory control group subsystem [ 0.000000] PID hash table entries: 2048 (order: 2, 16384 bytes) [ 0.000000] Checking aperture... [ 0.000000] No AGP bridge found [ 0.000000] Memory: 483936k/511988k available (5295k kernel code, 400k absent, 27652k reserved, 7053k data, 1268k init) [ 0.000000] Hierarchical RCU implementation. [ 0.000000] NR_IRQS:33024 nr_irqs:256 [ 0.000000] Console: colour dummy device 80x25 [ 0.000000] console [ttyS0] enabled [ 0.000000] Detected 2591.580 MHz processor. [ 0.001999] Calibrating delay loop (skipped) preset value.. 5183.16 BogoMIPS (lpj=2591580) [ 0.001999] pid_max: default: 32768 minimum: 301 [ 0.002130] Security Framework initialized [ 0.002422] SELinux: Disabled at boot. [ 0.003066] Dentry cache hash table entries: 65536 (order: 7, 524288 bytes) [ 0.003648] Inode-cache hash table entries: 32768 (order: 6, 262144 bytes) [ 0.004058] Mount-cache hash table entries: 256 [ 0.004728] Initializing cgroup subsys ns [ 0.005007] Initializing cgroup subsys cpuacct [ 0.005341] Initializing cgroup subsys memory [ 0.005648] Initializing cgroup subsys devices [ 0.006005] Initializing cgroup subsys freezer [ 0.006294] Initializing cgroup subsys net_cls [ 0.006616] Initializing cgroup subsys blkio [ 0.007010] Initializing cgroup subsys perf_event [ 0.007329] Initializing cgroup subsys net_prio [ 0.008033] Disabled fast string operations [ 0.008568] mce: CPU supports 10 MCE banks [ 0.008902] alternatives: switching to unfair spinlock [ 0.011475] SMP alternatives: switching to UP code [ 0.022889] Freeing SMP alternatives: 36k freed [ 0.023020] ftrace: converting mcount calls to 0f 1f 44 00 00 [ 0.023382] ftrace: allocating 21719 entries in 86 pages [ 0.027066] APIC routing finalized to flat. [ 0.027591] ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1 [ 0.027998] CPU0: Intel QEMU Virtual CPU version (cpu64-rhel6) stepping 03