Created attachment 396509 [details] snapshot Description of problem: Boot up a VM and suspend to disk. When resume from s4, guest becomes dead. The same problem exists on rhel3.9,win7,rhel5.4 guests. Version-Release number of selected component (if applicable): (host)# rpm -qa |grep kvm etherboot-zroms-kvm-5.4.4-13.el5 kvm-83-160.el5 kvm-qemu-img-83-160.el5 kmod-kvm-83-160.el5 kvm-tools-83-160.el5 kvm-debuginfo-83-160.el5 How reproducible: Can reproduce 100% Steps to Reproduce: 1. Boot up a VM 2. Suspend guest to disk 3. Resume guest Actual results: Guest could not resume from s4. Expected results: Guest can resumed from s4 successfully. Additional info: Commandline of VM: # qemu-kvm -drive file=/tmp/kvm_autotest_root/images/RHEL-Server-5.4-64.qcow2,if=ide,boot=on -net nic,vlan=0,model=rtl8139,macaddr=00:19:B7:5E:9A:00 -net tap,vlan=0,ifname=rtl8139_0_6001,script=/etc/qemu-ifup-switch -m 2048 -vnc :0 (guest) # uname -a Linux localhost.localdomain 2.6.18-164.11.1.el5 #1 SMP Wed Jan 6 13:26:04 EST 2010 x86_64 x86_64 x86_64 GNU/Linux (host)# uname -a Linux localhost.localdomain.englab.nay.redhat.com 2.6.18-189.el5 #1 SMP Tue Feb 16 11:10:22 EST 2010 x86_64 x86_64 x86_64 GNU/Linux
Is this from autotest report? Have this been verified manually? win7 and rhel5.4 both work for me. rhel3.9 I am not even sure supports hibernate to disk. Retest _manually_.
(In reply to comment #1) > Is this from autotest report? Have this been verified manually? win7 and > rhel5.4 > both work for me. rhel3.9 I am not even sure supports hibernate to disk. Retest > _manually_. Hello gleb, it was found by autotest, but I verified it in manual before reported this bug. I've re-tested manually with rhel54 guest, bug can be reproduced. host kernel: 2.6.18-189.el5 guest kernel: 2.6.18-164.2.1.el5 Command line: qemu-kvm -name 'vm1' -monitor tcp:0:6001,server,nowait -drive file=./RHEL-Server-5.4-64.qcow2,if=ide,cache=none,boot=on -net nic,vlan=0,model=e1000,macaddr=00:FF:B9:FE:01:77 -net tap,vlan=0,ifname=e1000_0_6001,script=/etc/qemu-ifup-switch,downscript=no -m 512 -smp 1 -soundhw ac97 -usbdevice tablet -rtc-td-hack -no-hpet -cpu qemu64,+sse2 -no-kvm-pit-reinjection -redir tcp:5000::22 -vnc :0 -serial unix:/tmp/serial-20100514-124115-Tay4,server,nowait Output of serial: # nc -U /tmp/serial-20100514-124115-Tay4 EXT3-fs: INFO: recovery required on readonly filesystem. EXT3-fs: write access will be enabled during recovery. kjournald starting. Commit interval 5 seconds EXT3-fs: recovery complete. EXT3-fs: mounted filesystem with ordered data mode. type=1404 audit(1273813267.281:2): enforcing=1 old_enforcing=0 auid=4294967295 ses=4294967295 type=1403 audit(1273813267.555:3): policy loaded auid=4294967295 ses=4294967295 hdc: drive_cmd: status=0x41 { DriveReady Error } hdc: drive_cmd: error=0x04 { AbortedCommand } ide: failed opcode was: 0xec mtrr: type mismatch for c2000000,100000 old: uncachable new: write-combining mtrr: type mismatch for c2000000,400000 old: uncachable new: write-combining Disabling non-boot CPUs ... Stopping tasks: ===========================================================================================================| Shrinking memory... done (54236 pages freed) pci_set_power_state(): 0000:00:05.0: state=3, current state=5 swsusp: Need to copy 60889 pages swsusp: critical section/: done (60889 pages copied) PCI: Enabling device 0000:00:01.2 (0000 -> 0001) PCI: Enabling device 0000:00:04.0 (0000 -> 0001) pnp: Failed to activate device 00:02. pnp: Failed to activate device 00:03. pnp: Failed to activate device 00:05. pnp: Failed to activate device 00:06. Saving image data pages (61008 pages) ... done Wrote 244032 kbytes in 7.25 seconds (33.65 MB/s) S| Shutdown: hda Power down. acpi_power_off called --------------------------------------------------------------------------- [root@intel-i7-12-3 ~]# nc -U /tmp/serial-20100514-124115-Tay4 Linux version 2.6.18-164.2.1.el5 (mockbuild.bos.redhat.com) (gcc version 4.1.2 20080704 (Red Hat 4.1.2-46)) #1 SMP Mon Sep 21 04:37:42 EDT 2009 Command line: ro root=/dev/VolGroup00/LogVol00 rhgb console=ttyS0,115200 console=tty0 BIOS-provided physical RAM map: BIOS-e820: 0000000000010000 - 000000000009f000 (usable) BIOS-e820: 000000000009f000 - 00000000000a0000 (reserved) BIOS-e820: 00000000000e8000 - 0000000000100000 (reserved) BIOS-e820: 0000000000100000 - 000000001fff0000 (usable) BIOS-e820: 000000001fff0000 - 0000000020000000 (ACPI data) BIOS-e820: 00000000c0000000 - 00000000c1000000 (reserved) BIOS-e820: 00000000fffbc000 - 0000000100000000 (reserved) DMI 2.4 present. kvm-clock: cpu 0, msr 7eff:80433401, boot clock No NUMA configuration found Faking a node at 0000000000000000-000000001fff0000 Bootmem setup node 0 0000000000000000-000000001fff0000 Memory for crash kernel (0x0 to 0x0) notwithin permissible range disabling kdump ACPI: PM-Timer IO Port: 0xb008 ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled) Processor #0 6:6 APIC version 20 ACPI: LAPIC (acpi_id[0x01] lapic_id[0x01] disabled) ACPI: LAPIC (acpi_id[0x02] lapic_id[0x02] disabled) ACPI: LAPIC (acpi_id[0x03] lapic_id[0x03] disabled) ACPI: LAPIC (acpi_id[0x04] lapic_id[0x04] disabled) ACPI: LAPIC (acpi_id[0x05] lapic_id[0x05] disabled) ACPI: LAPIC (acpi_id[0x06] lapic_id[0x06] disabled) ACPI: LAPIC (acpi_id[0x07] lapic_id[0x07] disabled) ACPI: LAPIC (acpi_id[0x08] lapic_id[0x08] disabled) ACPI: LAPIC (acpi_id[0x09] lapic_id[0x09] disabled) ACPI: LAPIC (acpi_id[0x0a] lapic_id[0x0a] disabled) ACPI: LAPIC (acpi_id[0x0b] lapic_id[0x0b] disabled) ACPI: LAPIC (acpi_id[0x0c] lapic_id[0x0c] disabled) ACPI: LAPIC (acpi_id[0x0d] lapic_id[0x0d] disabled) ACPI: LAPIC (acpi_id[0x0e] lapic_id[0x0e] disabled) ACPI: LAPIC (acpi_id[0x0f] lapic_id[0x0f] disabled) ACPI: IOAPIC (id[0x01] address[0xfec00000] gsi_base[0]) IOAPIC[0]: apic_id 1, version 17, address 0xfec00000, GSI 0-23 ACPI: INT_SRC_OVR (bus 0 bus_irq 5 global_irq 5 high level) ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level) ACPI: INT_SRC_OVR (bus 0 bus_irq 10 global_irq 10 high level) ACPI: INT_SRC_OVR (bus 0 bus_irq 11 global_irq 11 high level) Setting APIC routing to physical flat Using ACPI (MADT) for SMP configuration information Nosave address range: 000000000009f000 - 00000000000a0000 Nosave address range: 00000000000a0000 - 00000000000e8000 Nosave address range: 00000000000e8000 - 0000000000100000 Allocating PCI resources starting at 30000000 (gap: 20000000:a0000000) SMP: Allowing 16 CPUs, 15 hotplug CPUs kvm-clock: cpu 0, msr 0:1035401, primary cpu clock Built 1 zonelists. Total pages: 127865 Kernel command line: ro root=/dev/VolGroup00/LogVol00 rhgb console=ttyS0,115200 console=tty0 Initializing CPU#0 PID hash table entries: 2048 (order: 11, 16384 bytes) kvm_get_tsc_khz: cpu 0, msr 0:10be001 time.c: Using tsc for timekeeping HZ 1000 Console: colour VGA+ 80x25 Dentry cache hash table entries: 65536 (order: 7, 524288 bytes) Inode-cache hash table entries: 32768 (order: 6, 262144 bytes) Checking aperture... ACPI: DMAR not present Memory: 506560k/524224k available (2548k kernel code, 17212k reserved, 1292k data, 208k init) Calibrating delay loop (skipped), value calculated using timer frequency.. 5320.13 BogoMIPS (lpj=2660068) Security Framework v1.0.0 initialized SELinux: Initializing. selinux_register_security: Registering secondary module capability Capability LSM initialized as secondary Mount-cache hash table entries: 256 , L1 D cache: 32K SMP alternatives: switching to UP code ACPI: Core revision 20060707 Using local APIC timer interrupts. WARNING calibrate_APIC_clock: the APIC timer calibration may be wrong. Detected 62.503 MHz APIC timer. Brought up 1 CPUs testing NMI watchdog ... <4>WARNING: CPU#0: NMI appears to be stuck (0->0)! time.c: Using 1.193182 MHz WALL KVM GTOD KVM timer. time.c: Detected 2660.068 MHz processor. checking if image is initramfs... it is Freeing initrd memory: 3279k freed NET: Registered protocol family 16 ACPI: bus type pci registered PCI: Using configuration type 1 ACPI: Interpreter enabled ACPI: Using IOAPIC for interrupt routing ACPI: No dock devices found. ACPI: PCI Root Bridge [PCI0] (0000:00) PCI quirk: region b000-b03f claimed by PIIX4 ACPI PCI quirk: region b100-b10f claimed by PIIX4 SMB ACPI: PCI Interrupt Link [LNKA] (IRQs 5 *10 11) ACPI: PCI Interrupt Link [LNKB] (IRQs 5 *10 11) ACPI: PCI Interrupt Link [LNKC] (IRQs 5 10 *11) ACPI: PCI Interrupt Link [LNKD] (IRQs 5 10 *11) Linux Plug and Play Support v0.97 (c) Adam Belay pnp: PnP ACPI init pnp: PnP ACPI: found 7 devices usbcore: registered new driver usbfs usbcore: registered new driver hub PCI: Using ACPI for IRQ routing PCI: If a device doesn't work, try "pci=routeirq". If it helps, post a report NetLabel: Initializing NetLabel: domain hash size = 128 NetLabel: protocols = UNLABELED CIPSOv4 NetLabel: unlabeled traffic allowed by default ACPI: DMAR not present PCI-GART: No AMD northbridge found. NET: Registered protocol family 2 IP route cache hash table entries: 4096 (order: 3, 32768 bytes) TCP established hash table entries: 16384 (order: 6, 262144 bytes) TCP bind hash table entries: 8192 (order: 5, 131072 bytes) TCP: Hash tables configured (established 16384 bind 8192) TCP reno registered audit: initializing netlink socket (disabled) type=2000 audit(1273813521.619:1): initialized Total HugeTLB memory allocated, 0 VFS: Disk quotas dquot_6.5.1 Dquot-cache hash table entries: 512 (order 0, 4096 bytes) Initializing Cryptographic API alg: No test for crc32c (crc32c-generic) ksign: Installing public key data Loading keyring - Added public key 44A9ABA9643110BD - User ID: Red Hat, Inc. (Kernel Module GPG key) io scheduler noop registered io scheduler anticipatory registered io scheduler deadline registered io scheduler cfq registered (default) Limiting direct PCI/PCI transfers. PCI: PIIX3: Enabling Passive Release on 0000:00:01.0 Activating ISA DMA hang workarounds. pci_hotplug: PCI Hot Plug PCI Core version: 0.5 Real Time Clock Driver v1.12ac Non-volatile memory driver v1.2 Linux agpgart interface v0.101 (c) Dave Jones Serial: 8250/16550 driver $Revision: 1.90 $ 4 ports, IRQ sharing enabled �serial8250: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A 00:06: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A brd: module loaded Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2 ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx PIIX3: IDE controller at PCI slot 0000:00:01.1 PIIX3: chipset revision 0 PIIX3: not 100% native mode: will probe irqs later ide0: BM-DMA at 0xc000-0xc007, BIOS settings: hda:pio, hdb:pio ide1: BM-DMA at 0xc008-0xc00f, BIOS settings: hdc:pio, hdd:pio hda: QEMU HARDDISK, ATA DISK drive ide0 at 0x1f0-0x1f7,0x3f6 on irq 14 hdc: QEMU DVD-ROM, ATAPI CD/DVD-ROM drive ide1 at 0x170-0x177,0x376 on irq 15 hda: max request size: 512KiB hda: 41943040 sectors (21474 MB) w/256KiB Cache, CHS=16383/255/63, (U)DMA hda: cache flushes supported hda: hda1 hda2 ide-floppy driver 0.99.newide usbcore: registered new driver hiddev usbcore: registered new driver usbhid drivers/usb/input/hid-core.c: v2.6:USB HID core driver PNP: PS/2 Controller [PNP0303:KBD,PNP0f13:MOU] at 0x60,0x64 irq 1,12 serio: i8042 KBD port at 0x60,0x64 irq 1 serio: i8042 AUX port at 0x60,0x64 irq 12 mice: PS/2 mouse device common for all mice md: md driver 0.90.3 MAX_MD_DEVS=256, MD_SB_DISKS=27 md: bitmap version 4.39 TCP bic registered Initializing IPsec netlink socket NET: Registered protocol family 1 NET: Registered protocol family 17 ACPI: (supports S3 S4 S5) Initalizing network drop monitor service Freeing unused kernel memory: 208k freed Write protecting the kernel read-only data: 497k input: AT Translated Set 2 keyboard as /class/input/input0 input: ImExPS/2 Generic Explorer Mouse as /class/input/input1 USB Universal Host Controller Interface driver v3.0 ACPI: PCI Interrupt Link [LNKD] enabled at IRQ 11 ACPI: PCI Interrupt 0000:00:01.2[D] -> Link [LNKD] -> GSI 11 (level, high) -> IRQ 11 uhci_hcd 0000:00:01.2: UHCI Host Controller uhci_hcd 0000:00:01.2: new USB bus registered, assigned bus number 1 uhci_hcd 0000:00:01.2: irq 11, io base 0x0000c020 usb usb1: configuration #1 chosen from 1 choice hub 1-0:1.0: USB hub found hub 1-0:1.0: 2 ports detected SCSI subsystem initialized device-mapper: uevent: version 1.0.3 device-mapper: ioctl: 4.11.5-ioctl (2007-12-12) initialised: dm-devel device-mapper: dm-raid45: initialized v0.2594l ACPI: PCI Interrupt Link [LNKA] enabled at IRQ 10 ACPI: PCI Interrupt 0000:00:05.0[A] -> Link [LNKA] -> GSI 10 (level, high) -> IRQ 10 usb 1-2: new full speed USB device using uhci_hcd and address 2 usb 1-2: configuration #1 chosen from 1 choice input: QEMU 0.9.1 QEMU USB Tablet as /class/input/input2 input: USB HID v0.01 Pointer [QEMU 0.9.1 QEMU USB Tablet] on usb-0000:00:01.2-2 Attempting manual resume Disabling non-boot CPUs ... Stopping tasks: ======| Shrinking memory... done (0 pages freed) Loading image data pages (61008 pages) ... done Read 244032 kbytes in 5.39 seconds (45.27 MB/s) pci_set_power_state(): 0000:00:05.0: state=3, current state=5 ACPI: PCI interrupt for device 0000:00:01.2 disabled BUG: soft lockup - CPU#0 stuck for 600s! [bash:2678] CPU 0: Modules linked in: autofs4 hidp rfcomm l2cap bluetooth lockd sunrpc ip_conntrack_netbios_ns ipt_REJECT xt_state ip_conntrack nfnetlink iptable_filter ip_tables ip6t_REJECT xt_tcpudp ip6table_filter ip6_tables x_tables dm_multipath scsi_dh video hwmon backlight sbs i2c_ec button battery asus_acpi acpi_memhotplug ac ipv6 xfrm_nalgo crypto_api lp floppy joydev snd_intel8x0 snd_ac97_codec ac97_bus snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss snd_pcm snd_timer snd soundcore snd_page_alloc i2c_piix4 e1000 i2c_core ide_cd parport_pc cdrom parport pcspkr serio_raw virtio_net virtio_blk virtio_pci virtio_ring virtio dm_raid45 dm_message dm_region_hash dm_mem_cache dm_snapshot dm_zero dm_mirror dm_log dm_mod ata_piix libata sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd Pid: 2678, comm: bash Not tainted 2.6.18-164.2.1.el5 #1 RIP: 0010:[<ffffffff80012322>] [<ffffffff80012322>] __do_softirq+0x51/0x133 RSP: 0018:ffffffff8043df60 EFLAGS: 00000206 RAX: 0000000000000002 RBX: 0000000000000002 RCX: ffffffff8005e2fc RDX: ffff810016bd9fd8 RSI: 0000000000000080 RDI: ffff8100045427e0 RBP: ffffffff8043dee0 R08: 0000000000000001 R09: 000000000000003f R10: ffff81001fc10008 R11: 0000000000000050 R12: ffffffff8005dc8e R13: 0000000000000046 R14: ffffffff80077874 R15: ffffffff8043dee0 FS: 00002ae1796c7dd0(0000) GS:ffffffff803c1000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000000002c70036 CR3: 0000000016481000 CR4: 00000000000006e0 Call Trace: <IRQ> [<ffffffff8005e2fc>] call_softirq+0x1c/0x28 [<ffffffff8006cb20>] do_softirq+0x2c/0x85 [<ffffffff8005dc8e>] apic_timer_interrupt+0x66/0x6c <EOI> [<ffffffff800a86ba>] swsusp_suspend+0x4f/0x51 [<ffffffff800a86b7>] swsusp_suspend+0x4c/0x51 [<ffffffff800a8afd>] pm_suspend_disk+0x42/0xce [<ffffffff800a79e7>] enter_state+0x5e/0x19b [<ffffffff800a7b93>] state_store+0x5e/0x79 [<ffffffff8010ac88>] sysfs_write_file+0xb9/0xe8 [<ffffffff80016927>] vfs_write+0xce/0x174 [<ffffffff800171df>] sys_write+0x45/0x6e [<ffffffff8005d28d>] tracesys+0xd5/0xe0
Update your guest kernel. Old 2.6.18 kernels were known to have bugs in PM area. If suspend/resume does not work for you in some kernel version check exactly this kernel version with real HW first.
Hello gleb, I installed rhel54(kernel: 2.6.18-164.2.1.el5) on my real HW, s4 works on it. ---- I also tested with latest guest kernel(both rhel55 and rhel54) for 10 times. host kernel: 2.6.18-189.el5 # rpm -qa |grep kvm etherboot-zroms-kvm-5.4.4-12.el5 etherboot-zroms-kvm-5.4.4-10.el5 kvm-qemu-img-83-160.el5 etherboot-zroms-kvm-5.4.4-13.el5 kvm-debuginfo-83-160.el5 kmod-kvm-83-160.el5 kvm-tools-83-160.el5 kvm-83-160.el5 PASS <- rhel55 guest kernel: 2.6.18-196.el5 PASS <- rhel54 guest kernel: 2.6.18-164.18.1.el5
(In reply to comment #4) > Hello gleb, > > I installed rhel54(kernel: 2.6.18-164.2.1.el5) on my real HW, s4 works on it. Outcome of s4 may depend on specific HW. On real host HW is very different from virtual one. > > ---- > > I also tested with latest guest kernel(both rhel55 and rhel54) for 10 times. > host kernel: 2.6.18-189.el5 > # rpm -qa |grep kvm > etherboot-zroms-kvm-5.4.4-12.el5 > etherboot-zroms-kvm-5.4.4-10.el5 > kvm-qemu-img-83-160.el5 > etherboot-zroms-kvm-5.4.4-13.el5 > kvm-debuginfo-83-160.el5 > kmod-kvm-83-160.el5 > kvm-tools-83-160.el5 > kvm-83-160.el5 > > > PASS <- rhel55 guest kernel: 2.6.18-196.el5 > PASS <- rhel54 guest kernel: 2.6.18-164.18.1.el5 I am closing the bug then.