Bug 588984 - RHEL5.5-i386 guest kernel panic on resuming from s4
RHEL5.5-i386 guest kernel panic on resuming from s4
Status: CLOSED NOTABUG
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kvm (Show other bugs)
5.6
All Linux
low Severity medium
: rc
: ---
Assigned To: Gleb Natapov
Virtualization Bugs
:
Depends On:
Blocks: Rhel5KvmTier1
  Show dependency treegraph
 
Reported: 2010-05-04 22:50 EDT by Amos Kong
Modified: 2015-05-24 20:05 EDT (History)
4 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2010-05-18 01:12:44 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Amos Kong 2010-05-04 22:50:39 EDT
Description of problem:
Suspend a RHEL-5.5-i386 guest to disk. Guest kernel panic occurred when
resuming it.

Version-Release number of selected component (if applicable):
host kernel: 2.6.18-194.el5
guest kernel: 2.6.18-189.el5PAE

# rpm -qa |grep kvm
kmod-kvm-83-164.el5_5.8
kvm-83-164.el5_5.8
kvm-qemu-img-83-164.el5_5.8
etherboot-zroms-kvm-5.4.4-13.el5
kvm-tools-83-164.el5_5.8
kvm-debuginfo-83-164.el5_5.8

How reproducible:
always

Steps to Reproduce:
1. boot up a RHEL-5.5-i386 guest
2. check if guest os support s4
# grep disk /sys/power/state
3. suspend to disk
# echo 'disk' > /sys/power/state
4. resume suspended guest

Actual results:
guest kernel panic when resuming from s4

Expected results:
success to resume from s4

Additional info:

1. commandline:
/usr/local/staf/test/RHEV/kvm-new/kvm-test/tests/kvm/qemu -name 'vm1' -monitor
tcp:0:6001,server,nowait -drive
file=/usr/local/staf/test/RHEV/kvm-new/kvm-test/tests/kvm/images/RHEL-Server-5.5-32.raw,if=ide,cache=writethrough,boot=on
-net nic,vlan=0,model=e1000,macaddr=00:FF:B9:FE:59:4b -net
tap,vlan=0,ifname=e1000_0_6001,script=/usr/local/staf/test/RHEV/kvm-new/kvm-test/tests/kvm/scripts/qemu-ifup-switch,downscript=no
-m 2048 -smp 1 -soundhw ac97 -usbdevice tablet -rtc-td-hack -no-hpet -cpu
qemu64,+sse2 -no-kvm-pit-reinjection -redir tcp:5000::22 -vnc :0 -serial
unix:/tmp/serial-20100429-012412-lZjL,server,nowait

2. serial output:
Memory for crash kernel (0x0 to 0x0) notwithin permissible range

PCI: PIIX3: Enabling Passive Release on 0000:00:01.0

ÿRed Hat nash version 5.1.19.6 starting

  Reading all physical volumes.  This may take a while...

  Found volume group "VolGroup00" using metadata type lvm2

  2 logical volume(s) in volume group "VolGroup00" now active

		Welcome to Red Hat Enterprise Linux Server

		Press 'I' to enter interactive startup.

Setting clock  (utc): Tue May  4 23:40:09 CST 2010 [  OK  ]


Starting udev: [  OK  ]


Loading default keymap (us): [  OK  ]


Setting hostname dhcp-66-82-223.nay.redhat.com:  [  OK  ]


Setting up Logical Volume Management:   2 logical volume(s) in volume group "VolGroup00" now active

[  OK  ]


Checking filesystems

Checking all file systems.

[/sbin/fsck.ext3 (1) -- /] fsck.ext3 -a /dev/VolGroup00/LogVol00 

/dev/VolGroup00/LogVol00: clean, 107428/2613760 files, 647205/2613248 blocks

[/sbin/fsck.ext3 (1) -- /boot] fsck.ext3 -a /dev/hda1 

/boot: clean, 40/26104 files, 21581/104388 blocks

[  OK  ]


Remounting root filesystem in read-write mode:  [  OK  ]


Mounting local filesystems:  [  OK  ]


Enabling local filesystem quotas:  [  OK  ]


Enabling /etc/fstab swaps:  [  OK  ]


INIT: Entering runlevel: 5


Entering non-interactive startup

Applying Intel CPU microcode update: [  OK  ]


Starting monitoring for VG VolGroup00:   2 logical volume(s) in volume group "VolGroup00" monitored

[  OK  ]


Starting background readahead: [  OK  ]


Checking for hardware changes [  OK  ]


Applying ip6tables firewall rules: [  OK  ]


Applying iptables firewall rules: [  OK  ]


Loading additional iptables modules: ip_conntrack_netbios_ns [  OK  ]


Bringing up loopback interface:  [  OK  ]


Bringing up interface eth0:  

Determining IP information for eth0... done.

[  OK  ]


Starting auditd: [  OK  ]


Starting system logger: [  OK  ]


Starting kernel logger: [  OK  ]


Starting irqbalance: [  OK  ]


Starting portmap: [  OK  ]


Starting NFS statd: [  OK  ]


Starting RPC idmapd: [  OK  ]


Starting system message bus: [  OK  ]


Starting Bluetooth services:[  OK  ]
[  OK  ]


Mounting other filesystems:  [  OK  ]


Starting PC/SC smart card daemon (pcscd): [  OK  ]


Starting acpi daemon: [  OK  ]


Starting HAL daemon: [  OK  ]


Starting hidd: [  OK  ]


Starting autofs:  Loading autofs4: [  OK  ]


Starting automount: [  OK  ]


[  OK  ]


Starting sshd: [  OK  ]


Starting cups: [  OK  ]


Starting xinetd: [  OK  ]


Starting sendmail: [  OK  ]


Starting sm-client: [  OK  ]


Starting console mouse services: [  OK  ]


Starting crond: [  OK  ]


Starting xfs: [  OK  ]


Starting anacron: [  OK  ]


Starting atd: [  OK  ]
[  OK  ]


Starting background readahead: [  OK  ]


Starting yum-updatesd: [  OK  ]


Starting Avahi daemon... [  OK  ]


Starting autotest:  No autotest jobs outstanding

[  OK  ]


Starting smartd: hdc: drive_cmd: status=0x41 { DriveReady Error }

hdc: drive_cmd: error=0x04 { AbortedCommand }

ide: failed opcode was: 0xec

[  OK  ]


mtrr: type mismatch for c2000000,100000 old: uncachable new: write-combining

mtrr: type mismatch for c2000000,400000 old: uncachable new: write-combining



Red Hat Enterprise Linux Server release 5.5 Beta (Tikanga)

Kernel 2.6.18-196.el5 on an i686



dhcp-66-82-223.nay.redhat.com login: Disabling non-boot CPUs ...

Stopping tasks: =======================================================================|

Shrinking memory...  -\|done (34433 pages freed)

pci_set_power_state(): 0000:00:05.0: state=3, current state=5

pci_set_power_state(): 0000:00:03.0: state=3, current state=5

..............................

swsusp: Need to copy 56744 pages

swsusp: critical section/: done (56744 pages copied)

swsusp: Restoring Highmem

PCI: Enabling device 0000:00:01.2 (0000 -> 0001)

PCI: Enabling device 0000:00:04.0 (0000 -> 0001)

pnp: Failed to activate device 00:02.

pnp: Failed to activate device 00:03.

pnp: Failed to activate device 00:05.

pnp: Failed to activate device 00:06.

Saving image data pages (56800 pages) ...       0%  1%  2%  3%  4%  5%  6%  7%  8%  9% 10% 11% 12% 13% 14% 15% 16% 17% 18% 19% 20% 21% 22% 23% 24% 25% 26% 27% 28% 29% 30% 31% 32% 33% 34% 35% 36% 37% 38% 39% 40% 41% 42% 43% 44% 45% 46% 47% 48% 49% 50% 51% 52% 53% 54% 55% 56% 57% 58% 59% 60% 61% 62% 63% 64% 65% 66% 67% 68% 69% 70% 71% 72% 73% 74% 75% 76% 77% 78% 79% 80% 81% 82% 83% 84% 85% 86% 87% 88% 89% 90% 91% 92% 93% 94% 95% 96% 97% 98% 99%done

Wrote 227200 kbytes in 6.18 seconds (36.76 MB/s)

S|

Shutdown: hda

pci_set_power_state(): 0000:00:03.0: state=3, current state=5

Power down.

acpi_power_off called

Memory for crash kernel (0x0 to 0x0) notwithin permissible range

PCI: PIIX3: Enabling Passive Release on 0000:00:01.0

ÿRed Hat nash version 5.1.19.6 starting

  Reading all physical volumes.  This may take a while...

  Found volume group "VolGroup00" using metadata type lvm2

  2 logical volume(s) in volume group "VolGroup00" now active

Resuming from /dev/VolGroup00/LogVol01.

pci_set_power_state(): 0000:00:05.0: state=3, current state=5

BUG: unable to handle kernel paging request at virtual address c0655000

 printing eip:

*pde = 004001e3

Oops: 0003 [#1]

SMP 

last sysfs file: /power/resume

Modules linked in: virtio_net virtio_pci virtio_ring dm_raid45 dm_message dm_region_hash dm_mem_cache dm_snapshot dm_zero dm_mirror dm_log dm_mod virtio_blk virtio ata_piix libata sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd

CPU:    0

EIP:    0060:[<c05b661e>]    Not tainted VLI

EFLAGS: 00010082   (2.6.18-196.el5 #1) 

EIP is at copy_loop+0xe/0x15

eax: 000006d0   ebx: 0fc00001   ecx: 00000400   edx: eb977180

esi: ea12f000   edi: c0655000   ebp: 00000014   esp: f7c8af28

ds: 007b   es: 007b   ss: 0068

Process init (pid: 1, ti=f7c8a000 task=f7c89aa0 task.ti=f7c8a000)

Stack: c0440af5 0fc00001 c0440f29 f7f5f000 c0440ff4 c0640a23 00000001 000000fc 

       c0440f72 c0691a34 c04ad8ba c21006f0 c069a3dc f7c78300 c04ad994 00000014 

       bfb4c1d7 c21478c0 f7c78314 c21478c0 c04ad903 bfb4c1d7 00000014 c04761b7 

Call Trace:

 [<c0440af5>] swsusp_resume+0x25/0x51

 [<c0440f29>] software_resume+0xb1/0xcc

 [<c0440ff4>] resume_store+0x82/0x93

 [<c0440f72>] resume_store+0x0/0x93

 [<c04ad8ba>] subsys_attr_store+0x1e/0x22

 [<c04ad994>] sysfs_write_file+0x91/0xbb

 [<c04ad903>] sysfs_write_file+0x0/0xbb

 [<c04761b7>] vfs_write+0xa1/0x143

 [<c04767a9>] sys_write+0x3c/0x63

 [<c0404f17>] syscall_call+0x7/0xb

 =======================

Code: 7f c0 e8 65 b5 e8 ff c3 90 b9 00 40 6f 00 0f 22 d9 8b 15 18 e0 7b c0 8d b6 00 00 00 00 85 d2 74 11 8b 32 8b 7a 04 b9 00 04 00 00 <f3> a5 8b 52 08 eb eb a1 2c 80 78 c0 89 c2 81 e2 7f ff ff ff 0f 

EIP: [<c05b661e>] copy_loop+0xe/0x15 SS:ESP 0068:f7c8af28

 <0>Kernel panic - not syncing: Fatal exception


3. host cpuinfo
processor       : 7
vendor_id       : GenuineIntel
cpu family      : 6
model           : 26
model name      : Intel(R) Core(TM) i7 CPU         920  @ 2.67GHz
stepping        : 4
cpu MHz         : 1600.000
cache size      : 8192 KB
physical id     : 0
siblings        : 8
core id         : 3
cpu cores       : 4
apicid          : 7
fpu             : yes
fpu_exception   : yes
cpuid level     : 11
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall nx rdtscp lm constant_tsc ida nonstop_tsc pni monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr sse4_1 sse4_2 popcnt lahf_lm
bogomips        : 5319.96
clflush size    : 64
cache_alignment : 64
address sizes   : 36 bits physical, 48 bits virtual
power management: [8]
Comment 1 Amos Kong 2010-05-05 03:45:00 EDT
This bug can reproduce on RHEL5.6, so change the 'Version' to 5.6
---
host kernel: 2.6.18-194.el5
# rpm -qa |grep kvm
kmod-kvm-83-172.el5
kvm-83-172.el5
kvm-qemu-img-83-172.el5
etherboot-zroms-kvm-5.4.4-13.el5
kvm-tools-83-172.el5
kvm-debuginfo-83-172.el5
Comment 2 Gleb Natapov 2010-05-14 13:06:01 EDT
Was this verified manually?
Comment 3 Amos Kong 2010-05-17 23:35:16 EDT
gleb, I found a mistake, this bug was found and verified manually with guest kernel(2.6.18-196.el5.i686), host kernel(2.6.18-194.el5.x86_64).

I've retested with guest kernel(2.6.18-196.el5.i686), host kernel(both 2.6.18-196.el5.x86_64 and 2.6.18-196.el5.x86_64), bug could not be reproduced.
Comment 4 Gleb Natapov 2010-05-18 01:12:44 EDT
Close then.

Note You need to log in before you can comment on or make changes to this bug.