Bug 624451

Summary: RHEL5 KVM guest slowly locks up on F13 (clocksource)
Product: [Fedora] Fedora Reporter: Rik van Riel <riel>
Component: kernelAssignee: Fedora Virtualization Maintainers <virt-maint>
Status: CLOSED WONTFIX QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: medium Docs Contact:
Priority: low    
Version: 13CC: anton, dougsland, gansalmon, gcosta, itamar, jcm, jforbes, jonathan, kernel-maint, madhu.chinakonda
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-06-28 14:33:56 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Rik van Riel 2010-08-16 14:15:55 UTC
Description of problem:

After running for between 50 minutes and 90 minutes, a RHEL5 guest running on an F13 host essentially locks up.  This appears to be due to a clocksource problem, where the RHEL5 guest only uses the "jiffies" clocksource and nothing else.

F12 running on top of F13 uses kvm-clock just fine.
RHEL5 running on top of F12 also uses kvm-clock.

Version-Release number of selected component (if applicable):

guest kernel: 2.6.18-164.11.1.el5
host kernel:  2.6.33.6-147.2.4.fc13.x86_64
              qemu-kvm-0.12.3-8.fc13.x86_64

Steps to Reproduce:
1. start RHEL5 guest on F13 (SMP guest confirmed trouble, trying UP now)
2. make the guest busy
3. wait an hour or so
  
Actual results:

The guest gradually becomes slower, until things that should take fractions of a second take minutes.  Programs like "sleep 1" start taking forever.

Expected results:

Things hum along quietly.

Additional info:

Linux version 2.6.18-164.11.1.el5 (mockbuild.redhat.com) (gcc version 4.1.2 20080704 (Red Hat 4.1.2-46)) #1 SMP Wed Jan 6 13:26:04 EST 2010
Command line: ro root=/dev/vg0/root console=ttyS0,tty panic=30
BIOS-provided physical RAM map:
 BIOS-e820: 0000000000010000 - 000000000009cc00 (usable)
 BIOS-e820: 000000000009cc00 - 00000000000a0000 (reserved)
 BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved)
 BIOS-e820: 0000000000100000 - 000000007fffb000 (usable)
 BIOS-e820: 000000007fffb000 - 0000000080000000 (reserved)
 BIOS-e820: 00000000fffbc000 - 0000000100000000 (reserved)
DMI 2.4 present.
kvm-clock: cpu 0, msr 7eff:80433401, boot clock
ACPI: RSDP (v000 BOCHS                                 ) @ 0x00000000000f83e0
ACPI: RSDT (v001 BOCHS  BXPCRSDT 0x00000001 BXPC 0x00000001) @ 0x000000007fffde10
ACPI: FADT (v001 BOCHS  BXPCFACP 0x00000001 BXPC 0x00000001) @ 0x000000007ffffe10
ACPI: SSDT (v001 BOCHS  BXPCSSDT 0x00000001 BXPC 0x00000001) @ 0x000000007fffdf30
ACPI: MADT (v001 BOCHS  BXPCAPIC 0x00000001 BXPC 0x00000001) @ 0x000000007fffde40
ACPI: DSDT (v001   BXPC   BXDSDT 0x00000001 INTL 0x20090123) @ 0x0000000000000000
No NUMA configuration found
Faking a node at 0000000000000000-000000007fffb000
Bootmem setup node 0 0000000000000000-000000007fffb000
Memory for crash kernel (0x0 to 0x0) notwithin permissible range
disabling kdump
On node 0 totalpages: 515710
  DMA zone: 2634 pages, LIFO batch:0
  DMA32 zone: 513076 pages, LIFO batch:31
ACPI: PM-Timer IO Port: 0xb008
ACPI: Local APIC address 0xfee00000
ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled)
Processor #0 6:2 APIC version 20
ACPI: LAPIC (acpi_id[0x01] lapic_id[0x01] enabled)
Processor #1 6:2 APIC version 20
ACPI: LAPIC (acpi_id[0x02] lapic_id[0x02] enabled)
Processor #2 6:2 APIC version 20
ACPI: IOAPIC (id[0x03] address[0xfec00000] gsi_base[0])
IOAPIC[0]: apic_id 3, version 17, address 0xfec00000, GSI 0-23
ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
ACPI: INT_SRC_OVR (bus 0 bus_irq 5 global_irq 5 high level)
ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
ACPI: INT_SRC_OVR (bus 0 bus_irq 10 global_irq 10 high level)
ACPI: INT_SRC_OVR (bus 0 bus_irq 11 global_irq 11 high level)
ACPI: IRQ0 used by override.
ACPI: IRQ2 used by override.
ACPI: IRQ5 used by override.
ACPI: IRQ9 used by override.
ACPI: IRQ10 used by override.
ACPI: IRQ11 used by override.
Setting APIC routing to physical flat
Using ACPI (MADT) for SMP configuration information
Nosave address range: 000000000009c000 - 000000000009d000
Nosave address range: 000000000009d000 - 00000000000a0000
Nosave address range: 00000000000a0000 - 00000000000f0000
Nosave address range: 00000000000f0000 - 0000000000100000
Allocating PCI resources starting at 88000000 (gap: 80000000:7ffbc000)
SMP: Allowing 3 CPUs, 0 hotplug CPUs
kvm-clock: cpu 0, msr 0:2375401, primary cpu clock
Built 1 zonelists.  Total pages: 515710
Kernel command line: ro root=/dev/vg0/root console=ttyS0,tty panic=30
Initializing CPU#0
PID hash table entries: 4096 (order: 12, 32768 bytes)
time.c: Using tsc for timekeeping HZ 1000
Console: colour VGA+ 80x25
Dentry cache hash table entries: 262144 (order: 9, 2097152 bytes)
Inode-cache hash table entries: 131072 (order: 8, 1048576 bytes)
Checking aperture...
ACPI: DMAR not present
Memory: 2056180k/2097132k available (2551k kernel code, 40488k reserved, 1290k data, 208k init)
Calibrating delay loop (skipped), value calculated using timer frequency.. 4533.06 BogoMIPS (lpj=2266534)
Security Framework v1.0.0 initialized
SELinux:  Initializing.
SELinux:  Starting in permissive mode
selinux_register_security:  Registering secondary module capability
Capability LSM initialized as secondary
Mount-cache hash table entries: 256
CPU: L1 I cache: 32K, L1 D cache: 32K
CPU: L2 cache: 4096K
SMP alternatives: switching to UP code
ACPI: Core revision 20060707
Using local APIC timer interrupts.
WARNING calibrate_APIC_clock: the APIC timer calibration may be wrong.
Detected 62.497 MHz APIC timer.
SMP alternatives: switching to SMP code
Booting processor 1/3 APIC 0x1
Initializing CPU#1
Calibrating delay using timer specific routine.. 4533.72 BogoMIPS (lpj=2266863)
CPU: L1 I cache: 32K, L1 D cache: 32K
CPU: L2 cache: 4096K
QEMU Virtual CPU version 0.12.3 stepping 03
kvm-clock: cpu 1, msr 0:237da81, secondary cpu clock
SMP alternatives: switching to SMP code
Booting processor 2/3 APIC 0x2
Initializing CPU#2
Calibrating delay using timer specific routine.. 4532.16 BogoMIPS (lpj=2266080)
CPU: L1 I cache: 32K, L1 D cache: 32K
CPU: L2 cache: 4096K
QEMU Virtual CPU version 0.12.3 stepping 03
kvm-clock: cpu 2, msr 0:2386101, secondary cpu clock
Brought up 3 CPUs
testing NMI watchdog ... <4>WARNING: CPU#0: NMI appears to be stuck (0->0)!
time.c: Using 1.193182 MHz WALL KVM GTOD KVM timer.
time.c: Detected 2266.534 MHz processor.
sizeof(vma)=176 bytes
sizeof(page)=56 bytes
sizeof(inode)=560 bytes
sizeof(dentry)=216 bytes
sizeof(ext3inode)=760 bytes
sizeof(buffer_head)=96 bytes
sizeof(skbuff)=248 bytes
migration_cost=30
checking if image is initramfs... it is
Freeing initrd memory: 3163k freed
NET: Registered protocol family 16
ACPI: bus type pci registered
PCI: Using configuration type 1
mtrr: your CPUs had inconsistent variable MTRR settings
mtrr: your CPUs had inconsistent MTRRdefType settings
mtrr: probably your BIOS does not setup all CPUs.
mtrr: corrected configuration.
ACPI: Interpreter enabled
ACPI: Using IOAPIC for interrupt routing
ACPI: No dock devices found.
ACPI: PCI Root Bridge [PCI0] (0000:00)
PCI quirk: region b000-b03f claimed by PIIX4 ACPI
PCI quirk: region b100-b10f claimed by PIIX4 SMB
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT]
ACPI: PCI Interrupt Link [LNKA] (IRQs 5 *10 11)
ACPI: PCI Interrupt Link [LNKB] (IRQs 5 *10 11)
ACPI: PCI Interrupt Link [LNKC] (IRQs 5 10 *11)
ACPI: PCI Interrupt Link [LNKD] (IRQs 5 10 *11)
Linux Plug and Play Support v0.97 (c) Adam Belay
pnp: PnP ACPI init
pnp: PnP ACPI: found 6 devices
usbcore: registered new driver usbfs
usbcore: registered new driver hub
PCI: Using ACPI for IRQ routing
PCI: If a device doesn't work, try "pci=routeirq".  If it helps, post a report
NetLabel: Initializing
NetLabel:  domain hash size = 128
NetLabel:  protocols = UNLABELED CIPSOv4
NetLabel:  unlabeled traffic allowed by default
ACPI: DMAR not present
PCI-GART: No AMD northbridge found.
NET: Registered protocol family 2
IP route cache hash table entries: 65536 (order: 7, 524288 bytes)
TCP established hash table entries: 262144 (order: 10, 4194304 bytes)
TCP bind hash table entries: 65536 (order: 8, 1048576 bytes)
TCP: Hash tables configured (established 262144 bind 65536)
TCP reno registered
audit: initializing netlink socket (disabled)
type=2000 audit(1281967626.636:1): initialized
Total HugeTLB memory allocated, 0
VFS: Disk quotas dquot_6.5.1
Dquot-cache hash table entries: 512 (order 0, 4096 bytes)
SELinux:  Registering netfilter hooks
Initializing Cryptographic API
alg: No test for crc32c (crc32c-generic)
ksign: Installing public key data
Loading keyring
- Added public key 66178A9C3289FFFB
- User ID: Red Hat, Inc. (Kernel Module GPG key)
io scheduler noop registered
io scheduler anticipatory registered
io scheduler deadline registered
io scheduler cfq registered (default)
Limiting direct PCI/PCI transfers.
PCI: PIIX3: Enabling Passive Release on 0000:00:01.0
Activating ISA DMA hang workarounds.
Boot video device is 0000:00:02.0
pci_hotplug: PCI Hot Plug PCI Core version: 0.5
Real Time Clock Driver v1.12ac
hpet_acpi_add: no address or irqs in _CRS
Non-volatile memory driver v1.2
Linux agpgart interface v0.101 (c) Dave Jones
Serial: 8250/16550 driver $Revision: 1.90 $ 4 ports, IRQ sharing enabled
serial8250: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
brd: module loaded
Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2
ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
PIIX3: IDE controller at PCI slot 0000:00:01.1
PIIX3: chipset revision 0
PIIX3: not 100% native mode: will probe irqs later
    ide0: BM-DMA at 0xc000-0xc007, BIOS settings: hda:pio, hdb:pio
    ide1: BM-DMA at 0xc008-0xc00f, BIOS settings: hdc:pio, hdd:pio
Probing IDE interface ide0...
Probing IDE interface ide1...
Probing IDE interface ide0...
Probing IDE interface ide1...
ide-floppy driver 0.99.newide
usbcore: registered new driver hiddev
usbcore: registered new driver usbhid
drivers/usb/input/hid-core.c: v2.6:USB HID core driver
PNP: PS/2 Controller [PNP0303:KBD,PNP0f13:MOU] at 0x60,0x64 irq 1,12
serio: i8042 KBD port at 0x60,0x64 irq 1
serio: i8042 AUX port at 0x60,0x64 irq 12
mice: PS/2 mouse device common for all mice
md: md driver 0.90.3 MAX_MD_DEVS=256, MD_SB_DISKS=27
md: bitmap version 4.39
TCP bic registered
Initializing IPsec netlink socket
input: AT Translated Set 2 keyboard as /class/input/input0
NET: Registered protocol family 1
NET: Registered protocol family 17
ACPI: (supports S3 S4 S5)
Initalizing network drop monitor service
Freeing unused kernel memory: 208k freed
Write protecting the kernel read-only data: 497k
input: ImExPS/2 Generic Explorer Mouse as /class/input/input1
ohci_hcd: 2005 April 22 USB 1.1 'Open' Host Controller (OHCI) Driver (PCI)
USB Universal Host Controller Interface driver v3.0
ACPI: PCI Interrupt Link [LNKD] enabled at IRQ 11
ACPI: PCI Interrupt 0000:00:01.2[D] -> Link [LNKD] -> GSI 11 (level, high) -> IRQ 11
PCI: Setting latency timer of device 0000:00:01.2 to 64
uhci_hcd 0000:00:01.2: UHCI Host Controller
uhci_hcd 0000:00:01.2: new USB bus registered, assigned bus number 1
uhci_hcd 0000:00:01.2: irq 11, io base 0x0000c020
usb usb1: configuration #1 chosen from 1 choice
hub 1-0:1.0: USB hub found
hub 1-0:1.0: 2 ports detected
ACPI: PCI Interrupt Link [LNKC] enabled at IRQ 10
ACPI: PCI Interrupt 0000:00:03.0[A] -> Link [LNKC] -> GSI 10 (level, high) -> IRQ 10
ACPI: PCI Interrupt 0000:00:04.0[A] -> Link [LNKD] -> GSI 11 (level, high) -> IRQ 11
ACPI: PCI Interrupt Link [LNKA] enabled at IRQ 10
ACPI: PCI Interrupt 0000:00:05.0[A] -> Link [LNKA] -> GSI 10 (level, high) -> IRQ 10
 vda: vda1 vda2 vda3 < vda5 > vda4
device-mapper: uevent: version 1.0.3
device-mapper: ioctl: 4.11.5-ioctl (2007-12-12) initialised: dm-devel
device-mapper: dm-raid45: initialized v0.2594l
kjournald starting.  Commit interval 5 seconds
EXT3-fs: mounted filesystem with ordered data mode.
type=1404 audit(1281967629.818:2): enforcing=1 old_enforcing=0 auid=4294967295 ses=4294967295
security:  3 users, 6 roles, 2039 types, 263 bools, 1 sens, 1024 cats
security:  61 classes, 79077 rules
SELinux:  Completing initialization.
SELinux:  Setting up existing superblocks.
SELinux: initialized (dev dm-0, type ext3), uses xattr
SELinux: initialized (dev usbfs, type usbfs), uses genfs_contexts
SELinux: initialized (dev tmpfs, type tmpfs), uses transition SIDs
SELinux: initialized (dev debugfs, type debugfs), uses genfs_contexts
SELinux: initialized (dev selinuxfs, type selinuxfs), uses genfs_contexts
SELinux: initialized (dev mqueue, type mqueue), uses transition SIDs
SELinux: initialized (dev hugetlbfs, type hugetlbfs), uses genfs_contexts
SELinux: initialized (dev devpts, type devpts), uses transition SIDs
SELinux: initialized (dev eventpollfs, type eventpollfs), uses task SIDs
SELinux: initialized (dev inotifyfs, type inotifyfs), uses genfs_contexts
SELinux: initialized (dev tmpfs, type tmpfs), uses transition SIDs
SELinux: initialized (dev futexfs, type futexfs), uses genfs_contexts
SELinux: initialized (dev anon_inodefs, type anon_inodefs), uses genfs_contexts
SELinux: initialized (dev pipefs, type pipefs), uses task SIDs
SELinux: initialized (dev sockfs, type sockfs), uses task SIDs
SELinux: initialized (dev cpuset, type cpuset), uses genfs_contexts
SELinux: initialized (dev proc, type proc), uses genfs_contexts
SELinux: initialized (dev bdev, type bdev), uses genfs_contexts
SELinux: initialized (dev rootfs, type rootfs), uses genfs_contexts
SELinux: initialized (dev sysfs, type sysfs), uses genfs_contexts
type=1403 audit(1281967629.926:3): policy loaded auid=4294967295 ses=4294967295
SCSI subsystem initialized
libata version 3.00 loaded.
input: PC Speaker as /class/input/input2
piix4_smbus 0000:00:01.3: Found 0000:00:01.3 device
FDC 0 is a S82078B
lp: driver loaded but no devices found
ACPI: Power Button (FF) [PWRF]
md: Autodetecting RAID arrays.
md: autorun ...
md: ... autorun DONE.
device-mapper: multipath: version 1.0.5 loaded
EXT3 FS on dm-0, internal journal
kjournald starting.  Commit interval 5 seconds
EXT3 FS on vda1, internal journal
EXT3-fs: mounted filesystem with ordered data mode.
SELinux: initialized (dev vda1, type ext3), uses xattr
SELinux: initialized (dev tmpfs, type tmpfs), uses transition SIDs
kjournald starting.  Commit interval 5 seconds
EXT3 FS on dm-1, internal journal
EXT3-fs: mounted filesystem with ordered data mode.
SELinux: initialized (dev dm-1, type ext3), uses xattr
kjournald starting.  Commit interval 5 seconds
EXT3 FS on dm-3, internal journal
EXT3-fs: mounted filesystem with ordered data mode.
SELinux: initialized (dev dm-3, type ext3), uses xattr
SELinux: initialized (dev tmpfs, type tmpfs), uses mountpoint labeling
SELinux: initialized (dev tmpfs, type tmpfs), uses mountpoint labeling
Adding 589816k swap on /dev/vg0/swap.  Priority:-1 extents:1 across:589816k
Adding 409592k swap on /swapfile.  Priority:-2 extents:3176 across:2819088k
SELinux: initialized (dev binfmt_misc, type binfmt_misc), uses genfs_contexts
NET: Registered protocol family 10
lo: Disabled Privacy Extensions
IPv6 over IPv4 tunneling driver
SELinux: initialized (dev rpc_pipefs, type rpc_pipefs), uses genfs_contexts

Comment 1 Jon Masters 2010-08-17 17:58:09 UTC
So it's at least a RHEL5.5 guest then?

Comment 2 Chuck Ebbert 2010-08-18 12:42:17 UTC
Can you try older F13 2.6.33 kernels to see if this was recently introduced?

Comment 3 Bug Zapper 2011-06-01 11:24:34 UTC
This message is a reminder that Fedora 13 is nearing its end of life.
Approximately 30 (thirty) days from now Fedora will stop maintaining
and issuing updates for Fedora 13.  It is Fedora's policy to close all
bug reports from releases that are no longer maintained.  At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '13'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 13's end of life.

Bug Reporter: Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 13 is end of life.  If you 
would still like to see this bug fixed and are able to reproduce it 
against a later version of Fedora please change the 'version' of this 
bug to the applicable version.  If you are unable to change the version, 
please add a comment here and someone will do it for you.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events.  Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

The process we are following is described here: 
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 4 Bug Zapper 2011-06-28 14:33:56 UTC
Fedora 13 changed to end-of-life (EOL) status on 2011-06-25. Fedora 13 is 
no longer maintained, which means that it will not receive any further 
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of 
Fedora please feel free to reopen this bug against that version.

Thank you for reporting this bug and we are sorry it could not be fixed.