Description of problem: Xen is unstable with 2.6.30 rebase. I've attached relevant dmesg and kernel timeouts. Version-Release number of selected component (if applicable): kernel 2.6.30 Actual results: System hang. Expected results: Stable system. Additional info: [kev009@gnucapplus ~]$ dmesg Initializing cgroup subsys cpuset Initializing cgroup subsys cpu Linux version 2.6.30.8-64.fc11.x86_64 (mockbuild.phx.redhat.com) (gcc version 4.4.1 20090725 (Red Hat 4.4.1-2) (GCC) ) #1 SMP Fri Sep 25 04:43:32 EDT 2009 Command line: ro root=/dev/VolGroup00/LogVol00 rhgb quiet KERNEL supported cpus: Intel GenuineIntel AMD AuthenticAMD Centaur CentaurHauls ACPI in unprivileged domain disabled BIOS-provided physical RAM map: Xen: 0000000000000000 - 00000000000a0000 (usable) Xen: 00000000000a0000 - 0000000000100000 (reserved) Xen: 0000000000100000 - 0000000002166000 (usable) Xen: 0000000002166000 - 0000000002369000 (reserved) Xen: 0000000002369000 - 0000000040000000 (usable) DMI not present or invalid. last_pfn = 0x40000 max_arch_pfn = 0x100000000 init_memory_mapping: 0000000000000000-0000000040000000 0000000000 - 0040000000 page 4k kernel direct mapping tables up to 40000000 @ 100000-302000 RAMDISK: 019f2000 - 02166000 No NUMA configuration found Faking a node at 0000000000000000-0000000040000000 Bootmem setup node 0 0000000000000000-0000000040000000 NODE_DATA [0000000000008000 - 000000000001cfff] bootmap [000000000001d000 - 0000000000024fff] pages 8 (6 early reservations) ==> bootmem [0000000000 - 0040000000] #0 [0000000000 - 0000001000] BIOS data page ==> [0000000000 - 0000001000] #1 [0002369000 - 000237e000] XEN PAGETABLES ==> [0002369000 - 000237e000] #2 [0000006000 - 0000008000] TRAMPOLINE ==> [0000006000 - 0000008000] #3 [0001000000 - 00019d11a0] TEXT DATA BSS ==> [0001000000 - 00019d11a0] #4 [00019f2000 - 0002166000] RAMDISK ==> [00019f2000 - 0002166000] #5 [0000100000 - 00002ea000] PGTABLE ==> [0000100000 - 00002ea000] Zone PFN ranges: DMA 0x00000000 -> 0x00001000 DMA32 0x00001000 -> 0x00100000 Normal 0x00100000 -> 0x00100000 Movable zone start PFN for each node early_node_map[3] active PFN ranges 0: 0x00000000 -> 0x000000a0 0: 0x00000100 -> 0x00002166 0: 0x00002369 -> 0x00040000 On node 0 totalpages: 261533 DMA zone: 56 pages used for memmap DMA zone: 493 pages reserved DMA zone: 3451 pages, LIFO batch:0 DMA32 zone: 3528 pages used for memmap DMA32 zone: 254005 pages, LIFO batch:31 SMP: Allowing 2 CPUs, 0 hotplug CPUs No local APIC present nr_irqs_gsi: 16 PM: Registered nosave memory: 00000000000a0000 - 0000000000100000 PM: Registered nosave memory: 0000000002166000 - 0000000002369000 Allocating PCI resources starting at 50000000 (gap: 40000000:c0000000) NR_CPUS:64 nr_cpumask_bits:64 nr_cpu_ids:2 nr_node_ids:1 PERCPU: Allocated 20 4k pages, static data 79584 bytes trying to map vcpu_info 0 at ffffc2000000b020, mfn 1ed7cf, offset 32 cpu 0 using vcpu_info at ffffc2000000b020 trying to map vcpu_info 1 at ffffc20000021020, mfn 1ed7bb, offset 32 cpu 1 using vcpu_info at ffffc20000021020 Xen: using vcpu_info placement Built 1 zonelists in Node order, mobility grouping on. Total pages: 257456 Policy zone: DMA32 Kernel command line: ro root=/dev/VolGroup00/LogVol00 rhgb quiet Initializing CPU#0 ------------[ cut here ]------------ WARNING: at arch/x86/xen/enlighten.c:482 cvt_gate_to_trap+0x86/0xbe() (Not tainted) Modules linked in: Pid: 0, comm: swapper Not tainted 2.6.30.8-64.fc11.x86_64 #1 Call Trace: [<ffffffff810571a8>] warn_slowpath_common+0x95/0xc3 [<ffffffff810571fd>] warn_slowpath_null+0x27/0x3d [<ffffffff8100abc0>] cvt_gate_to_trap+0x86/0xbe [<ffffffff8100ac50>] xen_convert_trap_info+0x58/0x8d [<ffffffff8100b056>] xen_load_idt+0x4c/0x70 [<ffffffff8148f238>] cpu_init+0x11f/0x2ff [<ffffffff8100ad93>] ? xen_write_idt_entry+0x46/0xa9 [<ffffffff8175c9d1>] trap_init+0x257/0x274 [<ffffffff8122433b>] ? sort_extable+0x32/0x4b [<ffffffff81757e62>] start_kernel+0x240/0x433 [<ffffffff817572d0>] x86_64_start_reservations+0xbb/0xd6 [<ffffffff8175b4e0>] xen_start_kernel+0x4b8/0x4bc ---[ end trace a7919e7f17c0a725 ]--- NR_IRQS:4352 nr_irqs:288 PID hash table entries: 4096 (order: 12, 32768 bytes) Detected 1991.513 MHz processor. Console: colour dummy device 80x25 console [tty0] enabled console [hvc0] enabled allocated 10485760 bytes of page_cgroup please try cgroup_disable=memory option if you don't want Checking aperture... No AGP bridge found Calgary: detecting Calgary via BIOS EBDA area Calgary: Unable to locate Rio Grande table in EBDA - bailing! Memory: 1001400k/1048576k available (4736k kernel code, 2444k absent, 44732k reserved, 2498k data, 1348k init) SLUB: Genslabs=14, HWalign=64, Order=0-3, MinObjects=0, CPUs=2, Nodes=1 Xen: using vcpuop timer interface installing Xen timer for CPU 0 Calibrating delay loop (skipped), value calculated using timer frequency.. 3983.02 BogoMIPS (lpj=1991513) Security Framework initialized SELinux: Initializing. SELinux: Starting in permissive mode Dentry cache hash table entries: 131072 (order: 8, 1048576 bytes) Inode-cache hash table entries: 65536 (order: 7, 524288 bytes) Mount-cache hash table entries: 256 Initializing cgroup subsys ns Initializing cgroup subsys cpuacct Initializing cgroup subsys memory Initializing cgroup subsys devices Initializing cgroup subsys freezer Initializing cgroup subsys net_cls CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line) CPU: L2 Cache: 1024K (64 bytes/line) CPU 0/0x0 -> Node 0 tseg: 0000000000 SMP alternatives: switching to UP code ftrace: converting mcount calls to 0f 1f 44 00 00 ftrace: allocating 19374 entries in 76 pages installing Xen timer for CPU 1 ------------[ cut here ]------------ WARNING: at arch/x86/xen/enlighten.c:482 cvt_gate_to_trap+0x86/0xbe() (Tainted: G W ) Modules linked in: Pid: 1, comm: swapper Tainted: G W 2.6.30.8-64.fc11.x86_64 #1 Call Trace: [<ffffffff810571a8>] warn_slowpath_common+0x95/0xc3 [<ffffffff810571fd>] warn_slowpath_null+0x27/0x3d [<ffffffff8100abc0>] cvt_gate_to_trap+0x86/0xbe [<ffffffff8100ac50>] xen_convert_trap_info+0x58/0x8d [<ffffffff8100aca9>] xen_copy_trap_info+0x24/0x26 [<ffffffff8148d271>] xen_cpu_up+0x193/0x35f [<ffffffff814939ef>] _cpu_up+0xbc/0x157 [<ffffffff81493af9>] cpu_up+0x6f/0x90 [<ffffffff81757827>] kernel_init+0xc8/0x24b [<ffffffff8101313a>] child_rip+0xa/0x20 [<ffffffff81012afd>] ? restore_args+0x0/0x30 [<ffffffff81013130>] ? child_rip+0x0/0x20 ---[ end trace a7919e7f17c0a726 ]--- SMP alternatives: switching to SMP code Initializing CPU#1 ------------[ cut here ]------------ WARNING: at arch/x86/xen/enlighten.c:482 cvt_gate_to_trap+0x86/0xbe() (Tainted: G W ) Modules linked in: Pid: 0, comm: swapper Tainted: G W 2.6.30.8-64.fc11.x86_64 #1 Call Trace: [<ffffffff810571a8>] warn_slowpath_common+0x95/0xc3 [<ffffffff810571fd>] warn_slowpath_null+0x27/0x3d [<ffffffff8100abc0>] cvt_gate_to_trap+0x86/0xbe [<ffffffff8100ac50>] xen_convert_trap_info+0x58/0x8d [<ffffffff8100b056>] xen_load_idt+0x4c/0x70 [<ffffffff8148f236>] cpu_init+0x11d/0x2ff [<ffffffff8148cfc5>] cpu_bringup+0x22/0xac [<ffffffff8148d0c3>] cpu_bringup_and_idle+0x21/0x3c ---[ end trace a7919e7f17c0a727 ]--- CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line) CPU: L2 Cache: 1024K (64 bytes/line) CPU 1/0x0 -> Node 0 Brought up 2 CPUs sizeof(vma)=176 bytes sizeof(page)=56 bytes sizeof(inode)=576 bytes sizeof(dentry)=192 bytes sizeof(ext3inode)=784 bytes sizeof(buffer_head)=104 bytes sizeof(skbuff)=232 bytes sizeof(task_struct)=5960 bytes CPU0 attaching sched-domain: domain 0: span 0-1 level CPU groups: 0 1 CPU1 attaching sched-domain: domain 0: span 0-1 level CPU groups: 1 0 net_namespace: 1952 bytes Booting paravirtualized kernel on Xen Xen version: 3.4.0 (preserve-AD) Grant table initialized regulator: core version 0.5 Time: 165:165:165 Date: 165/165/65 NET: Registered protocol family 16 PCI: Fatal: No config space access function found bio: create slab <bio-0> at 0 ACPI: Interpreter disabled. xen_balloon: Initialising balloon driver. SCSI subsystem initialized libata version 3.00 loaded. usbcore: registered new interface driver usbfs usbcore: registered new interface driver hub usbcore: registered new device driver usb PCI: System does not support PCI PCI: System does not support PCI NetLabel: Initializing NetLabel: domain hash size = 128 NetLabel: protocols = UNLABELED CIPSOv4 NetLabel: unlabeled traffic allowed by default pnp: PnP ACPI: disabled Switched to high resolution mode on CPU 0 Switched to high resolution mode on CPU 1 NET: Registered protocol family 2 IP route cache hash table entries: 32768 (order: 6, 262144 bytes) TCP established hash table entries: 131072 (order: 9, 2097152 bytes) TCP bind hash table entries: 65536 (order: 8, 1048576 bytes) TCP: Hash tables configured (established 131072 bind 65536) TCP reno registered NET: Registered protocol family 1 Trying to unpack rootfs image as initramfs... Freeing initrd memory: 7632k freed platform rtc_cmos: registered platform RTC device (no PNP device found) audit: initializing netlink socket (disabled) type=2000 audit(1254332332.712:1): initialized HugeTLB registered 2 MB page size, pre-allocated 0 pages VFS: Disk quotas dquot_6.5.2 Dquot-cache hash table entries: 512 (order 0, 4096 bytes) msgmni has been set to 2062 SELinux: Registering netfilter hooks alg: No test for stdrng (krng) Block layer SCSI generic (bsg) driver version 0.4 loaded (major 252) io scheduler noop registered io scheduler anticipatory registered io scheduler deadline registered io scheduler cfq registered (default) pci_hotplug: PCI Hot Plug PCI Core version: 0.5 pciehp: PCI Express Hot Plug Controller Driver version: 0.4 acpiphp: ACPI Hot Plug PCI Controller Driver version: 0.5 Non-volatile memory driver v1.3 Linux agpgart interface v0.103 Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled brd: module loaded loop: module loaded input: Macintosh mouse button emulation as /devices/virtual/input/input0 Driver 'sd' needs updating - please use bus_type methods Driver 'sr' needs updating - please use bus_type methods Fixed MDIO Bus: probed ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver ohci_hcd: USB 1.1 'Open' Host Controller (OHCI) Driver uhci_hcd: USB Universal Host Controller Interface driver PNP: No PS/2 controller found. Probing ports directly. mice: PS/2 mouse device common for all mice rtc_cmos: probe of rtc_cmos failed with error -16 device-mapper: uevent: version 1.0.3 device-mapper: ioctl: 4.14.0-ioctl (2008-04-23) initialised: dm-devel cpuidle: using governor ladder cpuidle: using governor menu usbcore: registered new interface driver hiddev usbcore: registered new interface driver usbhid usbhid: v2.6:USB HID core driver nf_conntrack version 0.5.0 (8192 buckets, 32768 max) CONFIG_NF_CT_ACCT is deprecated and will be removed soon. Please use nf_conntrack.acct=1 kernel paramater, acct=1 nf_conntrack module option or sysctl net.netfilter.nf_conntrack_acct=1 to enable it. ip_tables: (C) 2000-2006 Netfilter Core Team TCP cubic registered Initializing XFRM netlink socket NET: Registered protocol family 17 PM: Resume from disk failed. registered taskstats version 1 XENBUS: Device with no driver: device/console/0 XENBUS: Device with no driver: device/vbd/51712 XENBUS: Device with no driver: device/vif/0 Magic number: 1:252:3141 Initalizing network drop monitor service Freeing unused kernel memory: 1348k freed Write protecting the kernel read-only data: 6728k xvda: xvda1 xvda2 EXT3-fs: INFO: recovery required on readonly filesystem. EXT3-fs: write access will be enabled during recovery. kjournald starting. Commit interval 5 seconds EXT3-fs: dm-0: orphan cleanup on readonly fs ext3_orphan_cleanup: deleting unreferenced inode 16579 ext3_orphan_cleanup: deleting unreferenced inode 16530 ext3_orphan_cleanup: deleting unreferenced inode 16476 ext3_orphan_cleanup: deleting unreferenced inode 16469 ext3_orphan_cleanup: deleting unreferenced inode 16455 ext3_orphan_cleanup: deleting unreferenced inode 16423 ext3_orphan_cleanup: deleting unreferenced inode 16416 EXT3-fs: dm-0: 7 orphan inodes deleted EXT3-fs: recovery complete. EXT3-fs: mounted filesystem with ordered data mode. SELinux: Disabled at runtime. SELinux: Unregistering netfilter hooks type=1404 audit(1254332342.191:2): selinux=0 auid=4294967295 ses=4294967295 udev: starting version 141 Initialising Xen virtual ethernet driver. input: PC Speaker as /devices/platform/pcspkr/input/input1 device-mapper: multipath: version 1.0.5 loaded EXT3 FS on dm-0, internal journal kjournald starting. Commit interval 5 seconds EXT3 FS on xvda1, internal journal EXT3-fs: mounted filesystem with ordered data mode. Adding 1671160k swap on /dev/mapper/VolGroup00-LogVol01. Priority:-1 extents:1 across:1671160k NET: Registered protocol family 10 lo: Disabled Privacy Extensions RPC: Registered udp transport module. RPC: Registered tcp transport module. eth0: no IPv6 routers present ---------- [root@buildbox-a2 ~]# xm console 8 INFO: task kjournald:61 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. kjournald D ffff88003e932000 0 61 2 ffff88003e919d40 0000000000000246 ffffffff8100e45c 0000000000000000 000000001cee5db8 ffff88003e919d20 ffffffff8100ee82 0000000000000202 ffff88003e9c83a8 000000000000e2e8 ffff88003e9c83a8 0000000000012d00 Call Trace: [<ffffffff8100e45c>] ? xen_force_evtchn_callback+0x20/0x36 [<ffffffff8100ee82>] ? check_events+0x12/0x20 [<ffffffff8100ee6f>] ? xen_restore_fl_direct_end+0x0/0x1 [<ffffffff814993de>] ? _spin_unlock_irqrestore+0x4e/0x64 [<ffffffff81496bf6>] schedule+0x21/0x49 [<ffffffff811b8b33>] journal_commit_transaction+0x13d/0xe42 [<ffffffff8100e45c>] ? xen_force_evtchn_callback+0x20/0x36 [<ffffffff81070bfb>] ? autoremove_wake_function+0x0/0x5f [<ffffffff810632bc>] ? try_to_del_timer_sync+0x69/0x87 [<ffffffff811bcdf7>] kjournald+0xfd/0x253 [<ffffffff81070bfb>] ? autoremove_wake_function+0x0/0x5f [<ffffffff811bccfa>] ? kjournald+0x0/0x253 [<ffffffff81070709>] kthread+0x6d/0xae [<ffffffff8101313a>] child_rip+0xa/0x20 [<ffffffff81012afd>] ? restore_args+0x0/0x30 [<ffffffff81013130>] ? child_rip+0x0/0x20 INFO: task ns-slapd:1034 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. ns-slapd D ffffc20000000000 0 1034 1 ffff88003dd87908 0000000000000282 ffff88003dd87868 ffffffff8100ed0d ffff88003dd86000 00000000e59205a0 ffff88003dd87888 ffffffff8107957a ffff88003d4fe0e8 000000000000e2e8 ffff88003d4fe0e8 0000000000012d00 Call Trace: [<ffffffff8100ed0d>] ? xen_clocksource_get_cycles+0x1c/0x32 [<ffffffff8107957a>] ? clocksource_read+0x22/0x38 [<ffffffff81074986>] ? ktime_get_ts+0x61/0x7d [<ffffffff8113b0ee>] ? sync_buffer+0x0/0x6b [<ffffffff81496bf6>] schedule+0x21/0x49 [<ffffffff81496c62>] io_schedule+0x44/0x6c [<ffffffff8113b141>] sync_buffer+0x53/0x6b [<ffffffff81497294>] __wait_on_bit_lock+0x55/0xb2 [<ffffffff810d2d1f>] ? find_get_page+0x64/0xa3 [<ffffffff8149736e>] out_of_line_wait_on_bit_lock+0x7d/0x9c [<ffffffff8113b0ee>] ? sync_buffer+0x0/0x6b [<ffffffff81070c5a>] ? wake_bit_function+0x0/0x5a [<ffffffff8113b380>] __lock_buffer+0x3d/0x53 [<ffffffff811b6eda>] lock_buffer+0x49/0x64 [<ffffffff811b7a15>] do_get_write_access+0x82/0x3f3 [<ffffffff811bbdb3>] ? journal_add_journal_head+0xce/0x162 [<ffffffff811b7dc0>] journal_get_write_access+0x3a/0x65 [<ffffffff8118c209>] __ext3_journal_get_write_access+0x34/0x74 [<ffffffff8117e464>] ext3_reserve_inode_write+0x50/0xaa [<ffffffff8117e50d>] ext3_mark_inode_dirty+0x4f/0x80 [<ffffffff8117e6b8>] ext3_dirty_inode+0x79/0xa7 [<ffffffff81135095>] __mark_inode_dirty+0x45/0x190 [<ffffffff81129603>] file_update_time+0xc0/0x113 [<ffffffff810eb167>] do_wp_page+0x610/0x658 [<ffffffff81INFO: task kjournald:61 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. kjournald D ffff88003e932000 0 61 2 ffff88003e919d40 0000000000000246 ffffffff8100e45c 0000000000000000 000000001cee5db8 ffff88003e919d20 ffffffff8100ee82 0000000000000202 ffff88003e9c83a8 000000000000e2e8 ffff88003e9c83a8 0000000000012d00 Call Trace: [<ffffffff8100e45c>] ? xen_force_evtchn_callback+0x20/0x36 [<ffffffff8100ee82>] ? check_events+0x12/0x20 [<ffffffff8100ee6f>] ? xen_restore_fl_direct_end+0x0/0x1 [<ffffffff814993de>] ? _spin_unlock_irqrestore+0x4e/0x64 [<ffffffff81496bf6>] schedule+0x21/0x49 [<ffffffff811b8b33>] journal_commit_transaction+0x13d/0xe42 [<ffffffff8100e45c>] ? xen_force_evtchn_callback+0x20/0x36 [<ffffffff81070bfb>] ? autoremove_wake_function+0x0/0x5f [<ffffffff810632bc>] ? try_to_del_timer_sync+0x69/0x87 [<ffffffff811bcdf7>] kjournald+0xfd/0x253 [<ffffffff81070bfb>] ? autoremove_wake_function+0x0/0x5f [<ffffffff811bccfa>] ? kjournald+0x0/0x253 [<ffffffff81070709>] kthread+0x6d/0xae [<ffffffff8101313a>] child_rip+0xa/0x20 [<ffffffff81012afd>] ? restore_args+0x0/0x30 [<ffffffff81013130>] ? child_rip+0x0/0x20 INFO: task ns-slapd:1034 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. ns-slapd D ffffc20000000000 0 1034 1 ffff88003dd87908 0000000000000282 ffff88003dd87868 ffffffff8100ed0d ffff88003dd86000 00000000e59205a0 ffff88003dd87888 ffffffff8107957a ffff88003d4fe0e8 000000000000e2e8 ffff88003d4fe0e8 0000000000012d00 Call Trace: [<ffffffff8100ed0d>] ? xen_clocksource_get_cycles+0x1c/0x32 [<ffffffff8107957a>] ? clocksource_read+0x22/0x38 [<ffffffff81074986>] ? ktime_get_ts+0x61/0x7d [<ffffffff8113b0ee>] ? sync_buffer+0x0/0x6b [<ffffffff81496bf6>] schedule+0x21/0x49 [<ffffffff81496c62>] io_schedule+0x44/0x6c [<ffffffff8113b141>] sync_buffer+0x53/0x6b [<ffffffff81497294>] __wait_on_bit_lock+0x55/0xb2 [<ffffffff810d2d1f>] ? find_get_page+0x64/0xa3 [<ffffffff8149736e>] out_of_line_wait_on_bit_lock+0x7d/0x9c [<ffffffff8113b0ee>] ? sync_buffer+0x0/0x6b [<ffffffff81070c5a>] ? wake_bit_function+0x0/0x5a [<ffffffff8113b380>] __lock_buffer+0x3d/0x53 [<ffffffff811b6eda>] lock_buffer+0x49/0x64 [<ffffffff811b7a15>] do_get_write_access+0x82/0x3f3 [<ffffffff811bbdb3>] ? journal_add_journal_head+0xce/0x162 [<ffffffff811b7dc0>] journal_get_write_access+0x3a/0x65 [<ffffffff8118c209>] __ext3_journal_get_write_access+0x34/0x74 [<ffffffff8117e464>] ext3_reserve_inode_write+0x50/0xaa [<ffffffff8117e50d>] ext3_mark_inode_dirty+0x4f/0x80 [<ffffffff8117e6b8>] ext3_dirty_inode+0x79/0xa7 [<ffffffff81135095>] __mark_inode_dirty+0x45/0x190 [<ffffffff81129603>] file_update_time+0xc0/0x113 [<ffffffff810eb167>] do_wp_page+0x610/0x658 [<ffffffff8100bc21>] ? __raw_callee_save_xen_pmd_val+0x11/0x1e [<ffffffff810eccd9>] handle_mm_fault+0x6a2/0x72e [<ffffffff814993de>] ? _spin_unlock_irqrestore+0x4e/0x64 [<ffffffff8149be99>] do_page_fault+0x226/0x24f [<ffffffff81499965>] page_fault+0x25/0x30 INFO: task ns-slapd:1040 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. ns-slapd D ffff88003e932024 0 1040 1 ffff88003bc119f8 0000000000000282 ffffffff8100e45c ffffc20000025410 00000000f1efb74c ffff88003bc119d8 ffffffff8100ee82 0000000000000004 ffff88003bc0b248 000000000000e2e8 ffff88003bc0b248 0000000000012d00 Call Trace: [<ffffffff8100e45c>] ? xen_force_evtchn_callback+0x20/0x36 [<ffffffff8100ee82>] ? check_events+0x12/0x20 [<ffffffff8100ee6f>] ? xen_restore_fl_direct_end+0x0/0x1 [<ffffffff814993de>] ? _spin_unlock_irqrestore+0x4e/0x64 [<ffffffff8100ee82>] ? check_events+0x12/0x20 [<ffffffff81496bf6>] schedule+0x21/0x49 [<ffffffff811b841d>] start_this_handle+0x2d4/0x373 [<ffffffff81070bfb>] ? autoremove_wake_function+0x0/0x5f [<ffffffff811b865d>] journal_start+0xb7/0x106 [<ffffffff81187903>] ext3_journal_start_sb+0x62/0x78 [<ffffffff8117d60b>] ext3_journal_start+0x28/0x3e [<ffffffff8117e67d>] ext3_dirty_inode+0x3e
Moving this to the kernel component, but leaving xen-maint assigned.
I've had a stable system thus far by appending "clocksource=jiffies" to the kernel boot line. The default clocksource is otherwise "xen". The dmesg boot warnings in my initial report still occur (enlighten.c).
That's interesting. I guess there are two possibilities here: 1. there's something wrong with the Xen/KVM pvclock clocksource that's upsetting the IO scheduling code somehow (maybe non-monotony or something), or 2. there's a bug in IO scheduling when it has very high-res timing info It could well be 1. The tsc clocksource has some code to specifically guarantee monotonicity; maybe we should do the same...
Created attachment 364778 [details] Check to make sure clocksource returns minimally monotone results Please try this and see if it makes any difference. I'm not sure that it will, but it looks like a difference between the normal tsc-based clocksources and the Xen one (there's an equivalent kvm patch too).
I haven't tested the patch yet since I don't have facilities to build a Fedora kernel installed at the moment, but I do have another piece of information. When running with xen clocksource, the system is stable if I get a message such as this one once: hrtimer: interrupt too slow, forcing clock min delta to 19541868 ns If the hrtimer message does not appear, the system will crash usually fairly quickly.
I've built a 2.6.30.8-64.fc11.x86_64 kernel containing this patch, and am running it on a server that experiences this issue. Testing results will be forthcoming as soon as they exist. Also, this issue appears to only manifest on guests that are running with multiple vcpus on AMD hardware - one vcpu on AMD is fine, and multiple vcpus on Intel is fine. Kevin, is the host you're seeing this problem on AMD-based as well?
Yes, the box is a dual opterton 246 and the guest has two VCPUs. I have an identically configured box that does not exhibit these problems. So far, the patched monotone kernel has been running well for 17 hours. I will update if anything changes. Before applying the monotone patch, I upgraded to RHEL 5.4 and Xen 3.4.1 on the dom0 but that had no positive impact for this bug.
Unfortunately, it seems that the server I was testing this on experienced the same old lockup 4.25 hours after it was started under the patched kernel. Since then, it's been running for a bit over 26 hours without incident (we've essentially eliminated all the user load that is usually on the server, to see if it still fails under no load). So, while the patch looks like a strictly good thing, I don't think we've actually pegged and fixed the cause of this lockup. Note that we experienced similar lockups under 2.6.29 on multi-vcpu AMD guests, but instead of the backtraces mostly showing things line xen_clocksource_get_cycles, clocksource_read, and check_events, the prominent entry in the backtraces in 2.6.29 was getnstimeofday. Other than the kernel upgrade, the configuration of the machines I see this issue on has not changed, so I strongly suspect that the same trigger for this lockup exists in both cases, and the kernel version bump has affected the way in which it manifests and how the backtraces look. Perhaps these hints about what it looked like in two successive kernel versions will give Jeremy a better idea of what could be behind it.
What clocksource does Linux use on this system if you boot it natively (ie, without Xen?). Your symptoms are pointing to something like mis-synchronized tscs between sockets in some way which either Xen isn't detecting and compensating for, or not passing the info to the guests, or perhaps the guest is not using/misusing that info. However, I mentioned this problem to Jes Axboe last week and he said that the IO scheduler should not be sensitive to these kinds of timing upsets. Certainly not going into indefinite sulks. What IO scheduler are you using (CFQ by the looks of it)? What happens if you switch to no-op? (Boot with elevator=noop on the kernel command line.) (noop should be preferred for guests anyway, since they have no real insight into the underlying IO devices; dom0 is in a better position to do a good job.)
Based on my results, I'd say it is an interaction with the Dom0 timing or hardware. One of my identical servers is perfectly stable with the unpatched domU kernels. Both boxes selected 'jiffies' as dom0 current_clocksource. This is the RHEL 5.4 xen dom0 kernel. FWIW the monotone patch is working well on my F11 domU.. 60 hours uptime, but it has very light disk workload..
The host in question here, when booted into the native Ubuntu 8.04 2.6.24-25 kernel, has a clocksource0 of acpi_pm. Indeed, the guest had been using the CFQ scheduler. We switched it to the noop scheduler with no change in behavior - the guest hung with the same usual blocked task output after just over an hour under its usual load. We've now attempted adding clocksource=tsc to the host's kernel line, and are running the guest with the noop scheduler (and the F11 2.6.30.8-64 kernel with your monotonicity patch) to see if a change in host clocksource has any effect.
One other difference from the configuration that Kevin sees is that our dom0 reports that its clocksource0 is xen rather than jiffies.
The results of our further tests indicate that clocksource=tsc and elevator=noop did not change behavior; the guest hung about five hours after we had rebooted it with the characteristic backtraces. We have rebooted the kernel with the default IO scheduler (keeping the host with clocksource=tsc) but are not optimistic about the outcome.
Default IO scheduler and host with clocksource=tsc hung after 34 hours.
Crash and burn, but it took several days with monotone patch. [root@buildbox-a2 ~]# xm console gnucapplus awstats.pl[9839]: segfault at 7fff470cdff8 ip 00007fff47945005 sp 00007fff470ce000 error 6 in libperl.so[7fff47895000+196000] INFO: task pdflush:29 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. pdflush D ffffc20000000000 0 29 2 ffff88003eeb5a20 0000000000000246 ffff88003eeb5980 ffffffff8100ed0d ffff88003f824be8 00000000dcba5f53 ffff88003eeb59a0 ffffffff8107957a ffff88003ff160e8 000000000000e2e8 ffff88003ff160e8 0000000000012d00 Call Trace: [<ffffffff8100ed0d>] ? xen_clocksource_get_cycles+0x1c/0x40 [<ffffffff8107957a>] ? clocksource_read+0x22/0x38 [<ffffffff81074986>] ? ktime_get_ts+0x61/0x7d [<ffffffff810d300b>] ? sync_page+0x0/0x71 [<ffffffff81496c56>] schedule+0x21/0x49 [<ffffffff81496cc2>] io_schedule+0x44/0x6c [<ffffffff810d3064>] sync_page+0x59/0x71 [<ffffffff814972f4>] __wait_on_bit_lock+0x55/0xb2 [<ffffffff810d2f53>] __lock_page+0x73/0x8e [<ffffffff81070c5a>] ? wake_bit_function+0x0/0x5a [<ffffffff810db339>] write_cache_pages+0x21b/0x40c [<ffffffff810dab42>] ? __writepage+0x0/0x56 [<ffffffff81498d04>] ? trace_hardirqs_on_thunk+0x3a/0x3c [<ffffffff8100bc21>] ? __raw_callee_save_xen_pmd_val+0x11/0x1e [<ffffffff8100e45c>] ? xen_force_evtchn_callback+0x20/0x36 [<ffffffff810db55f>] generic_writepages+0x35/0x4b [<ffffffff810db5b6>] do_writepages+0x41/0x60 [<ffffffff811346b8>] __writeback_single_inode+0x168/0x2a8 [<ffffffff8100e45c>] ? xen_force_evtchn_callback+0x20/0x36 [<ffffffff81134c2a>] generic_sync_sb_inodes+0x21a/0x361 [<ffffffff81134fe7>] writeback_inodes+0xb0/0x119 [<ffffffff810db7e8>] wb_kupdate+0xc0/0x14a [<ffffffff810dc8ae>] pdflush+0x16d/0x272 [<ffffffff810db728>] ? wb_kupdate+0x0/0x14a [<ffffffff810dc741>] ? pdflush+0x0/0x272 [<ffffffff81070709>] kthread+0x6d/0xae [<ffffffff8101313a>] child_rip+0xa/0x20 [<ffffffff81012afd>] ? restore_args+0x0/0x30 [<ffffffff81013130>] ? child_rip+0x0/0x20 INFO: task kjournald:61 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. kjournald D ffff88003e81a3c0 0 61 2 ffff88003e949bc0 0000000000000246 ffff88003e949b20 ffffffff8100ed0d 0000000064706931 0000000064706931 ffff88003e949b40 ffffffff8107957a ffff88003e9c03a8 000000000000e2e8 ffff88003e9c03a8 0000000000012d00 Call Trace: [<ffffffff8100ed0d>] ? xen_clocksource_get_cycles+0x1c/0x40 [<ffffffff8107957a>] ? clocksource_read+0x22/0x38 [<ffffffff8100e45c>] ? xen_force_evtchn_callback+0x20/0x36 [<ffffffff81074986>] ? ktime_get_ts+0x61/0x7d [<ffffffff8113b0ee>] ? sync_buffer+0x0/0x6b [<ffffffff81496c56>] schedule+0x21/0x49 [<ffffffff81496cc2>] io_schedule+0x44/0x6c [<ffffffff8113b141>] sync_buffer+0x53/0x6b [<ffffffff8113b0ee>] ? sync_buffer+0x0/0x6b [<ffffffff81497444>] __wait_on_bit+0x57/0x9e [<ffffffff8100e45c>] ? xen_force_evtchn_callback+0x20/0x36 [<ffffffff81497508>] out_of_line_wait_on_bit+0x7d/0x9c [<ffffffff8113b0ee>] ? sync_buffer+0x0/0x6b [<ffffffff81070c5a>] ? wake_bit_function+0x0/0x5a [<ffffffff8113aff2>] __wait_on_buffer+0x37/0x4d [<ffffffff811b88b2>] wait_on_buffer+0x4a/0x65 [<ffffffff811b8e8d>] journal_commit_transaction+0x497/0xe42 [<ffffffff8100ee7f>] ? xen_restore_fl_direct_end+0x0/0x1 [<ffffffff8149943e>] ? _spin_unlock_irqrestore+0x4e/0x64 [<ffffffff810632bc>] ? try_to_del_timer_sync+0x69/0x87 [<ffffffff811bcdf7>] kjournald+0xfd/0x253 [<ffffffff81070bfb>] ? autoremove_wake_function+0x0/0x5f [<ffffffff811bccfa>] ? kjournald+0x0/0x253 [<ffffffff81070709>] kthread+0x6d/0xae [<ffffffff8101313a>] child_rip+0xa/0x20 [<ffffffff81012afd>] ? restore_args+0x0/0x30 [<ffffffff81013130>] ? child_rip+0x0/0x20 INFO: task ns-slapd:1034 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. ns-slapd D ffffc20000016000 0 1034 1 ffff88003d549c68 0000000000000286 ffff88003d549bc8 ffffffff8100ed0d ffffffff8113cdf2 00000000585e4218 ffff88003d549be8 ffffffff8107957a ffff88003e069af8 000000000000e2e8 ffff88003e069af8 0000000000012d00 Call Trace: [<ffffffff8100ed0d>] ? xen_clocksource_get_cycles+0x1c/0x40 [<ffffffff8113cdf2>] ? end_buffer_async_write+0x0/0x16c [<ffffffff8107957a>] ? clocksource_read+0x22/0x38 [<ffffffff81074986>] ? ktime_get_ts+0x61/0x7d [<ffffffff810d300b>] ? sync_page+0x0/0x71 [<ffffffff81496c56>] schedule+0x21/0x49 [<ffffffff81496cc2>] io_schedule+0x44/0xINFO: task pdflush:29 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. pdflush D ffffc20000000000 0 29 2 ffff88003eeb5a20 0000000000000246 ffff88003eeb5980 ffffffff8100ed0d ffff88003f824be8 00000000dcba5f53 ffff88003eeb59a0 ffffffff8107957a ffff88003ff160e8 000000000000e2e8 ffff88003ff160e8 0000000000012d00 Call Trace: [<ffffffff8100ed0d>] ? xen_clocksource_get_cycles+0x1c/0x40 [<ffffffff8107957a>] ? clocksource_read+0x22/0x38 [<ffffffff81074986>] ? ktime_get_ts+0x61/0x7d [<ffffffff810d300b>] ? sync_page+0x0/0x71 [<ffffffff81496c56>] schedule+0x21/0x49 [<ffffffff81496cc2>] io_schedule+0x44/0x6c [<ffffffff810d3064>] sync_page+0x59/0x71 [<ffffffff814972f4>] __wait_on_bit_lock+0x55/0xb2 [<ffffffff810d2f53>] __lock_page+0x73/0x8e [<ffffffff81070c5a>] ? wake_bit_function+0x0/0x5a [<ffffffff810db339>] write_cache_pages+0x21b/0x40c [<ffffffff810dab42>] ? __writepage+0x0/0x56 [<ffffffff81498d04>] ? trace_hardirqs_on_thunk+0x3a/0x3c [<ffffffff8100bc21>] ? __raw_callee_save_xen_pmd_val+0x11/0x1e [<ffffffff8100e45c>] ? xen_force_evtchn_callback+0x20/0x36 [<ffffffff810db55f>] generic_writepages+0x35/0x4b [<ffffffff810db5b6>] do_writepages+0x41/0x60 [<ffffffff811346b8>] __writeback_single_inode+0x168/0x2a8 [<ffffffff8100e45c>] ? xen_force_evtchn_callback+0x20/0x36 [<ffffffff81134c2a>] generic_sync_sb_inodes+0x21a/0x361 [<ffffffff81134fe7>] writeback_inodes+0xb0/0x119 [<ffffffff810db7e8>] wb_kupdate+0xc0/0x14a [<ffffffff810dc8ae>] pdflush+0x16d/0x272 [<ffffffff810db728>] ? wb_kupdate+0x0/0x14a [<ffffffff810dc741>] ? pdflush+0x0/0x272 [<ffffffff81070709>] kthread+0x6d/0xae [<ffffffff8101313a>] child_rip+0xa/0x20 [<ffffffff81012afd>] ? restore_args+0x0/0x30 [<ffffffff81013130>] ? child_rip+0x0/0x20 INFO: task kjournald:61 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. kjournald D ffff88003e81a3c0 0 61 2 ffff88003e949bc0 0000000000000246 ffff88003e949b20 ffffffff8100ed0d 0000000064706931 0000000064706931 ffff88003e949b40 ffffffff8107957a ffff88003e9c03a8 000000000000e2e8 ffff88003e9c03a8 0000000000012d00 Call Trace: [<ffffffff8100ed0d>] ? xen_clocksource_get_cycles+0x1c/0x40 [<ffffffff8107957a>] ? clocksource_read+0x22/0x38 [<ffffffff8100e45c>] ? xen_force_evtchn_callback+0x20/0x36 [<ffffffff81074986>] ? ktime_get_ts+0x61/0x7d [<ffffffff8113b0ee>] ? sync_buffer+0x0/0x6b [<ffffffff81496c56>] schedule+0x21/0x49 [<ffffffff81496cc2>] io_schedule+0x44/0x6c [<ffffffff8113b141>] sync_buffer+0x53/0x6b [<ffffffff8113b0ee>] ? sync_buffer+0x0/0x6b [<ffffffff81497444>] __wait_on_bit+0x57/0x9e [<ffffffff8100e45c>] ? xen_force_evtchn_callback+0x20/0x36 [<ffffffff81497508>] out_of_line_wait_on_bit+0x7d/0x9c [<ffffffff8113b0ee>] ? sync_buffer+0x0/0x6b [<ffffffff81070c5a>] ? wake_bit_function+0x0/0x5a [<ffffffff8113aff2>] __wait_on_buffer+0x37/0x4d [<ffffffff811b88b2>] wait_on_buffer+0x4a/0x65 [<ffffffff811b8e8d>] journal_commit_transaction+0x497/0xe42 [<ffffffff8100ee7f>] ? xen_restore_fl_direct_end+0x0/0x1 [<ffffffff8149943e>] ? _spin_unlock_irqrestore+0x4e/0x64 [<ffffffff810632bc>] ? try_to_del_timer_sync+0x69/0x87 [<ffffffff811bcdf7>] kjournald+0xfd/0x253 [<ffffffff81070bfb>] ? autoremove_wake_function+0x0/0x5f [<ffffffff811bccfa>] ? kjournald+0x0/0x253 [<ffffffff81070709>] kthread+0x6d/0xae [<ffffffff8101313a>] child_rip+0xa/0x20 [<ffffffff81012afd>] ? restore_args+0x0/0x30 [<ffffffff81013130>] ? child_rip+0x0/0x20
Mitchell/Edward, Have you tried F12 yet? I am gearing up this week to see how it performs on these hosts. F11 with jiffies forced works well otherwise (24 days and counting).
Just saw this with F12 (kernel 2.6.31)!! Any ideas what is going on here? This is bringing down production VMs for me. Fedora release 12 (Constantine) Kernel 2.6.31.12-174.2.3.fc12.x86_64 on an x86_64 (/dev/hvc0) buildmaster login: type=1305 audit(1265446533.138:11726): audit_enabled=0 old=1 auid=4294967295 ses=4294967295 subj=kernel res=1 INFO: task kjournald:242 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. kjournald D ffff88002c73fb80 0 242 2 0x00000000 ffff8800031d1c00 0000000000000246 000906996fc8e4d9 ffff88000315a318 0000000001311302 ffffffff8111c908 ffff8800031d1cb0 0000000000000002 ffff8800037861c8 000000000000f8f0 ffff8800037861c8 0000000000015600 Call Trace: [<ffffffff8111c908>] ? sync_buffer+0x0/0x44 [<ffffffff8106e8e3>] ? clocksource_read+0xf/0x11 [<ffffffff8106acf2>] ? ktime_get_ts+0x4e/0x53 [<ffffffff8111c908>] ? sync_buffer+0x0/0x44 [<ffffffff8141bf53>] io_schedule+0x31/0x42 [<ffffffff8111c948>] sync_buffer+0x40/0x44 [<ffffffff8141c465>] __wait_on_bit+0x48/0x7b [<ffffffff8100e7ff>] ? xen_restore_fl_direct_end+0x0/0x1 [<ffffffff8141c506>] out_of_line_wait_on_bit+0x6e/0x79 [<ffffffff8111c908>] ? sync_buffer+0x0/0x44 [<ffffffff81067b70>] ? wake_bit_function+0x0/0x33 [<ffffffff8111c86e>] __wait_on_buffer+0x24/0x26 [<ffffffff81199272>] wait_on_buffer+0x3b/0x3f [<ffffffff81199c59>] journal_commit_transaction+0x92e/0xe55 [<ffffffff8100e7ff>] ? xen_restore_fl_direct_end+0x0/0x1 [<ffffffff8141d6e9>] ? _spin_unlock_irqrestore+0x19/0x1b [<ffffffff8105c19b>] ? try_to_del_timer_sync+0x5a/0x66 [<ffffffff8100e7ff>] ? xen_restore_fl_direct_end+0x0/0x1 [<ffffffff8119ce1f>] kjournald+0xeb/0x230 [<ffffffff81067b37>] ? autoremove_wake_function+0x0/0x39 [<ffffffff8141d6e9>] ? _spin_unlock_irqrestore+0x19/0x1b [<ffffffff8119cd34>] ? kjournald+0x0/0x230 [<ffffffff810677b5>] kthread+0x91/0x99 [<ffffffff81012daa>] child_rip+0xa/0x20 [<ffffffff81011f97>] ? int_ret_from_sys_call+0x7/0x1b [<ffffffff8101271d>] ? retint_restore_args+0x5/0x6 [<ffffffff81012da0>] ? child_rip+0x0/0x20 INFO: task java:1446 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. java D ffff88002c73fb80 0 1446 1 0x00000080 ffff88002d8cbb38 0000000000000282 00003ffffffff000 ffff88002d8c00d8 ffff88002f271188 00000000000000ab ffffffff8100e139 ffff88002d8cbb08 ffff88002d864a48 000000000000f8f0 ffff88002d864a48 0000000000015600 Call Trace: [<ffffffff8100e139>] ? xen_force_evtchn_callback+0xd/0xf [<ffffffff8100e7ff>] ? xen_restore_fl_direct_end+0x0/0x1 [<ffffffff8141d6e9>] ? _spin_unlock_irqrestore+0x19/0x1b [<ffffffff81198743>] do_get_write_access+0x1d4/0x3bc [<ffffffff8100e7ff>] ? xen_restore_fl_direct_end+0x0/0x1 [<ffffffff81067b70>] ? wake_bit_function+0x0/0x33 [<ffffffff8119c095>] ? journal_add_journal_head+0xbf/0x13c [<ffffffff81198956>] journal_get_write_access+0x2b/0x40 [<ffffffff81164539>] __ext3_journal_get_write_access+0x25/0x51 [<ffffffff81157b4f>] ext3_reserve_inode_write+0x43/0x87 [<ffffffff8100e812>] ? check_events+0x12/0x20 [<ffffffff81157bd3>] ext3_mark_inode_dirty+0x40/0x5d [<ffffffff81157d32>] ext3_dirty_inode+0x6a/0x81 [<ffffffff811173d1>] __mark_inode_dirty+0x34/0xe3 [<ffffffff8110de79>] file_update_time+0xae/0xed [<ffffffff810d74a8>] do_wp_page+0x60c/0x640 [<ffffffff8100c3e5>] ? __raw_callee_save_xen_pud_val+0x11/0x1e [<ffffffff8100c3a9>] ? __raw_callee_save_xen_pmd_val+0x11/0x1e [<ffffffff810d8de8>] handle_mm_fault+0x690/0x705 [<ffffffff8100e139>] ? xen_force_evtchn_callback+0xd/0xf [<ffffffff8100e812>] ? check_events+0x12/0x20 [<ffffffff8100e7ff>] ? xen_restore_fl_direct_end+0x0/0x1 [<ffffffff8141d6e9>] ? _spin_unlock_irqrestore+0x19/0x1b [<ffffffff8141fba1>] do_page_fault+0x281/0x299 [<ffffffff8141db35>] page_fault+0x25/0x30 INFO: task kjournald:242 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. kjournald D ffff88002c73fb80 0 242 2 0x00000000 ffff8800031d1c00 0000000000000246 000906996fc8e4d9 ffff88000315a318 0000000001311302 ffffffff8111c908 ffff8800031d1cb0 0000000000000002 ffff8800037861c8 000000000000f8f0 ffff8800037861c8 0000000000015600 Call Trace: [<ffffffff8111c908>] ? sync_buffer+0x0/0x44 [<ffffffff8106e8e3>] ? clocksource_read+0xf/0x11 [<ffffffff8106acf2>] ? ktime_get_ts+0x4e/0x53 [<ffffffff8111c908>] ? sync_buffer+0x0/0x44 [<ffffffff8141bf53>] io_schedule+0x31/0x42 [<ffffffff8111c948>] sync_buffer+0x40/0x44 [<ffffffff8141c465>] __wait_on_bit+0x48/0x7b [<ffffffff8100e7ff>] ? xen_restore_fl_direct_end+0x0/0x1 [<ffffffff8141c506>] out_of_line_wait_on_bit+0x6e/0x79 [<ffffffff8111c908>] ? sync_buffer+0x0/0x44 [<ffffffff81067b70>] ? wake_bit_function+0x0/0x33 [<ffffffff8111c86e>] __wait_on_buffer+0x24/0x26 [<ffffffff81199272>] wait_on_buffer+0x3b/0x3f [<ffffffff81199c59>] journal_commit_transaction+0x92e/0xe55 [<ffffffff8100e7ff>] ? xen_restore_fl_direct_end+0x0/0x1 [<ffffffff8141d6e9>] ? _spin_unlock_irqrestore+0x19/0x1b [<ffffffff8105c19b>] ? try_to_del_timer_sync+0x5a/0x66 [<ffffffff8100e7ff>] ? xen_restore_fl_direct_end+0x0/0x1 [<ffffffff8119ce1f>] kjournald+0xeb/0x230 [<ffffffff81067b37>] ? autoremove_wake_function+0x0/0x39 [<ffffffff8141d6e9>] ? _spin_unlock_irqrestore+0x19/0x1b [<ffffffff8119cd34>] ? kjournald+0x0/0x230 [<ffffffff810677b5>] kthread+0x91/0x99 [<ffffffff81012daa>] child_rip+0xa/0x20 [<ffffffff81011f97>] ? int_ret_from_sys_call+0x7/0x1b [<ffffffff8101271d>] ? retint_restore_args+0x5/0x6 [<ffffffff81012da0>] ? child_rip+0x0/0x20 INFO: task java:1446 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. java D ffff88002c73fb80 0 1446 1 0x00000080 ffff88002d8cbb38 0000000000000282 00003ffffffff000 ffff88002d8c00d8 ffff88002f271188 00000000000000ab ffffffff8100e139 ffff88002d8cbb08 ffff88002d864a48 000000000000f8f0 ffff88002d864a48 0000000000015600 Call Trace: [<ffffffff8100e139>] ? xen_force_evtchn_callback+0xd/0xf [<ffffffff8100e7ff>] ? xen_restore_fl_direct_end+0x0/0x1 [<ffffffff8141d6e9>] ? _spin_unlock_irqrestore+0x19/0x1b [<ffffffff81198743>] do_get_write_access+0x1d4/0x3bc [<ffffffff8100e7ff>] ? xen_restore_fl_direct_end+0x0/0x1 [<ffffffff81067b70>] ? wake_bit_function+0x0/0x33 [<ffffffff8119c095>] ? journal_add_journal_head+0xbf/0x13c [<ffffffff81198956>] journal_get_write_access+0x2b/0x40 [<ffffffff81164539>] __ext3_journal_get_write_access+0x25/0x51 [<ffffffff81157b4f>] ext3_reserve_inode_write+0x43/0x87 [<ffffffff8100e812>] ? check_events+0x12/0x20 [<ffffffff81157bd3>] ext3_mark_inode_dirty+0x40/0x5d [<ffffffff81157d32>] ext3_dirty_inode+0x6a/0x81 [<ffffffff811173d1>] __mark_inode_dirty+0x34/0xe3 [<ffffffff8110de79>] file_update_time+0xae/0xed [<ffffffff810d74a8>] do_wp_page+0x60c/0x640 [<ffffffff8100c3e5>] ? __raw_callee_save_xen_pud_val+0x11/0x1e [<ffffffff8100c3a9>] ? __raw_callee_save_xen_pmd_val+0x11/0x1e [<ffffffff810d8de8>] handle_mm_fault+0x690/0x705 [<ffffffff8100e139>] ? xen_force_evtchn_callback+0xd/0xf [<ffffffff8100e812>] ? check_events+0x12/0x20 [<ffffffff8100e7ff>] ? xen_restore_fl_direct_end+0x0/0x1 [<ffffffff8141d6e9>] ? _spin_unlock_irqrestore+0x19/0x1b [<ffffffff8141fba1>] do_page_fault+0x281/0x299 [<ffffffff8141db35>] page_fault+0x25/0x30 INFO: task kjournald:242 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. kjournald D ffff88002c73fb80 0 242 2 0x00000000 ffff8800031d1c00 0000000000000246 000906996fc8e4d9 ffff88000315a318 0000000001311302 ffffffff8111c908 ffff8800031d1cb0 0000000000000002 ffff8800037861c8 000000000000f8f0 ffff8800037861c8 0000000000015600 Call Trace: [<ffffffff8111c908>] ? sync_buffer+0x0/0x44 [<ffffffff8106e8e3>] ? clocksource_read+0xf/0x11 [<ffffffff8106acf2>] ? ktime_get_ts+0x4e/0x53 [<ffffffff8111c908>] ? sync_buffer+0x0/0x44 [<ffffffff8141bf53>] io_schedule+0x31/0x42 [<ffffffff8111c948>] sync_buffer+0x40/0x44 [<ffffffff8141c465>] __wait_on_bit+0x48/0x7b [<ffffffff8100e7ff>] ? xen_restore_fl_direct_end+0x0/0x1 [<ffffffff8141c506>] out_of_line_wait_on_bit+0x6e/0x79 [<ffffffff8111c908>] ? sync_buffer+0x0/0x44 [<ffffffff81067b70>] ? wake_bit_function+0x0/0x33 [<ffffffff8111c86e>] __wait_on_buffer+0x24/0x26 [<ffffffff81199272>] wait_on_buffer+0x3b/0x3f [<ffffffff81199c59>] journal_commit_transaction+0x92e/0xe55 [<ffffffff8100e7ff>] ? xen_restore_fl_direct_end+0x0/0x1 [<ffffffff8141d6e9>] ? _spin_unlock_irqrestore+0x19/0x1b [<ffffffff8105c19b>] ? try_to_del_timer_sync+0x5a/0x66 [<ffffffff8100e7ff>] ? xen_restore_fl_direct_end+0x0/0x1 [<ffffffff8119ce1f>] kjournald+0xeb/0x230 [<ffffffff81067b37>] ? autoremove_wake_function+0x0/0x39 [<ffffffff8141d6e9>] ? _spin_unlock_irqrestore+0x19/0x1b [<ffffffff8119cd34>] ? kjournald+0x0/0x230 [<ffffffff810677b5>] kthread+0x91/0x99 [<ffffffff81012daa>] child_rip+0xa/0x20 [<ffffffff81011f97>] ? int_ret_from_sys_call+0x7/0x1b [<ffffffff8101271d>] ? retint_restore_args+0x5/0x6 [<ffffffff81012da0>] ? child_rip+0x0/0x20 INFO: task rsyslogd:3571 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. rsyslogd D ffff880003797480 0 3571 1 0x00000080 ffff88001b0cbd28 0000000000000282 ffffffff8100e812 ffff88002d961f58 0000000001314f32 ffff88001b0cbdc8 0000000000000002 ffffffff810c154d ffff88000e0383c8 000000000000f8f0 ffff88000e0383c8 0000000000015600 Call Trace: [<ffffffff8100e812>] ? check_events+0x12/0x20 [<ffffffff810c154d>] ? sync_page+0x0/0x4a [<ffffffff8106e8e3>] ? clocksource_read+0xf/0x11 [<ffffffff8106acf2>] ? ktime_get_ts+0x4e/0x53 [<ffffffff810c154d>] ? sync_page+0x0/0x4a [<ffffffff8141bf53>] io_schedule+0x31/0x42 [<ffffffff810c1593>] sync_page+0x46/0x4a [<ffffffff8141c35b>] __wait_on_bit_lock+0x46/0x8f [<ffffffff810e4381>] ? read_swap_cache_async+0x41/0x121 [<ffffffff810c14f5>] __lock_page+0x66/0x6d [<ffffffff81067b70>] ? wake_bit_function+0x0/0x33 [<ffffffff810d681d>] lock_page+0x3a/0x3e [<ffffffff810d8b77>] handle_mm_fault+0x41f/0x705 [<ffffffff8100e139>] ? xen_force_evtchn_callback+0xd/0xf [<ffffffff8100e812>] ? check_events+0x12/0x20 [<ffffffff8100e7ff>] ? xen_restore_fl_direct_end+0x0/0x1 [<ffffffff8141fba1>] do_page_fault+0x281/0x299 [<ffffffff8141db35>] page_fault+0x25/0x30 INFO: task java:1446 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. java D ffff88002c73fb80 0 1446 1 0x00000080 ffff88002d8cbb38 0000000000000282 00003ffffffff000 ffff88002d8c00d8 ffff88002f271188 00000000000000ab ffffffff8100e139 ffff88002d8cbb08 ffff88002d864a48 000000000000f8f0 ffff88002d864a48 0000000000015600 Call Trace: [<ffffffff8100e139>] ? xen_force_evtchn_callback+0xd/0xf [<ffffffff8100e7ff>] ? xen_restore_fl_direct_end+0x0/0x1 [<ffffffff8141d6e9>] ? _spin_unlock_irqrestore+0x19/0x1b [<ffffffff81198743>] do_get_write_access+0x1d4/0x3bc [<ffffffff8100e7ff>] ? xen_restore_fl_direct_end+0x0/0x1 [<ffffffff81067b70>] ? wake_bit_function+0x0/0x33 [<ffffffff8119c095>] ? journal_add_journal_head+0xbf/0x13c [<ffffffff81198956>] journal_get_write_access+0x2b/0x40 [<ffffffff81164539>] __ext3_journal_get_write_access+0x25/0x51 [<ffffffff81157b4f>] ext3_reserve_inode_write+0x43/0x87 [<ffffffff8100e812>] ? check_events+0x12/0x20 [<ffffffff81157bd3>] ext3_mark_inode_dirty+0x40/0x5d [<ffffffff81157d32>] ext3_dirty_inode+0x6a/0x81 [<ffffffff811173d1>] __mark_inode_dirty+0x34/0xe3 [<ffffffff8110de79>] file_update_time+0xae/0xed [<ffffffff810d74a8>] do_wp_page+0x60c/0x640 [<ffffffff8100c3e5>] ? __raw_callee_save_xen_pud_val+0x11/0x1e [<ffffffff8100c3a9>] ? __raw_callee_save_xen_pmd_val+0x11/0x1e [<ffffffff810d8de8>] handle_mm_fault+0x690/0x705 [<ffffffff8100e139>] ? xen_force_evtchn_callback+0xd/0xf [<ffffffff8100e812>] ? check_events+0x12/0x20 [<ffffffff8100e7ff>] ? xen_restore_fl_direct_end+0x0/0x1 [<ffffffff8141d6e9>] ? _spin_unlock_irqrestore+0x19/0x1b [<ffffffff8141fba1>] do_page_fault+0x281/0x299 [<ffffffff8141db35>] page_fault+0x25/0x30 INFO: task kjournald:242 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. kjournald D ffff88002c73fb80 0 242 2 0x00000000 ffff8800031d1c00 0000000000000246 000906996fc8e4d9 ffff88000315a318 0000000001311302 ffffffff8111c908 ffff8800031d1cb0 0000000000000002 ffff8800037861c8 000000000000f8f0 ffff8800037861c8 0000000000015600 Call Trace: [<ffffffff8111c908>] ? sync_buffer+0x0/0x44 [<ffffffff8106e8e3>] ? clocksource_read+0xf/0x11 [<ffffffff8106acf2>] ? ktime_get_ts+0x4e/0x53 [<ffffffff8111c908>] ? sync_buffer+0x0/0x44 [<ffffffff8141bf53>] io_schedule+0x31/0x42 [<ffffffff8111c948>] sync_buffer+0x40/0x44 [<ffffffff8141c465>] __wait_on_bit+0x48/0x7b [<ffffffff8100e7ff>] ? xen_restore_fl_direct_end+0x0/0x1 [<ffffffff8141c506>] out_of_line_wait_on_bit+0x6e/0x79 [<ffffffff8111c908>] ? sync_buffer+0x0/0x44 [<ffffffff81067b70>] ? wake_bit_function+0x0/0x33 [<ffffffff8111c86e>] __wait_on_buffer+0x24/0x26 [<ffffffff81199272>] wait_on_buffer+0x3b/0x3f [<ffffffff81199c59>] journal_commit_transaction+0x92e/0xe55 [<ffffffff8100e7ff>] ? xen_restore_fl_direct_end+0x0/0x1 [<ffffffff8141d6e9>] ? _spin_unlock_irqrestore+0x19/0x1b [<ffffffff8105c19b>] ? try_to_del_timer_sync+0x5a/0x66 [<ffffffff8100e7ff>] ? xen_restore_fl_direct_end+0x0/0x1 [<ffffffff8119ce1f>] kjournald+0xeb/0x230 [<ffffffff81067b37>] ? autoremove_wake_function+0x0/0x39 [<ffffffff8141d6e9>] ? _spin_unlock_irqrestore+0x19/0x1b [<ffffffff8119cd34>] ? kjournald+0x0/0x230 [<ffffffff810677b5>] kthread+0x91/0x99 [<ffffffff81012daa>] child_rip+0xa/0x20 [<ffffffff81011f97>] ? int_ret_from_sys_call+0x7/0x1b [<ffffffff8101271d>] ? retint_restore_args+0x5/0x6 [<ffffffff81012da0>] ? child_rip+0x0/0x20 INFO: task rsyslogd:3571 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. rsyslogd D ffff880003797480 0 3571 1 0x00000080 ffff88001b0cbd28 0000000000000282 ffffffff8100e812 ffff88002d961f58 0000000001314f32 ffff88001b0cbdc8 0000000000000002 ffffffff810c154d ffff88000e0383c8 000000000000f8f0 ffff88000e0383c8 0000000000015600 Call Trace: [<ffffffff8100e812>] ? check_events+0x12/0x20 [<ffffffff810c154d>] ? sync_page+0x0/0x4a [<ffffffff8106e8e3>] ? clocksource_read+0xf/0x11 [<ffffffff8106acf2>] ? ktime_get_ts+0x4e/0x53 [<ffffffff810c154d>] ? sync_page+0x0/0x4a [<ffffffff8141bf53>] io_schedule+0x31/0x42 [<ffffffff810c1593>] sync_page+0x46/0x4a [<ffffffff8141c35b>] __wait_on_bit_lock+0x46/0x8f [<ffffffff810e4381>] ? read_swap_cache_async+0x41/0x121 [<ffffffff810c14f5>] __lock_page+0x66/0x6d [<ffffffff81067b70>] ? wake_bit_function+0x0/0x33 [<ffffffff810d681d>] lock_page+0x3a/0x3e [<ffffffff810d8b77>] handle_mm_fault+0x41f/0x705 [<ffffffff8100e139>] ? xen_force_evtchn_callback+0xd/0xf [<ffffffff8100e812>] ? check_events+0x12/0x20 [<ffffffff8100e7ff>] ? xen_restore_fl_direct_end+0x0/0x1 [<ffffffff8141fba1>] do_page_fault+0x281/0x299 [<ffffffff8141db35>] page_fault+0x25/0x30 INFO: task ntpd:1005 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. ntpd D ffff88002c73c380 0 1005 1 0x00000080 ffff88002d4aba18 0000000000000282 ffff88002d4abc38 ffff88002d4f5c48 0000000001317516 ffff88002d4abac8 0000000000000002 ffffffff8111c908 ffff8800030e1b48 000000000000f8f0 ffff8800030e1b48 0000000000015600 Call Trace: [<ffffffff8111c908>] ? sync_buffer+0x0/0x44 [<ffffffff8106e8e3>] ? clocksource_read+0xf/0x11 [<ffffffff8106acf2>] ? ktime_get_ts+0x4e/0x53 [<ffffffff8111c908>] ? sync_buffer+0x0/0x44 [<ffffffff8141bf53>] io_schedule+0x31/0x42 [<ffffffff8111c948>] sync_buffer+0x40/0x44 [<ffffffff8141c35b>] __wait_on_bit_lock+0x46/0x8f [<ffffffff8141c412>] out_of_line_wait_on_bit_lock+0x6e/0x79 [<ffffffff8111c908>] ? sync_buffer+0x0/0x44 [<ffffffff81067b70>] ? wake_bit_function+0x0/0x33 [<ffffffff8111cac8>] __lock_buffer+0x2a/0x2c [<ffffffff81197bd3>] lock_buffer+0x3a/0x3f [<ffffffff811985e2>] do_get_write_access+0x73/0x3bc [<ffffffff8119c01a>] ? journal_add_journal_head+0x44/0x13c [<ffffffff81197848>] ? __journal_file_buffer+0xcc/0x177 [<ffffffff8119c095>] ? journal_add_journal_head+0xbf/0x13c [<ffffffff81198956>] journal_get_write_access+0x2b/0x40 [<ffffffff81164539>] __ext3_journal_get_write_access+0x25/0x51 [<ffffffff8115664f>] ext3_new_inode+0x608/0x9da [<ffffffff811990bf>] ? journal_start+0x65/0xe1 [<ffffffff810f36eb>] ? kmem_cache_alloc+0xa7/0x11c [<ffffffff8115d534>] ext3_create+0xa0/0x126 [<ffffffff81105f20>] vfs_create+0x78/0x9a [<ffffffff81106e02>] do_filp_open+0x314/0x967 [<ffffffff811074e0>] ? getname+0x31/0x1c3 [<ffffffff810401ab>] ? __might_sleep+0x27/0xe8 [<ffffffff81204c64>] ? might_fault+0x1f/0x21 [<ffffffff81204d63>] ? __strncpy_from_user+0x1e/0x49 [<ffffffff8111045f>] ? alloc_fd+0x7b/0x124 [<ffffffff810fae02>] do_sys_open+0x63/0x10f [<ffffffff810faee1>] sys_open+0x20/0x22 [<ffffffff81011cf2>] system_call_fastpath+0x16/0x1b
boot dmesg from this system: [root@buildmaster ~]# dmesg Initializing cgroup subsys cpuset Initializing cgroup subsys cpu Linux version 2.6.31.12-174.2.3.fc12.x86_64 (mockbuild.fedoraproject.org) (gcc version 4.4.2 20091222 (Red Hat 4.4.2-20) (GCC) ) #1 SMP Mon Jan 18 19:52:07 UTC 2010 Command line: ro root=/dev/VolGroup00/LogVol00 rhgb quiet SYSFONT=latarcyrheb-sun16 LANG=en_US.UTF-8 KEYTABLE=us KERNEL supported cpus: Intel GenuineIntel AMD AuthenticAMD Centaur CentaurHauls ACPI in unprivileged domain disabled BIOS-provided physical RAM map: Xen: 0000000000000000 - 00000000000a0000 (usable) Xen: 00000000000a0000 - 0000000000100000 (reserved) Xen: 0000000000100000 - 0000000030000000 (usable) DMI not present or invalid. last_pfn = 0x30000 max_arch_pfn = 0x400000000 initial memory mapped : 0 - 20000000 init_memory_mapping: 0000000000000000-0000000030000000 0000000000 - 0030000000 page 4k kernel direct mapping tables up to 30000000 @ 100000-282000 RAMDISK: 019d1000 - 03921000 No NUMA configuration found Faking a node at 0000000000000000-0000000030000000 Bootmem setup node 0 0000000000000000-0000000030000000 NODE_DATA [0000000000008000 - 000000000001dfff] bootmap [000000000001e000 - 0000000000023fff] pages 6 (7 early reservations) ==> bootmem [0000000000 - 0030000000] #0 [0000000000 - 0000001000] BIOS data page ==> [0000000000 - 0000001000] #1 [0003aa4000 - 0003ac5000] XEN PAGETABLES ==> [0003aa4000 - 0003ac5000] #2 [0000006000 - 0000008000] TRAMPOLINE ==> [0000006000 - 0000008000] #3 [0001000000 - 00019b0b40] TEXT DATA BSS ==> [0001000000 - 00019b0b40] #4 [00019d1000 - 0003921000] RAMDISK ==> [00019d1000 - 0003921000] #5 [0003921000 - 0003aa4000] XEN START INFO ==> [0003921000 - 0003aa4000] #6 [0000100000 - 000025e000] PGTABLE ==> [0000100000 - 000025e000] Zone PFN ranges: DMA 0x00000000 -> 0x00001000 DMA32 0x00001000 -> 0x00100000 Normal 0x00100000 -> 0x00100000 Movable zone start PFN for each node early_node_map[2] active PFN ranges 0: 0x00000000 -> 0x000000a0 0: 0x00000100 -> 0x00030000 On node 0 totalpages: 196512 DMA zone: 56 pages used for memmap DMA zone: 353 pages reserved DMA zone: 3591 pages, LIFO batch:0 DMA32 zone: 2632 pages used for memmap DMA32 zone: 189880 pages, LIFO batch:31 SMP: Allowing 2 CPUs, 0 hotplug CPUs No local APIC present APIC: disable apic facility nr_irqs_gsi: 16 PM: Registered nosave memory: 00000000000a0000 - 0000000000100000 Allocating PCI resources starting at 30000000 (gap: 30000000:d0000000) NR_CPUS:256 nr_cpumask_bits:256 nr_cpu_ids:2 nr_node_ids:1 PERCPU: Allocated 22 4k pages, static data 90080 bytes trying to map vcpu_info 0 at ffffc9000000b020, mfn 1ec2b9, offset 32 cpu 0 using vcpu_info at ffffc9000000b020 trying to map vcpu_info 1 at ffffc90000023020, mfn 1ec2a3, offset 32 cpu 1 using vcpu_info at ffffc90000023020 Xen: using vcpu_info placement Built 1 zonelists in Node order, mobility grouping on. Total pages: 193471 Policy zone: DMA32 Kernel command line: ro root=/dev/VolGroup00/LogVol00 rhgb quiet SYSFONT=latarcyrheb-sun16 LANG=en_US.UTF-8 KEYTABLE=us PID hash table entries: 4096 (order: 12, 32768 bytes) Initializing CPU#0 Checking aperture... No AGP bridge found Calgary: detecting Calgary via BIOS EBDA area Calgary: Unable to locate Rio Grande table in EBDA - bailing! Memory: 729800k/786432k available (4238k kernel code, 384k absent, 56248k reserved, 2924k data, 1328k init) SLUB: Genslabs=14, HWalign=64, Order=0-3, MinObjects=0, CPUs=2, Nodes=1 Hierarchical RCU implementation. NR_IRQS:4352 nr_irqs:288 Detected 1991.509 MHz processor. Console: colour dummy device 80x25 console [tty0] enabled console [hvc0] enabled allocated 7864320 bytes of page_cgroup please try 'cgroup_disable=memory' option if you don't want memory cgroups Xen: using vcpuop timer interface installing Xen timer for CPU 0 Calibrating delay loop (skipped), value calculated using timer frequency.. 3983.01 BogoMIPS (lpj=1991509) Security Framework initialized SELinux: Initializing. SELinux: Starting in permissive mode Dentry cache hash table entries: 131072 (order: 8, 1048576 bytes) Inode-cache hash table entries: 65536 (order: 7, 524288 bytes) Mount-cache hash table entries: 256 Initializing cgroup subsys ns Initializing cgroup subsys cpuacct Initializing cgroup subsys memory Initializing cgroup subsys devices Initializing cgroup subsys freezer Initializing cgroup subsys net_cls CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line) CPU: L2 Cache: 1024K (64 bytes/line) CPU 0/0x1 -> Node 0 tseg: 0000000000 Performance Counters: AMD PMU driver. ------------[ cut here ]------------ WARNING: at arch/x86/kernel/apic/apic.c:247 native_apic_write_dummy+0x35/0x41() (Not tainted) Modules linked in: Pid: 0, comm: swapper Not tainted 2.6.31.12-174.2.3.fc12.x86_64 #1 Call Trace: [<ffffffff81051710>] warn_slowpath_common+0x84/0x9c [<ffffffff8105173c>] warn_slowpath_null+0x14/0x16 [<ffffffff81026175>] native_apic_write_dummy+0x35/0x41 [<ffffffff8101cdaa>] perf_counters_lapic_init+0x33/0x35 [<ffffffff8171ebc0>] init_hw_perf_counters+0x300/0x39d [<ffffffff8171e77b>] identify_boot_cpu+0x3c/0x3e [<ffffffff8171e89c>] check_bugs+0x9/0x2d [<ffffffff81716db8>] start_kernel+0x3e0/0x3fa [<ffffffff817162a1>] x86_64_start_reservations+0xac/0xb0 [<ffffffff81719cf1>] xen_start_kernel+0x4e4/0x4e8 ---[ end trace a7919e7f17c0a725 ]--- ... version: 0 ... bit width: 48 ... generic counters: 4 ... value mask: 0000ffffffffffff ... max period: 00007fffffffffff ... fixed-purpose counters: 0 ... counter mask: 000000000000000f SMP alternatives: switching to UP code ftrace: converting mcount calls to 0f 1f 44 00 00 ftrace: allocating 20894 entries in 82 pages installing Xen timer for CPU 1 SMP alternatives: switching to SMP code Initializing CPU#1 CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line) CPU: L2 Cache: 1024K (64 bytes/line) CPU 1/0x1 -> Node 0 Brought up 2 CPUs sizeof(vma)=176 bytes sizeof(page)=56 bytes sizeof(inode)=584 bytes sizeof(dentry)=192 bytes sizeof(ext3inode)=776 bytes sizeof(buffer_head)=104 bytes sizeof(skbuff)=248 bytes sizeof(task_struct)=6008 bytes CPU0 attaching sched-domain: domain 0: span 0-1 level CPU groups: 0 1 CPU1 attaching sched-domain: domain 0: span 0-1 level CPU groups: 1 0 Booting paravirtualized kernel on Xen Xen version: 3.4.1 (preserve-AD) Grant table initialized regulator: core version 0.5 Time: 165:165:165 Date: 165/165/65 NET: Registered protocol family 16 PCI: Fatal: No config space access function found bio: create slab <bio-0> at 0 ACPI: Interpreter disabled. xen_balloon: Initialising balloon driver. vgaarb: loaded SCSI subsystem initialized libata version 3.00 loaded. usbcore: registered new interface driver usbfs usbcore: registered new interface driver hub usbcore: registered new device driver usb PCI: System does not support PCI PCI: System does not support PCI NetLabel: Initializing NetLabel: domain hash size = 128 NetLabel: protocols = UNLABELED CIPSOv4 NetLabel: unlabeled traffic allowed by default Switched to high resolution mode on CPU 0 pnp: PnP ACPI: disabled NET: Registered protocol family 2 IP route cache hash table entries: 32768 (order: 6, 262144 bytes) TCP established hash table entries: 131072 (order: 9, 2097152 bytes) Switched to high resolution mode on CPU 1 TCP bind hash table entries: 65536 (order: 8, 1048576 bytes) TCP: Hash tables configured (established 131072 bind 65536) TCP reno registered NET: Registered protocol family 1 Trying to unpack rootfs image as initramfs... Freeing initrd memory: 32064k freed platform rtc_cmos: registered platform RTC device (no PNP device found) audit: initializing netlink socket (disabled) type=2000 audit(1265674112.410:1): initialized HugeTLB registered 2 MB page size, pre-allocated 0 pages VFS: Disk quotas dquot_6.5.2 Dquot-cache hash table entries: 512 (order 0, 4096 bytes) msgmni has been set to 1598 SELinux: Registering netfilter hooks alg: No test for stdrng (krng) Block layer SCSI generic (bsg) driver version 0.4 loaded (major 252) io scheduler noop registered io scheduler anticipatory registered io scheduler deadline registered io scheduler cfq registered (default) pci_hotplug: PCI Hot Plug PCI Core version: 0.5 pciehp: PCI Express Hot Plug Controller Driver version: 0.4 acpiphp: ACPI Hot Plug PCI Controller Driver version: 0.5 Non-volatile memory driver v1.3 Linux agpgart interface v0.103 Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled brd: module loaded loop: module loaded input: Macintosh mouse button emulation as /devices/virtual/input/input0 Fixed MDIO Bus: probed ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver ohci_hcd: USB 1.1 'Open' Host Controller (OHCI) Driver uhci_hcd: USB Universal Host Controller Interface driver PNP: No PS/2 controller found. Probing ports directly. mice: PS/2 mouse device common for all mice rtc_cmos: probe of rtc_cmos failed with error -16 device-mapper: uevent: version 1.0.3 device-mapper: ioctl: 4.15.0-ioctl (2009-04-01) initialised: dm-devel cpuidle: using governor ladder cpuidle: using governor menu usbcore: registered new interface driver hiddev usbcore: registered new interface driver usbhid usbhid: v2.6:USB HID core driver nf_conntrack version 0.5.0 (6144 buckets, 24576 max) CONFIG_NF_CT_ACCT is deprecated and will be removed soon. Please use nf_conntrack.acct=1 kernel parameter, acct=1 nf_conntrack module option or sysctl net.netfilter.nf_conntrack_acct=1 to enable it. ip_tables: (C) 2000-2006 Netfilter Core Team TCP cubic registered Initializing XFRM netlink socket NET: Registered protocol family 17 PM: Resume from disk failed. registered taskstats version 1 No TPM chip found, activating TPM-bypass! XENBUS: Device with no driver: device/console/0 XENBUS: Device with no driver: device/vbd/51712 XENBUS: Device with no driver: device/vif/0 Magic number: 1:252:3141 drivers/rtc/hctosys.c: unable to open rtc device (rtc0) Initalizing network drop monitor service Freeing unused kernel memory: 1328k freed Write protecting the kernel read-only data: 6356k dracut: dracut-004-4.fc12 udev: starting version 145 dracut: Starting plymouth daemon xvda: xvda1 xvda2 dracut: Scanning devices xvda2 for LVM volume groups dracut: Reading all physical volumes. This may take a while... dracut: Found volume group "VolGroup00" using metadata type lvm2 dracut: 2 logical volume(s) in volume group "VolGroup00" now active EXT3-fs: INFO: recovery required on readonly filesystem. EXT3-fs: write access will be enabled during recovery. kjournald starting. Commit interval 5 seconds EXT3-fs: dm-0: orphan cleanup on readonly fs ext3_orphan_cleanup: deleting unreferenced inode 1289059 ext3_orphan_cleanup: deleting unreferenced inode 1736760 ext3_orphan_cleanup: deleting unreferenced inode 1289563 ext3_orphan_cleanup: deleting unreferenced inode 1290658 ext3_orphan_cleanup: deleting unreferenced inode 1286573 ext3_orphan_cleanup: deleting unreferenced inode 1736944 ext3_orphan_cleanup: deleting unreferenced inode 1289670 ext3_orphan_cleanup: deleting unreferenced inode 1657226 ext3_orphan_cleanup: deleting unreferenced inode 1657225 ext3_orphan_cleanup: deleting unreferenced inode 1657224 ext3_orphan_cleanup: deleting unreferenced inode 1657223 ext3_orphan_cleanup: deleting unreferenced inode 1657222 ext3_orphan_cleanup: deleting unreferenced inode 1657221 ext3_orphan_cleanup: deleting unreferenced inode 1657220 ext3_orphan_cleanup: deleting unreferenced inode 1657219 ext3_orphan_cleanup: deleting unreferenced inode 1657218 ext3_orphan_cleanup: deleting unreferenced inode 1657217 ext3_orphan_cleanup: deleting unreferenced inode 1657216 ext3_orphan_cleanup: deleting unreferenced inode 1736792 ext3_orphan_cleanup: deleting unreferenced inode 1290604 ext3_orphan_cleanup: deleting unreferenced inode 1288243 ext3_orphan_cleanup: deleting unreferenced inode 1736926 ext3_orphan_cleanup: deleting unreferenced inode 1287216 ext3_orphan_cleanup: deleting unreferenced inode 1290092 ext3_orphan_cleanup: deleting unreferenced inode 1287519 ext3_orphan_cleanup: deleting unreferenced inode 1287269 ext3_orphan_cleanup: deleting unreferenced inode 1736791 ext3_orphan_cleanup: deleting unreferenced inode 1289235 ext3_orphan_cleanup: deleting unreferenced inode 1393164 ext3_orphan_cleanup: deleting unreferenced inode 1287902 ext3_orphan_cleanup: deleting unreferenced inode 1736775 ext3_orphan_cleanup: deleting unreferenced inode 1507843 ext3_orphan_cleanup: deleting unreferenced inode 1507366 ext3_orphan_cleanup: deleting unreferenced inode 1507841 ext3_orphan_cleanup: deleting unreferenced inode 1384449 ext3_orphan_cleanup: deleting unreferenced inode 1507837 ext3_orphan_cleanup: deleting unreferenced inode 1507836 ext3_orphan_cleanup: deleting unreferenced inode 1507835 ext3_orphan_cleanup: deleting unreferenced inode 1507331 ext3_orphan_cleanup: deleting unreferenced inode 1507833 ext3_orphan_cleanup: deleting unreferenced inode 1507618 ext3_orphan_cleanup: deleting unreferenced inode 1507410 ext3_orphan_cleanup: deleting unreferenced inode 1507831 ext3_orphan_cleanup: deleting unreferenced inode 1287648 ext3_orphan_cleanup: deleting unreferenced inode 1507830 ext3_orphan_cleanup: deleting unreferenced inode 1507829 ext3_orphan_cleanup: deleting unreferenced inode 1287398 ext3_orphan_cleanup: deleting unreferenced inode 1507825 ext3_orphan_cleanup: deleting unreferenced inode 1287301 ext3_orphan_cleanup: deleting unreferenced inode 1287248 ext3_orphan_cleanup: deleting unreferenced inode 1287220 ext3_orphan_cleanup: deleting unreferenced inode 1286853 ext3_orphan_cleanup: deleting unreferenced inode 1287219 ext3_orphan_cleanup: deleting unreferenced inode 1507726 ext3_orphan_cleanup: deleting unreferenced inode 1507676 ext3_orphan_cleanup: deleting unreferenced inode 1507335 ext3_orphan_cleanup: deleting unreferenced inode 1287100 ext3_orphan_cleanup: deleting unreferenced inode 1287012 ext3_orphan_cleanup: deleting unreferenced inode 1286960 ext3_orphan_cleanup: deleting unreferenced inode 1507666 ext3_orphan_cleanup: deleting unreferenced inode 1507658 ext3_orphan_cleanup: deleting unreferenced inode 1507657 ext3_orphan_cleanup: deleting unreferenced inode 1507656 ext3_orphan_cleanup: deleting unreferenced inode 1507652 ext3_orphan_cleanup: deleting unreferenced inode 1507644 ext3_orphan_cleanup: deleting unreferenced inode 1507425 ext3_orphan_cleanup: deleting unreferenced inode 1286865 ext3_orphan_cleanup: deleting unreferenced inode 1286813 ext3_orphan_cleanup: deleting unreferenced inode 1286738 ext3_orphan_cleanup: deleting unreferenced inode 1507582 ext3_orphan_cleanup: deleting unreferenced inode 1507545 ext3_orphan_cleanup: deleting unreferenced inode 1507536 ext3_orphan_cleanup: deleting unreferenced inode 1507461 ext3_orphan_cleanup: deleting unreferenced inode 1507441 ext3_orphan_cleanup: deleting unreferenced inode 1286653 ext3_orphan_cleanup: deleting unreferenced inode 1507430 ext3_orphan_cleanup: deleting unreferenced inode 1507414 ext3_orphan_cleanup: deleting unreferenced inode 1507384 ext3_orphan_cleanup: deleting unreferenced inode 1507379 ext3_orphan_cleanup: deleting unreferenced inode 1507367 ext3_orphan_cleanup: deleting unreferenced inode 1507350 ext3_orphan_cleanup: deleting unreferenced inode 1507342 ext3_orphan_cleanup: deleting unreferenced inode 1507341 ext3_orphan_cleanup: deleting unreferenced inode 1507337 EXT3-fs: dm-0: 84 orphan inodes deleted EXT3-fs: recovery complete. EXT3-fs: mounted filesystem with ordered data mode. dracut: Mounted root filesystem /dev/mapper/VolGroup00-LogVol00 dracut: Switching root udev: starting version 145 Initialising Xen virtual ethernet driver. device-mapper: multipath: version 1.1.0 loaded EXT3 FS on dm-0, internal journal kjournald starting. Commit interval 5 seconds EXT3 FS on xvda1, internal journal EXT3-fs: mounted filesystem with ordered data mode. Adding 1671160k swap on /dev/mapper/VolGroup00-LogVol01. Priority:-1 extents:1 across:1671160k NET: Registered protocol family 10 lo: Disabled Privacy Extensions RPC: Registered udp transport module. RPC: Registered tcp transport module. eth0: no IPv6 routers present
Is clocksource=jiffies help in this case? Similar Xen systems that might be having these problems are bug #551552 and bug #550724 But I'm only guessing.
(In reply to comment #19) > Is clocksource=jiffies help in this case? > > Similar Xen systems that might be having these problems are bug #551552 and bug > #550724 > > But I'm only guessing. It appears F12/Kernel 2.6.31 have no option for jiffies, just xen clocksource. bug #551552 looks like a dupe of this, the other one could be different but related.
just installed kernel-2.6.30.10-105.2.23.fc11.x86_64 on my F12 domU system similar to your setup (see #550724) and set clocksource=jiffies Too early to tell if this is more stable, but the clock gains time very quickly. ntpd does not seem to be keeping it stable. Perhaps I have ntpd setup wrong.
Well that didn't last long. 2.6.30.10-105.2.23 crashed pretty quickly. So 2.6.30 with clocksource=jiffies was not stable for me.
Neen quite here for awhile, are you still seeing this problem? Could you also post the dmesg of your dom0?
I suspect that 2.6.32.23-170.fc12.x86_64 fixes this problem. See bug 636534 and bug 550724
This message is a reminder that Fedora 12 is nearing its end of life. Approximately 30 (thirty) days from now Fedora will stop maintaining and issuing updates for Fedora 12. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as WONTFIX if it remains open with a Fedora 'version' of '12'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version prior to Fedora 12's end of life. Bug Reporter: Thank you for reporting this issue and we are sorry that we may not be able to fix it before Fedora 12 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora please change the 'version' of this bug to the applicable version. If you are unable to change the version, please add a comment here and someone will do it for you. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete. The process we are following is described here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping
Fedora 12 changed to end-of-life (EOL) status on 2010-12-02. Fedora 12 is no longer maintained, which means that it will not receive any further security or bug fix updates. As a result we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora please feel free to reopen this bug against that version. Thank you for reporting this bug and we are sorry it could not be fixed.