Bug 709856
Summary: | Kernel trace on m2.4xlarge or m2.2xlarge instances in EC2 | |||
---|---|---|---|---|
Product: | Red Hat Enterprise Linux 6 | Reporter: | Jay Greguske <jgreguske> | |
Component: | kernel | Assignee: | Andrew Jones <drjones> | |
Status: | CLOSED ERRATA | QA Contact: | Red Hat Kernel QE team <kernel-qe> | |
Severity: | urgent | Docs Contact: | ||
Priority: | urgent | |||
Version: | 6.1 | CC: | behoward, bsarathy, clalance, cmorgan, dhoward, drjones, fhrbata, kzhang, lersek, pbonzini, qwan, sforsber, sghosh, tburke, whayutin, yugzhang | |
Target Milestone: | rc | Keywords: | EC2, ZStream | |
Target Release: | --- | |||
Hardware: | x86_64 | |||
OS: | Linux | |||
Whiteboard: | ||||
Fixed In Version: | kernel-2.6.32-156.el6 | Doc Type: | Bug Fix | |
Doc Text: |
Xen guests cannot make use of all CPU features, and in some cases they are even risky to be advertised. One such feature is CONSTANT_TSC. This feature prevents the TSC (Time Stamp Counter) from being marked as unstable, which allows the sched_clock_stable option to be enabled. Having the sched_clock_stable option enabled is problematic for Xen PV guests because the sched_clock() function has been overridden with the xen_sched_clock() function, which is not synchronized between virtual CPUs. This update provides a patch, which sets all x86_power features to 0 as a preventive measure against other potentially dangerous assumptions the kernel could make based on the features, fixing this issue.
|
Story Points: | --- | |
Clone Of: | ||||
: | 711317 711322 (view as bug list) | Environment: | ||
Last Closed: | 2011-12-06 13:08:46 UTC | Type: | --- | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 653816, 710609, 711317, 711322 | |||
Attachments: |
Description
Jay Greguske
2011-06-01 19:17:26 UTC
Ben, RH Kernel folks are asking all of the output besides the kernel trace. i-bb12a1d5 2011-06-01T20:24:40+0000 Xen Minimal OS! start_info: 0x4dec000(VA) nr_pages: 0x88b800 shared_inf: 0xbf58d000(MA) pt_base: 0x4def000(VA) nr_pt_frames: 0x2b mfn_list: 0x990000(VA) mod_start: 0x0(VA) mod_len: 0 flags: 0x0 cmd_line: root=/dev/sda1 ro 4 stack: 0x94f860-0x96f860 MM: Init _text: 0x0(VA) _etext: 0x5ff6d(VA) _erodata: 0x78000(VA) _edata: 0x80b00(VA) stack start: 0x94f860(VA) _end: 0x98fe68(VA) start_pfn: 4e1d max_pfn: 88b800 Mapping memory range 0x5000000 - 0x88b800000 setting 0x0-0x78000 readonly skipped 0x1000 MM: Initialise page allocator for 9273000(9273000)-88b800000(88b800000) MM: done Demand map pfns at 88b801000-288b801000. Heap resides at 288b802000-488b802000. Initialising timer interface Initialising console ... done. gnttab_table mapped at 0x88b801000. Initialising scheduler Thread "Idle": pointer: 0x288b802010, stack: 0xa010000 Initialising xenbus Thread "xenstore": pointer: 0x288b8027c0, stack: 0xa020000 Dummy main: start_info=0x96f960 Thread "main": pointer: 0x288b802f70, stack: 0xa030000 "main" "root=/dev/sda1" "ro" "4" vbd 2048 is hd0 ******************* BLKFRONT for device/vbd/2048 ********** backend at /local/domain/0/backend/vbd/62/2048 Failed to read /local/domain/0/backend/vbd/62/2048/feature-barrier. Failed to read /local/domain/0/backend/vbd/62/2048/feature-flush-cache. Booting 'RHEL6.1-20110510.1-Server-x86_64-starter-ec2 (2.6.32-131.0.15.el6.x8 6_64)' root (hd0) Filesystem type is ext2fs, using whole disk kernel /boot/vmlinuz-2.6.32-131.0.15.el6.x86_64 ro root=LABEL=_/ initrd /boot/initramfs-2.6.32-131.0.15.el6.x86_64.img close blk: backend at /local/domain/0/backend/vbd/62/2048 close blk: backend at /local/domain/0/backend/vbd/62/2128 Initializing cgroup subsys cpuset Initializing cgroup subsys cpu Linux version 2.6.32-131.0.15.el6.x86_64 (mockbuild.bos.redhat.com) (gcc version 4.4.4 20100726 (Red Hat 4.4.4-13) (GCC) ) #1 SMP Tue May 10 15:42:40 EDT 2011 Command line: ro root=LABEL=_/ KERNEL supported cpus: Intel GenuineIntel AMD AuthenticAMD Centaur CentaurHauls ACPI in unprivileged domain disabled BIOS-provided physical RAM map: Xen: 0000000000000000 - 00000000000a0000 (usable) Xen: 00000000000a0000 - 0000000000100000 (reserved) Xen: 0000000000100000 - 0000000800000000 (usable) DMI not present or invalid. last_pfn = 0x800000 max_arch_pfn = 0x400000000 last_pfn = 0x100000 max_arch_pfn = 0x400000000 init_memory_mapping: 0000000000000000-0000000100000000 init_memory_mapping: 0000000100000000-0000000800000000 RAMDISK: 01f68000 - 046d8000 No NUMA configuration found Faking a node at 0000000000000000-0000000800000000 Bootmem setup node 0 0000000000000000-0000000800000000 NODE_DATA [0000000000008000 - 000000000003bfff] bootmap [00000000008b7000 - 00000000009b6fff] pages 100 (8 early reservations) ==> bootmem [0000000000 - 0800000000] #0 [0000000000 - 0000001000] BIOS data page ==> [0000000000 - 0000001000] #1 [0008b37000 - 0008b82000] XEN PAGETABLES ==> [0008b37000 - 0008b82000] #2 [0000006000 - 0000008000] TRAMPOLINE ==> [0000006000 - 0000008000] #3 [0001000000 - 0001f474e4] TEXT DATA BSS ==> [0001000000 - 0001f474e4] #4 [0001f68000 - 00046d8000] RAMDISK ==> [0001f68000 - 00046d8000] #5 [00046d8000 - 0008b37000] XEN START INFO ==> [00046d8000 - 0008b37000] #6 [0000100000 - 00008b7000] PGTABLE ==> [0000100000 - 00008b7000] #7 [0008b82000 - 000c39e000] PGTABLE ==> [0008b82000 - 000c39e000] Zone PFN ranges: DMA 0x00000001 -> 0x00001000 DMA32 0x00001000 -> 0x00100000 Normal 0x00100000 -> 0x00800000 Movable zone start PFN for each node early_node_map[2] active PFN ranges 0: 0x00000001 -> 0x000000a0 0: 0x00000100 -> 0x00800000 SFI: Simple Firmware Interface v0.7 http://simplefirmware.org SMP: Allowing 4 CPUs, 0 hotplug CPUs No local APIC present APIC: disable apic facility PM: Registered nosave memory: 00000000000a0000 - 0000000000100000 PCI: Warning: Cannot find a gap in the 32bit address range PCI: Unassigned devices with 32bit resource registers may break! Allocating PCI resources starting at 800100000 (gap: 800100000:400000) Booting paravirtualized kernel on Xen Xen version: 3.1.2-128.1.10.el5 (preserve-AD) NR_CPUS:4096 nr_cpumask_bits:4 nr_cpu_ids:4 nr_node_ids:1 PERCPU: Embedded 30 pages/cpu @ffff880028050000 s92504 r8192 d22184 u122880 pcpu-alloc: s92504 r8192 d22184 u122880 alloc=30*4096 pcpu-alloc: [0] 0 [0] 1 [0] 2 [0] 3 Xen: using vcpu_info placement Built 1 zonelists in Zone order, mobility grouping on. Total pages: 8271845 Policy zone: Normal Kernel command line: ro root=LABEL=_/ PID hash table entries: 4096 (order: 3, 32768 bytes) Checking aperture... No AGP bridge found AMD-Vi disabled by default: pass amd_iommu=on to enable PCI-DMA: Using software bounce buffering for IO (SWIOTLB) Placing 64MB software IO TLB between ffff880020000000 - ffff880024000000 software IO TLB at phys 0x20000000 - 0x24000000 Memory: 32835652k/33554432k available (5014k kernel code, 388k absent, 718392k reserved, 6906k data, 1232k init) Hierarchical RCU implementation. NR_IRQS:33024 nr_irqs:304 Console: colour dummy device 80x25 console [tty0] enabled console [hvc0] enabled allocated 335544320 bytes of page_cgroup please try 'cgroup_disable=memory' option if you don't want memory cgroups installing Xen timer for CPU 0 Detected 2666.760 MHz processor. Calibrating delay loop (skipped), value calculated using timer frequency.. 5333.52 BogoMIPS (lpj=2666760) pid_max: default: 32768 minimum: 301 Security Framework initialized SELinux: Initializing. Dentry cache hash table entries: 4194304 (order: 13, 33554432 bytes) Inode-cache hash table entries: 2097152 (order: 12, 16777216 bytes) Mount-cache hash table entries: 256 Initializing cgroup subsys ns Initializing cgroup subsys cpuacct Initializing cgroup subsys memory Initializing cgroup subsys devices Initializing cgroup subsys freezer Initializing cgroup subsys net_cls Initializing cgroup subsys blkio CPU: Unsupported number of siblings 16 SMP alternatives: switching to UP code ftrace: converting mcount calls to 0f 1f 44 00 00 ftrace: allocating 20700 entries in 82 pages Performance Events: PEBS fmt1+, Nehalem events, no APIC, boot with the "lapic" boot parameter to force-enable it. no hardware sampling interrupt available. Broken PMU hardware detected, using software events only. NMI watchdog disabled (cpu0): hardware events not enabled installing Xen timer for CPU 1 SMP alternatives: switching to SMP code CPU: Unsupported number of siblings 16 installing Xen timer for CPU 2 CPU: Unsupported number of siblings 16 installing Xen timer for CPU 3 CPU: Unsupported number of siblings 16 Brought up 4 CPUs devtmpfs: initialized Grant table initialized regulator: core version 0.5 NET: Registered protocol family 16 PCI: Fatal: No config space access function found bio: create slab <bio-0> at 0 ACPI: Interpreter disabled. xen_balloon: Initialising balloon driver. vgaarb: loaded SCSI subsystem initialized usbcore: registered new interface driver usbfs usbcore: registered new interface driver hub usbcore: registered new device driver usb PCI: System does not support PCI PCI: System does not support PCI NetLabel: Initializing NetLabel: domain hash size = 128 NetLabel: protocols = UNLABELED CIPSOv4 NetLabel: unlabeled traffic allowed by default Switching to clocksource xen pnp: PnP ACPI: disabled NET: Registered protocol family 2 IP route cache hash table entries: 524288 (order: 10, 4194304 bytes) TCP established hash table entries: 524288 (order: 11, 8388608 bytes) TCP bind hash table entries: 65536 (order: 8, 1048576 bytes) TCP: Hash tables configured (established 524288 bind 65536) TCP reno registered NET: Registered protocol family 1 Trying to unpack rootfs image as initramfs... Freeing initrd memory: 40384k freed platform rtc_cmos: registered platform RTC device (no PNP device found) audit: initializing netlink socket (disabled) type=2000 audit(1306959870.823:1): initialized HugeTLB registered 2 MB page size, pre-allocated 0 pages VFS: Disk quotas dquot_6.5.2 Dquot-cache hash table entries: 512 (order 0, 4096 bytes) msgmni has been set to 32768 alg: No test for stdrng (krng) ksign: Installing public key data Loading keyring - Added public key C9F0DFBFCE81A817 - User ID: Red Hat, Inc. (Kernel Module GPG key) - Added public key D4A26C9CCD09BEDA - User ID: Red Hat Enterprise Linux Driver Update Program <secalert> Block layer SCSI generic (bsg) driver version 0.4 loaded (major 252) io scheduler noop registered io scheduler anticipatory registered io scheduler deadline registered io scheduler cfq registered (default) pci_hotplug: PCI Hot Plug PCI Core version: 0.5 pciehp: PCI Express Hot Plug Controller Driver version: 0.4 acpiphp: ACPI Hot Plug PCI Controller Driver version: 0.5 pci-stub: invalid id string "" Non-volatile memory driver v1.3 Linux agpgart interface v0.103 crash memory driver: version 1.1 Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled brd: module loaded loop: module loaded input: Macintosh mouse button emulation as /devices/virtual/input/input0 Fixed MDIO Bus: probed ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver ohci_hcd: USB 1.1 'Open' Host Controller (OHCI) Driver uhci_hcd: USB Universal Host Controller Interface driver PNP: No PS/2 controller found. Probing ports directly. mice: PS/2 mouse device common for all mice rtc_cmos: probe of rtc_cmos failed with error -16 cpuidle: using governor ladder cpuidle: using governor menu invalid opcode: 0000 [#1] SMP last sysfs file: CPU 3 Modules linked in: Modules linked in: Pid: 0, comm: swapper Not tainted 2.6.32-131.0.15.el6.x86_64 #1 RIP: e030:[<ffffffff812bb828>] [<ffffffff812bb828>] intel_idle+0x98/0x170 RSP: e02b:ffff8807dc979e80 EFLAGS: 00010046 RAX: ffff8807dc978010 RBX: 0000000000000004 RCX: 0000000000000000 RDX: 0000000000000000 RSI: ffff8807dc979fd8 RDI: ffff8800280b5040 RBP: ffff8807dc979ef0 R08: 0000000000000000 R09: 0000000000000320 R10: 00000000ffffffff R11: 0000000000000000 R12: 0000000000000010 R13: 12234113f555059a R14: 0000000000000002 R15: 0000000000000003 FS: 0000000000000000(0000) GS:ffff8800280aa000(0000) knlGS:0000000000000000 CS: e033 DS: 002b ES: 002b CR0: 000000008005003b CR2: 0000000000000000 CR3: 0000000001a25000 CR4: 0000000000002620 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000000 Process swapper (pid: 0, threadinfo ffff8807dc978000, task ffff8807dc974ac0) Stack: 0000000000000002 ffff8800280b0240 0000000000000001 ffff8800280b6640 <0> 0000000006acfc00 00000000000861c3 ffffffff81007b3f 00000003814e0cb6 <0> ffff8807dc979ef0 ffff8800280c72c0 ffff8800280c7390 0000000000000000 Call Trace: [<ffffffff81007b3f>] ? xen_restore_fl_direct_end+0x0/0x1 [<ffffffff81007af9>] ? xen_irq_enable_direct_end+0x0/0x7 [<ffffffff814cee35>] cpu_bringup_and_idle+0x13/0x15 Code: 8b 7d cc 85 c0 0f 85 b3 00 00 00 65 48 8b 34 25 08 cc 00 00 48 8b 86 38 e0 ff ff a8 08 75 25 31 d2 48 8d 86 38 e0 ff ff 48 89 d1 <0f> 01 c8 0f ae f0 48 8b 86 38 e0 ff ff a8 08 75 08 b1 01 4c 89 RIP [<ffffffff812bb828>] intel_idle+0x98/0x170 RSP <ffff8807dc979e80> invalid opcode: 0000 [#2] ---[ end trace 3a5e7c9663b4e1f1 ]--- SMP last sysfs file: CPU 0 Modules linked in: Modules linked in: Pid: 0, comm: swapper Tainted: G D ---------------- 2.6.32-131.0.15.el6.x86_64 #1 RIP: e030:[<ffffffff812bb828>] [<ffffffff812bb828>] intel_idle+0x98/0x170 RSP: e02b:ffffffff81a01ea8 EFLAGS: 00010046 RAX: ffffffff81a00010 RBX: 0000000000000008 RCX: 0000000000000000 RDX: 0000000000000000 RSI: ffffffff81a01fd8 RDI: ffff88002805b040 RBP: ffffffff81a01f18 R08: 0000000000000000 R09: 00000000000000c8 R10: 000000003b9aca00 R11: 00000000fffb6c80 R12: 0000000000000020 R13: 12234113f5554508 R14: 0000000000000003 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff880028050000(0000) knlGS:0000000000000000 CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 000000289560f000 CR3: 0000000001a25000 CR4: 0000000000002620 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000000 Process swapper (pid: 0, threadinfo ffffffff81a00000, task ffffffff81a2d020) Stack: ffff88002805c640 ffff880028050240 000000001dbe00c0 ffff88002805c640 <0> 0000000000000001 0000000017044e80 ffffffff81007b3f 00000000814e0cb6 <0> ffffffff81a01f18 ffff88002806d2c0 ffff88002806d3f0 0000000000000000 Call Trace: [<ffffffff81007b3f>] ? xen_restore_fl_direct_end+0x0/0x1 [<ffffffff813eccb7>] cpuidle_idle_call+0xa7/0x140 [<ffffffff81009e96>] cpu_idle+0xb6/0x110 [<ffffffff814c376a>] rest_init+0x7a/0x80 [<ffffffff81bbdf28>] start_kernel+0x41d/0x429 [<ffffffff81bbd33a>] x86_64_start_reservations+0x125/0x129 [<ffffffff81bc111b>] xen_start_kernel+0x582/0x586 Code: 8b 7d cc 85 c0 0f 85 b3 00 00 00 65 48 8b 34 25 08 cc 00 00 48 8b 86 38 e0 ff ff a8 08 75 25 31 d2 48 8d 86 38 e0 ff ff 48 89 d1 <0f> 01 c8 0f ae f0 48 8b 86 38 e0 ff ff a8 08 75 08 b1 01 4c 89 RIP [<ffffffff812bb828>] intel_idle+0x98/0x170 RSP <ffffffff81a01ea8> ---[ end trace 3a5e7c9663b4e1f2 ]--- invalid opcode: 0000 [#3] SMP Kernel panic - not syncing: Fatal exception RAX: ffffffff81a00010 RBX: 0000000000000008 RCX: 0000000000000000 RDX: 0000000000000000 RSI: ffffffff81a01fd8 RDI: ffff88002805b040 RBP: ffffffff81a01f18 R08: 0000000000000000 R09: 00000000000000c8 R10: 000000003b9aca00 R11: 00000000fffb6c80 R12: 0000000000000020 R13: 12234113f5554508 R14: 0000000000000003 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff880028050000(0000) knlGS:0000000000000000 CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 000000289560f000 CR3: 0000000001a25000 CR4: 0000000000002620 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000000 Process swapper (pid: 0, threadinfo ffffffff81a00000, task ffffffff81a2d020) Stack: ffff88002805c640 ffff880028050240 000000001dbe00c0 ffff88002805c640 <0> 0000000000000001 0000000017044e80 ffffffff81007b3f 00000000814e0cb6 <0> ffffffff81a01f18 ffff88002806d2c0 ffff88002806d3f0 0000000000000000 Call Trace: [<ffffffff81007b3f>] ? xen_restore_fl_direct_end+0x0/0x1 [<ffffffff813eccb7>] cpuidle_idle_call+0xa7/0x140 [<ffffffff81009e96>] cpu_idle+0xb6/0x110 [<ffffffff814c376a>] rest_init+0x7a/0x80 [<ffffffff81bbdf28>] start_kernel+0x41d/0x429 [<ffffffff81bbd33a>] x86_64_start_reservations+0x125/0x129 [<ffffffff81bc111b>] xen_start_kernel+0x582/0x586 Code: 8b 7d cc 85 c0 0f 85 b3 00 00 00 65 48 8b 34 25 08 cc 00 00 48 8b 86 38 e0 ff ff a8 08 75 25 31 d2 48 8d 86 38 e0 ff ff 48 89 d1 <0f> 01 c8 0f ae f0 48 8b 86 38 e0 ff ff a8 08 75 08 b1 01 4c 89 RIP [<ffffffff812bb828>] intel_idle+0x98/0x170 RSP <ffffffff81a01ea8> ---[ end trace 3a5e7c9663b4e1f2 ]--- invalid opcode: 0000 [#3] SMP Kernel panic - not syncing: Fatal exception last sysfs file: Pid: 0, comm: swapper Tainted: G D ---------------- 2.6.32-131.0.15.el6.x86_64 #1 CPU 1 Call Trace: Modules linked in: [<ffffffff814dac28>] ? panic+0x78/0x143 Modules linked in: [<ffffffff81007b3f>] ? xen_restore_fl_direct_end+0x0/0x1 [<ffffffff814ddb2c>] ? _spin_unlock_irqrestore+0x1c/0x20 Pid: 0, comm: swapper Tainted: G D ---------------- 2.6.32-131.0.15.el6.x86_64 #1 RIP: e030:[<ffffffff812bb828>] [<ffffffff814dec74>] ? oops_end+0xe4/0x100 [<ffffffff812bb828>] intel_idle+0x98/0x170 RSP: e02b:ffff8807dc973e80 EFLAGS: 00010046 [<ffffffff8100f2fb>] ? die+0x5b/0x90 RAX: ffff8807dc972010 RBX: 0000000000000004 RCX: 0000000000000000 root (hd0) Filesystem type is ext2fs, using whole disk kernel /boot/vmlinuz-2.6.32-131.0.15.el6.x86_64 ro root=LABEL=_/ initrd /boot/initramfs-2.6.32-131.0.15.el6.x86_64.img close blk: backend at /local/domain/0/backend/vbd/115/2048 close blk: backend at /local/domain/0/backend/vbd/115/2128 Initializing cgroup subsys cpuset Initializing cgroup subsys cpu Linux version 2.6.32-131.0.15.el6.x86_64 (mockbuild.bos.redhat.com) (gcc version 4.4.4 20100726 (Red Hat 4.4.4-13) (GCC) ) #1 SMP Tue May 10 15:42:40 EDT 2011 Command line: ro root=LABEL=_/ KERNEL supported cpus: Intel GenuineIntel AMD AuthenticAMD Centaur CentaurHauls ACPI in unprivileged domain disabled BIOS-provided physical RAM map: Xen: 0000000000000000 - 00000000000a0000 (usable) Xen: 00000000000a0000 - 0000000000100000 (reserved) Xen: 0000000000100000 - 0000000800000000 (usable) DMI not present or invalid. last_pfn = 0x800000 max_arch_pfn = 0x400000000 last_pfn = 0x100000 max_arch_pfn = 0x400000000 init_memory_mapping: 0000000000000000-0000000100000000 init_memory_mapping: 0000000100000000-0000000800000000 RAMDISK: 01f68000 - 046d8000 No NUMA configuration found Faking a node at 0000000000000000-0000000800000000 Bootmem setup node 0 0000000000000000-0000000800000000 NODE_DATA [0000000000008000 - 000000000003bfff] bootmap [00000000008b7000 - 00000000009b6fff] pages 100 (8 early reservations) ==> bootmem [0000000000 - 0800000000] #0 [0000000000 - 0000001000] BIOS data page ==> [0000000000 - 0000001000] #1 [0008b37000 - 0008b82000] XEN PAGETABLES ==> [0008b37000 - 0008b82000] #2 [0000006000 - 0000008000] TRAMPOLINE ==> [0000006000 - 0000008000] #3 [0001000000 - 0001f474e4] TEXT DATA BSS ==> [0001000000 - 0001f474e4] #4 [0001f68000 - 00046d8000] RAMDISK ==> [0001f68000 - 00046d8000] #5 [00046d8000 - 0008b37000] XEN START INFO ==> [00046d8000 - 0008b37000] #6 [0000100000 - 00008b7000] PGTABLE ==> [0000100000 - 00008b7000] #7 [0008b82000 - 000c39e000] PGTABLE ==> [0008b82000 - 000c39e000] Zone PFN ranges: DMA 0x00000001 -> 0x00001000 DMA32 0x00001000 -> 0x00100000 Normal 0x00100000 -> 0x00800000 Movable zone start PFN for each node early_node_map[2] active PFN ranges 0: 0x00000001 -> 0x000000a0 0: 0x00000100 -> 0x00800000 SFI: Simple Firmware Interface v0.7 http://simplefirmware.org SMP: Allowing 4 CPUs, 0 hotplug CPUs No local APIC present APIC: disable apic facility PM: Registered nosave memory: 00000000000a0000 - 0000000000100000 PCI: Warning: Cannot find a gap in the 32bit address range PCI: Unassigned devices with 32bit resource registers may break! Allocating PCI resources starting at 800100000 (gap: 800100000:400000) Booting paravirtualized kernel on Xen Xen version: 3.1.2-128.1.10.el5 (preserve-AD) NR_CPUS:4096 nr_cpumask_bits:4 nr_cpu_ids:4 nr_node_ids:1 PERCPU: Embedded 30 pages/cpu @ffff880028050000 s92504 r8192 d22184 u122880 pcpu-alloc: s92504 r8192 d22184 u122880 alloc=30*4096 pcpu-alloc: [0] 0 [0] 1 [0] 2 [0] 3 Xen: using vcpu_info placement Built 1 zonelists in Zone order, mobility grouping on. Total pages: 8271845 Policy zone: Normal Kernel command line: ro root=LABEL=_/ PID hash table entries: 4096 (order: 3, 32768 bytes) Checking aperture... No AGP bridge found AMD-Vi disabled by default: pass amd_iommu=on to enable PCI-DMA: Using software bounce buffering for IO (SWIOTLB) Placing 64MB software IO TLB between ffff880020000000 - ffff880024000000 software IO TLB at phys 0x20000000 - 0x24000000 Memory: 32835652k/33554432k available (5014k kernel code, 388k absent, 718392k reserved, 6906k data, 1232k init) Hierarchical RCU implementation. NR_IRQS:33024 nr_irqs:304 Console: colour dummy device 80x25 console [tty0] enabled console [hvc0] enabled allocated 335544320 bytes of page_cgroup please try 'cgroup_disable=memory' option if you don't want memory cgroups installing Xen timer for CPU 0 Detected 2666.760 MHz processor. Calibrating delay loop (skipped), value calculated using timer frequency.. 5333.52 BogoMIPS (lpj=2666760) pid_max: default: 32768 minimum: 301 Security Framework initialized SELinux: Initializing. Dentry cache hash table entries: 4194304 (order: 13, 33554432 bytes) Inode-cache hash table entries: 2097152 (order: 12, 16777216 bytes) Mount-cache hash table entries: 256 Initializing cgroup subsys ns Initializing cgroup subsys cpuacct Initializing cgroup subsys memory Initializing cgroup subsys devices Initializing cgroup subsys freezer Initializing cgroup subsys net_cls Initializing cgroup subsys blkio CPU: Unsupported number of siblings 16 SMP alternatives: switching to UP code ftrace: converting mcount calls to 0f 1f 44 00 00 ftrace: allocating 20700 entries in 82 pages Performance Events: PEBS fmt1+, Nehalem events, no APIC, boot with the "lapic" boot parameter to force-enable it. no hardware sampling interrupt available. Broken PMU hardware detected, using software events only. NMI watchdog disabled (cpu0): hardware events not enabled installing Xen timer for CPU 1 SMP alternatives: switching to SMP code CPU: Unsupported number of siblings 16 installing Xen timer for CPU 2 CPU: Unsupported number of siblings 16 installing Xen timer for CPU 3 CPU: Unsupported number of siblings 16 Brought up 4 CPUs devtmpfs: initialized Grant table initialized regulator: core version 0.5 NET: Registered protocol family 16 PCI: Fatal: No config space access function found bio: create slab <bio-0> at 0 ACPI: Interpreter disabled. xen_balloon: Initialising balloon driver. vgaarb: loaded SCSI subsystem initialized usbcore: registered new interface driver usbfs usbcore: registered new interface driver hub usbcore: registered new device driver usb PCI: System does not support PCI PCI: System does not support PCI NetLabel: Initializing NetLabel: domain hash size = 128 NetLabel: protocols = UNLABELED CIPSOv4 NetLabel: unlabeled traffic allowed by default Switching to clocksource xen pnp: PnP ACPI: disabled NET: Registered protocol family 2 IP route cache hash table entries: 524288 (order: 10, 4194304 bytes) TCP established hash table entries: 524288 (order: 11, 8388608 bytes) TCP bind hash table entries: 65536 (order: 8, 1048576 bytes) TCP: Hash tables configured (established 524288 bind 65536) TCP reno registered NET: Registered protocol family 1 Trying to unpack rootfs image as initramfs... Freeing initrd memory: 40384k freed platform rtc_cmos: registered platform RTC device (no PNP device found) audit: initializing netlink socket (disabled) type=2000 audit(1306960050.551:1): initialized HugeTLB registered 2 MB page size, pre-allocated 0 pages VFS: Disk quotas dquot_6.5.2 Dquot-cache hash table entries: 512 (order 0, 4096 bytes) msgmni has been set to 32768 alg: No test for stdrng (krng) ksign: Installing public key data Loading keyring - Added public key C9F0DFBFCE81A817 - User ID: Red Hat, Inc. (Kernel Module GPG key) - Added public key D4A26C9CCD09BEDA - User ID: Red Hat Enterprise Linux Driver Update Program <secalert> Block layer SCSI generic (bsg) driver version 0.4 loaded (major 252) io scheduler noop registered io scheduler anticipatory registered io scheduler deadline registered io scheduler cfq registered (default) pci_hotplug: PCI Hot Plug PCI Core version: 0.5 pciehp: PCI Express Hot Plug Controller Driver version: 0.4 acpiphp: ACPI Hot Plug PCI Controller Driver version: 0.5 pci-stub: invalid id string "" Non-volatile memory driver v1.3 Linux agpgart interface v0.103 crash memory driver: version 1.1 Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled brd: module loaded loop: module loaded input: Macintosh mouse button emulation as /devices/virtual/input/input0 Fixed MDIO Bus: probed ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver ohci_hcd: USB 1.1 'Open' Host Controller (OHCI) Driver uhci_hcd: USB Universal Host Controller Interface driver PNP: No PS/2 controller found. Probing ports directly. mice: PS/2 mouse device common for all mice rtc_cmos: probe of rtc_cmos failed with error -16 cpuidle: using governor ladder cpuidle: using governor menu invalid opcode: 0000 [#1] SMP last sysfs file: CPU 3 Modules linked in: Modules linked in: Pid: 0, comm: swapper Not tainted 2.6.32-131.0.15.el6.x86_64 #1 RIP: e030:[<ffffffff812bb828>] [<ffffffff812bb828>] intel_idle+0x98/0x170 RSP: e02b:ffff8807dc979e80 EFLAGS: 00010046 RAX: ffff8807dc978010 RBX: 0000000000000004 RCX: 0000000000000000 RDX: 0000000000000000 RSI: ffff8807dc979fd8 RDI: ffff8800280b5040 RBP: ffff8807dc979ef0 R08: 0000000000000000 R09: 0000000000000320 R10: 00000000ffffffff R11: 0000000000000000 R12: 0000000000000010 R13: 1223413dcdc8106e R14: 0000000000000002 R15: 0000000000000003 FS: 0000000000000000(0000) GS:ffff8800280aa000(0000) knlGS:0000000000000000 CS: e033 DS: 002b ES: 002b CR0: 000000008005003b CR2: 0000000000000000 CR3: 0000000001a25000 CR4: 0000000000002620 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000000 Process swapper (pid: 0, threadinfo ffff8807dc978000, task ffff8807dc974ac0) Stack: 0000000000000002 ffff8800280b0240 0000000000000001 ffff8800280b6640 <0> 0000000006ea0500 00000000000a9a47 ffffffff81007b3f 00000003814e0cb6 <0> ffff8807dc979ef0 ffff8800280c72c0 ffff8800280c7390 0000000000000000 Call Trace: [<ffffffff81007b3f>] ? xen_restore_fl_direct_end+0x0/0x1 [<ffffffff813eccb7>] cpuidle_idle_call+0xa7/0x140 [<ffffffff81009e96>] cpu_idle+0xb6/0x110 [<ffffffff81007af9>] ? xen_irq_enable_direct_end+0x0/0x7 [<ffffffff814cee35>] cpu_bringup_and_idle+0x13/0x15 Code: 8b 7d cc 85 c0 0f 85 b3 00 00 00 65 48 8b 34 25 08 cc 00 00 48 8b 86 38 e0 ff ff a8 08 75 25 31 d2 48 8d 86 38 e0 ff ff 48 89 d1 <0f> 01 c8 0f ae f0 48 8b 86 38 e0 ff ff a8 08 75 08 b1 01 4c 89 RIP [<ffffffff812bb828>] intel_idle+0x98/0x170 RSP <ffff8807dc979e80> invalid opcode: 0000 [#2] ---[ end trace 7163428e87a00d47 ]--- Kernel panic - not syncing: Fatal exception Pid: 0, comm: swapper Tainted: G D ---------------- 2.6.32-131.0.15.el6.x86_64 #1 Call Trace: [<ffffffff814dac28>] ? panic+0x78/0x143 [<ffffffff81007b3f>] ? xen_restore_fl_direct_end+0x0/0x1 [<ffffffff814ddb2c>] ? _spin_unlock_irqrestore+0x1c/0x20 [<ffffffff814dec74>] ? oops_end+0xe4/0x100 [<ffffffff8100f2fb>] ? die+0x5b/0x90 [<ffffffff814de544>] ? do_trap+0xc4/0x160 [<ffffffff8100ceb5>] ? do_invalid_op+0x95/0xb0 [<ffffffff812bb828>] ? intel_idle+0x98/0x170 [<ffffffff8109d1d7>] ? tick_broadcast_oneshot_control+0xc7/0x120 [<ffffffff8109cd05>] ? tick_notify+0x325/0x410 [<ffffffff8100bf5b>] ? invalid_op+0x1b/0x20 Command line: ro root=LABEL=_/ KERNEL supported cpus: Intel GenuineIntel AMD AuthenticAMD Centaur CentaurHauls ACPI in unprivileged domain disabled BIOS-provided physical RAM map: Xen: 0000000000000000 - 00000000000a0000 (usable) Xen: 00000000000a0000 - 0000000000100000 (reserved) Xen: 0000000000100000 - 0000000800000000 (usable) DMI not present or invalid. last_pfn = 0x800000 max_arch_pfn = 0x400000000 last_pfn = 0x100000 max_arch_pfn = 0x400000000 init_memory_mapping: 0000000000000000-0000000100000000 init_memory_mapping: 0000000100000000-0000000800000000 RAMDISK: 01f68000 - 046d8000 No NUMA configuration found Faking a node at 0000000000000000-0000000800000000 Bootmem setup node 0 0000000000000000-0000000800000000 NODE_DATA [0000000000008000 - 000000000003bfff] bootmap [00000000008b7000 - 00000000009b6fff] pages 100 (8 early reservations) ==> bootmem [0000000000 - 0800000000] #0 [0000000000 - 0000001000] BIOS data page ==> [0000000000 - 0000001000] #1 [0008b37000 - 0008b82000] XEN PAGETABLES ==> [0008b37000 - 0008b82000] #2 [0000006000 - 0000008000] TRAMPOLINE ==> [0000006000 - 0000008000] #3 [0001000000 - 0001f474e4] TEXT DATA BSS ==> [0001000000 - 0001f474e4] #4 [0001f68000 - 00046d8000] RAMDISK ==> [0001f68000 - 00046d8000] #5 [00046d8000 - 0008b37000] XEN START INFO ==> [00046d8000 - 0008b37000] #6 [0000100000 - 00008b7000] PGTABLE ==> [0000100000 - 00008b7000] #7 [0008b82000 - 000c39e000] PGTABLE ==> [0008b82000 - 000c39e000] Zone PFN ranges: DMA 0x00000001 -> 0x00001000 DMA32 0x00001000 -> 0x00100000 Normal 0x00100000 -> 0x00800000 Movable zone start PFN for each node early_node_map[2] active PFN ranges 0: 0x00000001 -> 0x000000a0 0: 0x00000100 -> 0x00800000 SFI: Simple Firmware Interface v0.7 http://simplefirmware.org SMP: Allowing 4 CPUs, 0 hotplug CPUs No local APIC present APIC: disable apic facility PM: Registered nosave memory: 00000000000a0000 - 0000000000100000 PCI: Warning: Cannot find a gap in the 32bit address range PCI: Unassigned devices with 32bit resource registers may break! Allocating PCI resources starting at 800100000 (gap: 800100000:400000) Booting paravirtualized kernel on Xen Xen version: 3.1.2-128.1.10.el5 (preserve-AD) NR_CPUS:4096 nr_cpumask_bits:4 nr_cpu_ids:4 nr_node_ids:1 PERCPU: Embedded 30 pages/cpu @ffff880028050000 s92504 r8192 d22184 u122880 pcpu-alloc: s92504 r8192 d22184 u122880 alloc=30*4096 pcpu-alloc: [0] 0 [0] 1 [0] 2 [0] 3 Xen: using vcpu_info placement Built 1 zonelists in Zone order, mobility grouping on. Total pages: 8271845 Policy zone: Normal Kernel command line: ro root=LABEL=_/ PID hash table entries: 4096 (order: 3, 32768 bytes) Checking aperture... No AGP bridge found AMD-Vi disabled by default: pass amd_iommu=on to enable PCI-DMA: Using software bounce buffering for IO (SWIOTLB) Placing 64MB software IO TLB between ffff880020000000 - ffff880024000000 software IO TLB at phys 0x20000000 - 0x24000000 Memory: 32835652k/33554432k available (5014k kernel code, 388k absent, 718392k reserved, 6906k data, 1232k init) Hierarchical RCU implementation. NR_IRQS:33024 nr_irqs:304 Console: colour dummy device 80x25 console [tty0] enabled console [hvc0] enabled allocated 335544320 bytes of page_cgroup please try 'cgroup_disable=memory' option if you don't want memory cgroups installing Xen timer for CPU 0 Detected 2666.760 MHz processor. Calibrating delay loop (skipped), value calculated using timer frequency.. 5333.52 BogoMIPS (lpj=2666760) pid_max: default: 32768 minimum: 301 Security Framework initialized SELinux: Initializing. Dentry cache hash table entries: 4194304 (order: 13, 33554432 bytes) Inode-cache hash table entries: 2097152 (order: 12, 16777216 bytes) Mount-cache hash table entries: 256 Initializing cgroup subsys ns Initializing cgroup subsys cpuacct Initializing cgroup subsys memory Initializing cgroup subsys devices Initializing cgroup subsys freezer Initializing cgroup subsys net_cls Initializing cgroup subsys blkio CPU: Unsupported number of siblings 16 SMP alternatives: switching to UP code ftrace: converting mcount calls to 0f 1f 44 00 00 ftrace: allocating 20700 entries in 82 pages Performance Events: PEBS fmt1+, Nehalem events, no APIC, boot with the "lapic" boot parameter to force-enable it. no hardware sampling interrupt available. Broken PMU hardware detected, using software events only. NMI watchdog disabled (cpu0): hardware events not enabled installing Xen timer for CPU 1 SMP alternatives: switching to SMP code CPU: Unsupported number of siblings 16 installing Xen timer for CPU 2 CPU: Unsupported number of siblings 16 installing Xen timer for CPU 3 CPU: Unsupported number of siblings 16 Brought up 4 CPUs devtmpfs: initialized Grant table initialized regulator: core version 0.5 NET: Registered protocol family 16 PCI: Fatal: No config space access function found bio: create slab <bio-0> at 0 ACPI: Interpreter disabled. xen_balloon: Initialising balloon driver. vgaarb: loaded SCSI subsystem initialized usbcore: registered new interface driver usbfs usbcore: registered new interface driver hub usbcore: registered new device driver usb PCI: System does not support PCI PCI: System does not support PCI NetLabel: Initializing NetLabel: domain hash size = 128 NetLabel: protocols = UNLABELED CIPSOv4 NetLabel: unlabeled traffic allowed by default Switching to clocksource xen pnp: PnP ACPI: disabled NET: Registered protocol family 2 IP route cache hash table entries: 524288 (order: 10, 4194304 bytes) TCP established hash table entries: 524288 (order: 11, 8388608 bytes) TCP bind hash table entries: 65536 (order: 8, 1048576 bytes) TCP: Hash tables configured (established 524288 bind 65536) TCP reno registered NET: Registered protocol family 1 Trying to unpack rootfs image as initramfs... Freeing initrd memory: 40384k freed platform rtc_cmos: registered platform RTC device (no PNP device found) audit: initializing netlink socket (disabled) type=2000 audit(1306959872.310:1): initialized HugeTLB registered 2 MB page size, pre-allocated 0 pages VFS: Disk quotas dquot_6.5.2 Dquot-cache hash table entries: 512 (order 0, 4096 bytes) msgmni has been set to 32768 alg: No test for stdrng (krng) ksign: Installing public key data Loading keyring - Added public key C9F0DFBFCE81A817 - User ID: Red Hat, Inc. (Kernel Module GPG key) - Added public key D4A26C9CCD09BEDA - User ID: Red Hat Enterprise Linux Driver Update Program <secalert> Block layer SCSI generic (bsg) driver version 0.4 loaded (major 252) io scheduler noop registered io scheduler anticipatory registered io scheduler deadline registered io scheduler cfq registered (default) pci_hotplug: PCI Hot Plug PCI Core version: 0.5 pciehp: PCI Express Hot Plug Controller Driver version: 0.4 acpiphp: ACPI Hot Plug PCI Controller Driver version: 0.5 pci-stub: invalid id string "" Non-volatile memory driver v1.3 Linux agpgart interface v0.103 crash memory driver: version 1.1 Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled brd: module loaded loop: module loaded input: Macintosh mouse button emulation as /devices/virtual/input/input0 Fixed MDIO Bus: probed ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver ohci_hcd: USB 1.1 'Open' Host Controller (OHCI) Driver uhci_hcd: USB Universal Host Controller Interface driver PNP: No PS/2 controller found. Probing ports directly. mice: PS/2 mouse device common for all mice rtc_cmos: probe of rtc_cmos failed with error -16 cpuidle: using governor ladder cpuidle: using governor menu usbcore: registered new interface driver hiddev usbcore: registered new interface driver usbhid usbhid: v2.6:USB HID core driver BUG: soft lockup - CPU#0 stuck for 67s! [swapper:1] Modules linked in: CPU 0: Modules linked in: Pid: 1, comm: swapper Not tainted 2.6.32-131.0.15.el6.x86_64 #1 RIP: e030:[<ffffffff810a4512>] [<ffffffff810a4512>] smp_call_function_many+0x1b2/0x210 RSP: e02b:ffff8807dc96bc10 EFLAGS: 00000202 RAX: 0000000000000004 RBX: ffff880028061160 RCX: 000000000000003c RDX: 0000000000000004 RSI: 0000000000000004 RDI: ffff8807ffe7dc00 RBP: ffff8807dc96bc50 R08: ffff8807ffe7dc00 R09: 0000000000000000 R10: 0000000000000004 R11: 0000000000000000 R12: 0000000000000000 R13: ffffffff81b9e080 R14: ffffffff81b9e080 R15: ffffffff811594a0 FS: 0000000000000000(0000) GS:ffff880028050000(0000) knlGS:0000000000000000 CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000000000000000 CR3: 0000000001a25000 CR4: 0000000000002620 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000000 Call Trace: [<ffffffff811594a0>] ? do_ccupdate_local+0x0/0x40 [<ffffffff810a4592>] smp_call_function+0x22/0x30 [<ffffffff8106f334>] on_each_cpu+0x24/0x50 [<ffffffff8115c46f>] do_tune_cpucache+0x12f/0x630 [<ffffffff8115cb4b>] enable_cpucache+0x3b/0xf0 [<ffffffff814c5d3f>] setup_cpu_cache+0x22f/0x340 [<ffffffff81007b3f>] ? xen_restore_fl_direct_end+0x0/0x1 [<ffffffff8115ac92>] ? kmem_cache_alloc+0x182/0x190 [<ffffffff8115d7da>] kmem_cache_create+0x3fa/0x580 [<ffffffff81bfd1b7>] flow_cache_init+0x4a/0x1af [<ffffffff81bfd16d>] ? flow_cache_init+0x0/0x1af [<ffffffff8100204c>] do_one_initcall+0x3c/0x1d0 [<ffffffff81bbd884>] kernel_init+0x29d/0x2f9 [<ffffffff8100c1ca>] child_rip+0xa/0x20 [<ffffffff8100b393>] ? int_ret_from_sys_call+0x7/0x1b [<ffffffff8100bb1d>] ? retint_restore_args+0x5/0x6 [<ffffffff8100c1c0>] ? child_rip+0x0/0x20 BUG: soft lockup - CPU#0 stuck for 67s! [swapper:1] Modules linked in: CPU 0: Modules linked in: Pid: 1, comm: swapper Not tainted 2.6.32-131.0.15.el6.x86_64 #1 RIP: e030:[<ffffffff810a4516>] [<ffffffff810a4516>] smp_call_function_many+0x1b6/0x210 RSP: e02b:ffff8807dc96bc10 EFLAGS: 00000202 RAX: 0000000000000004 RBX: ffff880028061160 RCX: 000000000000003c RDX: 0000000000000004 RSI: 0000000000000004 RDI: ffff8807ffe7dc00 RBP: ffff8807dc96bc50 R08: ffff8807ffe7dc00 R09: 0000000000000000 R10: 0000000000000004 R11: 0000000000000000 R12: 0000000000000000 R13: ffffffff81b9e080 R14: ffffffff81b9e080 R15: ffffffff811594a0 FS: 0000000000000000(0000) GS:ffff880028050000(0000) knlGS:0000000000000000 CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000000000000000 CR3: 0000000001a25000 CR4: 0000000000002620 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000000 Call Trace: [<ffffffff811594a0>] ? do_ccupdate_local+0x0/0x40 [<ffffffff810a4592>] smp_call_function+0x22/0x30 [<ffffffff8106f334>] on_each_cpu+0x24/0x50 [<ffffffff8115c46f>] do_tune_cpucache+0x12f/0x630 [<ffffffff8115cb4b>] enable_cpucache+0x3b/0xf0 [<ffffffff814c5d3f>] setup_cpu_cache+0x22f/0x340 [<ffffffff81007b3f>] ? xen_restore_fl_direct_end+0x0/0x1 [<ffffffff8115ac92>] ? kmem_cache_alloc+0x182/0x190 [<ffffffff8115d7da>] kmem_cache_create+0x3fa/0x580 [<ffffffff81bfd1b7>] flow_cache_init+0x4a/0x1af [<ffffffff81bfd16d>] ? flow_cache_init+0x0/0x1af [<ffffffff8100204c>] do_one_initcall+0x3c/0x1d0 [<ffffffff81bbd884>] kernel_init+0x29d/0x2f9 [<ffffffff8100c1ca>] child_rip+0xa/0x20 [<ffffffff8100b393>] ? int_ret_from_sys_call+0x7/0x1b [<ffffffff8100bb1d>] ? retint_restore_args+0x5/0x6 [<ffffffff8100c1c0>] ? child_rip+0x0/0x20 The four stack traces are from four separate hosts. Dom0 in all cases is RHEL 5.3 x86_64. We are observing this accross multiple vendors. Also, it look like the launch failure is 50%. 15% of the hosts that do launch seem to suffer some sort of death into their life cycle. The m2.2xlarge has more failures than the m2.4xlarge hosts. Correction -- "multiple vendors" should read "multple hardware vendors". I'm testing a m2.4xlarge x86_64 RHEL 6.1 and I'm having trouble recreating. As far as I understand it the recreate is.. 1. bring up a RHEL 6.1, x86_64 w/ m2.2xlarge or m2.4xlarge 2. wait roughly 60 minutes 3. 50% of those instances fail Questions: 1. Are there any operations running on the host while its up? 2. How many guests should be started to recreate, roughly.. ? Can you please provide the output of cat /proc/cpuinfo on the guest? Created attachment 502501 [details]
uname -a for instance in apac-sing
Uname -a output from m2.4xlarge in apac-sing.
Created attachment 502502 [details]
/proc/cpuinfo from apac-sing
This is output from a running m2.4xlarge instance in apac-sing.
Much like Wes, I cannot repeat the issue with one instance, but given my usage, this falls into the stats supplied by Ben. I can try other instances, if needed, but the inconsistency is frustrating. It should also be noted that the error appears not to occur with RHEL 5.5, 5.6, and 6.0. Wanted to add that my response time is slow in the instance, but that could be more of a factor of connecting from Raleigh to Singapore. The instance is still going after 22 minutes of uptime. Created attachment 502528 [details]
cpu proc, uptime data for 10 instances across all regions
FYI.. these instances are also m2.4xlarge.. I'll bring up 10 m2.2xlarge right now to test
Another Question: If the instances does soft crash.. is it still listed in the ec2 webui console after the crash? Chris, (In reply to comment #13) > Much like Wes, I cannot repeat the issue with one instance, but given my usage, > this falls into the stats supplied by Ben. I can try other instances, if > needed, but the inconsistency is frustrating. It should also be noted that the > error appears not to occur with RHEL 5.5, 5.6, and 6.0. could you please elaborate a bit more on the last sentence? Are you saying that this is a guest regression from 6.0 to 6.1? Or do you mean that you could not reproduce the problem either with 6.0 or with 6.1? Thanks! Laszlo, Amazon told us that they only experience the problem themselves with 6.1 images. None of the 5.5, 5.6 or 6.0 images have the issue. So it would imply that some sort of incompatibility began between 6.0 and 6.1 errata. Created attachment 502568 [details]
eu-west x86_64 m2.2xlarge crash console
Found one that did indeed crash.. console log attached
Info for the above crash AMI: RHEL-6.1-Starter-EBS-x86_64-1-Hourly (ami-4efcca3a) Zone: eu-west-1a Security Groups: default Type: m2.2xlarge Status: running Owner: 673500695950 VPC ID: - Subnet ID: - Source/Dest. Check: Virtualization: paravirtual Placement Group: Reservation: r-6ae7481c RAM Disk ID: - Platform: - Kernel ID: aki-a90a3ddd Monitoring: basic AMI Launch Index: 0 Elastic IP: - Root Device: Root Device Type: ebs Tenancy: default Lifecycle: normal Block Devices: sda Public DNS: ec2-79-125-54-178.eu-west-1.compute.amazonaws.com Private DNS: ip-10-230-55-234.eu-west-1.compute.internal Private IP Address: 10.230.55.234 Launch Time: 2011-06-02 09:49 EDT Created attachment 502574 [details]
ap-southeast-x86_64_rhel61_m2.2xlargeCRASH
RHEL-6.1-Starter-EBS-x86_64-1-Hourly (ami-aefe87fc) Zone: ap-southeast-1b Security Groups: default Type: m2.2xlarge Status: running Owner: 673500695950 VPC ID: - Subnet ID: - Source/Dest. Check: Virtualization: paravirtual Placement Group: Reservation: r-9718b5c2 RAM Disk ID: - Platform: - Kernel ID: aki-82235ad0 Monitoring: basic AMI Launch Index: 0 Elastic IP: - Root Device: Root Device Type: ebs Tenancy: default Lifecycle: normal Block Devices: sda Public DNS: ec2-175-41-173-14.ap-southeast-1.compute.amazonaws.com Private DNS: ip-10-130-227-206.ap-southeast-1.compute.internal Private IP Address: 10.130.227.206 Launch Time: 2011-06-02 09:16 EDT two out of the ten instances across the regions have crash with in two hours. actually to clarify... I have two instances w/ m2.4xlarge and m2.2xlarge in each region. one crash in eu-west, one crash in ap-southeast m2.4xlarge 0/10 crashed after 10+ hours m2.2xlarge 2/10 crashed after 2+ hours OK, I looked at the boot logs, traces, and the cpuinfo and have a *guess* at what the problem could be. The cpuinfo given to me, which I assume is a from a working guest on the problem host, has nonstop_tsc set in it. This is a risky feature to have on a guest. Furthermore, the 6.0 -> 6.1 regression that Laszlo confirmed with Chris adds another clue, because the Intel cpuidle driver was added during that development phase. The boot messages generally showed the crash shortly after printing cpuidle: using governor ladder cpuidle: using governor menu and traces appear to involve that driver (intel_idle is on the stack) and that's tied to nonstop_tsc because intel_idle_cpuidle_devices_init() introduced with that patchset does NOT set the tsc unstable if nonstop_tsc is present. I think the best way to proceed is to try a trial guest kernel with nonstop_tsc masked out. I can create one now. $ addr2line -pife \ /usr/lib/debug/lib/modules/2.6.32-131.0.15.el6.x86_64/vmlinux \ <<< ffffffff812bb828 __monitor at /usr/src/debug/kernel-2.6.32-131.0.15.el6/linux-2.6.32-131.0.15.el6.x86_64/arch/x86/include/asm/processor.h:751 (inlined by) intel_idle at /usr/src/debug/kernel-2.6.32-131.0.15.el6/linux-2.6.32-131.0.15.el6.x86_64/drivers/idle/intel_idle.c:192 [drivers/idle/intel_idle.c] 168 static int intel_idle(struct cpuidle_device *dev, struct cpuidle_state *state) 169 { 170 unsigned long ecx = 1; /* break on interrupt flag */ 171 unsigned long eax = (unsigned long)cpuidle_get_statedata(state); 172 unsigned int cstate; 173 ktime_t kt_before, kt_after; 174 s64 usec_delta; 175 int cpu = smp_processor_id(); 176 177 cstate = (((eax) >> MWAIT_SUBSTATE_SIZE) & MWAIT_CSTATE_MASK) + 1; 178 179 local_irq_disable(); 180 181 if (!(lapic_timer_reliable_states & (1 << (cstate)))) 182 clockevents_notify(CLOCK_EVT_NOTIFY_BROADCAST_ENTER, &cpu); 183 184 kt_before = ktime_get_real(); 185 186 stop_critical_timings(); 187 #ifndef MODULE 188 trace_power_start(POWER_CSTATE, (eax >> 4) + 1, cpu); 189 #endif 190 if (!need_resched()) { 191 192 __monitor((void *)¤t_thread_info()->flags, 0, 0); /* HERE */ 193 smp_mb(); 194 if (!need_resched()) 195 __mwait(eax, ecx); 196 } 197 198 start_critical_timings(); 199 200 kt_after = ktime_get_real(); 201 usec_delta = ktime_to_us(ktime_sub(kt_after, kt_before)); 202 203 local_irq_enable(); 204 205 if (!(lapic_timer_reliable_states & (1 << (cstate)))) 206 clockevents_notify(CLOCK_EVT_NOTIFY_BROADCAST_EXIT, &cpu); 207 208 return usec_delta; 209 } [arch/x86/include/asm/processor.h] 747 static inline void __monitor(const void *eax, unsigned long ecx, 748 unsigned long edx) 749 { 750 /* "monitor %eax, %ecx, %edx;" */ 751 asm volatile(".byte 0x0f, 0x01, 0xc8;" 752 :: "a" (eax), "c" (ecx), "d"(edx)); 753 } [root@ip-10-230-53-220 ~]# uname -a Linux ip-10-230-53-220 2.6.32-131.0.15.el6mask_nonstop_tsc.x86_64 #1 SMP Thu Jun 2 13:10:15 EDT 2011 x86_64 x86_64 x86_64 GNU/Linux [root@ip-10-230-53-220 ~]# cat /proc/cpuinfo |grep nonstop_tsc flags : fpu de tsc msr pae cx8 cmov pat clflush mmx fxsr sse sse2 ss ht syscall nx lm constant_tsc rep_good nonstop_tsc aperfmperf unfair_spinlock pni ssse3 cx16 sse4_1 sse4_2 popcnt hypervisor lahf_lm flags : fpu de tsc msr pae cx8 cmov pat clflush mmx fxsr sse sse2 ss ht syscall nx lm constant_tsc rep_good nonstop_tsc aperfmperf unfair_spinlock pni ssse3 cx16 sse4_1 sse4_2 popcnt hypervisor lahf_lm flags : fpu de tsc msr pae cx8 cmov pat clflush mmx fxsr sse sse2 ss ht syscall nx lm constant_tsc rep_good nonstop_tsc aperfmperf unfair_spinlock pni ssse3 cx16 sse4_1 sse4_2 popcnt hypervisor lahf_lm flags : fpu de tsc msr pae cx8 cmov pat clflush mmx fxsr sse sse2 ss ht syscall nx lm constant_tsc rep_good nonstop_tsc aperfmperf unfair_spinlock pni ssse3 cx16 sse4_1 sse4_2 popcnt hypervisor lahf_lm patch is not good.. dont distribute. (In reply to comment #30) > [root@ip-10-230-53-220 ~]# uname -a > Linux ip-10-230-53-220 2.6.32-131.0.15.el6mask_nonstop_tsc.x86_64 #1 SMP Thu > Jun 2 13:10:15 EDT 2011 x86_64 x86_64 x86_64 GNU/Linux > [root@ip-10-230-53-220 ~]# cat /proc/cpuinfo |grep nonstop_tsc > flags : fpu de tsc msr pae cx8 cmov pat clflush mmx fxsr sse sse2 ss ht > syscall nx lm constant_tsc rep_good nonstop_tsc aperfmperf unfair_spinlock pni > ssse3 cx16 sse4_1 sse4_2 popcnt hypervisor lahf_lm Argh... This is disappointing. I don't know why it didn't work... I'll find a machine with nonstop_tsc to fix and test it on before spinning another rpm. Created attachment 502732 [details]
mask nonstop_tsc
Ah, here's the issue. Sigh, I shouldn't have looked at this code closer when I was grepping for NONSTOP_TSC in early_init_intel() we have if (c->x86_power & (1 << 8)) { set_cpu_cap(c, X86_FEATURE_CONSTANT_TSC); set_cpu_cap(c, X86_FEATURE_NONSTOP_TSC); if (!check_tsc_unstable()) sched_clock_stable = 1; } also in early_init_amd() we have a corresponding if (c->x86_power & (1 << 8)) { set_cpu_cap(c, X86_FEATURE_CONSTANT_TSC); set_cpu_cap(c, X86_FEATURE_NONSTOP_TSC); } so it's getting flipped back on again. I should probably just clear all of x86_power, which is 8000_0007 edx. The patch I tried before only cleared the one NONSTOP_TSC bit, I'll roll another one that zeros out all of edx. Created attachment 502735 [details]
zero all x86_power feature bits
[root@ip-10-58-159-28 ~]# cat /root/BEFORE_INSTALL | grep nonstop_tsc flags : fpu de tsc msr pae cx8 cmov pat clflush mmx fxsr sse sse2 ss ht syscall nx lm constant_tsc rep_good nonstop_tsc aperfmperf unfair_spinlock pni ssse3 cx16 sse4_1 sse4_2 popcnt hypervisor lahf_lm flags : fpu de tsc msr pae cx8 cmov pat clflush mmx fxsr sse sse2 ss ht syscall nx lm constant_tsc rep_good nonstop_tsc aperfmperf unfair_spinlock pni ssse3 cx16 sse4_1 sse4_2 popcnt hypervisor lahf_lm flags : fpu de tsc msr pae cx8 cmov pat clflush mmx fxsr sse sse2 ss ht syscall nx lm constant_tsc rep_good nonstop_tsc aperfmperf unfair_spinlock pni ssse3 cx16 sse4_1 sse4_2 popcnt hypervisor lahf_lm flags : fpu de tsc msr pae cx8 cmov pat clflush mmx fxsr sse sse2 ss ht syscall nx lm constant_tsc rep_good nonstop_tsc aperfmperf unfair_spinlock pni ssse3 cx16 sse4_1 sse4_2 popcnt hypervisor lahf_lm [root@ip-10-58-159-28 ~]# [root@ip-10-58-159-28 ~]# uname -a Linux ip-10-58-159-28 2.6.32-131.0.15.el6zero_x86_power.x86_64 #1 SMP Fri Jun 3 03:09:47 EDT 2011 x86_64 x86_64 x86_64 GNU/Linux [root@ip-10-58-159-28 ~]# [root@ip-10-58-159-28 ~]# cat /proc/cpuinfo | grep nonstop_tsc OK.. I'm going to give my response in two parts. First off.. yes.. the patch fixes the issue. After 15 hours 10/10 of the x86_64 6.1 eu-west-1 ami's are still running. So.. yes the patch is a PASS Just to note 3 ami's crashed before I could get the patch installed, so I'm in a region that can reproduce the issue. I am run kernel-qe beaker tier1/2 tests on 2 of the 10 boxes with the patch. I will post the results I have thus far, but the tests are still running and I do see some failures. Something to be checked by the kernel-qe team for sure. Created attachment 502982 [details]
beaker tier1/2 kernel qe tests.. tests still running no complete
Comment on attachment 502982 [details]
beaker tier1/2 kernel qe tests.. tests still running no complete
have full logs.. tests finished
Created attachment 503004 [details]
tier1 & tier2 kernel qe tests
(In reply to comment #47) > First off.. yes.. the patch fixes the issue. After 15 hours 10/10 of the x86_64 > 6.1 eu-west-1 ami's are still running. So.. yes the patch is a PASS > > Just to note 3 ami's crashed before I could get the patch installed, so I'm in > a region that can reproduce the issue. > This is good news. > I am run kernel-qe beaker tier1/2 tests on 2 of the 10 boxes with the patch. I > will post the results I have thus far, but the tests are still running and I do > see some failures. Something to be checked by the kernel-qe team for sure. This is bad news. It sounds like I still shouldn't post the patch until we've determined whether or not it introduced these other test failures. Although I don't imagine these tests were run before on this exact setup, since these setups weren't booting before, so there's a good chance these are different issues We should do the following a) run the test kernel through all the regression tests here in house on our test machines and ensure all PASS (or determine the patch introduces regressions) b) investigate the failures (get debug logs, etc) and use Igor's test machines for experiments - we would do this work under separate bugzillas for each issue. I received an update from Amazon yesterday: "Sorry, I thought I had sent a confirmation. The patch appears to fix the problem. Thanks, Ben" I just tested the attachment. 1. download the attachment 1003 bzip2 -d tests.tar.bz2 1004 ls -ltra 1005 tar -xvf tests.tar should get two dirs.. drwxr-xr-x. 24 whayutin whayutin 4096 Jun 4 14:25 Tests_x86_64_01 drwxr-xr-x. 24 whayutin whayutin 4096 Jun 4 14:30 Tests_x86_6402 [whayutin@localhost Downloads]$ tar -tvf tests.tar | more drwxr-xr-x whayutin/whayutin 0 2011-06-04 14:25 Tests_x86_64_01/ drwxr-xr-x whayutin/whayutin 0 2011-06-04 14:24 Tests_x86_64_01/operational/ -rw------- whayutin/whayutin 715 2011-06-04 14:24 Tests_x86_64_01/operational/tm p.klT1c1 -rw-r--r-- whayutin/whayutin 113 2011-06-04 14:24 Tests_x86_64_01/operational/op erational.error.log -rw-r--r-- whayutin/whayutin 0 2011-06-04 14:24 Tests_x86_64_01/operational/dm esg_operational.log -rw-r--r-- whayutin/whayutin 336 2011-06-04 14:24 Tests_x86_64_01/operatio Now that we know turning off nonstop_tsc resolves the issue, I've done another round of code analysis to see why, and if the patch I proposed would introduce other problems or not. I believe the 'why' is because with nonstop_tsc turned off, sched_clock_stable stays off. With sched_clock_stable turned on the scheduler assumes sched_clock() is synchronized between processors. However, it's not synchronized with RHEL6 xenpv guests because it's been overridden with xen_sched_clock(), which attempts to measure unstolen time (which is different per vcpu). Upstream dropped this idea about a year ago with the following patch commit 8a22b9996b001c88f2bfb54c6de6a05fc39e177a Author: Jeremy Fitzhardinge <jeremy.fitzhardinge> Date: Mon Jul 12 11:49:59 2010 -0700 xen: drop xen_sched_clock in favour of using plain wallclock time Likely the correct fix for this issue is to backport this patch, rather than to mask the cpuid features. If Fedora 15 instances didn't have problems running on the same hosts, then that would be some confirmation of this theory. OTOH, the patch I've already created looks to be safe from the second part of my analysis, and it's a conservative/defensive patch to have in place. A clear advantage to using it over a backport of the patch above is that we don't have to switch sched_clock() now at the last minute, which could introduce regressions. Masking cpuid bits of unneeded features is in general safe, and it reduces potential for other regressions as well. So, I guess I've convinced myself to post the masking patch for now. I'll open a new bug to consider what to do during 6.2 development. (In reply to comment #57) > commit 8a22b9996b001c88f2bfb54c6de6a05fc39e177a > Author: Jeremy Fitzhardinge <jeremy.fitzhardinge> > Date: Mon Jul 12 11:49:59 2010 -0700 > > xen: drop xen_sched_clock in favour of using plain wallclock time > The mail thread related to this patch is here http://lists.xensource.com/archives/html/xen-devel/2010-07/msg00738.html If there are any rhts/beaker based tests that can specifically test this issue, please let cloud-qe know. Thank you! Patch(es) available on kernel-2.6.32-156.el6 Technical note added. If any revisions are required, please edit the "Technical Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. New Contents: Xen guests cannot make use of all CPU features, and in some cases they are even risky to be advertised. One such feature is CONSTANT_TSC. This feature prevents the TSC (Time Stamp Counter) from being marked as unstable, which allows the sched_clock_stable option to be enabled. Having the sched_clock_stable option enabled is problematic for Xen PV guests because the sched_clock() function has been overridden with the xen_sched_clock() function, which is not synchronized between virtual CPUs. This update provides a patch, which sets all x86_power features to 0 as a preventive measure against other potentially dangerous assumptions the kernel could make based on the features, fixing this issue. verified per.. https://bugzilla.redhat.com/show_bug.cgi?id=710609 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHSA-2011-1530.html |