Description of problem: Unable to boot Rawhide kernels in F7/Rawhide QEMU. F7 kernel-2.6.21-1.3228.fc7.x86_64 boots there fine. It may also be a QEMU bug, going to crosspost it there. Version-Release number of selected component (if applicable): kernel-2.6.23-0.15.rc0.git1.fc8.x86_64 kernel-2.6.23-0.29.rc0.git6.fc8.x86_64 kernel-2.6.23-0.35.rc0.git6.fc8.x86_64 How reproducible: Always. Steps to Reproduce: 1. echo 'MODULES="$MODULES ata_piix"' >/etc/sysconfig/mkinitrd (to find the QEMU IDE driver using initrd built in the next step on later QEMU guest OS boot) 2. rpm -i kernel*.x86_64.rpm 3. Add to /etc/grub.conf: console=ttyS0 earlyprintk=serial,ttyS0 (to be able to catch the crash messages) 4. sync;hdparm -f /dev/sda{,1,2};nice qemu-kvm -hda /dev/sda -snapshot -net nic -net tap -serial stdio -m 256 or any other command to boot the new kernel in QEMU Both `qemu-kvm' and `qemu-system-x86_64' behave the same way. Actual results: Unable to handle kernel NULL pointer dereference at 0000000000000018 RIP: [<ffffffff81167181>] acpi_idle_init+0x17/0x110 Expected results: Successful guest OS boot. Additional info: Workaroundable by setting in `config-x86_64-generic': CONFIG_ACPI_PROCESSOR=n /** * acpi_idle_init - attaches the driver to a CPU * @dev: the CPU */ static int acpi_idle_init(struct cpuidle_device *dev) { int cpu = dev->cpu; int i, count = 0; struct acpi_processor_cx *cx; struct cpuidle_state *state; struct acpi_processor *pr = processors[cpu]; /*!!! pr == NULL !!!*/ if (!pr->flags.power_setup_done) return -EINVAL; Dump of assembler code for function acpi_idle_init: 0xffffffff8116716a <acpi_idle_init+0>: push %r14 0xffffffff8116716c <acpi_idle_init+2>: push %r13 0xffffffff8116716e <acpi_idle_init+4>: push %r12 0xffffffff81167170 <acpi_idle_init+6>: mov %rdi,%r12 0xffffffff81167173 <acpi_idle_init+9>: push %rbp 0xffffffff81167174 <acpi_idle_init+10>: push %rbx 0xffffffff81167175 <acpi_idle_init+11>: movslq 0x4(%rdi),%rax 0xffffffff81167179 <acpi_idle_init+15>: mov 0xffffffff81902fa0(,%rax,8),%rbp /*!!! %rbp == NULL !!!*/ 0xffffffff81167181 <acpi_idle_init+23>: mov 0x18(%rbp),%al Linux version 2.6.23-0.35.rc0.git6.fc8 (kojibuilder.redhat.com) (gcc version 4.1.2 20070704 (Red Hat 4.1.2-15)) #1 SMP Thu Jul 19 17:21:21 EDT 2007 Command line: ro root=LABEL=host0-root console=ttyS0 earlyprintk=serial,ttyS0 2 BIOS-provided physical RAM map: BIOS-e820: 0000000000000000 - 000000000009fc00 (usable) BIOS-e820: 000000000009fc00 - 00000000000a0000 (reserved) BIOS-e820: 00000000000e8000 - 0000000000100000 (reserved) BIOS-e820: 0000000000100000 - 000000000fff0000 (usable) BIOS-e820: 000000000fff0000 - 0000000010000000 (ACPI data) BIOS-e820: 00000000fffc0000 - 0000000100000000 (reserved) end_pfn_map = 1048576 DMI not present or invalid. ACPI: RSDP 000FA6A0, 0014 (r0 BOCHS ) ACPI: RSDT 0FFF0000, 002C (r1 BOCHS BXPCRSDT 1 BXPC 1) ACPI: FACP 0FFF002C, 0074 (r1 BOCHS BXPCFACP 1 BXPC 1) ACPI: DSDT 0FFF0100, 0832 (r1 BXPC BXDSDT 1 INTL 20060912) ACPI: FACS 0FFF00C0, 0040 ACPI: APIC 0FFF0938, 0040 (r1 BOCHS BXPCAPIC 1 BXPC 1) No NUMA configuration found Faking a node at 0000000000000000-000000000fff0000 Bootmem setup node 0 0000000000000000-000000000fff0000 Zone PFN ranges: DMA 0 -> 4096 DMA32 4096 -> 1048576 Normal 1048576 -> 1048576 early_node_map[2] active PFN ranges 0: 0 -> 159 0: 256 -> 65520 ACPI: PM-Timer IO Port: 0xb008 ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled) Processor #0 (Bootup-CPU) ACPI: IOAPIC (id[0x01] address[0xfec00000] gsi_base[0]) IOAPIC[0]: apic_id 1, address 0xfec00000, GSI 0-23 Setting APIC routing to flat Using ACPI (MADT) for SMP configuration information swsusp: Registered nosave memory region: 000000000009f000 - 00000000000a0000 swsusp: Registered nosave memory region: 00000000000a0000 - 00000000000e8000 swsusp: Registered nosave memory region: 00000000000e8000 - 0000000000100000 Allocating PCI resources starting at 20000000 (gap: 10000000:effc0000) SMP: Allowing 1 CPUs, 0 hotplug CPUs PERCPU: Allocating 42504 bytes of per cpu data Built 1 zonelists. Total pages: 61646 Kernel command line: ro root=LABEL=host0-root console=ttyS0 earlyprintk=serial,ttyS0 2 Initializing CPU#0 PID hash table entries: 1024 (order: 10, 8192 bytes) TSC calibration disturbed by SMI, using PIT calibration result Marking TSC unstable due to TSCs unsynchronized time.c: Detected 1994.477 MHz processor. Console: colour VGA+ 80x25 Lock dependency validator: Copyright (c) 2006 Red Hat, Inc., Ingo Molnar ... MAX_LOCKDEP_SUBCLASSES: 8 ... MAX_LOCK_DEPTH: 30 ... MAX_LOCKDEP_KEYS: 2048 ... CLASSHASH_SIZE: 1024 ... MAX_LOCKDEP_ENTRIES: 8192 ... MAX_LOCKDEP_CHAINS: 16384 ... CHAINHASH_SIZE: 8192 memory used by lock dependency info: 1648 kB per task-struct memory footprint: 1680 bytes Checking aperture... Memory: 243312k/262080k available (2432k kernel code, 18380k reserved, 1485k data, 324k init) SLUB: Genslabs=23, HWalign=64, Order=0-1, MinObjects=4, CPUs=1, Nodes=1 Calibrating delay using timer specific routine.. 4077.21 BogoMIPS (lpj=2038607) Security Framework v1.0.0 initialized SELinux: Initializing. selinux_register_security: Registering secondary module capability Capability LSM initialized as secondary Dentry cache hash table entries: 32768 (order: 6, 262144 bytes) Inode-cache hash table entries: 16384 (order: 5, 131072 bytes) Mount-cache hash table entries: 256 CPU: L1 I cache: 8K CPU: L2 cache: 128K CPU 0/0 -> Node 0 SMP alternatives: switching to UP code Freeing SMP alternatives: 24k freed ACPI: Core revision 20070126 Using local APIC timer interrupts. Brought up 1 CPUs NET: Registered protocol family 16 ACPI: bus type pci registered PCI: Using configuration type 1 ACPI: Interpreter enabled ACPI: (supports S5) ACPI: Using IOAPIC for interrupt routing ACPI: PCI Root Bridge [PCI0] (0000:00) * Found PM-Timer Bug on the chipset. Due to workarounds for a bug, * this clock source is slow. Consider trying other clock sources PCI quirk: region b000-b03f claimed by PIIX4 ACPI PCI quirk: region b100-b10f claimed by PIIX4 SMB ACPI: PCI Interrupt Link [LNKA] (IRQs 3 4 5 6 7 9 10 *11 12) ACPI: PCI Interrupt Link [LNKB] (IRQs 3 4 5 6 7 *9 10 11 12) ACPI: PCI Interrupt Link [LNKC] (IRQs 3 4 5 6 7 9 10 *11 12) ACPI: PCI Interrupt Link [LNKD] (IRQs 3 4 5 6 7 *9 10 11 12) Linux Plug and Play Support v0.97 (c) Adam Belay pnp: PnP ACPI init ACPI: bus type pnp registered pnp: PnP ACPI: found 6 devices ACPI: ACPI bus type pnp unregistered usbcore: registered new interface driver usbfs usbcore: registered new interface driver hub usbcore: registered new device driver usb PCI: Using ACPI for IRQ routing PCI: If a device doesn't work, try "pci=routeirq". If it helps, post a report PCI-GART: No AMD northbridge found. NET: Registered protocol family 2 Time: acpi_pm clocksource has been installed. Switched to high resolution mode on CPU 0 IP route cache hash table entries: 2048 (order: 2, 16384 bytes) TCP established hash table entries: 8192 (order: 7, 524288 bytes) TCP bind hash table entries: 8192 (order: 6, 458752 bytes) TCP: Hash tables configured (established 8192 bind 8192) TCP reno registered checking if image is initramfs... it is Freeing initrd memory: 3089k freed audit: initializing netlink socket (disabled) audit(1184959335.171:1): initialized Total HugeTLB memory allocated, 0 VFS: Disk quotas dquot_6.5.1 Dquot-cache hash table entries: 512 (order 0, 4096 bytes) ksign: Installing public key data Loading keyring - Added public key A49870A22C495EDE - User ID: Red Hat, Inc. (Kernel Module GPG key) io scheduler noop registered io scheduler anticipatory registered io scheduler deadline registered io scheduler cfq registered (default) Limiting direct PCI/PCI transfers. PCI: PIIX3: Enabling Passive Release on 0000:00:01.0 Activating ISA DMA hang workarounds. pci_hotplug: PCI Hot Plug PCI Core version: 0.5 Unable to handle kernel NULL pointer dereference at 0000000000000018 RIP: [<ffffffff81167181>] acpi_idle_init+0x17/0x110 PGD 0 Oops: 0000 [1] SMP CPU 0 Modules linked in: Pid: 1, comm: swapper Not tainted 2.6.23-0.35.rc0.git6.fc8 #1 RIP: 0010:[<ffffffff81167181>] [<ffffffff81167181>] acpi_idle_init+0x17/0x110 RSP: 0000:ffff81000fc21e40 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff81000fc5b9b0 RCX: 0000000000000000 RDX: 0000000000000000 RSI: ffffffff813baba0 RDI: ffff81000fc5b9b0 RBP: 0000000000000000 R08: ffffffff813baba0 R09: ffff81000fc21e50 R10: 0000000000000000 R11: ffff81000fcf9780 R12: ffff81000fc5b9b0 R13: 0000000000000000 R14: ffffffff814b4dc0 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffffffff813d4000(0000) knlGS:0000000000000000 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b CR2: 0000000000000018 CR3: 0000000000201000 CR4: 00000000000006e0 Process swapper (pid: 1, threadinfo ffff81000fc20000, task ffff81000fc1e000) Stack: ffff81000fc5b9b0 00000000fffffffb ffffffff813a49a0 0000000000000000 ffffffff814b4dc0 ffffffff811d6444 2222222222222222 ffff81000fc5b9b0 0000000000000001 ffffffff811d651f 0000000000000000 ffffffff813a49a0 Call Trace: [<ffffffff811d6444>] cpuidle_attach_driver+0x55/0xa3 [<ffffffff811d651f>] cpuidle_switch_driver+0x8d/0x100 [<ffffffff811d663b>] cpuidle_register_driver+0x6c/0xac [<ffffffff814aa702>] acpi_processor_init+0xe0/0xf1 [<ffffffff81490a48>] kernel_init+0x206/0x375 [<ffffffff8125a008>] trace_hardirqs_on_thunk+0x35/0x37 [<ffffffff810507d5>] trace_hardirqs_on+0x12f/0x153 [<ffffffff8100aa28>] child_rip+0xa/0x12 [<ffffffff8100a13c>] restore_args+0x0/0x30 [<ffffffff81490842>] kernel_init+0x0/0x375 [<ffffffff8100aa1e>] child_rip+0x0/0x12 Code: 8a 45 18 84 c0 0f 89 e0 00 00 00 a8 01 0f 84 d8 00 00 00 48 RIP [<ffffffff81167181>] acpi_idle_init+0x17/0x110 RSP <ffff81000fc21e40> CR2: 0000000000000018 Kernel panic - not syncing: Attempted to kill init!
Created attachment 159731 [details] Kernel fix. More easily workaroundable by: qemu -no-acpi It is a FEAT for QEMU - it could support the ACPI CPU type nodes but it does not support it. But kernel should not crash on it.
Problem no longer present on: kernel-2.6.23-0.44.rc0.git16.fc8
Crashed again (without the -no-acpi workaround) on: kernel-2.6.23-0.129.rc3.git4.fc8.x86_64 kvm-24-1.x86_64 Activating ISA DMA hang workarounds. pci_hotplug: PCI Hot Plug PCI Core version: 0.5 Unable to handle kernel NULL pointer dereference at 000000000000001c RIP: [<ffffffff811701d9>] acpi_idle_init+0x17/0x110 PGD 0 Oops: 0000 [1] SMP CPU 0 Modules linked in: Pid: 1, comm: swapper Not tainted 2.6.23-0.129.rc3.git4.fc8 #1 RIP: 0010:[<ffffffff811701d9>] [<ffffffff811701d9>] acpi_idle_init+0x17/0x110 RSP: 0000:ffff81000fcb9e40 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff8100013f0000 RCX: 0000000000000000 RDX: 0000000000000000 RSI: ffffffff813be6e0 RDI: ffff8100013f0000 RBP: 0000000000000000 R08: ffffffff813be6e0 R09: ffff81000fcb9e50 R10: ffffffff811e14ec R11: ffff81000ff31be0 R12: ffff8100013f0000 R13: 0000000000000000 R14: ffffffff814bbd40 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffffffff813d9000(0000) knlGS:0000000000000000 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b CR2: 000000000000001c CR3: 0000000000201000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process swapper (pid: 1, threadinfo ffff81000fcb8000, task ffff81000fcb6000) Stack: ffff8100013f0000 00000000fffffffb ffffffff813a7800 0000000000000000 ffffffff814bbd40 ffffffff811e1340 2222222222222222 ffff8100013f0000 0000000000000001 ffffffff811e141b 0000000000000000 ffffffff813a7800 Call Trace: [<ffffffff811e1340>] cpuidle_attach_driver+0x55/0xa3 [<ffffffff811e141b>] cpuidle_switch_driver+0x8d/0x100 [<ffffffff811e1537>] cpuidle_register_driver+0x6c/0xac [<ffffffff814b137d>] acpi_processor_init+0xe0/0xf1 [<ffffffff81496768>] kernel_init+0x206/0x375 [<ffffffff81269a3b>] trace_hardirqs_on_thunk+0x35/0x37 [<ffffffff810541d9>] trace_hardirqs_on+0x12e/0x151 [<ffffffff8100cb18>] child_rip+0xa/0x12 [<ffffffff8100c22c>] restore_args+0x0/0x30 [<ffffffff81496562>] kernel_init+0x0/0x375 [<ffffffff8100cb0e>] child_rip+0x0/0x12 Code: 8a 45 1c 84 c0 0f 89 e0 00 00 00 a8 01 0f 84 d8 00 00 00 48 RIP [<ffffffff811701d9>] acpi_idle_init+0x17/0x110 RSP <ffff81000fcb9e40> CR2: 000000000000001c Kernel panic - not syncing: Attempted to kill init!
ok, so this bug went away for a while when I dropped the cpuidle patch (part of the highres timer/tickless64 patchkit we carry). I'll point the upstream cpuidle developer at this.
Fixed by adding a check for NULL in the code.
Therefore I assume the patch of Comment 1.