Bug 249105

Summary: [cpuidle] Crash as guest in qemu-0.9.0-2
Product: [Fedora] Fedora Reporter: Jan Kratochvil <jan.kratochvil>
Component: kernelAssignee: Kernel Maintainer List <kernel-maint>
Status: CLOSED RAWHIDE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: medium Docs Contact:
Priority: low    
Version: rawhideKeywords: Patch, Reopened
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: kernel-2.6.23-0.44.rc0.git16.fc8 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2007-10-05 00:03:12 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Kernel fix. none

Description Jan Kratochvil 2007-07-20 21:56:19 UTC
Description of problem:
Unable to boot Rawhide kernels in F7/Rawhide QEMU.
F7 kernel-2.6.21-1.3228.fc7.x86_64 boots there fine.
It may also be a QEMU bug, going to crosspost it there.

Version-Release number of selected component (if applicable):
kernel-2.6.23-0.15.rc0.git1.fc8.x86_64
kernel-2.6.23-0.29.rc0.git6.fc8.x86_64
kernel-2.6.23-0.35.rc0.git6.fc8.x86_64

How reproducible:
Always.

Steps to Reproduce:
1. echo 'MODULES="$MODULES ata_piix"' >/etc/sysconfig/mkinitrd
   (to find the QEMU IDE driver using initrd built in the next step on later
   QEMU guest OS boot)
2. rpm -i kernel*.x86_64.rpm
3. Add to /etc/grub.conf: console=ttyS0 earlyprintk=serial,ttyS0
   (to be able to catch the crash messages)
4. sync;hdparm -f /dev/sda{,1,2};nice qemu-kvm -hda /dev/sda -snapshot -net nic
-net tap -serial stdio -m 256
   or any other command to boot the new kernel in QEMU
   Both `qemu-kvm' and `qemu-system-x86_64' behave the same way.

Actual results:
Unable to handle kernel NULL pointer dereference at 0000000000000018 RIP: 
 [<ffffffff81167181>] acpi_idle_init+0x17/0x110

Expected results:
Successful guest OS boot.

Additional info:
Workaroundable by setting in `config-x86_64-generic':
  CONFIG_ACPI_PROCESSOR=n


/**
 * acpi_idle_init - attaches the driver to a CPU
 * @dev: the CPU
 */
static int acpi_idle_init(struct cpuidle_device *dev)
{
        int cpu = dev->cpu;
        int i, count = 0;
        struct acpi_processor_cx *cx; 
        struct cpuidle_state *state;

        struct acpi_processor *pr = processors[cpu];
/*!!! pr == NULL !!!*/
        if (!pr->flags.power_setup_done)
                return -EINVAL;

Dump of assembler code for function acpi_idle_init:
0xffffffff8116716a <acpi_idle_init+0>:  push   %r14
0xffffffff8116716c <acpi_idle_init+2>:  push   %r13
0xffffffff8116716e <acpi_idle_init+4>:  push   %r12
0xffffffff81167170 <acpi_idle_init+6>:  mov    %rdi,%r12
0xffffffff81167173 <acpi_idle_init+9>:  push   %rbp
0xffffffff81167174 <acpi_idle_init+10>: push   %rbx
0xffffffff81167175 <acpi_idle_init+11>: movslq 0x4(%rdi),%rax
0xffffffff81167179 <acpi_idle_init+15>: mov    0xffffffff81902fa0(,%rax,8),%rbp
/*!!! %rbp == NULL !!!*/
0xffffffff81167181 <acpi_idle_init+23>: mov    0x18(%rbp),%al

Linux version 2.6.23-0.35.rc0.git6.fc8 (kojibuilder.redhat.com)
(gcc version 4.1.2 20070704 (Red Hat 4.1.2-15)) #1 SMP Thu Jul 19 17:21:21 EDT 2007
Command line: ro root=LABEL=host0-root console=ttyS0 earlyprintk=serial,ttyS0 2
BIOS-provided physical RAM map:
 BIOS-e820: 0000000000000000 - 000000000009fc00 (usable)
 BIOS-e820: 000000000009fc00 - 00000000000a0000 (reserved)
 BIOS-e820: 00000000000e8000 - 0000000000100000 (reserved)
 BIOS-e820: 0000000000100000 - 000000000fff0000 (usable)
 BIOS-e820: 000000000fff0000 - 0000000010000000 (ACPI data)
 BIOS-e820: 00000000fffc0000 - 0000000100000000 (reserved)
end_pfn_map = 1048576
DMI not present or invalid.
ACPI: RSDP 000FA6A0, 0014 (r0 BOCHS )
ACPI: RSDT 0FFF0000, 002C (r1 BOCHS  BXPCRSDT        1 BXPC        1)
ACPI: FACP 0FFF002C, 0074 (r1 BOCHS  BXPCFACP        1 BXPC        1)
ACPI: DSDT 0FFF0100, 0832 (r1   BXPC   BXDSDT        1 INTL 20060912)
ACPI: FACS 0FFF00C0, 0040
ACPI: APIC 0FFF0938, 0040 (r1 BOCHS  BXPCAPIC        1 BXPC        1)
No NUMA configuration found
Faking a node at 0000000000000000-000000000fff0000
Bootmem setup node 0 0000000000000000-000000000fff0000
Zone PFN ranges:
  DMA             0 ->     4096
  DMA32        4096 ->  1048576
  Normal    1048576 ->  1048576
early_node_map[2] active PFN ranges
    0:        0 ->      159
    0:      256 ->    65520
ACPI: PM-Timer IO Port: 0xb008
ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled)
Processor #0 (Bootup-CPU)
ACPI: IOAPIC (id[0x01] address[0xfec00000] gsi_base[0])
IOAPIC[0]: apic_id 1, address 0xfec00000, GSI 0-23
Setting APIC routing to flat
Using ACPI (MADT) for SMP configuration information
swsusp: Registered nosave memory region: 000000000009f000 - 00000000000a0000
swsusp: Registered nosave memory region: 00000000000a0000 - 00000000000e8000
swsusp: Registered nosave memory region: 00000000000e8000 - 0000000000100000
Allocating PCI resources starting at 20000000 (gap: 10000000:effc0000)
SMP: Allowing 1 CPUs, 0 hotplug CPUs
PERCPU: Allocating 42504 bytes of per cpu data
Built 1 zonelists.  Total pages: 61646
Kernel command line: ro root=LABEL=host0-root console=ttyS0
earlyprintk=serial,ttyS0 2
Initializing CPU#0
PID hash table entries: 1024 (order: 10, 8192 bytes)
TSC calibration disturbed by SMI, using PIT calibration result
Marking TSC unstable due to TSCs unsynchronized
time.c: Detected 1994.477 MHz processor.
Console: colour VGA+ 80x25
Lock dependency validator: Copyright (c) 2006 Red Hat, Inc., Ingo Molnar
... MAX_LOCKDEP_SUBCLASSES:    8
... MAX_LOCK_DEPTH:          30
... MAX_LOCKDEP_KEYS:        2048
... CLASSHASH_SIZE:           1024
... MAX_LOCKDEP_ENTRIES:     8192
... MAX_LOCKDEP_CHAINS:      16384
... CHAINHASH_SIZE:          8192
 memory used by lock dependency info: 1648 kB
 per task-struct memory footprint: 1680 bytes
Checking aperture...
Memory: 243312k/262080k available (2432k kernel code, 18380k reserved, 1485k
data, 324k init)
SLUB: Genslabs=23, HWalign=64, Order=0-1, MinObjects=4, CPUs=1, Nodes=1
Calibrating delay using timer specific routine.. 4077.21 BogoMIPS (lpj=2038607)
Security Framework v1.0.0 initialized
SELinux:  Initializing.
selinux_register_security:  Registering secondary module capability
Capability LSM initialized as secondary
Dentry cache hash table entries: 32768 (order: 6, 262144 bytes)
Inode-cache hash table entries: 16384 (order: 5, 131072 bytes)
Mount-cache hash table entries: 256
CPU: L1 I cache: 8K
CPU: L2 cache: 128K
CPU 0/0 -> Node 0
SMP alternatives: switching to UP code
Freeing SMP alternatives: 24k freed
ACPI: Core revision 20070126
Using local APIC timer interrupts.
Brought up 1 CPUs
NET: Registered protocol family 16
ACPI: bus type pci registered
PCI: Using configuration type 1
ACPI: Interpreter enabled
ACPI: (supports S5)
ACPI: Using IOAPIC for interrupt routing
ACPI: PCI Root Bridge [PCI0] (0000:00)
* Found PM-Timer Bug on the chipset. Due to workarounds for a bug,
* this clock source is slow. Consider trying other clock sources
PCI quirk: region b000-b03f claimed by PIIX4 ACPI
PCI quirk: region b100-b10f claimed by PIIX4 SMB
ACPI: PCI Interrupt Link [LNKA] (IRQs 3 4 5 6 7 9 10 *11 12)
ACPI: PCI Interrupt Link [LNKB] (IRQs 3 4 5 6 7 *9 10 11 12)
ACPI: PCI Interrupt Link [LNKC] (IRQs 3 4 5 6 7 9 10 *11 12)
ACPI: PCI Interrupt Link [LNKD] (IRQs 3 4 5 6 7 *9 10 11 12)
Linux Plug and Play Support v0.97 (c) Adam Belay
pnp: PnP ACPI init
ACPI: bus type pnp registered
pnp: PnP ACPI: found 6 devices
ACPI: ACPI bus type pnp unregistered
usbcore: registered new interface driver usbfs
usbcore: registered new interface driver hub
usbcore: registered new device driver usb
PCI: Using ACPI for IRQ routing
PCI: If a device doesn't work, try "pci=routeirq".  If it helps, post a report
PCI-GART: No AMD northbridge found.
NET: Registered protocol family 2
Time: acpi_pm clocksource has been installed.
Switched to high resolution mode on CPU 0
IP route cache hash table entries: 2048 (order: 2, 16384 bytes)
TCP established hash table entries: 8192 (order: 7, 524288 bytes)
TCP bind hash table entries: 8192 (order: 6, 458752 bytes)
TCP: Hash tables configured (established 8192 bind 8192)
TCP reno registered
checking if image is initramfs... it is
Freeing initrd memory: 3089k freed
audit: initializing netlink socket (disabled)
audit(1184959335.171:1): initialized
Total HugeTLB memory allocated, 0
VFS: Disk quotas dquot_6.5.1
Dquot-cache hash table entries: 512 (order 0, 4096 bytes)
ksign: Installing public key data
Loading keyring
- Added public key A49870A22C495EDE
- User ID: Red Hat, Inc. (Kernel Module GPG key)
io scheduler noop registered
io scheduler anticipatory registered
io scheduler deadline registered
io scheduler cfq registered (default)
Limiting direct PCI/PCI transfers.
PCI: PIIX3: Enabling Passive Release on 0000:00:01.0
Activating ISA DMA hang workarounds.
pci_hotplug: PCI Hot Plug PCI Core version: 0.5
Unable to handle kernel NULL pointer dereference at 0000000000000018 RIP: 
 [<ffffffff81167181>] acpi_idle_init+0x17/0x110
PGD 0 
Oops: 0000 [1] SMP 
CPU 0 
Modules linked in:
Pid: 1, comm: swapper Not tainted 2.6.23-0.35.rc0.git6.fc8 #1
RIP: 0010:[<ffffffff81167181>]  [<ffffffff81167181>] acpi_idle_init+0x17/0x110
RSP: 0000:ffff81000fc21e40  EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff81000fc5b9b0 RCX: 0000000000000000
RDX: 0000000000000000 RSI: ffffffff813baba0 RDI: ffff81000fc5b9b0
RBP: 0000000000000000 R08: ffffffff813baba0 R09: ffff81000fc21e50
R10: 0000000000000000 R11: ffff81000fcf9780 R12: ffff81000fc5b9b0
R13: 0000000000000000 R14: ffffffff814b4dc0 R15: 0000000000000000
FS:  0000000000000000(0000) GS:ffffffff813d4000(0000) knlGS:0000000000000000
CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 0000000000000018 CR3: 0000000000201000 CR4: 00000000000006e0
Process swapper (pid: 1, threadinfo ffff81000fc20000, task ffff81000fc1e000)
Stack:  ffff81000fc5b9b0 00000000fffffffb ffffffff813a49a0 0000000000000000
 ffffffff814b4dc0 ffffffff811d6444 2222222222222222 ffff81000fc5b9b0
 0000000000000001 ffffffff811d651f 0000000000000000 ffffffff813a49a0
Call Trace:
 [<ffffffff811d6444>] cpuidle_attach_driver+0x55/0xa3
 [<ffffffff811d651f>] cpuidle_switch_driver+0x8d/0x100
 [<ffffffff811d663b>] cpuidle_register_driver+0x6c/0xac
 [<ffffffff814aa702>] acpi_processor_init+0xe0/0xf1
 [<ffffffff81490a48>] kernel_init+0x206/0x375
 [<ffffffff8125a008>] trace_hardirqs_on_thunk+0x35/0x37
 [<ffffffff810507d5>] trace_hardirqs_on+0x12f/0x153
 [<ffffffff8100aa28>] child_rip+0xa/0x12
 [<ffffffff8100a13c>] restore_args+0x0/0x30
 [<ffffffff81490842>] kernel_init+0x0/0x375
 [<ffffffff8100aa1e>] child_rip+0x0/0x12


Code: 8a 45 18 84 c0 0f 89 e0 00 00 00 a8 01 0f 84 d8 00 00 00 48 
RIP  [<ffffffff81167181>] acpi_idle_init+0x17/0x110
 RSP <ffff81000fc21e40>
CR2: 0000000000000018
Kernel panic - not syncing: Attempted to kill init!

Comment 1 Jan Kratochvil 2007-07-21 20:36:35 UTC
Created attachment 159731 [details]
Kernel fix.

More easily workaroundable by: qemu -no-acpi

It is a FEAT for QEMU - it could support the ACPI CPU type nodes but it does
not support it.

But kernel should not crash on it.

Comment 2 Jan Kratochvil 2007-07-22 06:58:59 UTC
Problem no longer present on: kernel-2.6.23-0.44.rc0.git16.fc8


Comment 3 Jan Kratochvil 2007-08-22 21:49:36 UTC
Crashed again (without the -no-acpi workaround) on:
kernel-2.6.23-0.129.rc3.git4.fc8.x86_64
kvm-24-1.x86_64

Activating ISA DMA hang workarounds.
pci_hotplug: PCI Hot Plug PCI Core version: 0.5
Unable to handle kernel NULL pointer dereference at 000000000000001c RIP: 
 [<ffffffff811701d9>] acpi_idle_init+0x17/0x110
PGD 0 
Oops: 0000 [1] SMP 
CPU 0 
Modules linked in:
Pid: 1, comm: swapper Not tainted 2.6.23-0.129.rc3.git4.fc8 #1
RIP: 0010:[<ffffffff811701d9>]  [<ffffffff811701d9>] acpi_idle_init+0x17/0x110
RSP: 0000:ffff81000fcb9e40  EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff8100013f0000 RCX: 0000000000000000
RDX: 0000000000000000 RSI: ffffffff813be6e0 RDI: ffff8100013f0000
RBP: 0000000000000000 R08: ffffffff813be6e0 R09: ffff81000fcb9e50
R10: ffffffff811e14ec R11: ffff81000ff31be0 R12: ffff8100013f0000
R13: 0000000000000000 R14: ffffffff814bbd40 R15: 0000000000000000
FS:  0000000000000000(0000) GS:ffffffff813d9000(0000) knlGS:0000000000000000
CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 000000000000001c CR3: 0000000000201000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process swapper (pid: 1, threadinfo ffff81000fcb8000, task ffff81000fcb6000)
Stack:  ffff8100013f0000 00000000fffffffb ffffffff813a7800 0000000000000000
 ffffffff814bbd40 ffffffff811e1340 2222222222222222 ffff8100013f0000
 0000000000000001 ffffffff811e141b 0000000000000000 ffffffff813a7800
Call Trace:
 [<ffffffff811e1340>] cpuidle_attach_driver+0x55/0xa3
 [<ffffffff811e141b>] cpuidle_switch_driver+0x8d/0x100
 [<ffffffff811e1537>] cpuidle_register_driver+0x6c/0xac
 [<ffffffff814b137d>] acpi_processor_init+0xe0/0xf1
 [<ffffffff81496768>] kernel_init+0x206/0x375
 [<ffffffff81269a3b>] trace_hardirqs_on_thunk+0x35/0x37
 [<ffffffff810541d9>] trace_hardirqs_on+0x12e/0x151
 [<ffffffff8100cb18>] child_rip+0xa/0x12
 [<ffffffff8100c22c>] restore_args+0x0/0x30
 [<ffffffff81496562>] kernel_init+0x0/0x375
 [<ffffffff8100cb0e>] child_rip+0x0/0x12


Code: 8a 45 1c 84 c0 0f 89 e0 00 00 00 a8 01 0f 84 d8 00 00 00 48 
RIP  [<ffffffff811701d9>] acpi_idle_init+0x17/0x110
 RSP <ffff81000fcb9e40>
CR2: 000000000000001c
Kernel panic - not syncing: Attempted to kill init!


Comment 4 Dave Jones 2007-08-22 22:07:33 UTC
ok, so this bug went away for a while when I dropped the cpuidle patch  (part of
the highres timer/tickless64 patchkit we carry).

I'll point the upstream cpuidle developer at this.

Comment 5 Chuck Ebbert 2007-10-05 00:03:12 UTC
Fixed by adding a check for NULL in the code.

Comment 6 Jan Kratochvil 2007-10-05 07:14:16 UTC
Therefore I assume the patch of Comment 1.