Bug 459413 - Suspend + WARNING + WQ on CPU0, prefer CPU1, corrupt root fs (TAINTED)
Suspend + WARNING + WQ on CPU0, prefer CPU1, corrupt root fs (TAINTED)
Status: CLOSED NEXTRELEASE
Product: Fedora
Classification: Fedora
Component: kernel (Show other bugs)
9
i686 Linux
medium Severity urgent
: ---
: ---
Assigned To: Kernel Maintainer List
Fedora Extras Quality Assurance
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2008-08-18 13:31 EDT by Vegard Nossum
Modified: 2008-09-10 03:06 EDT (History)
0 users

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2008-09-05 08:20:02 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Vegard Nossum 2008-08-18 13:31:45 EDT
User-Agent:       Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.0.1) Gecko/2008071615 Fedora/3.0.1-1.fc9 Firefox/3.0.1

NOTICE: The kernel in question was tainted by NVIDIA module. End of notice.

Report:

I ran suspend (as usual) with this command, from tty1:

# echo disk > /sys/power/state

This seems to have triggered a WARNING, although I did not notice it at the time:

Aug 17 23:34:35 localhost acpid: client has disconnected
Aug 17 23:34:35 localhost kernel: PM: Syncing filesystems ... done.
Aug 17 23:34:35 localhost acpid: client has disconnected
Aug 17 23:34:35 localhost kernel: Freezing user space processes ... (elapsed 0.00 seconds) done.
Aug 17 23:34:35 localhost kernel: Freezing remaining freezable tasks ... (elapsed 0.00 seconds) done.
Aug 17 23:34:35 localhost kernel: PM: Shrinking memory... done (66905 pages freed)
Aug 17 23:34:35 localhost kernel: PM: Freed 267620 kbytes in 1.56 seconds (171.55 MB/s)
Aug 17 23:34:35 localhost kernel: Suspending console(s)
Aug 17 23:34:35 localhost kernel: sd 0:0:0:0: [sda] Synchronizing SCSI cache
Aug 17 23:34:35 localhost kernel: ACPI: PCI interrupt for device 0000:07:00.1 disabled
Aug 17 23:34:35 localhost kernel: ACPI: PCI interrupt for device 0000:06:00.0 disabled
Aug 17 23:34:35 localhost kernel: NVRM: RmPowerManagement: 3
Aug 17 23:34:35 localhost kernel: ACPI: PCI interrupt for device 0000:00:1f.2 disabled
Aug 17 23:34:35 localhost kernel: ACPI: PCI interrupt for device 0000:00:1f.1 disabled
Aug 17 23:34:35 localhost kernel: ACPI: PCI interrupt for device 0000:00:1d.7 disabled
Aug 17 23:34:35 localhost kernel: ACPI: PCI interrupt for device 0000:00:1d.2 disabled
Aug 17 23:34:35 localhost kernel: ACPI: PCI interrupt for device 0000:00:1d.1 disabled
Aug 17 23:34:35 localhost kernel: ACPI: PCI interrupt for device 0000:00:1d.0 disabled
Aug 17 23:34:35 localhost kernel: ACPI: PCI interrupt for device 0000:00:1b.0 disabled
Aug 17 23:34:35 localhost kernel: ACPI: PCI interrupt for device 0000:00:1a.7 disabled
Aug 17 23:34:35 localhost kernel: ACPI: PCI interrupt for device 0000:00:1a.1 disabled
Aug 17 23:34:35 localhost kernel: ACPI: PCI interrupt for device 0000:00:1a.0 disabled
Aug 17 23:34:35 localhost kernel: ACPI: Preparing to enter system sleep state S4
Aug 17 23:34:35 localhost kernel: Disabling non-boot CPUs ...
Aug 17 23:34:35 localhost kernel: NOHZ: local_softirq_pending 100
Aug 17 23:34:35 localhost kernel: CPU 1 is now offline
Aug 17 23:34:35 localhost kernel: SMP alternatives: switching to UP code
Aug 17 23:34:35 localhost kernel: CPU1 is down
Aug 17 23:34:35 localhost kernel: ------------[ cut here ]------------
Aug 17 23:34:35 localhost kernel: WARNING: at arch/x86/kernel/smp_32.c:565 native_smp_call_function_mask+0
Aug 17 23:34:35 localhost kernel: Modules linked in: oprofile osst mtdchar usbserial ftl ircomm_tty ircomm
Aug 17 23:34:35 localhost kernel: snd joydev i2c_i801 soundcore i2c_core ssb iTCO_wdt iTCO_vendor_support 
Aug 17 23:34:35 localhost kernel: Pid: 10027, comm: bash Tainted: P         2.6.25.11-97.fc9.i686 #1
Aug 17 23:34:35 localhost kernel: [<c0426bc5>] warn_on_slowpath+0x47/0x73
Aug 17 23:34:35 localhost kernel: [<c04f3862>] ? number+0x106/0x1c0
Aug 17 23:34:35 localhost kernel: [<c062d4f3>] ? kprobe_flush_task+0x56/0x90
Aug 17 23:34:35 localhost kernel: [<c0423121>] ? hrtick_set+0x80/0xe5
Aug 17 23:34:35 localhost kernel: [<c06298f2>] ? schedule+0x6a9/0x6db
Aug 17 23:34:35 localhost kernel: [<c04f4104>] ? vsnprintf+0x3e7/0x420
Aug 17 23:34:35 localhost kernel: [<c041c1af>] ? __wake_up_locked+0x11/0x15
Aug 17 23:34:35 localhost kernel: [<c04133b5>] native_smp_call_function_mask+0x32/0xdc
Aug 17 23:34:35 localhost kernel: [<f8e9186a>] ? nmi_cpu_stop+0x0/0x1b [oprofile]
Aug 17 23:34:35 localhost kernel: [<f8e9186a>] ? nmi_cpu_stop+0x0/0x1b [oprofile]
Aug 17 23:34:35 localhost kernel: [<f8e9186a>] ? nmi_cpu_stop+0x0/0x1b [oprofile]
Aug 17 23:34:35 localhost kernel: [<c0413b78>] smp_call_function+0x18/0x1d
Aug 17 23:34:35 localhost kernel: [<c042acbe>] on_each_cpu+0x12/0x36
Aug 17 23:34:35 localhost kernel: [<f8e91911>] nmi_suspend+0x1c/0x21 [oprofile]
Aug 17 23:34:35 localhost kernel: [<c0560ad5>] sysdev_suspend+0x67/0x175
Aug 17 23:34:35 localhost kernel: [<c0564cdb>] device_power_down+0xa9/0xb1
Aug 17 23:34:35 localhost kernel: [<c04485b0>] hibernation_snapshot+0xc7/0x176
Aug 17 23:34:35 localhost kernel: [<c04486fc>] hibernate+0x9d/0x169
Aug 17 23:34:35 localhost kernel: [<c04473b3>] ? state_store+0x0/0xa8
Aug 17 23:34:35 localhost kernel: [<c0447409>] state_store+0x56/0xa8
Aug 17 23:34:35 localhost kernel: [<c04473b3>] ? state_store+0x0/0xa8
Aug 17 23:34:35 localhost kernel: [<c04f05b1>] kobj_attr_store+0x1a/0x22
Aug 17 23:34:35 localhost kernel: [<c04bb19a>] sysfs_write_file+0xb4/0xdf
Aug 17 23:34:35 localhost kernel: [<c04bb0e6>] ? sysfs_write_file+0x0/0xdf
Aug 17 23:34:35 localhost kernel: [<c048272f>] vfs_write+0x8a/0x12e
Aug 17 23:34:35 localhost kernel: [<c048286c>] sys_write+0x3b/0x60
Aug 17 23:34:35 localhost kernel: [<c0405bf2>] syscall_call+0x7/0xb
Aug 17 23:34:35 localhost kernel: [<c0620000>] ? agp_amd64_probe+0x134/0x3ee
Aug 17 23:34:35 localhost kernel: =======================
Aug 17 23:34:35 localhost kernel: ---[ end trace 94272564199f17c5 ]---
Aug 17 23:34:35 localhost kernel: PM: Creating hibernation image: 
Aug 17 23:34:35 localhost kernel: PM: Need to copy 127181 pages
Aug 17 23:34:35 localhost kernel: Intel machine check architecture supported.
Aug 17 23:34:35 localhost kernel: Intel machine check reporting enabled on CPU#0.
Aug 17 23:34:35 localhost kernel: WQ on CPU0, prefer CPU1
Aug 17 23:34:35 localhost kernel: Enabling non-boot CPUs ...
Aug 17 23:34:35 localhost kernel: SMP alternatives: switching to SMP code
Aug 17 23:34:35 localhost kernel: Booting processor 1/1 ip 4000
Aug 17 23:34:35 localhost kernel: CPU 1 irqstacks, hard=c07b9000 soft=c0799000
Aug 17 23:34:35 localhost kernel: Initializing CPU#1
Aug 17 23:34:35 localhost kernel: Calibrating delay using timer specific routine.. 2926.04 BogoMIPS (lpj=1
Aug 17 23:34:35 localhost kernel: CPU: L1 I cache: 32K, L1 D cache: 32K
Aug 17 23:34:35 localhost kernel: CPU: L2 cache: 1024K
Aug 17 23:34:35 localhost kernel: CPU: Physical Processor ID: 0
Aug 17 23:34:35 localhost kernel: CPU: Processor Core ID: 1
Aug 17 23:34:35 localhost kernel: Intel machine check architecture supported.
Aug 17 23:34:35 localhost kernel: Intel machine check reporting enabled on CPU#1.
Aug 17 23:34:35 localhost kernel: CPU1: Intel(R) Pentium(R) Dual  CPU  T2310  @ 1.46GHz stepping 0d
Aug 17 23:34:35 localhost kernel: CPU1 is up
Aug 17 23:34:35 localhost kernel: ACPI: EC: acpi_ec_wait timeout, status = 8, expect_event = 1
Aug 17 23:34:35 localhost kernel: ACPI: EC: read timeout, command = 130
Aug 17 23:34:35 localhost kernel: WQ on CPU0, prefer CPU1
Aug 17 23:34:35 localhost kernel: WQ on CPU0, prefer CPU1
Aug 17 23:34:35 localhost kernel: WQ on CPU0, prefer CPU1
Aug 17 23:34:35 localhost kernel: WQ on CPU0, prefer CPU1
Aug 17 23:34:35 localhost kernel: ACPI: PCI Interrupt 0000:00:1a.0[A] -> GSI 16 (level, low) -> IRQ 16
Aug 17 23:34:35 localhost kernel: usb usb3: root hub lost power or was reset
Aug 17 23:34:35 localhost kernel: ACPI: PCI Interrupt 0000:00:1a.1[B] -> GSI 21 (level, low) -> IRQ 21
Aug 17 23:34:35 localhost kernel: usb usb4: root hub lost power or was reset
Aug 17 23:34:35 localhost kernel: ACPI: PCI Interrupt 0000:00:1a.7[C] -> GSI 18 (level, low) -> IRQ 18
Aug 17 23:34:35 localhost kernel: usb usb1: root hub lost power or was reset
Aug 17 23:34:35 localhost kernel: ehci_hcd 0000:00:1a.7: debug port 1
Aug 17 23:34:35 localhost kernel: ACPI: PCI Interrupt 0000:00:1b.0[A] -> GSI 22 (level, low) -> IRQ 22
Aug 17 23:34:35 localhost kernel: ACPI: PCI Interrupt 0000:00:1d.0[A] -> GSI 23 (level, low) -> IRQ 23
Aug 17 23:34:35 localhost kernel: usb usb5: root hub lost power or was reset
Aug 17 23:34:35 localhost kernel: ACPI: PCI Interrupt 0000:00:1d.1[B] -> GSI 19 (level, low) -> IRQ 19
Aug 17 23:34:35 localhost kernel: usb usb6: root hub lost power or was reset
Aug 17 23:34:35 localhost kernel: ACPI: PCI Interrupt 0000:00:1d.2[C] -> GSI 18 (level, low) -> IRQ 18
Aug 17 23:34:35 localhost kernel: usb usb7: root hub lost power or was reset
Aug 17 23:34:35 localhost kernel: ACPI: PCI Interrupt 0000:00:1d.7[A] -> GSI 23 (level, low) -> IRQ 23
Aug 17 23:34:35 localhost kernel: usb usb2: root hub lost power or was reset
Aug 17 23:34:35 localhost kernel: ehci_hcd 0000:00:1d.7: debug port 1
Aug 17 23:34:35 localhost kernel: ACPI: PCI Interrupt 0000:00:1f.1[A] -> GSI 18 (level, low) -> IRQ 18
Aug 17 23:34:35 localhost kernel: ACPI: PCI Interrupt 0000:00:1f.2[B] -> GSI 19 (level, low) -> IRQ 19
Aug 17 23:34:35 localhost kernel: NVRM: RmPowerManagement: 5
Aug 17 23:34:35 localhost kernel: WQ on CPU0, prefer CPU1
Aug 17 23:34:35 localhost kernel: WQ on CPU0, prefer CPU1
Aug 17 23:34:35 localhost kernel: ata2: SATA link down (SStatus 0 SControl 300)
Aug 17 23:34:35 localhost kernel: ata4.00: ACPI cmd ef/03:0c:00:00:00:a0 filtered out
Aug 17 23:34:35 localhost kernel: ata4.00: ACPI cmd ef/03:42:00:00:00:a0 filtered out
Aug 17 23:34:35 localhost kernel: WQ on CPU0, prefer CPU1
Aug 17 23:34:35 localhost kernel: ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
Aug 17 23:34:35 localhost kernel: ata1.00: configured for UDMA/100
Aug 17 23:34:35 localhost kernel: sd 0:0:0:0: [sda] 312581808 512-byte hardware sectors (160042 MB)
Aug 17 23:34:35 localhost kernel: sd 0:0:0:0: [sda] Write Protect is off
Aug 17 23:34:35 localhost kernel: sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't sup
Aug 17 23:34:35 localhost kernel: ata4.00: configured for UDMA/33
Aug 17 23:34:35 localhost kernel: WQ on CPU0, prefer CPU1
Aug 17 23:34:35 localhost kernel: WQ on CPU0, prefer CPU1
Aug 17 23:34:35 localhost kernel: ACPI: PCI Interrupt 0000:06:00.0[A] -> GSI 19 (level, low) -> IRQ 19
Aug 17 23:34:35 localhost kernel: WQ on CPU0, prefer CPU1
Aug 17 23:34:35 localhost kernel: ACPI: PCI Interrupt 0000:07:00.1[B] -> GSI 22 (level, low) -> IRQ 22
Aug 17 23:34:35 localhost kernel: WQ on CPU0, prefer CPU1
Aug 17 23:34:35 localhost kernel: WQ on CPU0, prefer CPU1
Aug 17 23:34:35 localhost kernel: WQ on CPU0, prefer CPU1
Aug 17 23:34:35 localhost kernel: WQ on CPU0, prefer CPU1
Aug 17 23:34:35 localhost kernel: WQ on CPU0, prefer CPU1
Aug 17 23:34:35 localhost kernel: WQ on CPU0, prefer CPU1
Aug 17 23:34:35 localhost kernel: WQ on CPU0, prefer CPU1
Aug 17 23:34:35 localhost kernel: WQ on CPU0, prefer CPU1
Aug 17 23:34:35 localhost kernel: WQ on CPU0, prefer CPU1
Aug 17 23:34:35 localhost kernel: WQ on CPU0, prefer CPU1
Aug 17 23:34:35 localhost kernel: WQ on CPU0, prefer CPU1
Aug 17 23:34:35 localhost kernel: WQ on CPU0, prefer CPU1
Aug 17 23:34:35 localhost kernel: WQ on CPU0, prefer CPU1
Aug 17 23:34:35 localhost kernel: WQ on CPU0, prefer CPU1
Aug 17 23:34:35 localhost kernel: WQ on CPU0, prefer CPU1
Aug 17 23:34:35 localhost kernel: WQ on CPU0, prefer CPU1
Aug 17 23:34:35 localhost kernel: sd 0:0:0:0: [sda] Starting disk
Aug 17 23:34:35 localhost kernel: WQ on CPU0, prefer CPU1
Aug 17 23:34:35 localhost kernel: WQ on CPU0, prefer CPU1
Aug 17 23:34:35 localhost kernel: b43-phy0: Loading firmware version 410.2160 (2007-05-26 15:32:10)
Aug 17 23:34:35 localhost kernel: WQ on CPU0, prefer CPU1
Aug 17 23:34:35 localhost kernel: Restarting tasks ... WQ on CPU0, prefer CPU1
Aug 17 23:34:35 localhost kernel: usb 2-4: USB disconnect, address 4
Aug 17 23:34:35 localhost kernel: done.
Aug 17 23:34:35 localhost kernel: WQ on CPU0, prefer CPU1
Aug 17 23:34:35 localhost kernel: usb 2-4: new high speed USB device using ehci_hcd and address 5
Aug 17 23:34:35 localhost kernel: WQ on CPU0, prefer CPU1
Aug 17 23:34:35 localhost kernel: usb 2-4: configuration #1 chosen from 1 choice
Aug 17 23:34:35 localhost kernel: uvcvideo: Found UVC 1.00 device Acer CrystalEye webcam (064e:a101)
Aug 17 23:34:35 localhost kernel: input: Acer CrystalEye webcam as /devices/pci0000:00/0000:00:1d.7/usb2/2
Aug 17 23:34:35 localhost kernel: usb 2-4: New USB device found, idVendor=064e, idProduct=a101
Aug 17 23:34:35 localhost kernel: usb 2-4: New USB device strings: Mfr=2, Product=1, SerialNumber=3
Aug 17 23:34:35 localhost kernel: usb 2-4: Product: Acer CrystalEye webcam
Aug 17 23:34:35 localhost kernel: usb 2-4: Manufacturer: SuYin
Aug 17 23:34:35 localhost kernel: usb 2-4: SerialNumber: CN0314-OV03-VA-R02.00.00
Aug 17 23:34:35 localhost kernel: WQ on CPU0, prefer CPU1
Aug 17 23:34:35 localhost kernel: WQ on CPU0, prefer CPU1
Aug 17 23:34:35 localhost kernel: WQ on CPU0, prefer CPU1
Aug 17 23:34:36 localhost kernel: WQ on CPU0, prefer CPU1
Aug 17 23:34:36 localhost kernel: WQ on CPU0, prefer CPU1
Aug 17 23:34:36 localhost kernel: WQ on CPU0, prefer CPU1
Aug 17 23:34:36 localhost kernel: WQ on CPU0, prefer CPU1
[...]

So it resumed all right, with the exception of the "WQ on CPU0, prefer CPU1" message which hit my log a total of 373801 times (!) in about 12 hours, during which the machine was mostly fully operational (I did not notice that anything was wrong).

Then I inserted a CD, but it wouldn't mount. Again, I found this in the log:

Aug 18 17:31:39 localhost kernel: sr 3:0:0:0: [sr0] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE,SUGGEST_OK
Aug 18 17:31:39 localhost kernel: sr 3:0:0:0: [sr0] Sense Key : Medium Error [current] 
Aug 18 17:31:39 localhost kernel: Info fld=0x46ac0
Aug 18 17:31:39 localhost kernel: sr 3:0:0:0: [sr0] Add. Sense: L-EC uncorrectable error
Aug 18 17:31:39 localhost kernel: end_request: I/O error, dev sr0, sector 1157888
Aug 18 17:31:39 localhost kernel: Buffer I/O error on device sr0, logical block 289472
...
Aug 18 17:31:46 localhost kernel: WQ on CPU0, prefer CPU1
Aug 18 17:31:46 localhost kernel: sr 3:0:0:0: [sr0] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE,SUGGEST_OK
Aug 18 17:31:46 localhost kernel: sr 3:0:0:0: [sr0] Sense Key : Medium Error [current] 
Aug 18 17:31:46 localhost kernel: Info fld=0x46ac0
Aug 18 17:31:46 localhost kernel: sr 3:0:0:0: [sr0] Add. Sense: L-EC uncorrectable error
Aug 18 17:31:46 localhost kernel: end_request: I/O error, dev sr0, sector 1157888
Aug 18 17:31:46 localhost kernel: Buffer I/O error on device sr0, logical block 289472
Aug 18 17:31:46 localhost kernel: WQ on CPU0, prefer CPU1

When trying to mount the disk,

# mount /dev/cdrom /mnt

I would get some strange error about stale NFS. Running "strace" on mount showed that open("/etc/mtab") failed with -ESTALE.

So I rebooted. When the machine came back, it started fsck, and then I had to do a "manual" fsck, which found a lot of problems, including orphaned blocks, blocks owned by multiple inodes, wrong refcounts, wrong filetypes (/etc/mtab was "file type 1 instead of 6", etc.) Luckily, after this, the machine booted all right again.

I had been using oprofile to profile a userspace program. The fact that oprofile shows up in the WARNING backtrace seems significant to me. Maybe oprofile + suspend is the culprit?

Note that I've been using suspend/resume with this kernel (2.6.25.11-97.fc9.i686 + NVIDIA module) without problems, so I am inclined to believe that this is NOT the problem. Moreover, I always run suspend from the console to avoid switching modes on suspend/resume, since this has been problematic in the past.

Reproducible: Didn't try

Steps to Reproduce:
(This is guesswork, but I am reluctant to try to reproduce the failure because of the disk corruption):

1. Run oprofile on userspace program (no kernel profiling)
2. Suspend
3. Resume
Comment 1 Vegard Nossum 2008-08-18 15:12:52 EDT
I don't know if this is related to the corruption that I saw, but I managed to get this oprofile warning on the latest -git kernel, at this commit (+ one unrelated build fix):

commit b09c3e3f1710b554348c98e78fbf4a661918779a
Author: Huang Weiyi <weiyi.huang@gmail.com>
Date:   Sun Aug 17 07:51:05 2008 +0800


BUG: using smp_processor_id() in preemptible [00000000] code: oprofiled/4133
APIC error on CPU1: 00(40)
APIC error on CPU0: 00(40)
APIC error on CPU1: 40(40)
APIC error on CPU0: 40(40)
caller is get_stagger+0x9/0x30
Pid: 4133, comm: oprofiled Not tainted 2.6.27-rc3-00415-g122c9e0 #9
 [<c037228e>] debug_smp_processor_id+0xce/0xd0
 [<c0583d19>] get_stagger+0x9/0x30
 [<c05843de>] p4_fill_in_addresses+0x1e/0x3a0
 [<c058313d>] nmi_setup+0xcd/0x190
 [<c05811fa>] oprofile_setup+0x3a/0xc0
 [<c05820a6>] event_buffer_open+0x56/0x80
 [<c01a22c4>] __dentry_open+0xf4/0x1f0
 [<c01a2407>] nameidata_to_filp+0x47/0x60
 [<c0582050>] ? event_buffer_open+0x0/0x80
 [<c01ae1bc>] do_filp_open+0x18c/0x6d0
 [<c015626d>] ? put_lock_stats+0xd/0x30
 [<c01b883c>] ? alloc_fd+0xdc/0x100
 [<c0371bb6>] ? _raw_spin_unlock+0x46/0x80
 [<c066af57>] ? _spin_unlock+0x27/0x50
 [<c01a208b>] do_sys_open+0x4b/0xe0
 [<c0362da4>] ? trace_hardirqs_on_thunk+0xc/0x10
 [<c01a2189>] sys_open+0x29/0x40
 [<c01040db>] sysenter_do_call+0x12/0x3f
 =======================

I will also report this directly to the kernel people, however.

Steps to reproduce:

1. oprofile --reset
2. oprofile --setup --no-vmlinux --image=/bin/bash
3. oprofile --start
Comment 2 Chuck Ebbert 2008-08-23 00:47:52 EDT
The warning in smp_32.c is:

WARN_ON(irqs_disabled());
Comment 3 Chuck Ebbert 2008-08-23 00:53:49 EDT
drivers/oprofile/cpu_buffer.c:                printk("WQ on CPU%d, prefer CPU%d\n",
Comment 4 Vegard Nossum 2008-08-23 01:17:52 EDT
Hi,

I started a thread on LKML (http://lkml.org/lkml/2008/8/18/459), which is related, but not exactly the same. Andi Kleen posted a patch in response, which is now in mainline -git (it fixes one of the warnings in oprofile code that appeared during the suspend phase).

I've been testing the oprofile + suspend a couple of more times, but the disk corruption was not seen. This makes me think it was the fault of the bad CD. I might try it again after doing a backup.

Thanks for looking into it!


Vegard
Comment 5 Fedora Update System 2008-08-31 11:02:16 EDT
kernel-2.6.26.3-27.fc9 has been submitted as an update for Fedora 9.
http://admin.fedoraproject.org/updates/kernel-2.6.26.3-27.fc9
Comment 6 Fedora Update System 2008-09-03 12:49:24 EDT
kernel-2.6.26.3-29.fc9 has been submitted as an update for Fedora 9.
http://admin.fedoraproject.org/updates/kernel-2.6.26.3-29.fc9
Comment 7 Fedora Update System 2008-09-05 08:19:36 EDT
kernel-2.6.26.3-29.fc9 has been pushed to the Fedora 9 stable repository.  If problems still persist, please make note of it in this bug report.
Comment 8 Fedora Update System 2008-09-10 03:06:59 EDT
kernel-2.6.26.3-29.fc9 has been pushed to the Fedora 9 stable repository.  If problems still persist, please make note of it in this bug report.

Note You need to log in before you can comment on or make changes to this bug.