Bug 834116 - ext4lazyinit goes a bit crazy on i386
Summary: ext4lazyinit goes a bit crazy on i386
Keywords:
Status: CLOSED INSUFFICIENT_DATA
Alias: None
Product: Fedora
Classification: Fedora
Component: e2fsprogs
Version: rawhide
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Eric Sandeen
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2012-06-20 21:22 UTC by Richard W.M. Jones
Modified: 2012-06-21 07:33 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2012-06-20 21:38:39 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)

Description Richard W.M. Jones 2012-06-20 21:22:01 UTC
Description of problem:

When building libguestfs in Rawhide, on i386 *only* (and *not*
on x86-64) I see this message repeated over and over again:

[74256.755716] BUG: soft lockup - CPU#0 stuck for 22s! [ext4lazyinit:172]
[74256.755719] Modules linked in: kvm_amd kvm i2c_piix4 i2c_core virtio_net virtio_scsi virtio_blk virtio_rng virtio_balloon virtio_mmio sparse_keymap rfkill sym53c8xx scsi_transport_spi crc8 crc_ccitt crc_itu_t libcrc32c
[74256.755719] irq event stamp: 811
[74256.755719] hardirqs last  enabled at (811): [<c09ed8bd>] restore_all_notrace+0x0/0x18
[74256.755719] hardirqs last disabled at (809): [<c044585e>] __do_softirq+0xde/0x330
[74256.755719] softirqs last  enabled at (810): [<c0445895>] __do_softirq+0x115/0x330
[74256.755719] softirqs last disabled at (805): [<c0405005>] do_softirq+0xa5/0x100
[74256.755719] Modules linked in: kvm_amd kvm i2c_piix4 i2c_core virtio_net virtio_scsi virtio_blk virtio_rng virtio_balloon virtio_mmio sparse_keymap rfkill sym53c8xx scsi_transport_spi crc8 crc_ccitt crc_itu_t libcrc32c
[74256.755719] 
[74256.755719] Pid: 172, comm: ext4lazyinit Tainted: G        W    3.5.0-0.rc3.git0.1.fc18.i686 #1 Bochs Bochs
[74256.755719] EIP: 0060:[<c06c47db>] EFLAGS: 00000246 CPU: 0
[74256.755719] EIP is at do_raw_spin_lock+0x5b/0x130
[74256.755719] EAX: dcda6000 EBX: dcc520f0 ECX: 00000000 EDX: 00000050
[74256.755719] ESI: 590f4008 EDI: 00000000 EBP: dcda7ee4 ESP: dcda7ed0
[74256.755719]  DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
[74256.755719] CR0: 8005003b CR2: b77926f0 CR3: 1cecc000 CR4: 000006d0
[74256.755719] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
[74256.755719] DR6: 00000000 DR7: 00000000
[74256.755719] Process ext4lazyinit (pid: 172, ti=dcda6000 task=dcd4ab20 task.ti=dcda6000)
[74256.755719] Stack:
[74256.755719]  97015d28 00000000 dcc52100 dcc520f0 00000001 dcda7f04 c09ecc52 00000000
[74256.755719]  00000002 00000000 c05e0641 dcc520f0 dccea0d0 dcda7f60 c05e0641 db92bd98
[74256.755719]  dcda7f18 db7f14cc dcda7f60 c09e9eec 00000000 00000002 00002105 00000000
[74256.755719] Call Trace:
[74256.755719]  [<c09ecc52>] _raw_spin_lock+0x62/0x80
[74256.755719]  [<c05e0641>] ? ext4_init_inode_table+0x1e1/0x390
[74256.755719]  [<c05e0641>] ext4_init_inode_table+0x1e1/0x390
[74256.755719]  [<c09e9eec>] ? mutex_lock_nested+0x27c/0x330
[74256.755719]  [<c05f4a0c>] ext4_lazyinit_thread+0x22c/0x280
[74256.755719]  [<c05f47e0>] ? ext4_unregister_li_request+0x60/0x60
[74256.755719]  [<c04620dd>] kthread+0x7d/0x90
[74256.755719]  [<c0462060>] ? kthread_worker_fn+0x170/0x170
[74256.755719]  [<c09f5982>] kernel_thread_helper+0x6/0x10
[74256.755719] Code: 58 d7 c0 39 43 08 0f 84 b9 00 00 00 0f b7 13 38 d6 74 4d 69 05 20 df c3 c0 e8 03 00 00 c7 45 f0 01 00 00 00 89 45 ec 31 f6 31 ff <31> c0 39 f8 73 0e 83 7d f0 00 0f 85 9b 00 00 00 31 f6 31 ff 39 

(Repeated once per second).

This eventually appears to cause builds on i386 to fail.

You can see a full log here:

http://kojipkgs.fedoraproject.org//work/tasks/4753/4174753/build.log

(Warning: very long file.  You need to look about
half way through the file to see the errors)

Version-Release number of selected component (if applicable):

kernel 3.5.0-0.rc3.git0.1.fc18.i686
e2fsprogs-1.42.4-1.fc18.i686

How reproducible:

?

Steps to Reproduce:
1. unknown

Comment 1 Richard W.M. Jones 2012-06-20 21:38:39 UTC
I cannot reproduce this on kernel git0.2.

Kernel %changelog merely mentions "Disable debugging options"
so I don't know if that means the problem has genuinely gone
away or if we're just not debugging the problem anymore.

Comment 2 Eric Sandeen 2012-06-20 21:46:23 UTC
Hum; well - this would have been a kernel bug anyway ;)

sorry, what changed to make this go away?

Comment 3 Eric Sandeen 2012-06-20 21:48:58 UTC
Oh, if you switched to a nondebug kernel, and it went away, I'm _guessing_ we still have a real problem ;)

Comment 4 Richard W.M. Jones 2012-06-21 07:33:28 UTC
Yes, kernel git0.1 -> git0.2 *apparently* fixed the problem.


Note You need to log in before you can comment on or make changes to this bug.