Bug 222550 - inconsistent lock state, with new firewire driver
Summary: inconsistent lock state, with new firewire driver
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: rawhide
Hardware: All
OS: Linux
medium
high
Target Milestone: ---
Assignee: Kristian Høgsberg
QA Contact: Brian Brock
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2007-01-13 23:22 UTC by Deji Akingunola
Modified: 2007-11-30 22:11 UTC (History)
7 users (show)

Fixed In Version: kernel-2.6.21-1.3218.fc8
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2007-06-13 05:37:14 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
Firewire dmesg output from before today's HAL update (4.82 KB, text/plain)
2007-02-09 03:45 UTC, Michel Alexandre Salim
no flags Details

Description Deji Akingunola 2007-01-13 23:22:55 UTC
Description of problem: On pluging in an external disk drive using the firewire
interface, i got the below in the logs.
On a positive note, this driver actually detects the device and I was able to
mount the partitions on it; whereas previous kernels were unable to activate the
device.

<<
Jan 13 18:05:31 agape kernel: 
Jan 13 18:05:31 agape kernel: =================================
Jan 13 18:05:31 agape kernel: [ INFO: inconsistent lock state ]
Jan 13 18:05:31 agape kernel: 2.6.19-1.2911.fc7 #1
Jan 13 18:05:31 agape kernel: ---------------------------------
Jan 13 18:05:31 agape kernel: inconsistent {softirq-on-W} -> {in-softirq-W} usage.
Jan 13 18:05:31 agape kernel: swapper/0 [HC0[0]:SC1[1]:HE0:SE0] takes:
Jan 13 18:05:31 agape kernel:  (&k->k_lock){-+..}, at: [<ffffffff80457a32>]
klist_next+0x24/0x8a
Jan 13 18:05:31 agape kernel: {softirq-on-W} state was registered at:
Jan 13 18:05:31 agape kernel:   [<ffffffff802a52e2>] __lock_acquire+0x454/0xa50
Jan 13 18:05:31 agape kernel:   [<ffffffff8020b39b>] kfree+0x26c/0x27e
Jan 13 18:05:31 agape kernel:   [<ffffffff802a5bbb>] lock_acquire+0x4c/0x65
Jan 13 18:05:31 agape kernel:   [<ffffffff80457a32>] klist_next+0x24/0x8a
Jan 13 18:05:31 agape kernel:   [<ffffffff80263a05>] _spin_lock+0x25/0x31
Jan 13 18:05:31 agape kernel:   [<ffffffff80457a32>] klist_next+0x24/0x8a
Jan 13 18:05:31 agape kernel:   [<ffffffff803b0946>] __device_attach+0x0/0xb
Jan 13 18:05:31 agape kernel:   [<ffffffff803afba7>] bus_for_each_drv+0x52/0x7d
Jan 13 18:05:31 agape kernel:   [<ffffffff803b09d8>] device_attach+0x6a/0x7e
Jan 13 18:05:32 agape kernel:   [<ffffffff803afb2d>] bus_attach_device+0x24/0x4c
Jan 13 18:05:32 agape kernel:   [<ffffffff803aea4d>] device_add+0x356/0x528
Jan 13 18:05:32 agape kernel:   [<ffffffff8029f379>] init_waitqueue_head+0x1f/0x30
Jan 13 18:05:32 agape kernel:   [<ffffffff803aec38>] device_register+0x19/0x1e
Jan 13 18:05:32 agape kernel:   [<ffffffff8067d330>] acpi_scan_init+0xda/0x1a4
Jan 13 18:05:32 agape kernel:   [<ffffffff80269c43>] init+0x1e7/0x3ae
Jan 13 18:05:32 agape kernel:   [<ffffffff802a4a0b>] trace_hardirqs_on+0x11a/0x13e
Jan 13 18:05:32 agape kernel:   [<ffffffff8025cf38>] child_rip+0xa/0x12
Jan 13 18:05:32 agape kernel:   [<ffffffff80263cb0>] _spin_unlock_irq+0x2b/0x31
Jan 13 18:05:32 agape kernel:   [<ffffffff8025c6bc>] restore_args+0x0/0x30
Jan 13 18:05:32 agape kernel:   [<ffffffff80269a5c>] init+0x0/0x3ae
Jan 13 18:05:32 agape kernel:   [<ffffffff8025cf2e>] child_rip+0x0/0x12
Jan 13 18:05:32 agape kernel:   [<ffffffffffffffff>] 0xffffffffffffffff
Jan 13 18:05:32 agape kernel: irq event stamp: 2706171
Jan 13 18:05:32 agape kernel: hardirqs last  enabled at (2706170):
[<ffffffff80263d26>] _spin_unlock_irqrestore+0x3f/0x47
Jan 13 18:05:32 agape kernel: hardirqs last disabled at (2706171):
[<ffffffff80263d88>] _spin_lock_irqsave+0x13/0x3c
Jan 13 18:05:32 agape kernel: softirqs last  enabled at (2706158):
[<ffffffff80211c21>] __do_softirq+0xda/0xe3
Jan 13 18:05:32 agape kernel: softirqs last disabled at (2706163):
[<ffffffff8025d2ac>] call_softirq+0x1c/0x28
Jan 13 18:05:32 agape kernel: 
Jan 13 18:05:32 agape kernel: other info that might help us debug this:
Jan 13 18:05:32 agape kernel: 1 lock held by swapper/0:
Jan 13 18:05:32 agape kernel:  #0:  (&card->lock){.+..}, at:
[<ffffffff8818595f>] fw_core_handle_bus_reset+0x3b/0x548 [fw_core]
Jan 13 18:05:32 agape kernel: 
Jan 13 18:05:32 agape kernel: stack backtrace:
Jan 13 18:05:32 agape kernel: 
Jan 13 18:05:32 agape kernel: Call Trace:
Jan 13 18:05:32 agape kernel:  <IRQ>  [<ffffffff802a3ceb>]
print_usage_bug+0x162/0x173
Jan 13 18:05:32 agape kernel:  [<ffffffff802a4505>] mark_lock+0x139/0x411
Jan 13 18:05:32 agape kernel:  [<ffffffff802a5263>] __lock_acquire+0x3d5/0xa50
Jan 13 18:05:32 agape kernel:  [<ffffffff802569fb>] klist_children_put+0x0/0xf
Jan 13 18:05:32 agape kernel:  [<ffffffff802a5bbb>] lock_acquire+0x4c/0x65
Jan 13 18:05:32 agape kernel:  [<ffffffff80457a32>] klist_next+0x24/0x8a
Jan 13 18:05:32 agape kernel:  [<ffffffff80263a05>] _spin_lock+0x25/0x31
Jan 13 18:05:32 agape kernel:  [<ffffffff80457a32>] klist_next+0x24/0x8a
Jan 13 18:05:32 agape kernel:  [<ffffffff8818749e>] :fw_core:update_unit+0x0/0x2f
Jan 13 18:05:32 agape kernel:  [<ffffffff803ae05f>] next_device+0x9/0x19
Jan 13 18:05:32 agape kernel:  [<ffffffff803ae0dd>] device_for_each_child+0x36/0x59
Jan 13 18:05:32 agape kernel:  [<ffffffff8818726e>]
:fw_core:fw_node_event+0x110/0x17b
Jan 13 18:05:32 agape kernel:  [<ffffffff881858cb>]
:fw_core:fw_node_create+0x2a/0x83
Jan 13 18:05:32 agape kernel:  [<ffffffff88185cd9>]
:fw_core:fw_core_handle_bus_reset+0x3b5/0x548
Jan 13 18:05:32 agape kernel:  [<ffffffff88192baf>]
:fw_ohci:bus_reset_tasklet+0x1ae/0x1bb
Jan 13 18:05:32 agape kernel:  [<ffffffff8029162f>] tasklet_action+0x5e/0xb2
Jan 13 18:05:32 agape kernel:  [<ffffffff80211ba6>] __do_softirq+0x5f/0xe3
Jan 13 18:05:32 agape kernel:  [<ffffffff8025d2ac>] call_softirq+0x1c/0x28
Jan 13 18:05:32 agape kernel:  [<ffffffff8026bdfc>] do_softirq+0x39/0x9f
Jan 13 18:05:32 agape kernel:  [<ffffffff80291505>] irq_exit+0x4e/0x50
Jan 13 18:05:32 agape kernel:  [<ffffffff8026bfb0>] do_IRQ+0x14e/0x170
Jan 13 18:05:32 agape kernel:  [<ffffffff8026a699>] default_idle+0x0/0x51
Jan 13 18:05:32 agape kernel:  [<ffffffff8025c666>] ret_from_intr+0x0/0xf
Jan 13 18:05:32 agape kernel:  <EOI>  [<ffffffff80260f3e>]
__sched_text_start+0xafe/0xb36
Jan 13 18:05:32 agape kernel:  [<ffffffff8026a6ce>] default_idle+0x35/0x51
Jan 13 18:05:32 agape kernel:  [<ffffffff8026a6d0>] default_idle+0x37/0x51
Jan 13 18:05:32 agape kernel:  [<ffffffff8026a6ce>] default_idle+0x35/0x51
Jan 13 18:05:33 agape kernel:  [<ffffffff8024833c>] cpu_idle+0xa1/0xc4
Jan 13 18:05:33 agape kernel:  [<ffffffff80269a3a>] rest_init+0x2b/0x2d
Jan 13 18:05:33 agape kernel:  [<ffffffff80663811>] start_kernel+0x260/0x26f
Jan 13 18:05:33 agape kernel:  [<ffffffff80663158>] _sinittext+0x158/0x160
Jan 13 18:05:33 agape kernel: 
Jan 13 18:05:33 agape kernel: fw_core: created new fw device fw0 (5 config rom
retries)
Jan 13 18:05:33 agape kernel: fw_sbp2: logged in to sbp2 unit fw0.0
Jan 13 18:05:33 agape kernel: fw_sbp2:  - management_agent_address: 0xfffff0010000
Jan 13 18:05:33 agape kernel: fw_sbp2:  - command_block_agent_address:
0xffc1fffff0010020
Jan 13 18:05:33 agape kernel: fw_sbp2:  - status write address: 0x000100000000
Jan 13 18:05:33 agape kernel: fw_sbp2: agent reset write rcode=0
Jan 13 18:05:33 agape kernel: scsi4 : SBP-2 IEEE-1394
Jan 13 18:05:33 agape kernel: scsi 4:0:0:0: Direct-Access-RBC ST330083 1A      
             PQ: 0 ANSI: 4
Jan 13 18:05:33 agape kernel: SCSI device sdb: 586072368 512-byte hdwr sectors
(300069 MB)
Jan 13 18:05:33 agape kernel: sdb: Write Protect is off
Jan 13 18:05:33 agape kernel: SCSI device sdb: write cache: enabled, read cache:
enabled, doesn't support DPO or FUA
Jan 13 18:05:33 agape kernel: SCSI device sdb: 586072368 512-byte hdwr sectors
(300069 MB)
Jan 13 18:05:33 agape kernel: sdb: Write Protect is off
Jan 13 18:05:33 agape kernel: SCSI device sdb: write cache: enabled, read cache:
enabled, doesn't support DPO or FUA
Jan 13 18:05:33 agape kernel:  sdb: sdb1 sdb2
Jan 13 18:05:33 agape kernel: sd 4:0:0:0: Attached scsi disk sdb
<<


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:

Comment 1 Bernie Innocenti 2007-01-26 06:17:07 UTC
Confirmed on 2.6.19-1.2913.fc7, x86_64.


Comment 2 Michel Alexandre Salim 2007-02-09 03:45:48 UTC
Created attachment 147724 [details]
Firewire dmesg output from before today's HAL update

Comment 3 Michel Alexandre Salim 2007-02-09 03:46:55 UTC
Confirmed on 2.6.19-1.2919.fc7 and -1.2922.fc7, x86_64. The problem is getting
worse - before today's HAL update it would lock the machine up when udev is
started, but plugging the disk after the boot process is complete would spew the
lock warnings but then things work after I unloaded and reloaded fw_sbp2.

Right now, once I plug the FW hard drive in (Western Digital MyBook), on the
current terminal I don't get the shell prompt back. I could still switch to a
different terminal but after entering my login informations it, again, does not
give me a login prompt, so it seems the kernel cannot spawn new processes at
that point.


Comment 4 Kristian Høgsberg 2007-04-04 15:43:26 UTC
This should be fixed in the most recent kernels, please give that a try.

Comment 5 Deji Akingunola 2007-04-14 18:10:16 UTC
Finally got the time to test this, and happy to report there are no  more lock
messages in the log. However, though the kernel seems to activate the drive, the
partition on it was not mounted properly. The messages in the log is reproduced
below;

Apr 14 13:45:50 agape kernel: usb 2-5: USB disconnect, address 6
Apr 14 13:46:36 agape kernel: fw_core: phy config: card 0, new root=ffc1,
gap_count=5
Apr 14 13:46:45 agape kernel: fw_core: phy config: card 0, new root=ffc1,
gap_count=5
Apr 14 13:46:46 agape kernel: fw_core: created new fw device fw1 (3 config rom
retries)
Apr 14 13:46:46 agape kernel: fw_sbp2: logged in to sbp2 unit fw1.0 (0 retries)
Apr 14 13:46:46 agape kernel: fw_sbp2:  - management_agent_address:   
0xfffff0010000
Apr 14 13:46:46 agape kernel: fw_sbp2:  - command_block_agent_address:
0xfffff0010020
Apr 14 13:46:46 agape kernel: fw_sbp2:  - status write address:       
0x000100000000
Apr 14 13:46:46 agape kernel: scsi6 : SBP-2 IEEE-1394
Apr 14 13:46:46 agape kernel: scsi 6:0:0:0: Direct-Access-RBC Maxtor 6 L300R0  
             PQ: 0 ANSI: 4
Apr 14 13:46:46 agape kernel: SCSI device sdc: 586114704 512-byte hdwr sectors
(300091 MB)
Apr 14 13:46:46 agape kernel: sdc: Write Protect is off
Apr 14 13:46:46 agape kernel: SCSI device sdc: write cache: enabled, read cache:
enabled, doesn't support DPO or FUA
Apr 14 13:46:46 agape kernel: SCSI device sdc: 586114704 512-byte hdwr sectors
(300091 MB)
Apr 14 13:46:46 agape kernel: sdc: Write Protect is off
Apr 14 13:46:46 agape kernel: SCSI device sdc: write cache: enabled, read cache:
enabled, doesn't support DPO or FUA
Apr 14 13:46:46 agape kernel:  sdc: sdc1
Apr 14 13:46:46 agape kernel: sd 6:0:0:0: Attached scsi disk sdc
Apr 14 13:46:46 agape kernel: sd 6:0:0:0: Attached scsi geneApr 14 13:46:51
agape kernel: fw_sbp2: sbp2_scsi_abort
Apr 14 13:46:56 agape kernel: fw_sbp2: sbp2_scsi_abort
Apr 14 13:47:22 agape gconfd (root-23627): starting (version 2.18.0.1), pid
23627 user 'root'
Apr 14 13:47:22 agape gconfd (root-23627): Resolved address
"xml:readonly:/etc/gconf/gconf.xml.mandatory" to a read-only configuration
source at position 0
Apr 14 13:47:22 agape gconfd (root-23627): Resolved address
"xml:readwrite:/root/.gconf" to a writable configuration source at position 1
Apr 14 13:47:22 agape gconfd (root-23627): Resolved address
"xml:readonly:/etc/gconf/gconf.xml.defaults" to a read-only configuration source
at position 2
Apr 14 13:50:10 agape kernel: hfs: unable to find HFS+ superblock
Apr 14 13:51:34 agape kernel: fw_sbp2: management write failed, rcode 0x13
Apr 14 13:51:34 agape kernel: Synchronizing SCSI cache for disk sdc:
Apr 14 13:51:34 agape kernel: FAILED
Apr 14 13:51:34 agape kernel:   status = 0, message = 00, host = 2, driver = 00
Apr 14 13:51:34 agape kernel:   <5>fw_sbp2: removed sbp2 unit fw1.0
ric sg3 type 14

I got the 'hfs' message when i tried to check the drive with gparted. The
partition on it was created as ext3.


Comment 6 Kristian Høgsberg 2007-04-14 22:21:52 UTC
Thanks for testing, good to hear the lock warning went away.

I see a sbp2_scsi_abort() in the log - this probably shouldn't trigger in the
first place, but on top of this, in earlier kernel we didn't properly reset the
device in this case.  What is the exact kernel version you're testing with?  

If you haven't tried at least version 2.6.20-1.3064.fc7 of the kernel could you
give that a try?

Thanks.

Comment 7 Deji Akingunola 2007-04-15 02:04:35 UTC
Sorry I forgot to mention I'm running rawhide fully updated to this morning's
state, i.e. kernel-2.6.20-1.3066.fc7 (i've rebooted into this kernel)

Comment 8 Deji Akingunola 2007-06-13 05:37:14 UTC
My drive now mounts its partition using the new driver for the first time, this
is with kernel-2.6.21-1.3218.fc8.
Thanks.


Note You need to log in before you can comment on or make changes to this bug.