Red Hat Bugzilla – Bug 212122
2.6.9-42.19.ELsmp kernel deadlock when IPv6 address is configured on an unplugged interface
Last modified: 2007-11-30 17:07:27 EST
Description of problem:
When using 2.6.9-42.19.ELsmp kernel from
http://people.redhat.com/~jbaron/rhel4/SRPMS.kernel/ kernel hangs in deadlock in
case an IPv6 address is configured on a interface which has it's cable unplugged.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
[root@jalmari ~]# ip link set eth1 down
[root@jalmari ~]# ip link set eth1 up
[root@jalmari ~]# mii-tool eth1
eth1: no link
[root@jalmari ~]# ip address add 2000::11/64 dev eth1
SysRq : Show Regs
Pid: 3837, comm: ip
EIP: 0060:[<c02d39c5>] CPU: 0
EIP is at _spin_lock_bh+0x3c/0x42
EFLAGS: 00000286 Not tainted (2.6.9-42.19.ELsmp)
EAX: cb552000 EBX: cc82f708 ECX: 9b914cf4 EDX: 0008e365
ESI: cc82f6e0 EDI: cc82f708 EBP: cfe80800 DS: 007b ES: 007b
CR0: 8005003b CR2: 09223004 CR3: 0fd17560 CR4: 000006f0
[<d0acb1e6>] addrconf_dad_stop+0x17/0x90 [ipv6]
[<d0accd6c>] addrconf_dad_start+0x84/0x90 [ipv6]
[<d0acc0e3>] inet6_addr_add+0xa6/0xc0 [ipv6]
[<d0acd3a1>] inet6_rtm_newaddr+0x0/0x5b [ipv6]
If you have the system still available could you please provide a sysrq-t from
when the system is hung please? It would be helpful to know which process is
holding the semaphore that the above backtrace is blocked on. Thanks!
Sorry, I didn't even look at the code, I assumed that another process was
holding the lock, although, for the record, there is no patch posted to this
bug. I'll fix it shortly though.
Created attachment 139597 [details]
patch to fix addrconf deadlock
committed in stream U5 build 42.23. A test kernel with this patch is available
Although based on comment #7, perhaps we need to revisit this further....
No further movement is needed, really. The patch has been comitted. Nokia's
observation regarding the double unlock will cause a minor gripe from the lock
validator, but no real problems. I'm going to clean that up shortly, but as far
as this bug is concerned, the fix is in place.
I think the double unlock, if it's there, is a real bug.
If we do the first unlock, another cpu grabs the lock, then we do that
second bogus unlock, this allows a third cpu into the critical section
erroneously which will corrupt data.
It's a bug, and it can very well cause corruption, so we should fix it.
Yeah, after considering it further, I agree. I'll look at it and post a repo
patch if the double unlock exists.
Created attachment 142170 [details]
patch to fix double unlock
I've submitted this patch to fix the double unlock condition
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release. Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products. This request is not yet committed for inclusion in an Update
ok, i've integrated the patch from comment #17. A test kernel with this patch is
available from http://people.redhat.com/~jbaron/rhel4/
This bugzilla has Keywords: Regression.
Since no regressions are allowed between releases,
it is also being proposed as a blocker for this release.
Please resolve ASAP.
QE ack for RHEL4.5.
Both patches are in the -51 kernel and I set an ipv6 address with the network
down with no hang.
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.