Description of problem: When using 2.6.9-42.19.ELsmp kernel from http://people.redhat.com/~jbaron/rhel4/SRPMS.kernel/ kernel hangs in deadlock in case an IPv6 address is configured on a interface which has it's cable unplugged. Version-Release number of selected component (if applicable): 2.6.9-42.19.ELsmp How reproducible: Steps to Reproduce: [root@jalmari ~]# ip link set eth1 down [root@jalmari ~]# ip link set eth1 up [root@jalmari ~]# mii-tool eth1 eth1: no link [root@jalmari ~]# ip address add 2000::11/64 dev eth1 Actual results: [halt sent] SysRq : Show Regs Pid: 3837, comm: ip EIP: 0060:[<c02d39c5>] CPU: 0 EIP is at _spin_lock_bh+0x3c/0x42 EFLAGS: 00000286 Not tainted (2.6.9-42.19.ELsmp) EAX: cb552000 EBX: cc82f708 ECX: 9b914cf4 EDX: 0008e365 ESI: cc82f6e0 EDI: cc82f708 EBP: cfe80800 DS: 007b ES: 007b CR0: 8005003b CR2: 09223004 CR3: 0fd17560 CR4: 000006f0 [<d0acb1e6>] addrconf_dad_stop+0x17/0x90 [ipv6] [<d0accd6c>] addrconf_dad_start+0x84/0x90 [ipv6] [<d0acc0e3>] inet6_addr_add+0xa6/0xc0 [ipv6] [<d0acd3a1>] inet6_rtm_newaddr+0x0/0x5b [ipv6] [<c02880e3>] rtnetlink_rcv+0x226/0x327 [<c0292b56>] netlink_data_ready+0x14/0x44 [<c0292263>] netlink_sendskb+0x52/0x6c [<c0292971>] netlink_sendmsg+0x271/0x280 [<c027823d>] sock_sendmsg+0xdb/0xf7 [<c0120519>] autoremove_wake_function+0x0/0x2d [<c027d3c2>] verify_iovec+0x76/0xc2 [<c0279988>] sys_sendmsg+0x1ee/0x23b [<c014e82e>] handle_mm_fault+0xdc/0x193 [<c014f55e>] vma_link+0x44/0xbc [<c0150eca>] do_brk+0x1f0/0x22a [<c0279d8f>] sys_socketcall+0x1df/0x1fb [<c02d4cf7>] syscall_call+0x7/0xb Expected results: Additional info:
If you have the system still available could you please provide a sysrq-t from when the system is hung please? It would be helpful to know which process is holding the semaphore that the above backtrace is blocked on. Thanks!
Sorry, I didn't even look at the code, I assumed that another process was holding the lock, although, for the record, there is no patch posted to this bug. I'll fix it shortly though.
Created attachment 139597 [details] patch to fix addrconf deadlock
committed in stream U5 build 42.23. A test kernel with this patch is available from http://people.redhat.com/~jbaron/rhel4/ Although based on comment #7, perhaps we need to revisit this further....
No further movement is needed, really. The patch has been comitted. Nokia's observation regarding the double unlock will cause a minor gripe from the lock validator, but no real problems. I'm going to clean that up shortly, but as far as this bug is concerned, the fix is in place.
I think the double unlock, if it's there, is a real bug. If we do the first unlock, another cpu grabs the lock, then we do that second bogus unlock, this allows a third cpu into the critical section erroneously which will corrupt data. It's a bug, and it can very well cause corruption, so we should fix it.
Yeah, after considering it further, I agree. I'll look at it and post a repo patch if the double unlock exists.
Created attachment 142170 [details] patch to fix double unlock I've submitted this patch to fix the double unlock condition
This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux maintenance release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Update release for currently deployed products. This request is not yet committed for inclusion in an Update release.
ok, i've integrated the patch from comment #17. A test kernel with this patch is available from http://people.redhat.com/~jbaron/rhel4/
This bugzilla has Keywords: Regression. Since no regressions are allowed between releases, it is also being proposed as a blocker for this release. Please resolve ASAP.
QE ack for RHEL4.5.
Both patches are in the -51 kernel and I set an ipv6 address with the network down with no hang.
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2007-0304.html