Bug 140328
| Summary: | Kernel oops (non-terminal) on removal of "blackhole" or "unreachable" type routes | ||||||
|---|---|---|---|---|---|---|---|
| Product: | [Fedora] Fedora | Reporter: | Timothy Hinchcliffe <tim> | ||||
| Component: | kernel | Assignee: | Dave Jones <davej> | ||||
| Status: | CLOSED CURRENTRELEASE | QA Contact: | Brian Brock <bbrock> | ||||
| Severity: | medium | Docs Contact: | |||||
| Priority: | medium | ||||||
| Version: | 3 | CC: | pfrields, wtogami | ||||
| Target Milestone: | --- | ||||||
| Target Release: | --- | ||||||
| Hardware: | i386 | ||||||
| OS: | Linux | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | 1.681_FC3 | Doc Type: | Bug Fix | ||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2004-11-23 15:22:57 UTC | Type: | --- | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Attachments: |
|
||||||
|
Description
Timothy Hinchcliffe
2004-11-22 12:13:06 UTC
Created attachment 107176 [details]
Copy of the output of demonstration session.
Here I go through the steps that cause the bug, and record the kernel oops as
logged to syslog.
This is identical to what I see on the console in run level 1.
Line numbers from the source rpm of kernel 2.6.9-678_FC3: I belive that the Oops is occouring in line 526 of include/linux/list.h: "*pprev = next;" because pprev is null. This was inlined at line 166 of net/ipv4/fib_semantics.c: "hlist_del(&nh->nh_hash);" which is releasing the next hop hash lists I am guessing that a blackhole route manages to inject an incomplete hash entry into the nexthops list with pprev set to null somehow. Waiting for a kernel to compile on a very slow machine to confirm this through printk... I think the cause of the problem is lines 742,743 of
net/ipv4/fib_semantics.c in fib_create_info():
>if (!nh->nh_dev)
> continue;
Basicly if there is no nh_dev part of the next_hop structure, then the
nh_hash is never initialised so will has pprev set to null.
If this is a valid senario, hlist_del() needs to check nh_dev and only
run __hlist_del() if it is non-null. Otherwise the continue should
become some sort of error and the cause of an invalid nh_dev tracked
down. Or alternativly, the nh_hash needs to be initialized into a no
device type chain.
I *think* this effect the stock 2.6.9 kernel as well. I am unable to
verify that though.
Fixed in kernel-2.6.9-1.681_FC3! I had just figured out the patch as well! |