Al Viro talked to me earlier this week about a sysctl flaw. Below is an edited extract from his messages. I'm going to assign this "moderate" because it requires an interface to go away and most likely would lead to DoS. Not sent upstream. We have an exploitable hole in sysctl unregistration. There is no exclusion whatsoever between procfs or sysctl(2) access to ctl_table instances, methods *and* data and unregister_sysctl_table(). IOW, there's nothing to stop inetdev_destroy() in the middle of access to /proc/sys/net/ipv4/conf/<interface>/<whatever> or in the middle of corresponding sysctl. In the best case that's an oopsable hole. In the worst... You can open such file, wait for interface to go away, try to grab as much memory as possible in hope to hit the (kfreed) ctl_table. Then fill it with pointers to your function. Then do read from file you've opened and if you are lucky, you'll get it called as ->proc_handler() in kernel mode. Realistically, I see only two ways to deal with that: have unregister_sysctl_table() block until accesses in progress are finished (after making sure there will be no new ones - we can do that, at least) or have it return -EBUSY and let callers retry in whatever ways they like. IMO the latter is only going to cause more bugs. Note that unregister_sysctl_table() _already_ is process-synchronous since it removes procfs entries. Moreover, on a preempt kernel it _already_ blocks. What I would like to do is the same kind of extremely unfair rwsem we are using for net_device removal; in case of sysctl we _do_ have dynamic-only structure to work with - table headers. So it's nowhere near the horror level of net_device sweep; we can get away with all changes local to kernel/sysctl.c and the data structure private to it. The real question is whether anything with dynamic sysctls wants to do them under a spinlock. Note that we'll only block in the cases that right now would lead to access to kfreed objects, so we don't get any regressions. Opened procfs files will *not* block that sucker - only access itself would (I'm going to stick pointer to table header into ->private_data, have it pin down the header and have it check the "going away" flag before access). Fortunately, very few places really want unregister_sysctl_table() on dynamic data structures. Most of them are in net/*. We do have one done under a spinlock in arch/s390/appldata/appldata_base.c; that sucker tries to protect itself, but still doesn't get it right (moreover, it can't be fixed that way - the fundamental problem is that driver doesn't know that ctl_table is about to get used and it's freeing the table itself that gets you). We also have a lot of sloppy parport code (another source of dynamic sysctls) third-party code might be affected - trading triggerable oops (if not worse) for potential deadlock in narrow subset of previously vulnerable scenarios.
Created attachment 119069 [details] First draft proposed patch
*** Bug 170243 has been marked as a duplicate of this bug. ***
public, removing embargo
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2006-0101.html