848017 – cgdcbxd: does not prevent netlink spoofing

Bug 848017 - cgdcbxd: does not prevent netlink spoofing

Summary: cgdcbxd: does not prevent netlink spoofing

Keywords:
Status:	CLOSED NOTABUG
Alias:	None
Product:	Fedora
Classification:	Fedora
Component:	cgdcbxd
Sub Component:
Version:	19
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	unspecified
Target Milestone:	---
Assignee:	Neil Horman
QA Contact:	Fedora Extras Quality Assurance
Docs Contact:
URL:
Whiteboard:
Depends On:	851148
Blocks:	848015
TreeView+	depends on / blocked

Reported:	2012-08-14 10:34 UTC by Florian Weimer
Modified:	2015-02-20 11:49 UTC (History)
CC List:	3 users (show)
Fixed In Version:
Clone Of:
Clones:	851148 (view as bug list)
Environment:
Last Closed:	2013-07-12 13:52:09 UTC
Type:	Bug
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
[PATCH] cgdcbxd: add filtering support for netlink sockets (2.73 KB, patch) 2013-01-17 20:29 UTC, Neil Horman	no flags	Details \| Diff
View All

Description Florian Weimer 2012-08-14 10:34:12 UTC

cgdcbxd does not check if the DCB packets have PID/port ID 0 and originated in the kernel.  As a result, unprivileged local users can send messages to cgdcbxd, causing it to alter the cgroup configuration.

In addition, mnl_attr_get_str is used as if the resulting string is NUL-terminated.  This isn't always the case (at least when processing crafted messages).

Comment 1 Florian Weimer 2012-08-14 11:39:03 UTC

(In reply to comment #0)
> In addition, mnl_attr_get_str is used as if the resulting string is
> NUL-terminated.  This isn't always the case (at least when processing
> crafted messages).

Based on a discussion with the libmnl folks, mnl_attr_validate(attr, MNL_TYPE_NUL_STRING) should be used instead of mnl_attr_validate(attr, MNL_TYPE_STRING).  This is compatible with what the kernel sends.

Comment 2 Neil Horman 2012-08-14 15:29:44 UTC

I'm not sure thats even necessecary.  Every received message from the kernel is handled in cgdcbxd by mnl_cb_run->__mnl_cb_run.  Internal to that function, we validate the pid and port id in mnl_nlmsg_portid_ok, which validates:

1) that nlmsg_pid is 0
2) That portid is either 0 or matches the expected port id

So it seems, unless I'm missing something that all of your concerns are addressed there

Comment 3 Florian Weimer 2012-08-14 15:39:57 UTC

(In reply to comment #2)
> I'm not sure thats even necessecary.  Every received message from the kernel
> is handled in cgdcbxd by mnl_cb_run->__mnl_cb_run.  Internal to that
> function, we validate the pid and port id in mnl_nlmsg_portid_ok, which
> validates:
> 
> 1) that nlmsg_pid is 0
> 2) That portid is either 0 or matches the expected port id
> 
> So it seems, unless I'm missing something that all of your concerns are
> addressed there

Here's an excerpt from a GDB session when processing a packet which was generated by the nobody user:

Breakpoint 2, mnl_nlmsg_portid_ok (nlh=nlh@entry=0x7fffffffcec0, portid=portid@entry=0) at nlmsg.c:238
238		return nlh->nlmsg_pid && portid ? nlh->nlmsg_pid == portid : true;
(gdb) print nlh->nlmsg_pid
$1 = 5971
(gdb) finish
Run till exit from #0  mnl_nlmsg_portid_ok (nlh=nlh@entry=0x7fffffffcec0, portid=portid@entry=0) at nlmsg.c:238
0x00000037f1a01fff in __mnl_cb_run (cb_ctl_array_len=0, cb_ctl_array=0x0, data=0x0, 
    cb_data=0x403182 <cgdcbx_dcb_cb>, portid=0, seq=0, numbytes=<optimized out>, buf=<optimized out>)
    at callback.c:58
58			if (!mnl_nlmsg_portid_ok(nlh, portid)) {
Value returned is $2 = true
(gdb) 

This is actually the documented behavior, "We skip the tracking for netlink message whose portID is zero since it is reserved for event-based kernel notifications.".  I think this function is not suitable for checking that the packet originated in the kernel, and you have to do this check on your own, prior to calling mnl_cb_run.

Perhaps we should ask upstream if this is a bug in libmnl, just to be sure.

Comment 4 Neil Horman 2012-08-14 16:28:44 UTC

agreed, it seems that a non-zero pid and a valid port id should result in a failed check, as that would still allow kernel notifications to go through (as they would have a 0 pid), but catch users.

CCing hushan jia, the libmnl maintainer.  What are your thoughts on the issue?

Comment 5 Florian Weimer 2012-08-15 07:44:31 UTC

(In reply to comment #4)
> agreed, it seems that a non-zero pid and a valid port id should result in a
> failed check, as that would still allow kernel notifications to go through
> (as they would have a 0 pid), but catch users.
> 
> CCing hushan jia, the libmnl maintainer.  What are your thoughts on the
> issue?

Please note that this issue is still under embargo.

I've asked Pablo Neira Ayuso (libmnl upstream) for his opinion as well.

Comment 6 Florian Weimer 2012-08-15 12:59:52 UTC

(In reply to comment #5)

> I've asked Pablo Neira Ayuso (libmnl upstream) for his opinion as well.

He agrees that this is a bug in libmnl.  We're working on a fix.

Comment 7 Neil Horman 2012-08-15 13:16:31 UTC

copy that, reassigning the component on this bz to libmnl.  Thanks!

Comment 8 Hushan Jia 2012-08-16 01:57:06 UTC

(In reply to comment #6)
> (In reply to comment #5)
> 
> > I've asked Pablo Neira Ayuso (libmnl upstream) for his opinion as well.
> 
> He agrees that this is a bug in libmnl.  We're working on a fix.
Can you copy me on the email thread? thanks.

Comment 9 Florian Weimer 2012-08-23 09:37:29 UTC

(In reply to comment #8)
> Can you copy me on the email thread? thanks.

The discussion resulted in this patch proposal for the kernel:

http://marc.info/?l=linux-netdev&m=134522422125983

This will fix the issue for cgdcbxd only very indirectly, if it is ever accepted (it might break existing userspace software).

I think the current libmnl API does not really provide a way to check message origin.  The source address of a packet is discarded in mnl_socket_recvfrom, and that would be the most natural place to check.  Looking at the port ID in the nlmsghdr header (which is what mnl_nlmsg_portid_ok does) is not actually helpful.

Comment 10 Florian Weimer 2012-08-23 11:10:01 UTC

Changing the component back to cgdcbxd because a code change in cgdcbxd will be needed (no transparent fix in libmnl is planned).

Comment 11 Neil Horman 2012-08-23 11:34:12 UTC

None of what you said in comment 9 really makes sense to me at all.  If the check needs to occur in mnl_socket_recvfrom, that seems fine to me.  Every example usage I can find online of using mnl indicates that mnl_cb_run can be called immediately after mnl_socket_recvfrom.  Theres an expectation that the library is validating packet, which is does for the most part.  Whats wrong with validating these packets in mnl_socket_recvfrom, and dropping them if they fail there?

Comment 12 Florian Weimer 2012-08-23 11:39:04 UTC

(In reply to comment #11)
> None of what you said in comment 9 really makes sense to me at all.  If the
> check needs to occur in mnl_socket_recvfrom, that seems fine to me.  Every
> example usage I can find online of using mnl indicates that mnl_cb_run can
> be called immediately after mnl_socket_recvfrom.  Theres an expectation that
> the library is validating packet, which is does for the most part.  Whats
> wrong with validating these packets in mnl_socket_recvfrom, and dropping
> them if they fail there?

First part: If the kernel is changed spoofing will go away completely (because only root can spoof, and that's acceptable).

libmnl is intended to be used for applications which process messages from both the kernel and other userspace processes.  Therefore, non-kernel messages cannot be dropped unconditionally in mnl_socket_recvfrom.  Upstream made it clear that applications which need filtering would have to update their code.

Upstream plans to release a pach for libmnl soon, let's discuss this further once it's available.

Comment 13 Neil Horman 2012-08-23 13:26:04 UTC

Well, if the kernel bits are accepted, then there was no need to give this back to me.

And I understand that we can't unconditionally filter in libmnl, but if we're going to filter at all, perhaps we should allow for an api code point to enable conditional filtering?  Unless there are a huge number of conditions to check for it seems simple enough to add an api function mnl_filter_non_kernel_messages(bool)

to allow an application to filter non kernel messages on streams that only expect to receive information from the kernel.  That seems to beat the heck out of every application having to parse a netlink header to determine if its from a valid source or not.

Can you provide a link to your upstream conversation on the subject?

Comment 14 Neil Horman 2012-08-23 14:01:04 UTC

scratch my last comment.  Florian and I discussed this on irc and libmnl will add something to enable option filtering to support this, and when thats done, I'll turn it on in cgdcbxd

Comment 15 Neil Horman 2013-01-16 21:25:44 UTC

Florian, is there any need to keep this open?  I understand its still an issue, but the libmnl maintainer hasn't really made any movement with the bug that this depends on.  Are there plans to fix this, or shall we just close this as a WONTFIX?

Comment 16 Florian Weimer 2013-01-17 06:17:54 UTC

(In reply to comment #15)
> Florian, is there any need to keep this open?  I understand its still an
> issue, but the libmnl maintainer hasn't really made any movement with the
> bug that this depends on.  Are there plans to fix this, or shall we just
> close this as a WONTFIX?

Not really, the bug is still there in cgdcbxd.  If the libmnl API is not extended, we need to fix it in another way.  Not all the kernels we ship include generic Netlink spoofing filters (see bug 851968).

Comment 17 Neil Horman 2013-01-17 20:29:14 UTC

Created attachment 680487 [details]
[PATCH] cgdcbxd: add filtering support for netlink sockets


To avoid allowing cgdcbxd to be spoofed into accepting netlink messages from
user processes that have misconstructed headers indicating they're from the
kernel, switch to using the new libmnl filtering interface that allows us to
drop all frames originating from user space processes.

Signed-off-by: Neil Horman <nhorman>
---
 cgdcbxd.c | 15 +++++++++------
 1 file changed, 9 insertions(+), 6 deletions(-)

Comment 18 Neil Horman 2013-01-17 20:31:59 UTC

Ok, I've uploaded patches for libmnl to bz 851148 and here, to add fitlering capabilities to libmnl and take advantage of it in cgdcbxd.  Does this look like a reasonable solution to you?  If so, I'll push both the libmnl and cgdcbxd solutions upstream

Comment 20 Neil Horman 2013-02-07 15:15:24 UTC

I've not, I've just built it locally, and tested that it doesn't break in the nominal case, but I've not tried to spoof kernel netlink messages from user space.  Feel free to brew it up and give it a try though.

Comment 21 Jaroslav Reznik 2013-04-05 12:24:08 UTC

This bug appears to have been reported against 'rawhide' during the Fedora 19 development cycle.
Changing version to '19'.

(As we did not run this process for some time, it could affect also pre-Fedora 19 development
cycle bugs. We are very sorry. It will help us with cleanup during Fedora 19 End Of Life. Thank you.)

More information and reason for this action is here:
https://fedoraproject.org/wiki/BugZappers/HouseKeeping/Fedora19

Comment 22 Neil Horman 2013-04-15 16:43:23 UTC

Florian, Response from upstream is that kernel commit 20e1db19db5d6b9e4e83021595eab0dc8f107bef
 should take care of this issue.  Is there any need to keep this open?

Note You need to log in before you can comment on or make changes to this bug.