Bug 528134

Summary: qemu should set O_CLOEXEC on all file descriptors
Product: [Fedora] Fedora Reporter: Mark McLoughlin <markmc>
Component: qemuAssignee: Justin M. Forbes <jforbes>
Status: CLOSED RAWHIDE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: medium Docs Contact:
Priority: medium    
Version: rawhideCC: armbru, berrange, drepper, dwalsh, dwmw2, gcosta, itamar, jakub, jaswinder, jforbes, kwolf, markmc, quintela, schwab, tburke, virt-maint
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2010-03-11 19:33:34 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 498969    

Description Mark McLoughlin 2009-10-09 10:10:25 UTC
See bug #515521, comments #48, #49, #51 and #52

glibc is now forking a ptchown program under certain circumstances

qemu is leaking file descriptors to that process because we do not set O_CLOEXEC on all file descriptors

In bug #515521, looking at the selinux policy added, we're leaking a tap fds, the pidfile fd and image file fds

We should go through qemu code and make sure O_CLOEXEC gets set

Markus, Juan, Kevin, Justin - any of you fancy taking this?

Comment 1 Kevin Wolf 2009-10-13 13:43:58 UTC
Hm, I'm not sure I agree with what glibc is doing here. Basically they want to forbid non-O_CLOEXEC files for programs - and what if there are descriptors that we do want to pass to processes that we fork ourselves, just not to glibc's silently forked processes?

I think we can work around it in some places, but for example the Linux userspace emulator looks like something where we simply _can't_ add O_CLOEXEC without breaking things.

Comment 2 Mark McLoughlin 2009-10-13 13:56:04 UTC
(In reply to comment #1)
> Hm, I'm not sure I agree with what glibc is doing here. Basically they want to
> forbid non-O_CLOEXEC files for programs - and what if there are descriptors
> that we do want to pass to processes that we fork ourselves, just not to
> glibc's silently forked processes?

Sounds reasonable to me. Moving back to glibc

Comment 3 Daniel Walsh 2009-10-13 15:42:51 UTC
Well if we know of one file descriptor that can not be closed, we can dontaudit the leak.  from svirt_t to ptchown_t

Comment 4 Ulrich Drepper 2009-10-13 15:57:24 UTC
You can reset the CLOEXEC flag for those cases when it should really matter.

The problem is simple: the default of all file descriptors from the first days of unix should have been to set CLOEXEC.  Then is the very, very few places where this flag is needed it can be set.  These places are trivially to find because the child won't have the descriptor and won't work.

Therefore, in code like qemu, the default should implicitly be changed to always set CLOEXEC (through a wrapper function to open etc or whatever).

Again, there are unlikely any places where this flag has to be unset and if there are, they are trivially to find.

Comment 5 Markus Armbruster 2009-10-13 16:40:21 UTC
Regardless of what the default for CLOEXEC is or should be, and without disputing the thesis that many, perhaps most applications would be better off have all their file descriptors CLOEXEC at all times, it is entirely legitimate for an application to have CLOEXEC off on some of its file descriptors.  By keeping it off, the application assumes responsibility for controlling its flow across exec() manually.

However, it can do so only for the exec()s it actually sees.  If a library exec()s behind the application's back, then the library is responsible for controlling the flow of file descriptors across that exec().  The library cannot know which file descriptors are CLOEXEC, because that's an application decision (regardless of what the default for CLOEXEC is).  Thus, the library must assume the application's file descriptors are not CLOEXEC, and close all of them.

If a library doesn't do that, how's the application supposed to keep a file descriptor open safely across one of its own exec()s?  The only way I can see is this: somehow lock out the library to prevent it from execing behind the application's back, clear CLOEXEC, exec().  Doesn't look practical to me.

Once again, I'm all for applications CLOEXECing all the file descriptors they don't need to keep open across an exec().  I just want to know how to deal with the ones we do want to keep open, safely, and in presence of libraries that do their own exec()ing.

Comment 6 Ulrich Drepper 2009-10-14 07:21:40 UTC
(In reply to comment #5)
> However, it can do so only for the exec()s it actually sees.  If a library
> exec()s behind the application's back, then the library is responsible for
> controlling the flow of file descriptors across that exec().

Do you think it is at all likely that any reasonable code will call exec() somewhere in a library and expect the child to inherit some file descriptors without controlling them?  I think this is highly unlikely at best and broken in all cases.

How would such a child process learn about the file descriptor is the first place?  We are not talking about 0 to 2, which are special and usually shouldn't be CLOEXEC.  Any other descriptor can and often will have a different ID in different runs.  The runtime can implicitly open descriptors and shift the values for the descriptors available to the application.  Therefore there has to be somehow a reference to the inherited descriptor being passed to the library (parameter, static var, ...) and then the library either dup2()s the descriptor or passes a parameter (textual or so) describing the descriptor value to the child).  In that case the library can and must make sure the descriptor is actually inherited.  Nothing else is save.  And in any case, the application knows that the file descriptor is used this way.

I recognize that there is the _potential_ for such a problem which requires code changes.  But a) it is unlikely and b) confining a security critical program like qemu as tightly as possible is crucial for system security.

Comment 7 Kevin Wolf 2009-10-14 07:34:33 UTC
(In reply to comment #6)
> Do you think it is at all likely that any reasonable code will call exec()
> somewhere in a library and expect the child to inherit some file descriptors
> without controlling them?

No, definitely not. The point is that the library assumes that the application doesn't need any non-O_CLOEXEC file descriptor while doing an exec() that the application didn't explicitly ask for and is probably unaware of it. Which in my opinion is a wrong assumption. I'm not saying that I know the perfect solution, but I do see a problem here.

That said, I see no reason why, say, image files shouldn't be O_CLOEXEC in qemu. We can do better here. But even if we can use O_CLOEXEC here, I think the fundamental problem still remains.

Comment 8 Mark McLoughlin 2009-10-14 08:22:17 UTC
Given that the application may have good reason to have some file descriptors not marked with O_CLOEXEC - and that the application knows nothing of pt_chown or any other glibc helper programs - it is only possible for glibc itself to ensure that no file descriptors are leaked to the pt_chown process.

Seems clear to me that glibc really needs to do the "close every file descriptor I'm not interested in to pass on" dance before execing a helper program like pt_chown.

Comment 9 Mark McLoughlin 2009-10-14 08:26:04 UTC
(In reply to comment #7)

> That said, I see no reason why, say, image files shouldn't be O_CLOEXEC in
> qemu. We can do better here.

Right, let's go ahead and do that.

> But even if we can use O_CLOEXEC here, I think the fundamental problem still 
> remains.

Indeed, I think we should have a separate "close all file descriptors before execing pt_chown" bug filed against glibc.

Comment 10 David Woodhouse 2009-10-14 12:31:09 UTC
Perhaps what we actually need is a new clone flag which says 'close all fds', for library helpers to use.

Or if these helpers actually _need_ one or more fds to be cloned, then maybe something more complex -- like a way to specify which fd(s) are to be kept open.

Comment 11 Markus Armbruster 2009-10-14 13:12:31 UTC
I quite agree that QEMU should make sure not to leak fds across its exec()s, and I also agree that CLOEXEC is a useful tool for that.  I don't know whether it actually leaks any right now.  Let's find out.  We can use this bug to track.  Even if it doesn't leak, preventive use of CLOEXEC makes sense.  Upstream job.

Independent of that, libraries should also make sure not leak fds across their exec()s, glibc for pt_chown in particular.  I think Uli made the case for that quite convincingly in comment#6.  Do we have a bug for tracking the glibc/pt_chown issue already?

Comment 12 Mark McLoughlin 2009-10-23 13:06:07 UTC
(In reply to comment #11)
> Independent of that, libraries should also make sure not leak fds across their
> exec()s, glibc for pt_chown in particular.  I think Uli made the case for that
> quite convincingly in comment#6.  Do we have a bug for tracking the
> glibc/pt_chown issue already?

Filed bug #530558 now

Comment 13 Kevin Wolf 2009-12-07 09:44:48 UTC
A patch to add FD_CLOEXEC was committed upstream as 40ff6d7e8dceca227e7f8a3e8e0d58b2c66d19b4. Mark, I'll leave it to you if you want to cherry-pick the patch from there or wait for the next rebase.

Comment 14 Fedora Admin XMLRPC Client 2010-03-09 16:54:23 UTC
This package has changed ownership in the Fedora Package Database.  Reassigning to the new owner of this component.

Comment 15 Fedora Admin XMLRPC Client 2010-03-09 17:20:08 UTC
This package has changed ownership in the Fedora Package Database.  Reassigning to the new owner of this component.

Comment 16 Justin M. Forbes 2010-03-11 19:33:34 UTC
This patch is in 0.12.x which is currently available in rawhide and in updates-testing for Fedora 12.