Bug 467999

Summary: GFS2: Please reduce races when accessing the gfs2 metafs
Product: [Fedora] Fedora Reporter: Steve Whitehouse <swhiteho>
Component: gfs2-utilsAssignee: Steve Whitehouse <swhiteho>
Status: CLOSED WONTFIX QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: medium Docs Contact:
Priority: medium    
Version: 19CC: adas, agk, bmarzins, cfeist, fdinitto, swhiteho
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-05-24 14:51:28 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
proposed fix
none
fix for part of the issues none

Description Steve Whitehouse 2008-10-22 10:01:57 UTC
There are a number of races which occur when using the gfs2meta fs from the various utilities which need to be fixed. I'd suggest something along the lines of the following:

1. Keep an open fd on the root of the metafs for the whole time that the tool is running (to avoid someone else doing a umount in the middle of the operation)

2. All accesses to files within the metafs are to be done via the various -at syscalls only. Preferably openat and then whatever operations are required.

3. Once the root dir of the metafs has been opened, do a stat and _check_ that we really did get the correct metafs for the filesystem that we intended to operate on, just in case we raced with someone doing mount --move etc.

4. It would be nice to do something about the mount point in /tmp if possible. Ideally we'd want to make that mount private to the tool's namespace. This is a more tricky problem though, so it might need some more thought, but shouldn't stop us doing the first three items anyway. At the very least we should audit it against all the usual security problems one might cause by using /tmp files/directories to ensure that this is secure.

Comment 1 Fabio Massimo Di Nitto 2008-10-30 09:44:36 UTC
I am taking this one since i need to rework some of the original patch that lead to this problem in the first place.

Comment 2 Fabio Massimo Di Nitto 2008-10-30 10:06:38 UTC
Created attachment 321904 [details]
proposed fix

The patch in attachment addresses differnt issues at the same time:

- it introduces a better randomness in mount_gfs2_meta by using mkdtemp (this is a required change for security reason).
- the use of mkdtemp makes the whole dir_exists code unrequired (hence removed by the patch).
- we force each tool to create its own meta mount. This fixes any possible race conditions between tools (and thanks to the use of mkdtemp that will guarantee the creation of a unique mount point). It also makes find_gfs2_meta function unrequired (hence removed by the patch).
- cleanup struct gfs2_sbd of unrequired fields.
- cleanup the cleanup_metafs code path.
- cleanup exit path from mount_gfs2_meta.
- simplify code around different tools.

The only con of this patch is that it will mount gfs2_meta a few times during tools execution, but given the amount of benefits, i don't see it as a problem at all.

Fabio

Comment 3 Steve Whitehouse 2008-10-30 10:21:03 UTC
The proposed patch looks good, but it doesn't actually do what was requested in the opening comment.

I think you need to add to mount_gfs2_meta() code to open the metafs root directory and keep that fd somewhere (gfs2_sbd ?) until the metafs is umounted. Also we need a check right after the open to ensure that we didn't race with some other mount event, as per items #1 and #3 in the opening comment. It shouldn't be too tricky to add that in, and it will prevent any other process fiddling with the mount while its in use.

It looks like only the quota tool actually wants to try and open the metafs more than once during its operation. There is no need for that. It should be ok to remove all the existing mount/umount of the metafs and just do it once when the tool starts.

I also see no sign of the -at() syscalls mentioned in item #2 of the opening comment. So I do like the patch, its just that it doesn't seem to address this particular bz.

Comment 4 Fabio Massimo Di Nitto 2008-10-30 10:41:10 UTC
(In reply to comment #3)
> The proposed patch looks good, but it doesn't actually do what was requested in
> the opening comment.
> 
> I think you need to add to mount_gfs2_meta() code to open the metafs root
> directory and keep that fd somewhere (gfs2_sbd ?) until the metafs is umounted.

Each mount_gfs2_meta will create a unique mount point. meta_fd is filled by lock_for_admin that each tool invoke right after mount_gfs2_meta.
lock_for_admin could be invoked directly by mount_gfs2_meta and turned static. This could avoid wrong usage of mount_gfs2_meta in future.

> Also we need a check right after the open to ensure that we didn't race with
> some other mount event,

You cannot race for mount anymore. Each time you invoke mount_gfs2_meta, a new mount point is created (via mkdtemp) that is unique to that session. mkdtemp will guarantee that there is no overlapping.

> as per items #1 and #3 in the opening comment. It
> shouldn't be too tricky to add that in, and it will prevent any other process
> fiddling with the mount while its in use.

As above, there is no need to anymore. Note that we don't call find_gfs2_meta like before (it doesn't exists).

> It looks like only the quota tool actually wants to try and open the metafs
> more than once during its operation. There is no need for that. It should be ok
> to remove all the existing mount/umount of the metafs and just do it once when
> the tool starts.

I didn't look at the whole quota code. This is another optimization outside the scope of this patch.

> I also see no sign of the -at() syscalls mentioned in item #2 of the opening
> comment. So I do like the patch, its just that it doesn't seem to address this
> particular bz.

I don't think we need -at() anymore for the same reasons as above (unless I am still missing something, but I am happy to be proven wrong).

Fabio

Comment 5 Steve Whitehouse 2008-10-30 11:20:07 UTC
> > The proposed patch looks good, but it doesn't actually do what was requested in
> > the opening comment.
> > 
> > I think you need to add to mount_gfs2_meta() code to open the metafs root
> > directory and keep that fd somewhere (gfs2_sbd ?) until the metafs is umounted.

> Each mount_gfs2_meta will create a unique mount point. meta_fd is filled by
> lock_for_admin that each tool invoke right after mount_gfs2_meta.
> lock_for_admin could be invoked directly by mount_gfs2_meta and turned static.
> This could avoid wrong usage of mount_gfs2_meta in future.

Yes, I think the making lock_for_admin static is a good plan as you suggest. However it doesn't check that its opened the correct directory, so the race still exists. I'm not talking about clashing mount point names with other tools, I know that isn't a problem, I'm talking about the possibility that a user tries to do something (e.g. mount --move /tmp/.something /somewhereelse) which results in the tool picking the wrong directory. So it still needs to check that it opened the right thing, which is not that tricky to do. Just stat it and compare the device number with the original fs.

> > Also we need a check right after the open to ensure that we didn't race with
> > some other mount event,

> You cannot race for mount anymore. Each time you invoke mount_gfs2_meta, a new
> mount point is created (via mkdtemp) that is unique to that session. mkdtemp
> will guarantee that there is no overlapping.
>
See above. There is still a race, its just not related to the other tools, and the window is small.

> > as per items #1 and #3 in the opening comment. It
> > shouldn't be too tricky to add that in, and it will prevent any other process
> > fiddling with the mount while its in use.
>
> As above, there is no need to anymore. Note that we don't call find_gfs2_meta
> like before (it doesn't exists).

Yes, I understand that that, and thats ok, it seems a reasonable thing to do.

> > It looks like only the quota tool actually wants to try and open the metafs
> > more than once during its operation. There is no need for that. It should be ok
> > to remove all the existing mount/umount of the metafs and just do it once when
> > the tool starts.
>
> I didn't look at the whole quota code. This is another optimization outside the
> scope of this patch.

Yes, we can do that later its no big deal. On the other hand if we are not going to do it now, we should open a new bz for it, otherwise it will get forgotten.

> > I also see no sign of the -at() syscalls mentioned in item #2 of the opening
> > comment. So I do like the patch, its just that it doesn't seem to address this
> > particular bz.

> I don't think we need -at() anymore for the same reasons as above (unless I am
> still missing something, but I am happy to be proven wrong).

Yes, we do. Someone can come along and move the mount, or mount something else over the top of it, or ...... so its an extra safety net since we can then be 100% sure that nobody can do anything nasty while the tool is working. It also saves having to pass the mount point as a string to all the other functions that access the metafs files, we can just pass the fd of the root instead.

Comment 6 Fabio Massimo Di Nitto 2008-10-30 11:42:43 UTC
(In reply to comment #5)
> > > The proposed patch looks good, but it doesn't actually do what was requested in
> > > the opening comment.
> > > 
> > > I think you need to add to mount_gfs2_meta() code to open the metafs root
> > > directory and keep that fd somewhere (gfs2_sbd ?) until the metafs is umounted.
> 
> > Each mount_gfs2_meta will create a unique mount point. meta_fd is filled by
> > lock_for_admin that each tool invoke right after mount_gfs2_meta.
> > lock_for_admin could be invoked directly by mount_gfs2_meta and turned static.
> > This could avoid wrong usage of mount_gfs2_meta in future.
> 
> Yes, I think the making lock_for_admin static is a good plan as you suggest.

Ok, updating the patch now.

> However it doesn't check that its opened the correct directory, so the race
> still exists. I'm not talking about clashing mount point names with other
> tools, I know that isn't a problem, I'm talking about the possibility that a
> user tries to do something (e.g. mount --move /tmp/.something /somewhereelse)
> which results in the tool picking the wrong directory. So it still needs to
> check that it opened the right thing, which is not that tricky to do. Just stat
> it and compare the device number with the original fs.

Well we are mounting as root here. In order to do those operations you need to be root. At that point you can do much more damage in other way than trying to exploit this corner case.
Should we really worry about it?

> 
> > > Also we need a check right after the open to ensure that we didn't race with
> > > some other mount event,
> 
> > You cannot race for mount anymore. Each time you invoke mount_gfs2_meta, a new
> > mount point is created (via mkdtemp) that is unique to that session. mkdtemp
> > will guarantee that there is no overlapping.
> >
> See above. There is still a race, its just not related to the other tools, and
> the window is small.
> 

same comment about root as above :)

> 
> > > It looks like only the quota tool actually wants to try and open the metafs
> > > more than once during its operation. There is no need for that. It should be ok
> > > to remove all the existing mount/umount of the metafs and just do it once when
> > > the tool starts.
> >
> > I didn't look at the whole quota code. This is another optimization outside the
> > scope of this patch.
> 
> Yes, we can do that later its no big deal. On the other hand if we are not
> going to do it now, we should open a new bz for it, otherwise it will get
> forgotten.

Please file another bug otherwise this one will be messy.

> 
> > > I also see no sign of the -at() syscalls mentioned in item #2 of the opening
> > > comment. So I do like the patch, its just that it doesn't seem to address this
> > > particular bz.
> 
> > I don't think we need -at() anymore for the same reasons as above (unless I am
> > still missing something, but I am happy to be proven wrong).
> 
> Yes, we do. Someone can come along and move the mount, or mount something else
> over the top of it, or ...... so its an extra safety net since we can then be
> 100% sure that nobody can do anything nasty while the tool is working. It also
> saves having to pass the mount point as a string to all the other functions
> that access the metafs files, we can just pass the fd of the root instead.

Ok, now I understand what you mean, but then.. to perform those operations you already have enough privileges to make much more damage to the system. Is it worth headache to fix this issue at the end?

Fabio

Comment 7 Fabio Massimo Di Nitto 2008-10-30 11:53:47 UTC
Created attachment 321920 [details]
fix for part of the issues

This patch makes lock_admin static and always invoked by mount_gfs2_meta as discussed.

Fabio

Comment 8 Fabio Massimo Di Nitto 2008-10-30 13:16:56 UTC
14:16 < bob_home> Yes.  It's okay.  Nomine Padre et Fili et Spiritus Sancti, Amen.  There.  I've blessed it.   :7)

just for the record, also bob is ok with the patch :)

Comment 9 Bug Zapper 2008-11-26 04:06:57 UTC
This bug appears to have been reported against 'rawhide' during the Fedora 10 development cycle.
Changing version to '10'.

More information and reason for this action is here:
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 10 Bug Zapper 2009-06-09 09:49:36 UTC
This bug appears to have been reported against 'rawhide' during the Fedora 11 development cycle.
Changing version to '11'.

More information and reason for this action is here:
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 11 Bug Zapper 2010-04-27 12:17:29 UTC
This message is a reminder that Fedora 11 is nearing its end of life.
Approximately 30 (thirty) days from now Fedora will stop maintaining
and issuing updates for Fedora 11.  It is Fedora's policy to close all
bug reports from releases that are no longer maintained.  At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '11'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 11's end of life.

Bug Reporter: Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 11 is end of life.  If you 
would still like to see this bug fixed and are able to reproduce it 
against a later version of Fedora please change the 'version' of this 
bug to the applicable version.  If you are unable to change the version, 
please add a comment here and someone will do it for you.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events.  Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

The process we are following is described here: 
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 12 Bug Zapper 2010-07-30 10:33:18 UTC
This bug appears to have been reported against 'rawhide' during the Fedora 14 development cycle.
Changing version to '14'.

More information and reason for this action is here:
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 13 Steve Whitehouse 2010-08-19 13:18:27 UTC
Plan is to drop gfs2_quota going forward, so we only need to worry about the grow and jadd utilities since those will be the only ones left using the meta filesystem at that stage.

Comment 14 Fedora Admin XMLRPC Client 2010-12-02 17:06:07 UTC
This package has changed ownership in the Fedora Package Database.  Reassigning to the new owner of this component.

Comment 15 Fedora End Of Life 2013-04-03 18:22:33 UTC
This bug appears to have been reported against 'rawhide' during the Fedora 19 development cycle.
Changing version to '19'.

(As we did not run this process for some time, it could affect also pre-Fedora 19 development
cycle bugs. We are very sorry. It will help us with cleanup during Fedora 19 End Of Life. Thank you.)

More information and reason for this action is here:
https://fedoraproject.org/wiki/BugZappers/HouseKeeping/Fedora19

Comment 16 Steve Whitehouse 2013-05-24 14:51:28 UTC
This is basically complete now. Very little uses the metafs in upstream - only really the grow/jadd process. Everything else is done via a proper interface.

The correct way to resolve the remaining issues is really just to remove the metafs interface entirely and do everything via normal kernel interfaces. That may require a little bit of work for grow/jadd, but is not impossible.

Either way, I think we can reasonable close this one at this stage.