Bug 467999
Summary: | GFS2: Please reduce races when accessing the gfs2 metafs | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Steve Whitehouse <swhiteho> | ||||||
Component: | gfs2-utils | Assignee: | Steve Whitehouse <swhiteho> | ||||||
Status: | CLOSED WONTFIX | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||||
Severity: | medium | Docs Contact: | |||||||
Priority: | medium | ||||||||
Version: | 19 | CC: | adas, agk, bmarzins, cfeist, fdinitto, swhiteho | ||||||
Target Milestone: | --- | ||||||||
Target Release: | --- | ||||||||
Hardware: | All | ||||||||
OS: | Linux | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2013-05-24 14:51:28 UTC | Type: | --- | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Attachments: |
|
Description
Steve Whitehouse
2008-10-22 10:01:57 UTC
I am taking this one since i need to rework some of the original patch that lead to this problem in the first place. Created attachment 321904 [details]
proposed fix
The patch in attachment addresses differnt issues at the same time:
- it introduces a better randomness in mount_gfs2_meta by using mkdtemp (this is a required change for security reason).
- the use of mkdtemp makes the whole dir_exists code unrequired (hence removed by the patch).
- we force each tool to create its own meta mount. This fixes any possible race conditions between tools (and thanks to the use of mkdtemp that will guarantee the creation of a unique mount point). It also makes find_gfs2_meta function unrequired (hence removed by the patch).
- cleanup struct gfs2_sbd of unrequired fields.
- cleanup the cleanup_metafs code path.
- cleanup exit path from mount_gfs2_meta.
- simplify code around different tools.
The only con of this patch is that it will mount gfs2_meta a few times during tools execution, but given the amount of benefits, i don't see it as a problem at all.
Fabio
The proposed patch looks good, but it doesn't actually do what was requested in the opening comment. I think you need to add to mount_gfs2_meta() code to open the metafs root directory and keep that fd somewhere (gfs2_sbd ?) until the metafs is umounted. Also we need a check right after the open to ensure that we didn't race with some other mount event, as per items #1 and #3 in the opening comment. It shouldn't be too tricky to add that in, and it will prevent any other process fiddling with the mount while its in use. It looks like only the quota tool actually wants to try and open the metafs more than once during its operation. There is no need for that. It should be ok to remove all the existing mount/umount of the metafs and just do it once when the tool starts. I also see no sign of the -at() syscalls mentioned in item #2 of the opening comment. So I do like the patch, its just that it doesn't seem to address this particular bz. (In reply to comment #3) > The proposed patch looks good, but it doesn't actually do what was requested in > the opening comment. > > I think you need to add to mount_gfs2_meta() code to open the metafs root > directory and keep that fd somewhere (gfs2_sbd ?) until the metafs is umounted. Each mount_gfs2_meta will create a unique mount point. meta_fd is filled by lock_for_admin that each tool invoke right after mount_gfs2_meta. lock_for_admin could be invoked directly by mount_gfs2_meta and turned static. This could avoid wrong usage of mount_gfs2_meta in future. > Also we need a check right after the open to ensure that we didn't race with > some other mount event, You cannot race for mount anymore. Each time you invoke mount_gfs2_meta, a new mount point is created (via mkdtemp) that is unique to that session. mkdtemp will guarantee that there is no overlapping. > as per items #1 and #3 in the opening comment. It > shouldn't be too tricky to add that in, and it will prevent any other process > fiddling with the mount while its in use. As above, there is no need to anymore. Note that we don't call find_gfs2_meta like before (it doesn't exists). > It looks like only the quota tool actually wants to try and open the metafs > more than once during its operation. There is no need for that. It should be ok > to remove all the existing mount/umount of the metafs and just do it once when > the tool starts. I didn't look at the whole quota code. This is another optimization outside the scope of this patch. > I also see no sign of the -at() syscalls mentioned in item #2 of the opening > comment. So I do like the patch, its just that it doesn't seem to address this > particular bz. I don't think we need -at() anymore for the same reasons as above (unless I am still missing something, but I am happy to be proven wrong). Fabio > > The proposed patch looks good, but it doesn't actually do what was requested in > > the opening comment. > > > > I think you need to add to mount_gfs2_meta() code to open the metafs root > > directory and keep that fd somewhere (gfs2_sbd ?) until the metafs is umounted. > Each mount_gfs2_meta will create a unique mount point. meta_fd is filled by > lock_for_admin that each tool invoke right after mount_gfs2_meta. > lock_for_admin could be invoked directly by mount_gfs2_meta and turned static. > This could avoid wrong usage of mount_gfs2_meta in future. Yes, I think the making lock_for_admin static is a good plan as you suggest. However it doesn't check that its opened the correct directory, so the race still exists. I'm not talking about clashing mount point names with other tools, I know that isn't a problem, I'm talking about the possibility that a user tries to do something (e.g. mount --move /tmp/.something /somewhereelse) which results in the tool picking the wrong directory. So it still needs to check that it opened the right thing, which is not that tricky to do. Just stat it and compare the device number with the original fs. > > Also we need a check right after the open to ensure that we didn't race with > > some other mount event, > You cannot race for mount anymore. Each time you invoke mount_gfs2_meta, a new > mount point is created (via mkdtemp) that is unique to that session. mkdtemp > will guarantee that there is no overlapping. > See above. There is still a race, its just not related to the other tools, and the window is small. > > as per items #1 and #3 in the opening comment. It > > shouldn't be too tricky to add that in, and it will prevent any other process > > fiddling with the mount while its in use. > > As above, there is no need to anymore. Note that we don't call find_gfs2_meta > like before (it doesn't exists). Yes, I understand that that, and thats ok, it seems a reasonable thing to do. > > It looks like only the quota tool actually wants to try and open the metafs > > more than once during its operation. There is no need for that. It should be ok > > to remove all the existing mount/umount of the metafs and just do it once when > > the tool starts. > > I didn't look at the whole quota code. This is another optimization outside the > scope of this patch. Yes, we can do that later its no big deal. On the other hand if we are not going to do it now, we should open a new bz for it, otherwise it will get forgotten. > > I also see no sign of the -at() syscalls mentioned in item #2 of the opening > > comment. So I do like the patch, its just that it doesn't seem to address this > > particular bz. > I don't think we need -at() anymore for the same reasons as above (unless I am > still missing something, but I am happy to be proven wrong). Yes, we do. Someone can come along and move the mount, or mount something else over the top of it, or ...... so its an extra safety net since we can then be 100% sure that nobody can do anything nasty while the tool is working. It also saves having to pass the mount point as a string to all the other functions that access the metafs files, we can just pass the fd of the root instead. (In reply to comment #5) > > > The proposed patch looks good, but it doesn't actually do what was requested in > > > the opening comment. > > > > > > I think you need to add to mount_gfs2_meta() code to open the metafs root > > > directory and keep that fd somewhere (gfs2_sbd ?) until the metafs is umounted. > > > Each mount_gfs2_meta will create a unique mount point. meta_fd is filled by > > lock_for_admin that each tool invoke right after mount_gfs2_meta. > > lock_for_admin could be invoked directly by mount_gfs2_meta and turned static. > > This could avoid wrong usage of mount_gfs2_meta in future. > > Yes, I think the making lock_for_admin static is a good plan as you suggest. Ok, updating the patch now. > However it doesn't check that its opened the correct directory, so the race > still exists. I'm not talking about clashing mount point names with other > tools, I know that isn't a problem, I'm talking about the possibility that a > user tries to do something (e.g. mount --move /tmp/.something /somewhereelse) > which results in the tool picking the wrong directory. So it still needs to > check that it opened the right thing, which is not that tricky to do. Just stat > it and compare the device number with the original fs. Well we are mounting as root here. In order to do those operations you need to be root. At that point you can do much more damage in other way than trying to exploit this corner case. Should we really worry about it? > > > > Also we need a check right after the open to ensure that we didn't race with > > > some other mount event, > > > You cannot race for mount anymore. Each time you invoke mount_gfs2_meta, a new > > mount point is created (via mkdtemp) that is unique to that session. mkdtemp > > will guarantee that there is no overlapping. > > > See above. There is still a race, its just not related to the other tools, and > the window is small. > same comment about root as above :) > > > > It looks like only the quota tool actually wants to try and open the metafs > > > more than once during its operation. There is no need for that. It should be ok > > > to remove all the existing mount/umount of the metafs and just do it once when > > > the tool starts. > > > > I didn't look at the whole quota code. This is another optimization outside the > > scope of this patch. > > Yes, we can do that later its no big deal. On the other hand if we are not > going to do it now, we should open a new bz for it, otherwise it will get > forgotten. Please file another bug otherwise this one will be messy. > > > > I also see no sign of the -at() syscalls mentioned in item #2 of the opening > > > comment. So I do like the patch, its just that it doesn't seem to address this > > > particular bz. > > > I don't think we need -at() anymore for the same reasons as above (unless I am > > still missing something, but I am happy to be proven wrong). > > Yes, we do. Someone can come along and move the mount, or mount something else > over the top of it, or ...... so its an extra safety net since we can then be > 100% sure that nobody can do anything nasty while the tool is working. It also > saves having to pass the mount point as a string to all the other functions > that access the metafs files, we can just pass the fd of the root instead. Ok, now I understand what you mean, but then.. to perform those operations you already have enough privileges to make much more damage to the system. Is it worth headache to fix this issue at the end? Fabio Created attachment 321920 [details]
fix for part of the issues
This patch makes lock_admin static and always invoked by mount_gfs2_meta as discussed.
Fabio
14:16 < bob_home> Yes. It's okay. Nomine Padre et Fili et Spiritus Sancti, Amen. There. I've blessed it. :7) just for the record, also bob is ok with the patch :) This bug appears to have been reported against 'rawhide' during the Fedora 10 development cycle. Changing version to '10'. More information and reason for this action is here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping This bug appears to have been reported against 'rawhide' during the Fedora 11 development cycle. Changing version to '11'. More information and reason for this action is here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping This message is a reminder that Fedora 11 is nearing its end of life. Approximately 30 (thirty) days from now Fedora will stop maintaining and issuing updates for Fedora 11. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as WONTFIX if it remains open with a Fedora 'version' of '11'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version prior to Fedora 11's end of life. Bug Reporter: Thank you for reporting this issue and we are sorry that we may not be able to fix it before Fedora 11 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora please change the 'version' of this bug to the applicable version. If you are unable to change the version, please add a comment here and someone will do it for you. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete. The process we are following is described here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping This bug appears to have been reported against 'rawhide' during the Fedora 14 development cycle. Changing version to '14'. More information and reason for this action is here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping Plan is to drop gfs2_quota going forward, so we only need to worry about the grow and jadd utilities since those will be the only ones left using the meta filesystem at that stage. This package has changed ownership in the Fedora Package Database. Reassigning to the new owner of this component. This bug appears to have been reported against 'rawhide' during the Fedora 19 development cycle. Changing version to '19'. (As we did not run this process for some time, it could affect also pre-Fedora 19 development cycle bugs. We are very sorry. It will help us with cleanup during Fedora 19 End Of Life. Thank you.) More information and reason for this action is here: https://fedoraproject.org/wiki/BugZappers/HouseKeeping/Fedora19 This is basically complete now. Very little uses the metafs in upstream - only really the grow/jadd process. Everything else is done via a proper interface. The correct way to resolve the remaining issues is really just to remove the metafs interface entirely and do everything via normal kernel interfaces. That may require a little bit of work for grow/jadd, but is not impossible. Either way, I think we can reasonable close this one at this stage. |