autofs5 direct map and other enhancements requested in RHEL4.
Ian and I think that this request should wait until autofs v5 is stabilized in RHEL 5 before an attempt is made to backport it to RHEL 4. We will evaluate this request after RHEL 5 GA.
(In reply to comment #4) > Jeff, this request is for RHEL 4.5 which is planned to be released after 5 GA. > Waiting till 4.6 would mean waiting for roughly one year from now (unless we > find a different release vehicle). > > Perhaps we can keep it in the planning, do the work if feasible and decide about > support vs. tech preview later? Fact is that I've done some of the work on porting the patches already. I took the oppertunity to check to see how much effort would be needed when preparing a patch for another issue. This went quite well but there will quite a bit of time needed to verify correctness and to test functionality. I'll put a little effort into this as time permits so the patches will be ready in "short order" when the time comes. But I must stress that I'm having some difficulty resolving issues exposed by running the Connectathon tests. Until autofs performs solidly under these stress conditions these issues are the priority. When I have something, what can we do for testing, keeping in mind that this task is low priority? Ian
(In reply to comment #7) >> Does autofs4 pass the connectathon tests? >No. Is there a BZ open on this?
(In reply to comment #7) > (In reply to comment #6) > > Does autofs4 pass the connectathon tests? > > No. v4, no it doesn't. v5 basically does but that was the easy bit. > > > Lehman is very concerned about stability here. They are willing to go to > > autofs5 to get failover r/o mounts but their biggest driver is stability / > > reliable restartability. I'm focused on this now using the Connectathon test bed to stress autofs and I'm finding quite a bit to fix. It's very difficult but it's very useful in terms of improving stability and reliability of autofs. I'll be stressing it in as many ways as I can think of. > > Autofs v5 does not give them failover of readonly mounts. If their concern is > stability, then I'd question making such a large change in their production > environment. Autofs v5 is a quite new and unproven code base. Autofs v5 does > not guarantee /reliable/ restartability yet, either (Ian, correct me if I'm wrong). Basically yes. v5 will allow you to shutdown and startup the daemon when mounts are active but we don't have evidence of the possible effects of this in a production setting yet. The method used at the moment is to unlink umount active mounts that have been left at shutdown and mount the autofs mounts. This actually appears to work well. There are a couple of concerns: 1) applications are not able to access new mounts during the restart time window. There is basically no way around this as when there is no daemon to answer mount requests they simply can't happen. Existing open files on mounts left at shutdown remain until not in use and new requests go to the new mounts as expected. 2) There is a finite window between when the filesystem is unlink umounted and when the new filesystem is mounted. Similar to above but the daemon is running. Ideally we would add a remount option to the kernel module and this window would disappear but I haven't yet worked out how to do this in the userspace daemon. Basically I can't get hold of what I need when the NFS (or other) mount is atop the autofs mount. 3) As Jeff has pointed out recently we don't know the affects of unlink umounting on file operations such as locking but this should be ok. So when autofs has finished the test cycle we are going to have to give it some very specific testing to clearify the impact of these possible problem areas. I'd like to think that v5 will be more stable and reliable than v4 from the outset but the fact is that a lot of work has been done on this in v4 and there's a lot of new code in v5. Ian
(In reply to comment #9) > (In reply to comment #7) > >> Does autofs4 pass the connectathon tests? > >No. > > Is there a BZ open on this? No. The connectathon test suite does not give just one PASS/FAIL result. It is a set of tests that stresses different parts of the automounter. Much of this has to do with the parsing of maps. As the parser has been historically fragile, I've been hesitant to make changes to it as it can have unforseen ill effects on real customer installations. The fact of the matter is that we don't 100% comply with the Sun map format. We don't claim to (this is something that is hopefully being addressed in v5). Changing v4 to be compliant *will* break existing installations. It's simply not a good idea to fix things just so the tests pass.
(In reply to comment #8) > Lehman doesn't care about direct map support short term. They have already > created two distinct maps worlds within lehman. They would like to get back to > one for solaris and linux, but that can't happen for a year or so anyway, so > it's a soft desire. > > The most important piece is stability. They have issues restarting autofs (big > regression from autofs3 to autofs4 here). They also have issues with program > maps failing (suddenly autofs starts passing the entire path not the > mountpoint). If a daemon is hung in the "D" state they need to be able to have > a new one start up. Of course if autofs never failed they would not be so > concerned about how to handle failures. I wish! The issue regarding the paths is a bad one. It could still happen in v5 if a mount(8) fails but returns a success code back to the daemon. So we may have to remove the "sloppy" mount option to make sure bad mounts do return a fail to guard against this. Vitually all the NFS options that are needed are now supported by Linux mount so the "sloppy" mount option may not be so crucial any more. The daemon getting into a "D" state is much more of a worry and I'd really like to get more information on it. I've heard about it form time to time over the years but noone has been willing to help gather information on it so I can try and resolve it. To be honest I thought it had gone away at some point with bug fixes made to the 2.6 kernel module. So, more info please. > > They are very interested in having failover work (not just at initial mount > time, but at server failure). When did our plans change? I have email from Ian > on 6/14 saying that he was working on it... Indeed I did. And I have put time into it and I believe I can implement it but this is an NFS issue and autofs proper has the priority. When I'm happy with the stability of autofs I'll be able to spend more time on it. I must point out that while I'm familiar with the NFS code I've not spend time on enhancements before so it will be a fairly slow process. Very sorry. > > So if the failover is not in the plans for autofs5 I will close this ticket as > all Lehman wants from what we're giving is reliablity in autofs4. I can then > open a new ticket against autofs4 for each issue they have. It's on the plans but it's not possible for autofs to provide this function. It has to be done in the NFS client kernel module. Once again, sorry, I wish I could get through this stuff more quickly so I could do this but that's really almost always the way. Ian
Downgrading priority to "high" in order to better reflect priorities which are: 1) make autofs5 stable in RHEL5 2) investigate a backport to RHEL4
(In reply to comment #14) > Downgrading priority to "high" in order to better reflect priorities which are: > 1) make autofs5 stable in RHEL5 > 2) investigate a backport to RHEL4 I now have a set of kernel patches for REHL4 2.6.9-42.2.EL. As yet untested. Ian
(In reply to comment #15) > (In reply to comment #14) > > Downgrading priority to "high" in order to better reflect priorities which are: > > 1) make autofs5 stable in RHEL5 > > 2) investigate a backport to RHEL4 > > I now have a set of kernel patches for REHL4 2.6.9-42.2.EL. > As yet untested. I've updated my RHEL4 kernel (2.6.9-42.20.EL) and run the connectathon tests with autofs-4.1.4-197 against the standard and patched kernels. The results where the same for both which is an indication that adding the autofs version 5 kernel patches won't introduce regressions for autofs version 4. More testing needs to be done with a patched kernel against version 4 in an actual usage environment before we can really be confident that we won't be introducing regressions though. Any thoughts as to who would be able to help with this testing? Ian
Created attachment 139707 [details] Connectathon test log from unpatched RHEL4 kernel.
Created attachment 139708 [details] Connectathon test log from autofs v5 patched RHEL4 kernel.
Created attachment 139711 [details] Connectathon test log from autofs-5.0.1-0.rc2.20 against v5 patched RHEL4 kernel. These test results are as expected from autofs version 5.
(In reply to comment #16) > (In reply to comment #15) > > (In reply to comment #14) > > > Downgrading priority to "high" in order to better reflect priorities which are: > > > 1) make autofs5 stable in RHEL5 > > > 2) investigate a backport to RHEL4 > > > > I now have a set of kernel patches for REHL4 2.6.9-42.2.EL. > > As yet untested. > > I've updated my RHEL4 kernel (2.6.9-42.20.EL) and run the > connectathon tests with autofs-4.1.4-197 against the standard > and patched kernels. > > The results where the same for both which is an indication > that adding the autofs version 5 kernel patches won't introduce > regressions for autofs version 4. > > More testing needs to be done with a patched kernel against > version 4 in an actual usage environment before we can really > be confident that we won't be introducing regressions though. > Hi all, I have created a CVS private kernel branch for this and built a test kernel into dust-4E-scratch. I've tested against autofs version 4 and 5 all appears fine. So, initially we need those interested in testing version 5 to install this kernel and verify that their test machines still function as expected. While this is done I will put together a test plan. For the impatient who wish to install autofs version 5 prior to receiving the test plan I recommend autofs-5.0.1-0.rc2.23 or above. Feedback is welcome. Ian
(In reply to comment #25) > Ian, the 4.5 beta kernel has been built. > Why is this bugzilla still in POST? Sorry. I didn't merge the patches but I did check the merge. Setting to MODIFIED. Ian
It doesn't look like autofs 5 has made it into 4.5 at least as of 49.EL. I see autofs 4 and another autofs directory in the kernel source which appears to be the old autofs which we do not build.
(In reply to comment #28) > It doesn't look like autofs 5 has made it into 4.5 at least as of 49.EL. I see > autofs 4 and another autofs directory in the kernel source which appears to be > the old autofs which we do not build. Your mistaken. The kernel module "autofs4" supports "autofs kernel protocols 3,4 and 5", not to be confused with application version 4 or 5. The reason it was done this way is, first there are one too many autofs modules in the kernel already. I think the autofs4 module should be renamed to autofs and the existing autofs module removed. While I'd like to start pushing that I'm still reluctant to do so because of the potential for disruption and, until recently, there was a long standing unresolved bug. I'll have to wait for a while and see if that known problem has gone away, as the first question will almost certainly be about stability. Ian
Ian, I see that the draft release notes for RHEL 4.5 do not mention Autofs v5. Would you kindly draft something for the release notes that describes what is new in RHEL 4.5 autofs, anything users should be aware of, and answers the questions about nomenclature and version numbers mentioned above. You can post a draft here. I've set requires_release_note=? to get Don involved. Tom
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2007-0304.html