Description of problem: If I try to cd to an automounted directory that doesn't exist, the automount daemon makes 4 queries to the LDAP server to determine that it really doesn't exist. (This is better than autofs-4.x which took 9 queries!) Running wireshark, I see it makes 4 queries and the only thing it changes is the filter: 1. (&(objectclass=automount)(automountKey=qwerty)) 2. (&(objectclass=automount)(automountKey=/)) 3. (&(objectclass=automount)(automountKey=qwerty)) 4. (&(objectclass=automount)(automountKey=/)) Why does it query automountKey=/ when the search for the normal key fails? And why does it repeat both queries? It should just quit after the 1st query fails. Version-Release number of selected component (if applicable): autofs-5.0.0_beta6-2 How reproducible: Every time Steps to Reproduce: 1. Run wireshark and watch traffic to the LDAP server 2. In a shell, run 'cd /automnt/qwerty' where /automnt/qwerty doesn't exist 3. Watch wireshark capture 4 LDAP queries before the automount daemon returns "No such file or directory" Actual results: It took the automounter 4 LDAP queries, 2 of which were redundant, to determine that a certain mount point didn't exist. Expected results: It should only take 1 query.
(In reply to comment #0) > Description of problem: > If I try to cd to an automounted directory that doesn't exist, the automount > daemon makes 4 queries to the LDAP server to determine that it really doesn't > exist. (This is better than autofs-4.x which took 9 queries!) > > Running wireshark, I see it makes 4 queries and the only thing it changes is the > filter: > > 1. (&(objectclass=automount)(automountKey=qwerty)) > 2. (&(objectclass=automount)(automountKey=/)) > 3. (&(objectclass=automount)(automountKey=qwerty)) > 4. (&(objectclass=automount)(automountKey=/)) > > Why does it query automountKey=/ when the search for the normal key fails? And > why does it repeat both queries? It should just quit after the 1st query fails. > Yes. I've seen this behaviour in several different forms over time. I'm sure there are still some oppertunities for improvement in all the lookup modules. First the wildcard. When we lookup a key and it doesn't find a match in the map there's still the possibility that there's a wildcard entry that will match any key. Also there's the possibility that the map may have changed since the last lookup so were stuck not knowing. So we need to try both. It's done this way because I believe, like NIS, there is no guarantee as to the order query entries are returned and we require matching a lookup key against a map key before matching against the wildcard key. Perhaps your thinking I could construct a search with an "|". Possibly I can, but it's not that straight forward as there are a couple of other cases to allow for. I'll think about it. The second issue you point out far more difficult to deal with. Basically, autofs is at the mercy of applications making system calls that cause the kernel module to trigger a lookup. The kernel module attempted to cache negative lookups at one time but that didn't work properly. I plan on implementing this in the kernel module in time to come but because of the huge changes with v5 I want to let that stabalize first. Also I need to think more about how I'll do it. Ian
(In reply to comment #1) > (In reply to comment #0) > > Running wireshark, I see it makes 4 queries and the only thing it changes is the > > filter: > > > > 1. (&(objectclass=automount)(automountKey=qwerty)) > > 2. (&(objectclass=automount)(automountKey=/)) > > 3. (&(objectclass=automount)(automountKey=qwerty)) > > 4. (&(objectclass=automount)(automountKey=/)) > > > > Why does it query automountKey=/ when the search for the normal key fails? And > > why does it repeat both queries? It should just quit after the 1st query fails. > > snip ... > First the wildcard. > When we lookup a key and it doesn't find a match in the map there's > still the possibility that there's a wildcard entry that will match > any key. Also there's the possibility that the map may have changed > since the last lookup so were stuck not knowing. So we need to try > both. > > It's done this way because I believe, like NIS, there is no guarantee > as to the order query entries are returned and we require matching a > lookup key against a map key before matching against the wildcard key. > > Perhaps your thinking I could construct a search with an "|". Possibly > I can, but it's not that straight forward as there are a couple of other > cases to allow for. I'll think about it. Just to keep you updated. Today I finally started work to try and combine these two lookups into one. It would be good if we can reduce the number of queries by half, at least to start with. Ian
(In reply to comment #1) > > 2. (&(objectclass=automount)(automountKey=/)) <snip> > First the wildcard. > When we lookup a key and it doesn't find a match in the map there's > still the possibility that there's a wildcard entry that will match > any key. I wasn't aware that '/' was a wildcard in LDAP. If I manually run ldapsearch ... '(&(objectclass=automount)(automountKey=/))' it doesn't return anything so it doesn't appear to work as a wildcard. If, however, I use '*' instead of '/', I get the entire map back: ldapsearch ... '(&(objectclass=automount)(automountKey=*))' Should the '/' be a '*'? > Also there's the possibility that the map may have changed > since the last lookup so were stuck not knowing. So we need to try > both. If the search for the actual key doesn't return anything, though, then how will using a wildcard to get the map help? If the key doesn't exist when you ask the LDAP server for it directly, then downloading the entire map and manually searching for the key isn't going to help - it's still not going to exist. Or are you looking for a wildcard in the map itself? For example, an entry like * server:/home/& I can see why you might want to download the entire map to look for an entry like this, however, as I mentioned above, the '/' doesn't work as a wildcard if you're trying to get the entire map. > The second issue you point out far more difficult to deal with. > Basically, autofs is at the mercy of applications making system > calls that cause the kernel module to trigger a lookup. The kernel > module attempted to cache negative lookups at one time but that > didn't work properly. I plan on implementing this in the kernel > module in time to come but because of the huge changes with v5 I > want to let that stabalize first. Also I need to think more about > how I'll do it. I'm not following you here. Are you saying that if I try cd /automnt/qwerty the shell is somehow going to cause the automounter to query the LDAP server twice if the first attempt doesn't work?
(In reply to comment #3) > (In reply to comment #1) > > > 2. (&(objectclass=automount)(automountKey=/)) > <snip> > > First the wildcard. > > When we lookup a key and it doesn't find a match in the map there's > > still the possibility that there's a wildcard entry that will match > > any key. > > I wasn't aware that '/' was a wildcard in LDAP. If I manually run > ldapsearch ... '(&(objectclass=automount)(automountKey=/))' > it doesn't return anything so it doesn't appear to work as a wildcard. If, > however, I use '*' instead of '/', I get the entire map back: > ldapsearch ... '(&(objectclass=automount)(automountKey=*))' > > Should the '/' be a '*'? Nop. The '/' is used as the autofs map wildcard within LDAP autofs maps. The '*' is the LDAP match anything (LDAP wildcard) so that can't be used as an autofs map wildcard as we need to match a specific LDAP map (autofs wildcard) entry. > > > > Also there's the possibility that the map may have changed > > since the last lookup so were stuck not knowing. So we need to try > > both. > > If the search for the actual key doesn't return anything, though, then how will > using a wildcard to get the map help? If the key doesn't exist when you ask the > LDAP server for it directly, then downloading the entire map and manually > searching for the key isn't going to help - it's still not going to exist. > > Or are you looking for a wildcard in the map itself? For example, an entry like > * server:/home/& > I can see why you might want to download the entire map to look for an entry > like this, however, as I mentioned above, the '/' doesn't work as a wildcard if > you're trying to get the entire map. We don't want to download the entire map for this. I can see your confusion over this but I think the bit your missing is that autofs treats the '/' from an LDAP map key to mean '*' internally within the autofs LDAP lookup module. So the map entry * server:/home/& is stored in LDAP as automountKey: / automountInformation: server:/home/& so we can recognise it as the autofs wildcard map entry. > > The second issue you point out far more difficult to deal with. > > Basically, autofs is at the mercy of applications making system > > calls that cause the kernel module to trigger a lookup. The kernel > > module attempted to cache negative lookups at one time but that > > didn't work properly. I plan on implementing this in the kernel > > module in time to come but because of the huge changes with v5 I > > want to let that stabalize first. Also I need to think more about > > how I'll do it. > > I'm not following you here. Are you saying that if I try > cd /automnt/qwerty > the shell is somehow going to cause the automounter to query the LDAP server > twice if the first attempt doesn't work? Not quite what I meant but that does appear to be what happens at least for "cd". "ls" otoh appears to be not so persisent. Remember that what triggers a mount is a system call like open(2) or opendir(3) or other such call which causes a path lookup in the VFS which calls the autofs4 filesystem methods. This then leads to an upcall to the userspace daemon. What I was trying to describe is that the calls that the VFS makes to autofs4 are very specific and autofs4 has very little information as to what system call caused the lookup so all it can do is react by making an upcall to the daemon. There is at least one other case in the VFS lookup that can lead to a second call to the autofs4 module within the same lookup and I have recently submitted a kernel patch that I hope will remedy this (but probably not the case here as it relates to browable maps). Unfortunately it took some time to come up with a solution to this case. Once I'm sure that this (fairly straight forward in the end) patch is functioning correctly I will start thinking about caching failed mount callbacks to the daemon for some brief time, probably about 5 seconds, so that multiple system calls resulting in a failure will not generate multiple upcalls. Maybe I appears I'm a bit paranoid, leaving the caching till I'm happy that the version 5 changes are sound. The kernel changes for version 5 are quite significant (about 40 small patches in all, including bug fixes) so I didn't want to obscure the base function with failure caching within the initial implementation. Clear as mud, yes! So I know it's a problem but I'm working on it. In the mean time if I can merge the two lookups in the userspace LDAP module we can reduce the number of queries by half. Ian
(In reply to comment #2) > (In reply to comment #1) > > (In reply to comment #0) > > > Running wireshark, I see it makes 4 queries and the only thing it changes is the > > > filter: > > > > > > 1. (&(objectclass=automount)(automountKey=qwerty)) > > > 2. (&(objectclass=automount)(automountKey=/)) > > > 3. (&(objectclass=automount)(automountKey=qwerty)) > > > 4. (&(objectclass=automount)(automountKey=/)) > > > > > > Why does it query automountKey=/ when the search for the normal key fails? And > > > why does it repeat both queries? It should just quit after the 1st query fails. Very interesting. After my first attempt to combine the wildcard and key lookup I'm getting 2 queries for the for "ls" on an invalid key. That shouldn't be happening for this case. Working on it. Ian
(In reply to comment #5) > (In reply to comment #2) > > > > why does it repeat both queries? It should just quit after the 1st query > fails. > > Very interesting. > After my first attempt to combine the wildcard and key lookup > I'm getting 2 queries for the for "ls" on an invalid key. That > shouldn't be happening for this case. First I've combined the LDAP query to lookup the key and check for the wildcard entry into one so that's done. It is included in autofs-5.0.0_beta6-7 which I've just now built. So if you could give this a try when it's available and you have time that would be great. I've investigated the multiple daemon upcalls again and have come to the same conclusion as previously. An strace of "ls" shows stat("/ldap/ddddddd", 0x616b68) = -1 ENOENT (No such file or directory) lstat("/ldap/ddddddd", 0x616b68) = -1 ENOENT (No such file or directory) which, after going through the kernel code path, results in two callbacks to the daemon. An strace of "cd" shows stat("/ldap", {st_mode=S_IFDIR|0755, st_size=0, ...}) = 0 stat("/ldap/dbuast", 0x7fffff80f250) = -1 ENOENT (No such file or directory) chdir("/ldap/dbuast") = -1 ENOENT (No such file or directory) chdir("/ldap/dbuast") = -1 ENOENT (No such file or directory) which results in three callbacks to the daemon. Why it tries to chdir a second time after the first one fails is a mystery to me. At one time stat calls would not cause a callback but things have changed a fair bit in the kernel and it's probably better that way anyway. But the upshot of this is that implementing the caching of mount fails in the kernel module needs to be done as soon as possible. This functionality will be dependent on a patch that is currently pending in the -mm kernel which seems to have attracted some reluctance at this stage. The patch is however making it's way into the Rawhide kernel thanks to the efforts of Dave Jones. I'll let you know how this goes and how the cacheing of negative callbacks goes. Ian
(In reply to comment #4) > So the map entry > * server:/home/& > is stored in LDAP as > automountKey: / > automountInformation: server:/home/& > so we can recognise it as the autofs wildcard map entry. Ah-hah! I wasn't familiar with this detail of LDAP automount maps because we don't use the wildcards in our maps. We have many different NFS servers and paths so a wildcard entry wouldn't work for us so I never even tried to create one. I can see, though, that trying to create an automountKey of '*' would be a problem since that's an LDAP wildcard. Thanks! I guess I should go study the RFCs some more. :) Jeff
(In reply to comment #6) > > I've investigated the multiple daemon upcalls again and have > come to the same conclusion as previously. > > An strace of "ls" shows > stat("/ldap/ddddddd", 0x616b68) = -1 ENOENT (No such file or directory) > lstat("/ldap/ddddddd", 0x616b68) = -1 ENOENT (No such file or directory) > > which, after going through the kernel code path, results in two > callbacks to the daemon. > > An strace of "cd" shows > stat("/ldap", {st_mode=S_IFDIR|0755, st_size=0, ...}) = 0 > stat("/ldap/dbuast", 0x7fffff80f250) = -1 ENOENT (No such file or directory) > chdir("/ldap/dbuast") = -1 ENOENT (No such file or directory) > chdir("/ldap/dbuast") = -1 ENOENT (No such file or directory) > > which results in three callbacks to the daemon. Why it tries to > chdir a second time after the first one fails is a mystery to me. > > At one time stat calls would not cause a callback but things have > changed a fair bit in the kernel and it's probably better that way > anyway. > > But the upshot of this is that implementing the caching of mount > fails in the kernel module needs to be done as soon as possible. > This functionality will be dependent on a patch that is currently > pending in the -mm kernel which seems to have attracted some > reluctance at this stage. The patch is however making it's way > into the Rawhide kernel thanks to the efforts of Dave Jones. I'll > let you know how this goes and how the cacheing of negative > callbacks goes. I've done the cacheing of failed lookups. Now you should see just one query to the LDAP server. We may need to tune the time the failure remains negative. I set it to 10 seconds to start with so please let me know how it goes. Unfortunately I couldn't do this in the kernel module which would have been the best place. I've had to do it in the userspace daemon. Not quite as efficient but the result is the same. The change is available in version autofs-5.0.0_beta6-8. Note that there has been a version change to avoid upgrade problems as we go forward. No doubt you'll notice as the next version after the one above is autofs-5.0.1-0.rc1.1. Ian
I've upgraded to autofs-5.0.1-0.rc1.1 and kernel-2.6.17-1.2405.fc6 and I ran wireshark while running 'ls /automnt/qwerty'. The combined search filter is working: it's now looking for (&(objectclass=automount)(|(automountKey=qwerty)(automountKey=/))) However, I'm still seeing two LDAP searches go out on the wire (the 2nd search is about 0.7 seconds after the 1st). It's getting better!
I upgraded my FC6 boxes to autofs-5.0.1-0.rc2.8 kernel-2.6.18-1.2726.fc6 (among other packages) and I tested the LDAP queries again. Today I'm only seeing one LDAP search request for non-existent mount points. It looks like this BZ can be closed! Thanks! Jeff
(In reply to comment #10) > I upgraded my FC6 boxes to > autofs-5.0.1-0.rc2.8 > kernel-2.6.18-1.2726.fc6 > (among other packages) and I tested the LDAP queries again. Today I'm only > seeing one LDAP search request for non-existent mount points. It looks like > this BZ can be closed! Thanks Jeff. To be honest it should have been fixed when you last tested it and no matter how hard I tried I couldn't see why it didn't function as required. Ian > > Thanks! > Jeff >