Description of problem: After loading 5.4 beta the autofs performs drastically faster with files but still with large maps it uses up ~10% cpu on my vms. That still won't scale well for our virtualization efforts. Version-Release number of selected component (if applicable): 5.4 with autofs-5.0.1-0.rc2.129.bz510530.1 How reproducible: Very. Steps to Reproduce: 1. Just load a large (9k+ direct map and if necessary a 33k indirect map 2. Use the newest autofs autofs-5.0.1-0.rc2.129.bz510530.1 Actual results: autofs consume between 8-16% cpu on average Expected results: much less cpu consumed (less than 1% like with ldap) Additional info: We're using ldap at our company now so the urgency is lower but using ldap comes with its own set of problems (like network/service dependencies). We would prefer to use files. We (Deke, Bob, Dave, Ducky, Mike) had a discussion with you at the beginning of 2009 where we discussed possibly creating a binary map of the map table in memory vs serialized searches. I haven't checked the source lately to see if you were working on this but it sounded like a good idea at the time. I think we had an action to open the FR on your site but hadn't done that yet. Anyway -- in reference to bz bug 510530 -- there was speculation that the cpu utilization might be attributed to things other than the map re-read every 4 seconds. I've included the output of a batch top command: top -b -n 1500 -d 0.2 -p 12157 This should give you a view into what it's doing cpu-wise. Please let me know what other information you'd like to have. Thanks
Created attachment 351396 [details] batch top process to monitor automount over 5 minutes top -b -n 1500 -d 0.2 -p 12157
(In reply to comment #0) > Description of problem: > After loading 5.4 beta the autofs performs drastically faster with files but > still with large maps it uses up ~10% cpu on my vms. That still won't scale > well for our virtualization efforts. > > Version-Release number of selected component (if applicable): > 5.4 with autofs-5.0.1-0.rc2.129.bz510530.1 > > How reproducible: > Very. > > Steps to Reproduce: > 1. Just load a large (9k+ direct map and if necessary a 33k indirect map > 2. Use the newest autofs autofs-5.0.1-0.rc2.129.bz510530.1 Yes, I should be able to reproduce this. > > > Actual results: > autofs consume between 8-16% cpu on average But there are frequent examples where the CPU pegs at 100%. That shouldn't be happening so I must have mist something. > > Expected results: > much less cpu consumed (less than 1% like with ldap) > > Additional info: > We're using ldap at our company now so the urgency is lower but using ldap > comes with its own set of problems (like network/service dependencies). We > would prefer to use files. > > We (Deke, Bob, Dave, Ducky, Mike) had a discussion with you at the beginning of > 2009 where we discussed possibly creating a binary map of the map table in > memory vs serialized searches. I haven't checked the source lately to see if > you were working on this but it sounded like a good idea at the time. I think > we had an action to open the FR on your site but hadn't done that yet. Indeed, I remember it well and I was hoping for a better result than this, but then again, there was a lot of change aimed at large map improvements and it clearly needs a bit more work. We have had an internal cache in autofs for a long time but it was ineffective with file maps and lookups were performing poorly for large maps. It turned out that Valerie Aurora Henson (from our file systems group) posted several patches and one of the things they covered was replacing the hashing algorithm for the internal cache (before I managed to get to this myself). This lead to an investigation of how well (or badly in this case) the cache worked for large maps. The results clearly showed: 1) The cache size was way to small for large maps. 2) The distribution of entries in the cache was very poor with the original, simple minded, algorithm when the cache size (number of hash buckets) was increased. 3) Overhead of entry lookup is quite sensitive to hash chain length. This is almost obvious but we may also get an uneven distribution of entries, depending on key values. So, again, the cache size becomes important. These changes together with the changes to make file maps always use the cache and only go to the map when the file has changed should have resolved this issue. So, either I've missed something or some other process in autofs needs attention. Before we try to identify exactly what is causing the excessive CPU usage we need to eliminate the possibility that it is due to some of the hash chains being a bit long. If this is the case, but is not the only problem source, it will obscure the usage monitoring data and make it harder to identify the actual problem. Because of the cache sensitivity to hash chain length the cache size has been made tunable in the autofs configuration. In the configuration the MAP_HASH_TABLE_SIZE option can be used to set an appropriate cache size. The default has been chosen to as a trade off between memory consumption and the map size where this begins to make a significant difference which is around 8000 entries. So, for your maps, setting this to between 3000 and 4000 should make a difference and minimize CPU usage spikes caused by long hash chains. You may be able to ignore the reference to this value being a power of 2 in the configuration file. There was some discussion that indicated that the algorithm didn't require this but I left it in as, apparently, it shouldn't make a difference. > > Anyway -- in reference to bz bug 510530 -- there was speculation that the cpu > utilization might be attributed to things other than the map re-read every 4 > seconds. I've included the output of a batch top command: > top -b -n 1500 -d 0.2 -p 12157 > This should give you a view into what it's doing cpu-wise. Please let me know > what other information you'd like to have. This does look like something else beside the cache lookup is causing excessive CPU usage. I will investigate. Ian
(In reply to comment #2) > (In reply to comment #0) > > Description of problem: > > After loading 5.4 beta the autofs performs drastically faster with files but > > still with large maps it uses up ~10% cpu on my vms. That still won't scale > > well for our virtualization efforts. > > > > Version-Release number of selected component (if applicable): > > 5.4 with autofs-5.0.1-0.rc2.129.bz510530.1 > > > > How reproducible: > > Very. > > > > Steps to Reproduce: > > 1. Just load a large (9k+ direct map and if necessary a 33k indirect map > > 2. Use the newest autofs autofs-5.0.1-0.rc2.129.bz510530.1 > > Yes, I should be able to reproduce this. That was just wishful thinking. Testing this is quite difficult for me as several Gnome processes go bananas when using large maps. While that's good for testing expire to mount races it makes getting realistic results quite hard. I don't know what to do about the Gnome problem as we have often reported this type of thing before (but not specifically in relation to large maps) and the usual fix is to ignore autofs mounts in /proc/mounts. But when /proc/mounts is large the aggressive scanning that Gnome related utilities do is a real problem. Anyway, the top data from comment #1 looks like what I would expect for an indirect mount that uses the browse (or ghost) option or if the BROWSE_MODE="no" option in the autofs configuration is either commented out or set to "yes" rather than "no". The high system CPU usage is the main reason I think this. Is that the case? Ian
Making the change to MAP_HASH_TABLE_SIZE in the config file made another giant difference in cpu utilization and behavior. $ ps aux |grep automo root 20705 2.7 2.0 158756 18568 ? Ssl Jul12 3:44 automount I'm sorry I missed this feature before (even though it came out in 5.0.4) -- I'm certainly hot to implement it. memory ute is low and the 2.7% cpu was after doing load testing of hundreds of mounts (I ran a du --max-depth=1 on our project tree) When the system is idle -- automount behaves well with this new setting. It still has its moments but I'm certainly encouraged by the results. my /etc/sysconfig/autofs file now has this value in it: MAP_HASH_TABLE_SIZE="3584" and it appears to be working wonders. I'd like to let this run for another 12 hours or so. I think that the cpu ute is actually much lower than 2.7% without the silliness I put it through to test it. This is turning out to be a very promising release. I'm definitely motivated to get this release into our dev/test arenas. I really like seeing orders-of-magnitude improvements in performance. I think one more ought to just about do it ;) There is another attachment with the latest cpu numbers from the automount process. I hope they're helpful. Thanks very much for the tip on that feature.
Created attachment 351439 [details] batch top process to monitor automount over 5 minutes after hash table change here's the cpu utilization from the post-hash addition to the autofs config file
You've asked a few times, I'm sorry I've not answered yet: $ cat /etc/auto.master # auto.master for autofs5 machines /usr2 auto.home --timeout 60 $ cat /etc/sysconfig/autofs |grep -v ^# MAP_HASH_TABLE_SIZE="3584" $ ^- Ack! It looks like I was a bit too aggressive with my commenting-out in this section. I'll turn it back on and retest.
Well that definitely made a difference putting the DEFAULT_BROWSE_MODE="no" back into play. I'm sorry it accidentally was removed for our testing purposes. This is really positive. Here's another batchtop if you want to see it but my overall cpu ute is at or below 1%. $ ps aux |grep automou root 31647 1.2 1.5 81764 14200 ? Ssl 01:00 0:15 automount It says it's at 1.2 but that's again because I was running du walks up and down our project space to stress it out a bit. I waited for all the mounts to clear themselves to start the batchtop so you should see an idle system much like the other batchtops. I think -- if this holds up overnight -- that I'm content with this level of cpu utilization. Another order magnitude improvement would be nice but there's only so much 99% perfect buys you before the cost for the next 9 is too high. You rock, Ian!
Created attachment 351441 [details] batch top with browse_mode-no and hash table change. $ cat autofs |grep -v ^# DEFAULT_BROWSE_MODE="no" DEFAULT_APPEND_OPTIONS="yes" DEFAULT_LOGGING="verbose" MAP_HASH_TABLE_SIZE="3584" Looks a ton better!
(In reply to comment #7) > Well that definitely made a difference putting the DEFAULT_BROWSE_MODE="no" > back into play. I'm sorry it accidentally was removed for our testing > purposes. > > This is really positive. > > Here's another batchtop if you want to see it but my overall cpu ute is at or > below 1%. > > $ ps aux |grep automou > root 31647 1.2 1.5 81764 14200 ? Ssl 01:00 0:15 automount > > It says it's at 1.2 but that's again because I was running du walks up and down > our project space to stress it out a bit. I waited for all the mounts to clear > themselves to start the batchtop so you should see an idle system much like the > other batchtops. There are still CPU spikes in the top output which I suspect are the expire runs. The user space expire is not terribly efficient, but the reason for that is a rather long VFS related story which I have only partially worked out so far. It will be hard to improve that but the possibility is there. I think the next most important issue to tackle is the browse mount expire overhead. The problem with that is that I can't safely walk just the mounts in the directory within the kernel so I have to scan all the directories for expire candidates. There will be a way to to do it but it may mean VFS changes which will be a challenge, fisrt to do and then to get accepted. But, that's what were here for! > > I think -- if this holds up overnight -- that I'm content with this level of > cpu utilization. > > Another order magnitude improvement would be nice but there's only so much 99% > perfect buys you before the cost for the next 9 is too high. > > You rock, Ian! Hahahaha, great, so it looks like I got this mostly right for once! Ian
I believe that, in the end, we established that the issue here is addressed by configuration options added for this purpose in the latest autofs package. Is that correct? Can we close this bug?
My understanding from the customer is that this is still an issue. The current patches for autofs work when using an LDAP back-end. But when large file-based autofs maps are used they still consume a lot of CPU. The customer would still prefer to be using files so they can stand-down the LDAP infrastructure.
(In reply to comment #13) > My understanding from the customer is that this is still an issue. The current > patches for autofs work when using an LDAP back-end. But when large file-based > autofs maps are used they still consume a lot of CPU. > > The customer would still prefer to be using files so they can stand-down the > LDAP infrastructure. That's not the way the comments in the bug read. Could you look a bit closer at the comments in this bug and follow up with the customer. The changes that went into autofs and the autofs4 kernel module for RHEL-5.4 should have largely overcome the CPU issue when using large file maps. Note that to realize the full resource usage benefit the updated package and kernel must be used together and the configuration must be setup as discussed in this bug. Also discussed in comments #3 and #4 is the use of the "browse" (or --ghost) option with large file maps. We recommend not using that option for large maps because of the potential CPU overhead. That hasn't changed and it's likely to be quite a while before it does. Improvement in that area may require a total re-write of the expire sub-system. I'm still not sure what approach I will use to overcome the limitation and in fact I'm still thinking about what would need to be done for each of the two possible approaches I have in mind and haven't worked out which will be the best to use. The other issue mentioned, although not in this bug, is the load time when using large maps. For a direct map there is no way to avoid reading the entire map at start, that has to be done and there is no way to avoid it because each direct map entry needs to be mounted as a mount trigger. That's no different to other vendor implementations and is just the way direct mounts are. For indirect maps the changes in 5.4 mean that it must also be read in at startup to avoid constant lengthy file scans at mount key lookup. Note that if the map is modified it will be read again and can cause a brief hiccup. Slow startup is the unavoidable price we pay for interactive responsiveness. I can't see any way around that at all. Obviously, as is always the case with software, small efficiency improvements are always possible but the return on time invested is often not acceptable as the perception during use is often not visible. Please, can we get some data using a 5.4 autofs and kernel appropriately configured and identify if perhaps there are some other large improvement opportunities that I've missed. Ian
Is there more that needs to be discussed regarding this? If so please re-open the bug.