Bug 510941 (autofs-high-cpu) - high cpu utilization from autofs with large maps
Summary: high cpu utilization from autofs with large maps
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: autofs-high-cpu
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: autofs
Version: 5.4
Hardware: All
OS: Linux
low
low
Target Milestone: rc
: ---
Assignee: Ian Kent
QA Contact: Red Hat Kernel QE team
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2009-07-12 17:31 UTC by bg
Modified: 2011-03-28 09:47 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2011-03-28 09:47:39 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
batch top process to monitor automount over 5 minutes (8.32 KB, application/octet-stream)
2009-07-12 17:34 UTC, bg
no flags Details
batch top process to monitor automount over 5 minutes after hash table change (6.70 KB, application/octet-stream)
2009-07-13 07:54 UTC, bg
no flags Details
batch top with browse_mode-no and hash table change. (6.69 KB, application/octet-stream)
2009-07-13 08:26 UTC, bg
no flags Details

Description bg 2009-07-12 17:31:53 UTC
Description of problem:
After loading 5.4 beta the autofs performs drastically faster with files but still with large maps it uses up ~10% cpu on my vms.  That still won't scale well for our virtualization efforts.  

Version-Release number of selected component (if applicable):
5.4 with autofs-5.0.1-0.rc2.129.bz510530.1

How reproducible:
Very. 

Steps to Reproduce:
1.  Just load a large (9k+ direct map and if necessary a 33k indirect map
2.  Use the newest autofs autofs-5.0.1-0.rc2.129.bz510530.1

  
Actual results:
autofs consume between 8-16% cpu on average

Expected results:
much less cpu consumed (less than 1% like with ldap)

Additional info:
We're using ldap at our company now so the urgency is lower but using ldap comes with its own set of problems (like network/service dependencies).  We would prefer to use files.

We (Deke, Bob, Dave, Ducky, Mike) had a discussion with you at the beginning of 2009 where we discussed possibly creating a binary map of the map table in memory vs serialized searches.  I haven't checked the source lately to see if you were working on this but it sounded like a good idea at the time.  I think we had an action to open the FR on your site but hadn't done that yet.

Anyway -- in reference to bz bug 510530 -- there was speculation that the cpu utilization might be attributed to things other than the map re-read every 4 seconds.  I've included the output of a batch top command:
 top -b -n 1500 -d 0.2 -p 12157
This should give you a view into what it's doing cpu-wise.  Please let me know what other information you'd like to have.

Thanks

Comment 1 bg 2009-07-12 17:34:03 UTC
Created attachment 351396 [details]
batch top process to monitor automount over 5 minutes

top -b -n 1500 -d 0.2 -p 12157

Comment 2 Ian Kent 2009-07-13 03:53:22 UTC
(In reply to comment #0)
> Description of problem:
> After loading 5.4 beta the autofs performs drastically faster with files but
> still with large maps it uses up ~10% cpu on my vms.  That still won't scale
> well for our virtualization efforts.  
> 
> Version-Release number of selected component (if applicable):
> 5.4 with autofs-5.0.1-0.rc2.129.bz510530.1
> 
> How reproducible:
> Very. 
> 
> Steps to Reproduce:
> 1.  Just load a large (9k+ direct map and if necessary a 33k indirect map
> 2.  Use the newest autofs autofs-5.0.1-0.rc2.129.bz510530.1

Yes, I should be able to reproduce this.

> 
> 
> Actual results:
> autofs consume between 8-16% cpu on average

But there are frequent examples where the CPU pegs at 100%.
That shouldn't be happening so I must have mist something.

> 
> Expected results:
> much less cpu consumed (less than 1% like with ldap)
> 
> Additional info:
> We're using ldap at our company now so the urgency is lower but using ldap
> comes with its own set of problems (like network/service dependencies).  We
> would prefer to use files.
> 
> We (Deke, Bob, Dave, Ducky, Mike) had a discussion with you at the beginning of
> 2009 where we discussed possibly creating a binary map of the map table in
> memory vs serialized searches.  I haven't checked the source lately to see if
> you were working on this but it sounded like a good idea at the time.  I think
> we had an action to open the FR on your site but hadn't done that yet.

Indeed, I remember it well and I was hoping for a better result
than this, but then again, there was a lot of change aimed at
large map improvements and it clearly needs a bit more work.

We have had an internal cache in autofs for a long time but it
was ineffective with file maps and lookups were performing
poorly for large maps.

It turned out that Valerie Aurora Henson (from our file systems
group) posted several patches and one of the things they covered
was replacing the hashing algorithm for the internal cache (before
I managed to get to this myself). This lead to an investigation of
how well (or badly in this case) the cache worked for large maps.
The results clearly showed:

1) The cache size was way to small for large maps.

2) The distribution of entries in the cache was very poor with
   the original, simple minded, algorithm when the cache size
   (number of hash buckets) was increased. 

3) Overhead of entry lookup is quite sensitive to hash chain
   length. This is almost obvious but we may also get an uneven
   distribution of entries, depending on key values. So, again,
   the cache size becomes important.

These changes together with the changes to make file maps always
use the cache and only go to the map when the file has changed
should have resolved this issue. So, either I've missed something
or some other process in autofs needs attention.

Before we try to identify exactly what is causing the excessive
CPU usage we need to eliminate the possibility that it is due to
some of the hash chains being a bit long. If this is the case,
but is not the only problem source, it will obscure the usage
monitoring data and make it harder to identify the actual problem.

Because of the cache sensitivity to hash chain length the cache
size has been made tunable in the autofs configuration. In the
configuration the MAP_HASH_TABLE_SIZE option can be used to set
an appropriate cache size. The default has been chosen to as a
trade off between memory consumption and the map size where this
begins to make a significant difference which is around 8000
entries.

So, for your maps, setting this to between 3000 and 4000 should
make a difference and minimize CPU usage spikes caused by long
hash chains. You may be able to ignore the reference to this
value being a power of 2 in the configuration file. There was
some discussion that indicated that the algorithm didn't require
this but I left it in as, apparently, it shouldn't make a
difference.

> 
> Anyway -- in reference to bz bug 510530 -- there was speculation that the cpu
> utilization might be attributed to things other than the map re-read every 4
> seconds.  I've included the output of a batch top command:
>  top -b -n 1500 -d 0.2 -p 12157
> This should give you a view into what it's doing cpu-wise.  Please let me know
> what other information you'd like to have.

This does look like something else beside the cache lookup
is causing excessive CPU usage. I will investigate.

Ian

Comment 3 Ian Kent 2009-07-13 07:35:57 UTC
(In reply to comment #2)
> (In reply to comment #0)
> > Description of problem:
> > After loading 5.4 beta the autofs performs drastically faster with files but
> > still with large maps it uses up ~10% cpu on my vms.  That still won't scale
> > well for our virtualization efforts.  
> > 
> > Version-Release number of selected component (if applicable):
> > 5.4 with autofs-5.0.1-0.rc2.129.bz510530.1
> > 
> > How reproducible:
> > Very. 
> > 
> > Steps to Reproduce:
> > 1.  Just load a large (9k+ direct map and if necessary a 33k indirect map
> > 2.  Use the newest autofs autofs-5.0.1-0.rc2.129.bz510530.1
> 
> Yes, I should be able to reproduce this.

That was just wishful thinking.

Testing this is quite difficult for me as several Gnome processes
go bananas when using large maps. While that's good for testing
expire to mount races it makes getting realistic results quite
hard. I don't know what to do about the Gnome problem as we have
often reported this type of thing before (but not specifically
in relation to large maps) and the usual fix is to ignore autofs
mounts in /proc/mounts. But when /proc/mounts is large the
aggressive scanning that Gnome related utilities do is a real
problem.

Anyway, the top data from comment #1 looks like what I would
expect for an indirect mount that uses the browse (or ghost)
option or if the BROWSE_MODE="no" option in the autofs
configuration is either commented out or set to "yes" rather
than "no". The high system CPU usage is the main reason I think
this. Is that the case?

Ian

Comment 4 bg 2009-07-13 07:40:58 UTC
Making the change to MAP_HASH_TABLE_SIZE in the config file made another giant difference in cpu utilization and behavior.  

$ ps aux |grep automo
root     20705  2.7  2.0 158756 18568 ?        Ssl  Jul12   3:44 automount

I'm sorry I missed this feature before (even though it came out in 5.0.4) -- I'm certainly hot to implement it.  memory ute is low and the 2.7% cpu was after doing load testing of hundreds of mounts (I ran a du --max-depth=1 on our project tree)  When the system is idle -- automount behaves well with this new setting.  It still has its moments but I'm certainly encouraged by the results.

my /etc/sysconfig/autofs file now has this value in it:
MAP_HASH_TABLE_SIZE="3584"

and it appears to be working wonders.  I'd like to let this run for another 12 hours or so.  I think that the cpu ute is actually much lower than 2.7% without the silliness I put it through to test it.  

This is turning out to be a very promising release.  I'm definitely motivated to get this release into our dev/test arenas.  

I really like seeing orders-of-magnitude improvements in performance.  I think one more ought to just about do it ;)

There is another attachment with the latest cpu numbers from the automount process.  I hope they're helpful.

Thanks very much for the tip on that feature.

Comment 5 bg 2009-07-13 07:54:13 UTC
Created attachment 351439 [details]
batch top process to monitor automount over 5 minutes after hash table change

here's the cpu utilization from the post-hash addition to the autofs config file

Comment 6 bg 2009-07-13 07:55:02 UTC
You've asked a few times, I'm sorry I've not answered yet:

$ cat /etc/auto.master
# auto.master for autofs5 machines
/usr2   auto.home          --timeout 60

$ cat /etc/sysconfig/autofs |grep -v ^#
MAP_HASH_TABLE_SIZE="3584"
$
^- Ack!  It looks like I was a bit too aggressive with my commenting-out in this section.  I'll turn it back on and retest.

Comment 7 bg 2009-07-13 08:23:08 UTC
Well that definitely made a difference putting the DEFAULT_BROWSE_MODE="no"
 back into play.  I'm sorry it accidentally was removed for our testing purposes.  

This is really positive.  

Here's another batchtop if you want to see it but my overall cpu ute is at or below 1%.

$ ps aux |grep automou
root     31647  1.2  1.5  81764 14200 ?        Ssl  01:00   0:15 automount

It says it's at 1.2 but that's again because I was running du walks up and down our project space to stress it out a bit.  I waited for all the mounts to clear themselves to start the batchtop so you should see an idle system much like the other batchtops.

I think -- if this holds up overnight -- that I'm content with this level of cpu utilization.  

Another order magnitude improvement would be nice but there's only so much 99% perfect buys you before the cost for the next 9 is too high.

You rock, Ian!

Comment 8 bg 2009-07-13 08:26:43 UTC
Created attachment 351441 [details]
batch top with browse_mode-no and hash table change. 

$ cat autofs |grep -v ^#
DEFAULT_BROWSE_MODE="no"
DEFAULT_APPEND_OPTIONS="yes"
DEFAULT_LOGGING="verbose"
MAP_HASH_TABLE_SIZE="3584"

Looks a ton better!

Comment 9 Ian Kent 2009-07-13 14:24:00 UTC
(In reply to comment #7)
> Well that definitely made a difference putting the DEFAULT_BROWSE_MODE="no"
>  back into play.  I'm sorry it accidentally was removed for our testing
> purposes.  
> 
> This is really positive.  
> 
> Here's another batchtop if you want to see it but my overall cpu ute is at or
> below 1%.
> 
> $ ps aux |grep automou
> root     31647  1.2  1.5  81764 14200 ?        Ssl  01:00   0:15 automount
> 
> It says it's at 1.2 but that's again because I was running du walks up and down
> our project space to stress it out a bit.  I waited for all the mounts to clear
> themselves to start the batchtop so you should see an idle system much like the
> other batchtops.

There are still CPU spikes in the top output which I suspect
are the expire runs. The user space expire is not terribly
efficient, but the reason for that is a rather long VFS related
story which I have only partially worked out so far. It will be
hard to improve that but the possibility is there.

I think the next most important issue to tackle is the browse
mount expire overhead. The problem with that is that I can't
safely walk just the mounts in the directory within the kernel
so I have to scan all the directories for expire candidates.
There will be a way to to do it but it may mean VFS changes
which will be a challenge, fisrt to do and then to get accepted.
But, that's what were here for!

> 
> I think -- if this holds up overnight -- that I'm content with this level of
> cpu utilization.  
> 
> Another order magnitude improvement would be nice but there's only so much 99%
> perfect buys you before the cost for the next 9 is too high.
> 
> You rock, Ian!  

Hahahaha, great, so it looks like I got this mostly right for
once!

Ian

Comment 12 Ian Kent 2009-11-25 05:56:40 UTC
I believe that, in the end, we established that the issue here
is addressed by configuration options added for this purpose in
the latest autofs package.

Is that correct?
Can we close this bug?

Comment 13 Zak Berrie 2009-11-25 18:17:15 UTC
My understanding from the customer is that this is still an issue.  The current patches for autofs work when using an LDAP back-end.  But when large file-based autofs maps are used they still consume a lot of CPU.  

The customer would still prefer to be using files so they can stand-down the LDAP infrastructure.

Comment 14 Ian Kent 2009-11-26 02:31:43 UTC
(In reply to comment #13)
> My understanding from the customer is that this is still an issue.  The current
> patches for autofs work when using an LDAP back-end.  But when large file-based
> autofs maps are used they still consume a lot of CPU.  
> 
> The customer would still prefer to be using files so they can stand-down the
> LDAP infrastructure.  

That's not the way the comments in the bug read.
Could you look a bit closer at the comments in this bug and follow
up with the customer.

The changes that went into autofs and the autofs4 kernel module
for RHEL-5.4 should have largely overcome the CPU issue when
using large file maps. Note that to realize the full resource
usage benefit the updated package and kernel must be used
together and the configuration must be setup as discussed
in this bug.

Also discussed in comments #3 and #4 is the use of the "browse"
(or --ghost) option with large file maps. We recommend not using
that option for large maps because of the potential CPU overhead.
That hasn't changed and it's likely to be quite a while before it
does. Improvement in that area may require a total re-write of
the expire sub-system. I'm still not sure what approach I will
use to overcome the limitation and in fact I'm still thinking
about what would need to be done for each of the two possible
approaches I have in mind and haven't worked out which will be
the best to use.

The other issue mentioned, although not in this bug, is the load
time when using large maps. For a direct map there is no way to
avoid reading the entire map at start, that has to be done and
there is no way to avoid it because each direct map entry needs
to be mounted as a mount trigger. That's no different to other
vendor implementations and is just the way direct mounts are. For
indirect maps the changes in 5.4 mean that it must also be read
in at startup to avoid constant lengthy file scans at mount key
lookup. Note that if the map is modified it will be read again
and can cause a brief hiccup. Slow startup is the unavoidable
price we pay for interactive responsiveness. I can't see any
way around that at all.

Obviously, as is always the case with software, small efficiency
improvements are always possible but the return on time invested
is often not acceptable as the perception during use is often not
visible. 

Please, can we get some data using a 5.4 autofs and kernel
appropriately configured and identify if perhaps there are
some other large improvement opportunities that I've missed.

Ian

Comment 15 Ian Kent 2011-03-28 09:47:39 UTC
Is there more that needs to be discussed regarding this?
If so please re-open the bug.


Note You need to log in before you can comment on or make changes to this bug.