Bug 510530 (R5.4)

Summary: autofs-5.0.1-0.rc2.129 (RHEL 5.4[beta] automounter) has memory leak
Product: Red Hat Enterprise Linux 5 Reporter: bg <bgbugzilla>
Component: autofsAssignee: Ian Kent <ikent>
Status: CLOSED ERRATA QA Contact: BaseOS QE <qe-baseos-auto>
Severity: high Docs Contact:
Priority: high    
Version: 5.3CC: cward, dkovalsk, ikent, jmoyer, kvolny, rlerch, sghosh, sprabhu, tao, ykopkova, zbrown
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Previously, the method used by autofs to clean up pthreads was not reliable and could result in a memory leak. If the memory leak occurred, autofs would gradually consume all available memory and then crash. A small semantic change in the code prevents this memory leak from occurring now.
Story Points: ---
Clone Of: Environment:
Last Closed: 2009-09-02 11:58:53 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Patch that attempts to fix suspected semantic problem with pthread_cleanup_push() none

Description bg 2009-07-09 16:45:26 UTC
Description of problem:
autofs-5.0.1-0.rc2.129 (RHEL 5.4[beta] automounter) has memory leak
rpm -Fvh to autofs-5.0.1-0.rc2.129 on my rhel5.3 box.  System will run out of memory in just a few hours.  Using files (nsswitch) as a back-end it only took 2 hours or so.  Using ldap as a back end it looks like it's taking around 6 hours to fill up.  Our /etc/auto.projects file is about 8500 entries long.  auto.home is about 33,000 entries long. 

Version-Release number of selected component (if applicable):
RHEL 5.3 with the 5.4 autofs package.

How reproducible:
Very

Steps to Reproduce:
1.  Start autofs
2.  Monitor autofs
3.  Wait for crash.  Memory utilization starts immediately.  Crashes take between 2 to 8 hours.
  
Actual results:
automounter crashes

Expected results:
automounter works

Additional info:
We have a very large direct map (8500+ entries) and a large indirect user map (33,000+).  I was able to reproduce issue with both files and ldap as a backend for the auto.master + associated files.

Comment 1 Ian Kent 2009-07-09 17:28:14 UTC
(In reply to comment #0)
> 
> Expected results:
> automounter works
> 
> Additional info:
> We have a very large direct map (8500+ entries) and a large indirect user map
> (33,000+).  I was able to reproduce issue with both files and ldap as a backend
> for the auto.master + associated files.  

Are you maps simple direct and indirect maps?

I've had a report of this upstream but haven't been able to
get anywhere with it yet. If you can tell me more about the
structure of your maps I'll have another try at reproducing
the problem.

Comment 2 bg 2009-07-09 21:31:57 UTC
Here's our auto.master and associated files:

$ cat /etc/auto.master
# auto.master for autofs5 machines
/usr2   auto.home          --timeout 60
/-      auto.direct        --timeout 60
/net    -hosts

$ cat /etc/auto.direct
+/etc/auto.projects
/opt/random/src -rw,noquota       filerx:/vol/vol0/src
/opt/random/doc -rw,noquota       filerx:/vol/vol0/doc

the direct maps are there because we have multiple layers of nested mounts.  so auto.projects might look like the following:

/mnt/dir1     -rw,noquota,intr       filer1:/vol/subdir/dir1
/mnt/dir1/dir2    -rw,noquota,intr        filer2:/vol/subdir1/dir2
/mnt/dir1/dir2/dir3   -rw,noquota,intr        filer3:/vol/subdir1/dir2_dir3
/mnt/dir1/dir2/dir3/dir4      -rw,noquota,intr        filer4:/vol/subdir1/dir2_dir3_dir4
/mnt/dir1/dir2/dir3/dir4/include  -rw,noquota,intr        filer5:/vol/subdir1/dir2_dir3_dir4_include

homedirs as indirect maps (auto.home) might look like

user1        -rw,noquota,intr        fileru:/vol/vol2/usr2/user1
user2        -rw,noquota,intr        fileru:/vol/vol0/usr2/user2
user3        -rw,noquota,intr        fileru:/vol/vol2/usr2/user3
.
.
.
user33000   -rw,noquota,intr        fileru:/vol/vol2/usr2/user33000

Comment 3 Ian Kent 2009-07-10 05:40:37 UTC
Created attachment 351216 [details]
Patch that attempts to fix suspected semantic problem with pthread_cleanup_push()

I'm not sure this fixes the memory leak issue but testing
looked promising. I'll get a scratch build done with this
patch included.

Comment 4 Ian Kent 2009-07-10 05:45:23 UTC
Please try the scratch build at:
http://people.redhat.com/~ikent/autofs-5.0.1-0.rc2.129.bz510530.1.

Comment 6 bg 2009-07-10 15:03:21 UTC
On my ldap-sourced machine -- it looks really great right now.  cpu cycles seem under control and the ram is looking good after 10 minutes.

$ ps aux |grep auto
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root     30586  0.4  1.2 120216 11388 ?        Ssl  07:46   0:02 automount


Something I hadn't mentioned before was the extreme amount of cpu it was taking up on the files-based map.  This is still happening after this release.  (I can open a separate bz if you'd like)

$ date
Fri Jul 10 07:57:48 PDT 2009
$ ps aux |grep auto
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root      2114 98.6  2.2 170308 20424 ?        Ssl  07:41   6:23 automount
$ date
Fri Jul 10 07:57:51 PDT 2009

In just ~16 minutes of running it's already used over 6 minutes of cpu.  This is with the new patched version as well.  

The memory has stayed put at 2.3 or below% however.  That's looking MUCH better so far.  I'll let this run today and continue to monitor it.

top - 08:00:14 up 1 day,  5:25,  1 user,  load average: 0.92, 0.89, 0.68
Tasks: 103 total,   3 running, 100 sleeping,   0 stopped,   0 zombie
Cpu(s): 30.0%us, 70.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:    895464k total,   842456k used,    53008k free,    59220k buffers
Swap:   524280k total,    15024k used,   509256k free,   566516k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
 2114 root      15   0  166m  19m 1328 S 99.2  2.3   7:28.69 automount

The way the cpu spikes it will shoot up for a few seconds then rest for between 2 and 5 seconds -- then shoot up to 99% or so for a few seconds and then rest again.  Rinse, repeat.

Thanks for the super quick responses, Ian.

Comment 8 Ian Kent 2009-07-10 18:05:06 UTC
(In reply to comment #6)
> On my ldap-sourced machine -- it looks really great right now.  cpu cycles seem
> under control and the ram is looking good after 10 minutes.

OK, that's good.
I'll go over the code as there were a couple of other places
where this might be happening. I'll just change them to use
a slightly different sequence of events to eliminate the potential
for the problem just in case. I'll sort that out by Monday, in
time for the exception deadline. 

> 
> $ ps aux |grep auto
> USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
> root     30586  0.4  1.2 120216 11388 ?        Ssl  07:46   0:02 automount
> 
> 
> Something I hadn't mentioned before was the extreme amount of cpu it was taking
> up on the files-based map.  This is still happening after this release.  (I can
> open a separate bz if you'd like)

We'll have to open a separate bug for that because the change to
fix this issue really needs to get into the 5.4 release.

The scanning of file maps should be significantly less but clearly
I've got that wrong somehow. It will spike when the map is read but
shouldn't consult the file map again until the map file is modified.

Or maybe it isn't actually reading the file maps causing the spike?

> 
> $ date
> Fri Jul 10 07:57:48 PDT 2009
> $ ps aux |grep auto
> USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
> root      2114 98.6  2.2 170308 20424 ?        Ssl  07:41   6:23 automount
> $ date
> Fri Jul 10 07:57:51 PDT 2009
> 
> In just ~16 minutes of running it's already used over 6 minutes of cpu.  This
> is with the new patched version as well.  

With your indirect map, do you use the browse option (or --ghost)
or have you either commented out BROWSE_MODE="no" or used
BROWSE_MODE="yes". This will result in significant CPU usage for
a map of that size. This is a known problem and has been with us
for a long time (and it's about the last really big problem)
and although I have though about it many times I still don't have
way to resolve it. But, as it is so difficult I have left it till
last so it will be getting some close attention soon.

> 
> The memory has stayed put at 2.3 or below% however.  That's looking MUCH better
> so far.  I'll let this run today and continue to monitor it.
> 
> top - 08:00:14 up 1 day,  5:25,  1 user,  load average: 0.92, 0.89, 0.68
> Tasks: 103 total,   3 running, 100 sleeping,   0 stopped,   0 zombie
> Cpu(s): 30.0%us, 70.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
> Mem:    895464k total,   842456k used,    53008k free,    59220k buffers
> Swap:   524280k total,    15024k used,   509256k free,   566516k cached
> 
>   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
>  2114 root      15   0  166m  19m 1328 S 99.2  2.3   7:28.69 automount
> 
> The way the cpu spikes it will shoot up for a few seconds then rest for between
> 2 and 5 seconds -- then shoot up to 99% or so for a few seconds and then rest
> again.  Rinse, repeat.

The question is then, does this correspond to expire events
or mount events. Expire events would be every timeout/4 seconds.

> 
> Thanks for the super quick responses, Ian.  

My pleasure.
Ian

Comment 9 Ian Kent 2009-07-10 18:06:51 UTC
One other thing.
The kernel, is it the RHEL-5.4 kernel?

Comment 10 bg 2009-07-11 03:44:38 UTC
In this case the kernel is the RHEL 5.3 kernel.  It will take me a few days to get the 5.4 kernel spun up but I can do that if you'd like me to test it there as well.  If I can find time this weekend I will attempt it.  

Everything is still working great and the memory utilization actually seems to have dropped a bit.  

Nice work! 

I'll open a new BZ about the processor utilization as soon as I can.

Thanks again

Linux myhost 2.6.18-128.el5 #1 SMP Wed Dec 17 11:41:38 EST 2008 x86_64 x86_64 x86_64 GNU/Linux

Also tested on:

Linux myotherhost #1 SMP Wed Dec 17 11:42:39 EST 2008 i686 i686 i386 GNU/Linux

Comment 11 Ian Kent 2009-07-11 06:07:55 UTC
(In reply to comment #10)
> In this case the kernel is the RHEL 5.3 kernel.  It will take me a few days to
> get the 5.4 kernel spun up but I can do that if you'd like me to test it there
> as well.  If I can find time this weekend I will attempt it.  

OK, the reason I asked is that we can't realize all the CPU
improvements without the 5.4 kernel.

The 5.4 kernel includes the new autofs control ioctl interface
and while the primary reason for this implementation wasn't to
reduce CPU utilisation a feature was added that can help quite
a bit.

The source of the improvement is the is_mounted() function that
checks if a path is mounted and if it is an autofs or other
file system. We use the is_mounted() function a lot and it scans
either /etc/mtab or /proc/mounts, as required, but when the new
ioctl interface is in use we can ask the kernel directly for this,
avoiding the scan altogether.

With a large number of direct mounts this can give a significant
improvement and even without a large direct map the improvement
is quite noticeable.

Another thing, if you're using a 5.3 base system and just upgrade
the kernel and autofs then you need to tell autofs you want to use
the new interface by adding a configuration option, as can be seen
in the configuration of a fresh install:

#
# If the kernel supports using the autofs miscellanous device
# and you wish to use it you must set this configuration option
# to "yes" otherwise it will not be used.
USE_MISC_DEVICE="yes"
#

> 
> Everything is still working great and the memory utilization actually seems to
> have dropped a bit.  

Great.

I've done a little more testing and have found no evidence
that the other cases I mentioned are affected by this call
order mistake, so the patch here may be all we need.
 
> 
> Nice work! 

Thanks, but I had already been working on this and your
report and the evidence I had collected from the upstream
reporter caused the penny to drop as to the cause. So thanks
for reporting it.

> 
> I'll open a new BZ about the processor utilization as soon as I can.

OK, but perhaps we should check with the 5.4 kernel before
going ahead with that.

> 
> Thanks again
> 
> Linux myhost 2.6.18-128.el5 #1 SMP Wed Dec 17 11:41:38 EST 2008 x86_64 x86_64
> x86_64 GNU/Linux

Right, the new ioctl interface went into rev 137 but a couple
of other bug fixes went in since then also.

> 
> Also tested on:
> 
> Linux myotherhost #1 SMP Wed Dec 17 11:42:39 EST 2008 i686 i686 i386 GNU/Linux  

Mmmm ... no kernel version, ;)

Ian

Comment 12 Ian Kent 2009-07-12 04:55:16 UTC
I've gone through all the code and inspected the locations
where this might potentially be a problem (twice) and have
not found any other places where this issue is present.

So the patch we have should be all that is needed.
Could I have the needed acks to commit this to CVS please.

Ian

Comment 13 bg 2009-07-12 07:01:27 UTC
I've tested this in RHEL 5.3 on both i686 and x86_64 using ldap as a back-end for the automount maps with perfect results.  Memory usage has stayed low since it was installed and there have been no abnormal terminations.  Start up and shutdown remain slow (due to the large amount of maps) but the results are no different than we experienced before.  

I approve!

I also did manage to get a full beta 5.4 installation going with our image.  You are right -- those kernel mods made a huuuuge difference.  Now instead of 60-90% system utilzation due to automount process -- it's down to less than 10%.  It's still too high but you're on the right track it seems.  

(this is my 5.4 box)
$ date
Sat Jul 11 23:54:46 PDT 2009
$ ps aux |grep automount
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root     12157 14.2  2.1 104740 19456 ?        Ssl  23:41   1:57 automount
$ date
Sat Jul 11 23:54:48 PDT 2009
$ uname -a
Linux yetanothermyhost 2.6.18-155.el5 #1 SMP Fri Jun 19 17:06:31 EDT 2009 x86_64 x86_64 x86_64 GNU/Linux
(with kernel version this time!)

So with appx 14 minutes of run-time it's used up 1.57 minutes of cpu time.  This is certainly a HUGE improvement vs a kernel without the enhancements but it's still not quite good enough for us to switch back to files yet.  LDAP will have to do for now.  However, memory usage is great, and that's what this ticket was to solve.

Really though, thank you VERY much for your work on this ticket.  It's going to make a big difference for our RedHat implementation.  Please deploy your updates asap.  

If you send me the updated release I will install/test that as well with the same expediency if other changes were rolled in with this one in 129.

Comment 14 Ian Kent 2009-07-12 14:33:37 UTC
(In reply to comment #13)
> I've tested this in RHEL 5.3 on both i686 and x86_64 using ldap as a back-end
> for the automount maps with perfect results.  Memory usage has stayed low since
> it was installed and there have been no abnormal terminations.  Start up and
> shutdown remain slow (due to the large amount of maps) but the results are no
> different than we experienced before.  

Yes, there really isn't anything we can do about startup time.
The simple fact is that we need to read the entire direct map in
when we start. Indirect maps that do not use the browse option
don't need to be read at start and they aren't in version 5 so
the slowness must be due to the direct map.

> 
> I approve!
> 
> I also did manage to get a full beta 5.4 installation going with our image. 
> You are right -- those kernel mods made a huuuuge difference.  Now instead of
> 60-90% system utilzation due to automount process -- it's down to less than
> 10%.  It's still too high but you're on the right track it seems.  

That's a little disappointing although not entirely unexpected.

Logging a bug to investigate this would be useful as I need to
identify exactly where the remaining bottlenecks are, to make
sure my suspicions are correct.

> 
> (this is my 5.4 box)
> $ date
> Sat Jul 11 23:54:46 PDT 2009
> $ ps aux |grep automount
> USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
> root     12157 14.2  2.1 104740 19456 ?        Ssl  23:41   1:57 automount
> $ date
> Sat Jul 11 23:54:48 PDT 2009
> $ uname -a
> Linux yetanothermyhost 2.6.18-155.el5 #1 SMP Fri Jun 19 17:06:31 EDT 2009
> x86_64 x86_64 x86_64 GNU/Linux
> (with kernel version this time!)
> 
> So with appx 14 minutes of run-time it's used up 1.57 minutes of cpu time. 
> This is certainly a HUGE improvement vs a kernel without the enhancements but
> it's still not quite good enough for us to switch back to files yet.  LDAP will
> have to do for now.  However, memory usage is great, and that's what this
> ticket was to solve.

Yep.

> 
> Really though, thank you VERY much for your work on this ticket.  It's going to
> make a big difference for our RedHat implementation.  Please deploy your
> updates asap.  

Will do.

Getting the kernel update to a point suitable for upstream
acceptance took a lot longer than I had hoped but that is behind
us now and I can focus on further resource improvements. Having
the current improvements in place will allow us to identify exactly
where the remaining resource intensive code is (I suspect a couple
of places). Further improvements will get harder from here but
that's what development is about. And we need to be sure that we
have targeted the right places.

Ian

Comment 15 Ian Kent 2009-07-13 02:18:30 UTC
(In reply to comment #14)
> (In reply to comment #13)
> > I've tested this in RHEL 5.3 on both i686 and x86_64 using ldap as a back-end
> > for the automount maps with perfect results.  Memory usage has stayed low since
> > it was installed and there have been no abnormal terminations.  Start up and
> > shutdown remain slow (due to the large amount of maps) but the results are no
> > different than we experienced before.  
> 
> Yes, there really isn't anything we can do about startup time.
> The simple fact is that we need to read the entire direct map in
> when we start. Indirect maps that do not use the browse option
> don't need to be read at start and they aren't in version 5 so
> the slowness must be due to the direct map.

Sorry, what I've said here isn't correct any more.

We'll pick this up in bug 510941 but we need to correct this
statement here to avoid confusion if we refer back to this bug
later.

With the latest changes file maps should always be read at
startup. Since we have to read the map at some point this was
a trade off between spending the time at startup or spending
time upon the first lookup. So, if anything, startup should be
even slower than previously.

Ian

Comment 17 Ian Kent 2009-07-14 13:17:25 UTC
The correction identified in this bug is available in
package autofs-5.0.1-0.rc2.130.

Comment 19 Ian Kent 2009-07-14 13:55:19 UTC
(In reply to comment #17)
> The correction identified in this bug is available in
> package autofs-5.0.1-0.rc2.130.  

This package is also available for further testing at:
http://people.redhat.com/~ikent/autofs-5.0.1-0.rc2.130

Comment 22 bg 2009-07-15 16:44:09 UTC
Hmmm I'm still using the original BZ release and have not tried 130 yet -- but autofs has been running for some time now and it doesn't seem to be responding to directory change requests.

I've let a cd run for about 20 minutes now and haven't seen it actually work.  Should I open a different BZ for this?  I'm not sure why it's hanging like this.  I've tried a few directories and haven't gotten into any automounted nfs dir yet.

Comment 23 bg 2009-07-15 17:13:58 UTC
$ time cd /mnt/nfs





bash: cd: /mnt/nfs: Interrupted system call

real    30m53.465s
user    0m0.000s
sys     0m0.000s

$

Comment 24 Ian Kent 2009-07-16 01:24:53 UTC
(In reply to comment #22)
> Hmmm I'm still using the original BZ release and have not tried 130 yet -- but
> autofs has been running for some time now and it doesn't seem to be responding
> to directory change requests.
> 
> I've let a cd run for about 20 minutes now and haven't seen it actually work. 
> Should I open a different BZ for this?  I'm not sure why it's hanging like
> this.  I've tried a few directories and haven't gotten into any automounted nfs
> dir yet.  

That's a big surprise given the testing I've done.

Please open a new bug and post a sysreq-t dump if possible.
If you can duplicate it with debug logging enabled that log would
also be useful.

Ian

Comment 25 Ian Kent 2009-07-16 01:26:59 UTC
(In reply to comment #24)
> (In reply to comment #22)
> > Hmmm I'm still using the original BZ release and have not tried 130 yet -- but
> > autofs has been running for some time now and it doesn't seem to be responding
> > to directory change requests.
> > 
> > I've let a cd run for about 20 minutes now and haven't seen it actually work. 
> > Should I open a different BZ for this?  I'm not sure why it's hanging like
> > this.  I've tried a few directories and haven't gotten into any automounted nfs
> > dir yet.  
> 
> That's a big surprise given the testing I've done.
> 
> Please open a new bug and post a sysreq-t dump if possible.
> If you can duplicate it with debug logging enabled that log would
> also be useful.
> 

Also, are there any messages in the log?
Ian

Comment 26 Ian Kent 2009-07-16 01:56:29 UTC
(In reply to comment #2)
> Here's our auto.master and associated files:
> 
> $ cat /etc/auto.master
> # auto.master for autofs5 machines
> /usr2   auto.home          --timeout 60
> /-      auto.direct        --timeout 60
> /net    -hosts
> 
> $ cat /etc/auto.direct
> +/etc/auto.projects
> /opt/random/src -rw,noquota       filerx:/vol/vol0/src
> /opt/random/doc -rw,noquota       filerx:/vol/vol0/doc
> 
> the direct maps are there because we have multiple layers of nested mounts.  so
> auto.projects might look like the following:
> 
> /mnt/dir1     -rw,noquota,intr       filer1:/vol/subdir/dir1
> /mnt/dir1/dir2    -rw,noquota,intr        filer2:/vol/subdir1/dir2
> /mnt/dir1/dir2/dir3   -rw,noquota,intr        filer3:/vol/subdir1/dir2_dir3
> /mnt/dir1/dir2/dir3/dir4      -rw,noquota,intr       
> filer4:/vol/subdir1/dir2_dir3_dir4
> /mnt/dir1/dir2/dir3/dir4/include  -rw,noquota,intr       
> filer5:/vol/subdir1/dir2_dir3_dir4_include

Sorry, I didn't notice these nested direct mounts before.
Are you sure this ever worked with version 5?

Although I don't explicitly check for nesting in direct mount
map entries they can't work and basically aren't supported. If
they have worked previously then they were a problem waiting to
happen. To do this you need to use submounts with either a
direct mount or an indirect mount at the base of each tree of
offset mounts.

Once again, sorry I missed this before and sorry if the change
to using strict direct mount sematics in version 5 is causing
inconvenience but, as far as I know, this is the way it is with
other industry standard automount implementations.

Ian

Comment 27 Ian Kent 2009-07-16 02:45:28 UTC
(In reply to comment #26)
> > 
> > the direct maps are there because we have multiple layers of nested mounts.  so
> > auto.projects might look like the following:
> > 
> > /mnt/dir1     -rw,noquota,intr       filer1:/vol/subdir/dir1
> > /mnt/dir1/dir2    -rw,noquota,intr        filer2:/vol/subdir1/dir2
> > /mnt/dir1/dir2/dir3   -rw,noquota,intr        filer3:/vol/subdir1/dir2_dir3
> > /mnt/dir1/dir2/dir3/dir4      -rw,noquota,intr       
> > filer4:/vol/subdir1/dir2_dir3_dir4
> > /mnt/dir1/dir2/dir3/dir4/include  -rw,noquota,intr       
> > filer5:/vol/subdir1/dir2_dir3_dir4_include
> 
> Sorry, I didn't notice these nested direct mounts before.
> Are you sure this ever worked with version 5?
> 
> Although I don't explicitly check for nesting in direct mount
> map entries they can't work and basically aren't supported. If
> they have worked previously then they were a problem waiting to
> happen. To do this you need to use submounts with either a
> direct mount or an indirect mount at the base of each tree of
> offset mounts.

Actually, that's not correct.

Multi-mount map entries are the way nested mount trees must be
done even if submount maps are used to organize groups of
entries.

For example the above direct mount would need to be converted
to a direct mount at the base of the tree and offsets from the
first nesting point in the tree and would look something like:

/mnt/dir1 \
  /          -rw,noquota,intr  filer1:/vol/subdir/dir1 \
  /dir2      -rw,noquota,intr  filer2:/vol/subdir1/dir2 \
  /dir2/dir3 -rw,noquota,intr  filer3:/vol/subdir1/dir2_dir3 \
  /dir2/dir3/dir4 -rw,noquota,intr filer4:/vol/subdir1/dir2_dir3_dir4 \
  /dir2/dir3/dir4/include -rw,noquota,intr \
              filer5:/vol/subdir1dir2_dir3_dir4_include

Clearly, the path /mnt/dir1 could be a direct or indirect mount
entry although I believe other implementations don't allow this
for direct mount entries.

The semantic behaviour of this type of map entry is specifically
designed to handle nested trees of mounts and is the only way
that nesting of mounts can be done. In version 4, multi-mount
entries were a problem because every mount in the tree of
offsets had to be mounted (and expired) as a single unit upon
accessing the directory at the top of the tree. In version 5
entries in the nested tree are mounted and expired as you go
to avoid this problem but the limitation below still exists.

The limitation you need to be aware of with these entries is
that changes to the multi-mount map entries cannot be seen
until the entire tree is expired away and a mount triggered
again. This is specifically because of possible dependencies
due to the nesting.

Ian

Comment 28 bg 2009-07-20 03:41:05 UTC
This is not in direct response to your last post -- I can elaborate more on that in a few days.

However, to address the problem I'm having with the BZ (and 130) -- the automounter WILL cease functioning after about 24-72 hours.  The daemon is still running but it's non-responsive.

After talking with Mike and Deke -- they suggested I add this to /etc/init.d/autofs

ulimit -n 20480
ulimit -s 65535


This seems to have fixed the final issue I'm having.  I've had sever .130 servers running at both rhel 5.4 and 5.3 in 32bit and 64bit for 72 hours now and I've not had any crashes since.  This is a positive development.  I'll check back on this ticket in a few days to let you know if it is still stable.

I will open a new BZ (and an RH ticket) at that time (or perhaps sooner), however, because I don't want to hand-edit /etc/init.d/autofs on all of my hosts.  It would be nice to have that controlled through /etc/sysconfig/autofs or even bump the default value too if that doesn't create a very negative effect on a system.

I hope this helps you some.  If you want to dive deeper into my specific issue -- I can easily reproduce the unresponsiveness without those settings.  

This last week+ has been excellent for our autofs environment.  Thanks again.

Comment 29 Ian Kent 2009-07-20 05:18:43 UTC
(In reply to comment #28)
> This is not in direct response to your last post -- I can elaborate more on
> that in a few days.
> 
> However, to address the problem I'm having with the BZ (and 130) -- the
> automounter WILL cease functioning after about 24-72 hours.  The daemon is
> still running but it's non-responsive.
> 
> After talking with Mike and Deke -- they suggested I add this to
> /etc/init.d/autofs
> 
> ulimit -n 20480
> ulimit -s 65535

Mmmm, interesting.

> 
> 
> This seems to have fixed the final issue I'm having.  I've had sever .130
> servers running at both rhel 5.4 and 5.3 in 32bit and 64bit for 72 hours now
> and I've not had any crashes since.  This is a positive development.  I'll
> check back on this ticket in a few days to let you know if it is still stable.

I'm not sure the setting the open file limit higher will have
any effect as in the daemon we do:

...

#define MAX_OPEN_FILES          10240

...
        rlim.rlim_cur = MAX_OPEN_FILES;
        rlim.rlim_max = MAX_OPEN_FILES;
        res = setrlimit(RLIMIT_NOFILE, &rlim);
        if (res)
                warn(LOGOPT_NONE,
                     "can't increase open file limit - continuing");
...

but changing the maximum open file limit in the daemon should
be no big deal, and given the number of mounts I expect to be
able to deal with, is probably good idea.

As far as the maximum stack size goes, that's a bit more of a
question. I explicitly set a stack size for worker threads
and the call to create the stack attributes succeeds and
subsequent calls to create the threads work also. I use
the default stack size for the mount handling thread but
its main job is to create worker threads. However, I don't
call setrlimit(2) to increase the process maximum stack size
so maybe that is the source of the difficulty. Again, not a
big deal to add that and it seems like something I should
have done at the outset.

> 
> I will open a new BZ (and an RH ticket) at that time (or perhaps sooner),
> however, because I don't want to hand-edit /etc/init.d/autofs on all of my
> hosts.  It would be nice to have that controlled through /etc/sysconfig/autofs
> or even bump the default value too if that doesn't create a very negative
> effect on a system.

I'd rather handle as much as possible in the daemon itself so
that it is automount specific, in a single place, and automatic.

In time (but not any time soon) I'd like to eliminate the need
to hold open file handles, particularly for direct and offset
mounts, and only open them when needed for specific operations.
This has only now become a possibility with the new ioctl
implementation so I'm not in a rush to do this as too many
changes at once will be a recipe for disaster and we already
have a lot of changes.

> 
> I hope this helps you some.  If you want to dive deeper into my specific issue
> -- I can easily reproduce the unresponsiveness without those settings.  

I'm not sure we need to go deeper into this, other than to work
out if it is in fact the lack of a setrlimit(2) call to increase
the allowable stack size.

We probably need to discus the nesting of direct mounts further
in order for you to fully understand the reasons for my comments
and the reasons they are done the way they are. That's bound to
be difficult in itself.

> 
> This last week+ has been excellent for our autofs environment.  Thanks again.  

That's good to hear, at least we're getting there.

As I said before the some of the recent changes were a long
time coming and a bunch of initiatives have landed, coincidently,
all at once in the 5.4 release.

Ian

Comment 30 Ian Kent 2009-07-20 05:54:09 UTC
(In reply to comment #29)
> 
> As far as the maximum stack size goes, that's a bit more of a
> question. I explicitly set a stack size for worker threads
> and the call to create the stack attributes succeeds and
> subsequent calls to create the threads work also. I use
> the default stack size for the mount handling thread but
> its main job is to create worker threads. However, I don't
> call setrlimit(2) to increase the process maximum stack size
> so maybe that is the source of the difficulty. Again, not a
> big deal to add that and it seems like something I should
> have done at the outset.

Mmmm, setting the stack size in the pthread thread creation
attributes only sets the minimum stack size allowed so it
probably doesn't actually do much. So using setrlimit(2)
looks like the way to go.

Ian

Comment 33 Ruediger Landmann 2009-08-31 17:53:21 UTC
Release note added. If any revisions are required, please set the 
"requires_release_notes" flag to "?" and edit the "Release Notes" field accordingly.
All revisions will be proofread by the Engineering Content Services team.

New Contents:
Previously, the method used by autofs to clean up pthreads was not reliable and could result in a memory leak. If the memory leak occurred, autofs would gradually consume all available memory and then crash. A small semantic change in the code prevents this memory leak from occurring now.

Comment 34 errata-xmlrpc 2009-09-02 11:58:53 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2009-1397.html