Bug 589573 - automount hangs on startup when started with an already mounted cifs share
automount hangs on startup when started with an already mounted cifs share
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: autofs (Show other bugs)
5.4
All Linux
low Severity medium
: rc
: ---
Assigned To: Ian Kent
Jian Li
:
Depends On:
Blocks: 650009 650010
  Show dependency treegraph
 
Reported: 2010-05-06 09:12 EDT by Sachin Prabhu
Modified: 2014-03-03 19:06 EST (History)
6 users (show)

See Also:
Fixed In Version: autofs-5.0.1-0.rc2.149.el5
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 650009 (view as bug list)
Environment:
Last Closed: 2011-07-21 04:44:18 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
gdb session (52.68 KB, text/plain)
2010-05-06 09:24 EDT, Sachin Prabhu
no flags Details
Patch - fix remount locking (9.30 KB, patch)
2010-05-10 00:23 EDT, Ian Kent
no flags Details | Diff

  None (edit)
Description Sachin Prabhu 2010-05-06 09:12:13 EDT
Reproducer:

(1).Set to use automount for samba filesystem as follows.
 1.1. Create two directories and a text file for automount as follows.
  # mkdir /testdir_A
  # mkdir /testdir_B
  # echo "This is testB" > /testdir_B/testB
 1.2. Create a map file as follows.
  # echo "test1   --bind     :/testdir_B" > /etc/auto.testmap
 1.3. Create two directories and a text file for samba as follows.
  # mkdir /testdir_samba_A
  # mkdir /testdir_samba_B
  # echo "This is sambaB" > /testdir_samba_B/sambaB
 1.4. Edit /etc/auto.master file as follows.
  # vi /etc/auto.master
  ...
  /testdir_A /etc/auto.testmap
  /testdir_samba_A /etc/auto.smb
  ...
 1.5. Edit /etc/samba/smb.conf file as follows.
  # vi /etc/samba/smb.conf
  [global]
  ...
  security = share
  ...
  hosts allow = 127.
  ...
  guest account = root
  ...
  [automounttest]
       comment = automounttest
       public = yes
       null passwords = yes
       browseable = yes
       writable = yes
       guest ok = yes
       path = /testdir_samba_B
 1.6. Restart both smb and autofs service
  # service smb restart
  # service autofs restart

(2). Verify that /testdir_B/testB can be opened with automount.
# cat /testdir_A/test1/testB
This is testB
# cat /testdir_samba_A/127.0.0.1/automounttest/sambaB
This is sambaB

(3). Create a file 'hoge' which keeps being written on samba filesystem.
# sleep 1d > /testdir_samba_A/127.0.0.1/automounttest/hoge &

(4).Stop the automount service.
# service autofs stop

(5). Verify that samba filesystem is mounted.
# mount
...
//127.0.0.1/automounttest on /testdir_samba_A/127.0.0.1/automounttest type cifs (rw,mand)

(6).Kill the sleep to write into the file.
# pkill sleep

(7).Start up the automount service.
# service autofs start
 <This operation triggers automount's hang-up.>

(8).Confirm whether the automount can start up on another terminal.
# cat /testdir_A/test1/testB
 <a prompt no longer returns.>

Using gdb to trace, the automount userland process hangs while waiting for a write lock on the cache.

Thread 2 (Thread 0x431bc940 (LWP 2954)):
#0  0x00002b594cfc24c0 in pthread_rwlock_wrlock () from /lib64/libpthread.so.0
#1  0x00002b594cb6c636 in cache_writelock (mc=0x2b59576060fc) at cache.c:74
#2  0x00002aaaab766332 in lookup_mount (ap=0x2b5957605f40, name=0x2b5957615620 "/testdir_samba_A/127.0.0.1/automounttest", name_len=40, context=0x2b59576142f0) at lookup_program.c:149
#3  0x00002b594cb68b29 in lookup_name_file_source_instance (ap=0x2b5957605f40, map=0x2b5957606040, name=0x2b5957615620 "/testdir_samba_A/127.0.0.1/automounttest", name_len=40) at lookup.c:704
#4  0x00002b594cb69876 in lookup_nss_mount (ap=0x2b5957605f40, source=0x0, name=0x2b5957615620 "/testdir_samba_A/127.0.0.1/automounttest", name_len=40) at lookup.c:892
#5  0x00002b594cb6e9e3 in remount_active_mount (ap=0x2b5957605f40, me=0x2b5957615560, type=4) at mounts.c:1153
#6  try_remount (ap=0x2b5957605f40, me=0x2b5957615560, type=4) at mounts.c:1351
#7  0x00002b594cb65e32 in mount_autofs_offset (ap=0x2b5957605f40, me=0x2b5957615560, root=0x431b7030 "/testdir_samba_A/127.0.0.1", offset=0x431b5fd0 "/automounttest") at direct.c:652
#8  0x00002b594cb6dfde in mount_multi_triggers (ap=0x2b5957605f40, me=<value optimized out>, root=0x431b7030 "/testdir_samba_A/127.0.0.1", start=26, base=0x2aaaaaf0f04b "/") at mounts.c:1461
#9  0x00002aaaaaf002d0 in mount_subtree (ap=0x2b5957605f40, me=0x2b5957615380, name=0x2b595761c463 "127.0.0.1", loc=0x0, options=0x2b59576154f0 "fstype=cifs", ctxt=0x2b5957614340) at parse_sun.c:1221
#10 0x00002aaaaaf00b59 in parse_mount (ap=0x2b5957605f40, name=0x2b595761c463 "127.0.0.1", name_len=9, mapent=0x2b5957614370 "-fstype=cifs  \t \"/automounttest\" \"://127.0.0.1/automounttest\"", context=0x2b5957614340)
   at parse_sun.c:1594
#11 0x00002aaaab766a75 in lookup_mount (ap=0x2b5957605f40, name=0x2b595761c463 "127.0.0.1", name_len=9, context=0x2b59576142f0) at lookup_program.c:408
#12 0x00002b594cb68b29 in lookup_name_file_source_instance (ap=0x2b5957605f40, map=0x2b5957606040, name=0x2b595761c463 "127.0.0.1", name_len=9) at lookup.c:704
#13 0x00002b594cb69876 in lookup_nss_mount (ap=0x2b5957605f40, source=0x0, name=0x2b595761c463 "127.0.0.1", name_len=9) at lookup.c:892
#14 0x00002b594cb6ecc6 in remount_active_mount (ap=0x2b5957605f40, me=0x0, type=1) at mounts.c:1220
#15 try_remount (ap=0x2b5957605f40, me=0x0, type=1) at mounts.c:1351
#16 0x00002b594cb61757 in do_mount_autofs_indirect (ap=0x2b5957605f40, root=0x2b59575f3330 "/testdir_samba_A") at indirect.c:104
#17 0x00002b594cb619da in mount_autofs_indirect (ap=0x2b5957605f40, root=0x2b59575f3330 "/testdir_samba_A") at indirect.c:221
#18 0x00002b594cb6015e in handle_mounts (arg=0x7fffc908de80) at automount.c:1012
#19 0x00002b594cfbe73d in start_thread (arg=<value optimized out>) at pthread_create.c:301
#20 0x00002b594de8ed1d in clone () from /lib64/libc.so.6
Comment 1 Sachin Prabhu 2010-05-06 09:17:49 EDT
The issue appears to be a cache readlock which is held when calling  mount_subtree() from parse_mount() in parse_sun.c.

gdb was used to add the following breakpoints in the execution

break cache_readlock
break cache_writelock
break cache_try_writelock
break cache_unlock

At each breakpoint during the execution I ran the following commands
bt
to get the backtrace
c
to continue with the execution.

The following readlock doesn't have a corresponding unlock

Breakpoint 1, cache_readlock (mc=0x2aaaaad37110) at cache.c:59
59      {
(gdb) bt
#0  cache_readlock (mc=0x2aaaaad37110) at cache.c:59
#1  0x00002aaaac4bf801 in parse_mount (ap=0x2aaaaad36f60, name=0x2aaaaad46a73 "127.0.0.1", name_len=9, mapent=0x2aaaaad3e980 "-fstype=cifs  \t \"/automounttest\" \"://127.0.0.1/automounttest\"", context=0x2aaaaad3e950)
   at parse_sun.c:1501
#2  0x00002aaaacd25a75 in lookup_mount (ap=0x2aaaaad36f60, name=0x2aaaaad46a73 "127.0.0.1", name_len=9, context=0x2aaaaad3e900) at lookup_program.c:408
#3  0x00002aaaaaabeb29 in lookup_name_file_source_instance (ap=0x2aaaaad36f60, map=0x2aaaaad37060, name=0x2aaaaad46a73 "127.0.0.1", name_len=9) at lookup.c:704
#4  0x00002aaaaaabf876 in lookup_nss_mount (ap=0x2aaaaad36f60, source=0x0, name=0x2aaaaad46a73 "127.0.0.1", name_len=9) at lookup.c:892
#5  0x00002aaaaaac4cc6 in remount_active_mount (ap=0x2aaaaad36f60, me=0x0, type=1) at mounts.c:1220
#6  try_remount (ap=0x2aaaaad36f60, me=0x0, type=1) at mounts.c:1351
#7  0x00002aaaaaab7757 in do_mount_autofs_indirect (ap=0x2aaaaad36f60, root=0x2aaaaad1e2e0 "/testdir_samba_A") at indirect.c:104
#8  0x00002aaaaaab79da in mount_autofs_indirect (ap=0x2aaaaad36f60, root=0x2aaaaad1e2e0 "/testdir_samba_A") at indirect.c:221
#9  0x00002aaaaaab615e in handle_mounts (arg=0x7fffffffbd80) at automount.c:1012
#10 0x00002aaaaaf1473d in start_thread (arg=<value optimized out>) at pthread_create.c:301
#11 0x00002aaaabde4d1d in clone () from /lib64/libc.so.6
(gdb) c
Continuing.


This results in a later call for a writelock blocking.

(gdb) bt
#0  cache_writelock (mc=0x2aaaaad37110) at cache.c:71
#1  0x00002aaaacd25332 in lookup_mount (ap=0x2aaaaad36f60, name=0x2aaaaad3fc30 "/testdir_samba_A/127.0.0.1/automounttest", name_len=40, context=0x2aaaaad3e900) at lookup_program.c:149
#2  0x00002aaaaaabeb29 in lookup_name_file_source_instance (ap=0x2aaaaad36f60, map=0x2aaaaad37060, name=0x2aaaaad3fc30 "/testdir_samba_A/127.0.0.1/automounttest", name_len=40) at lookup.c:704
#3  0x00002aaaaaabf876 in lookup_nss_mount (ap=0x2aaaaad36f60, source=0x0, name=0x2aaaaad3fc30 "/testdir_samba_A/127.0.0.1/automounttest", name_len=40) at lookup.c:892
#4  0x00002aaaaaac49e3 in remount_active_mount (ap=0x2aaaaad36f60, me=0x2aaaaad3fb70, type=4) at mounts.c:1153
#5  try_remount (ap=0x2aaaaad36f60, me=0x2aaaaad3fb70, type=4) at mounts.c:1351
#6  0x00002aaaaaabbe32 in mount_autofs_offset (ap=0x2aaaaad36f60, me=0x2aaaaad3fb70, root=0x42820030 "/testdir_samba_A/127.0.0.1", offset=0x4281efd0 "/automounttest") at direct.c:652
#7  0x00002aaaaaac3fde in mount_multi_triggers (ap=0x2aaaaad36f60, me=<value optimized out>, root=0x42820030 "/testdir_samba_A/127.0.0.1", start=26, base=0x2aaaac4ce04b "/") at mounts.c:1461
#8  0x00002aaaac4bf2d0 in mount_subtree (ap=0x2aaaaad36f60, me=0x2aaaaad3f990, name=0x2aaaaad46a73 "127.0.0.1", loc=0x0, options=0x2aaaaad3fb00 "fstype=cifs", ctxt=0x2aaaaad3e950) at parse_sun.c:1221
#9  0x00002aaaac4bfb59 in parse_mount (ap=0x2aaaaad36f60, name=0x2aaaaad46a73 "127.0.0.1", name_len=9, mapent=0x2aaaaad3e980 "-fstype=cifs  \t \"/automounttest\" \"://127.0.0.1/automounttest\"", context=0x2aaaaad3e950)
   at parse_sun.c:1594
#10 0x00002aaaacd25a75 in lookup_mount (ap=0x2aaaaad36f60, name=0x2aaaaad46a73 "127.0.0.1", name_len=9, context=0x2aaaaad3e900) at lookup_program.c:408
#11 0x00002aaaaaabeb29 in lookup_name_file_source_instance (ap=0x2aaaaad36f60, map=0x2aaaaad37060, name=0x2aaaaad46a73 "127.0.0.1", name_len=9) at lookup.c:704
#12 0x00002aaaaaabf876 in lookup_nss_mount (ap=0x2aaaaad36f60, source=0x0, name=0x2aaaaad46a73 "127.0.0.1", name_len=9) at lookup.c:892
#13 0x00002aaaaaac4cc6 in remount_active_mount (ap=0x2aaaaad36f60, me=0x0, type=1) at mounts.c:1220
#14 try_remount (ap=0x2aaaaad36f60, me=0x0, type=1) at mounts.c:1351
#15 0x00002aaaaaab7757 in do_mount_autofs_indirect (ap=0x2aaaaad36f60, root=0x2aaaaad1e2e0 "/testdir_samba_A") at indirect.c:104
#16 0x00002aaaaaab79da in mount_autofs_indirect (ap=0x2aaaaad36f60, root=0x2aaaaad1e2e0 "/testdir_samba_A") at indirect.c:221
#17 0x00002aaaaaab615e in handle_mounts (arg=0x7fffffffbd80) at automount.c:1012
#18 0x00002aaaaaf1473d in start_thread (arg=<value optimized out>) at pthread_create.c:301
#19 0x00002aaaabde4d1d in clone () from /lib64/libc.so.6
(gdb) c
Comment 2 Sachin Prabhu 2010-05-06 09:24:31 EDT
Created attachment 412036 [details]
gdb session

gdb session with backtraces taken each time there was a readlock/writelock or unlock.

The process hangs waiting for a writelock on lock with address 0x2aaaaad37110

Grepping for all calls on this lock
$ grep 0x2aaaaad37110 717233.gdb |grep -v Breakpoint
#0  cache_writelock (mc=0x2aaaaad37110) at cache.c:71
#0  cache_unlock (mc=0x2aaaaad37110) at cache.c:95
#0  cache_readlock (mc=0x2aaaaad37110) at cache.c:59
#0  cache_unlock (mc=0x2aaaaad37110) at cache.c:95
#0  cache_readlock (mc=0x2aaaaad37110) at cache.c:59
#0  cache_unlock (mc=0x2aaaaad37110) at cache.c:95
#0  cache_writelock (mc=0x2aaaaad37110) at cache.c:71
#0  cache_unlock (mc=0x2aaaaad37110) at cache.c:95
#0  cache_writelock (mc=0x2aaaaad37110) at cache.c:71
#0  cache_unlock (mc=0x2aaaaad37110) at cache.c:95
#0  cache_writelock (mc=0x2aaaaad37110) at cache.c:71
#0  cache_unlock (mc=0x2aaaaad37110) at cache.c:95
#0  cache_readlock (mc=0x2aaaaad37110) at cache.c:59 <-- *
#0  cache_readlock (mc=0x2aaaaad37110) at cache.c:59
#0  cache_unlock (mc=0x2aaaaad37110) at cache.c:95
#0  cache_writelock (mc=0x2aaaaad37110) at cache.c:71

* - The readlock doesn't have a corresponding unlock. This process then tries to obtain a writelock on the cache with the readlock held. This results in the process hanging.
Comment 3 Ian Kent 2010-05-06 10:58:54 EDT
What revision of autofs is being used here?
AFAICS it isn't rev 131 from 5.4.
Comment 4 Ian Kent 2010-05-06 11:03:06 EDT
(In reply to comment #3)
> What revision of autofs is being used here?
> AFAICS it isn't rev 131 from 5.4.    

OK, my mistake, I'm with it now.
Comment 5 Ian Kent 2010-05-10 00:23:40 EDT
Created attachment 412707 [details]
Patch - fix remount locking
Comment 6 Ian Kent 2010-05-10 01:05:56 EDT
A test package which includes the change of comment #5 has been
made. It can be found at:
http://people.redhat.com/~ikent/autofs-5.0.1-0.rc2.143.bz589573.1.el5

Please test this package and report your results.
Comment 11 RHEL Product and Program Management 2010-08-09 14:27:46 EDT
This request was evaluated by Red Hat Product Management for
inclusion in the current release of Red Hat Enterprise Linux.
Because the affected component is not scheduled to be updated in the
current release, Red Hat is unfortunately unable to address this
request at this time. Red Hat invites you to ask your support
representative to propose this request, if appropriate and relevant,
in the next release of Red Hat Enterprise Linux.
Comment 13 RHEL Product and Program Management 2011-01-11 15:42:31 EST
This request was evaluated by Red Hat Product Management for
inclusion in the current release of Red Hat Enterprise Linux.
Because the affected component is not scheduled to be updated in the
current release, Red Hat is unfortunately unable to address this
request at this time. Red Hat invites you to ask your support
representative to propose this request, if appropriate and relevant,
in the next release of Red Hat Enterprise Linux.
Comment 14 RHEL Product and Program Management 2011-01-11 17:17:49 EST
This request was erroneously denied for the current release of
Red Hat Enterprise Linux.  The error has been fixed and this
request has been re-proposed for the current release.
Comment 16 Jian Li 2011-04-28 01:19:35 EDT
This bug is verified, test is explained as follows:

The environment:
#/etc/samba/smb.conf
[global]
	workgroup = MYGROUP
	security = SHARE
	guest account = root
	hosts allow = 127.
[automounttest]
	comment = automounttest
	path = /testdir_samba_B
	read only = No
	guest ok = Yes
	public = yes

[root@hp-xw6400-02 ~]# rpm -qa | grep samba
samba-common-3.0.33-3.29.el5_6.2
samba-client-3.0.33-3.29.el5_6.2
samba-3.0.33-3.29.el5_6.2
[root@hp-xw6400-02 ~]# rpm -q autofs
autofs-5.0.1-0.rc2.155.el5
[root@hp-xw6400-02 ~]# uname -a
Linux hp-xw6400-02.lab.bos.redhat.com 2.6.18-256.el5 #1 SMP Thu Apr 7 19:59:40 EDT 2011 i686 i686 i386 GNU/Linux

Some steps for preparation are operated as comment 0.
Steps and output :

[root@hp-xw6400-02 ~]# cat /testdir_A/test1/testB
This is testB
[root@hp-xw6400-02 ~]# cat /testdir_samba_A/127.0.0.1/automounttest/sambaB
This is sambaB
[root@hp-xw6400-02 ~]# sleep 1d > /testdir_samba_A/127.0.0.1/automounttest/hoge &
[1] 5327
[root@hp-xw6400-02 ~]# 
[root@hp-xw6400-02 ~]# service autofs stop
Stopping automount: [  OK  ]
[root@hp-xw6400-02 ~]# mount
........
//127.0.0.1/automounttest on /testdir_samba_A/127.0.0.1/automounttest type cifs (rw,mand)
[root@hp-xw6400-02 ~]# pkill sleep
[root@hp-xw6400-02 ~]# service autofs start
Starting automount: [  OK  ]
[1]+  Terminated              sleep 1d > /testdir_samba_A/127.0.0.1/automounttest/hoge
[root@hp-xw6400-02 ~]# cat /testdir_A/test1/testB
This is testB
[root@hp-xw6400-02 ~]# cat /testdir_samba_A/127.0.0.1/automounttest/sambaB
This is sambaB
Comment 17 errata-xmlrpc 2011-07-21 04:44:18 EDT
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2011-1079.html
Comment 18 errata-xmlrpc 2011-07-21 08:33:31 EDT
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2011-1079.html

Note You need to log in before you can comment on or make changes to this bug.