Bug 241780 - Autofs hangs on some mountpoints
Summary: Autofs hangs on some mountpoints
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Fedora
Classification: Fedora
Component: autofs
Version: 6
Hardware: All
OS: Linux
medium
high
Target Milestone: ---
Assignee: Ian Kent
QA Contact: Brock Organ
URL:
Whiteboard: bzcl34nup
Depends On:
Blocks: 243717
TreeView+ depends on / blocked
 
Reported: 2007-05-30 15:31 UTC by Anders Blomdell
Modified: 2008-05-06 19:38 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2008-05-06 19:38:44 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
Possibly relevant 'ps -eLf', 'gdb' and 'strace' data (11.52 KB, text/plain)
2007-05-30 15:31 UTC, Anders Blomdell
no flags Details
/proc/kallsyms (1.69 MB, text/plain)
2007-05-31 07:11 UTC, Anders Blomdell
no flags Details
strace of good mount (13.27 KB, application/octet-stream)
2007-05-31 10:51 UTC, Anders Blomdell
no flags Details
strace of failed mount (27.95 KB, application/octet-stream)
2007-05-31 10:52 UTC, Anders Blomdell
no flags Details
gdb + strace info of rc3.30 hung machine (6.13 KB, text/plain)
2007-06-04 08:47 UTC, Anders Blomdell
no flags Details
Better gdb trace (with symbols) (5.25 KB, text/plain)
2007-06-04 12:22 UTC, Anders Blomdell
no flags Details
Test patch for the alarm deadlock problem (1.39 KB, patch)
2007-06-04 15:58 UTC, Ian Kent
no flags Details | Diff
.tar file of various pieces to show presence of [possible] deadlock (90.00 KB, application/x-tar)
2007-06-05 14:59 UTC, Anders Blomdell
no flags Details
lock reorder patch (1.00 KB, patch)
2007-06-05 15:57 UTC, Anders Blomdell
no flags Details | Diff
new autofs_analyze (4.40 KB, text/plain)
2007-06-05 16:31 UTC, Anders Blomdell
no flags Details
alternate implementation of alarm_handler (993 bytes, text/plain)
2007-06-07 07:58 UTC, Anders Blomdell
no flags Details
new try on alarm_handler (1013 bytes, text/plain)
2007-06-07 09:22 UTC, Anders Blomdell
no flags Details

Description Anders Blomdell 2007-05-30 15:31:43 UTC
Description of problem:

After upgrading to autofs-5.0.1-0.rc3.29 (and lots of other packeges as
well:-(), our systems tend to stop automounting some of the mountpoints
specified by NIS-maps after some variable time (hours - several days)

Version-Release number of selected component (if applicable):

autofs-5.0.1-0.rc3.29

How reproducible:

Often 

Steps to Reproduce:
1. Use system for some time in such a way that automounting is triggered.
  
Actual results:

Automounter stops responding for some mountpoints

Expected results:

Automounting responding.


Additional info:

Comment 1 Anders Blomdell 2007-05-30 15:31:43 UTC
Created attachment 155705 [details]
Possibly relevant 'ps -eLf', 'gdb' and 'strace' data

Comment 2 Ian Kent 2007-05-31 02:53:10 UTC
Kernel version?
Could you post your maps?
Could we have a debug log as well please?

Ian


Comment 3 Anders Blomdell 2007-05-31 07:11:43 UTC
Created attachment 155783 [details]
/proc/kallsyms

Comment 4 Anders Blomdell 2007-05-31 07:12:38 UTC
On this particular machine (my development machine):

2.6.21-rc7 

Problem most often has manifested itself on:

2.6.20-1.2948.fc6

I have enclosed /proc/kallsyms

What do you mean by a debug log (remember, the problem is not easily
reproducible, and I would hate to destroy a specimen of it that I can keep alive
as long as I see fit).

Comment 5 Ian Kent 2007-05-31 09:10:23 UTC
(In reply to comment #4)
> On this particular machine (my development machine):
> 
> 2.6.21-rc7 

Oh really, and you've seen it using this kernel?

> 
> Problem most often has manifested itself on:
> 
> 2.6.20-1.2948.fc6
> 
> I have enclosed /proc/kallsyms
> 
> What do you mean by a debug log (remember, the problem is not easily
> reproducible, and I would hate to destroy a specimen of it that I can keep alive
> as long as I see fit).

Yep, I know it a big ask and setting debug logging will most
likely prevent the bug from occurring but, if you have enough
disk space, give it a try if you can.

http://people.redhat.com/jmoyer describes what needs to be set.

btw, autofs-5.0.1-0.rc3.30 should be on it's way to the testing
directory on the mirrors. I'm not sure this will help but please
give it a try when it arrives. Or grab the source rpm from Rawhide
or F7, build it and give that a try.

Ian




Comment 6 Anders Blomdell 2007-05-31 10:51:39 UTC
Created attachment 155790 [details]
strace of good mount

I'm still dodging the 'restart automounter' request, since I want to gather as
much info from the current hang (machine has been up for 35 days, automounter
for 18 days before the bug manifested itself). Usually this bug hits various
desktop machines for users who don't like me to debug their machines for hours
:-(.

So I have submitted 2 strace's, one for a request that works (strace.ok) and
one for a failed request (strace.bad). Since the strace of the failed request
is totally silent, it seems like there is a pending mount request somewhere
deep down in the kernel that is stopping the request to reach the automounter.

Comment 7 Anders Blomdell 2007-05-31 10:52:36 UTC
Created attachment 155792 [details]
strace of failed mount

Comment 8 Anders Blomdell 2007-05-31 11:22:58 UTC
Another point of possible interest (assuming it's a kernel bug):

An affected system stops to auto-unmount unused filesystems. I'll continue to
try to understand where that behaviour intersects with a stuck mount.


Comment 9 Ian Kent 2007-05-31 12:15:46 UTC
(In reply to comment #6)
> Created an attachment (id=155790) [edit]
> strace of good mount
> 
> I'm still dodging the 'restart automounter' request, since I want to gather as
> much info from the current hang (machine has been up for 35 days, automounter
> for 18 days before the bug manifested itself). Usually this bug hits various
> desktop machines for users who don't like me to debug their machines for hours
> :-(.

We probably won't get anything more that's useful from the
broken machine, thanks for getting the stuff so far.

We need to record the kernel version of machines that show
this problem. We need to confirm whether you see it on
machines with 2.6.21 or not. A patch for a race between
mount and expire is included in it. It may not be the
problem you're seeing but it seems to have come up a
few times lately (actually suddenly, I don't know why,
it was really hard to reproduce when we were degugging
it). The patch was present from 2.6.21-rc1.

> 
> So I have submitted 2 strace's, one for a request that works (strace.ok) and
> one for a failed request (strace.bad). Since the strace of the failed request
> is totally silent, it seems like there is a pending mount request somewhere
> deep down in the kernel that is stopping the request to reach the automounter.

Not sure if we got any extra evidence with these.
The symptoms of the race above are that you don't see mount
requests coming in when there should be. The straces do appear
to indicate that but this should only happen with specific
configurations and specific usage patterns. In some cases it
will recover several seconds after it happens (maybe 10) and
just start working again. OTOH the gdb trace looks like we
have a new problem, a deadlock, which would have symptoms of
stopping any further mounts to that mount point and possibly
(it looks probable from the gdb outputy) other mount points
as well.

The package I mentioned above (0.rc3.30) has some significant
changes to the map re-reading and identification of changed
maps. The gdb trace shows a map re-read happening and that's
why I wanted you to use that for further testing. I'd hate for
us to spend heaps of time solving something that's fixed already.

btw, a sysrq-t log dump is often useful.

Lastly, if this is a new race then things get more difficult.
Installing the corresponding autofs-debuginfo package so we can
get slightly more useful gdb backtrace info would be good. I'll
dig up my instructions for what to get. Basically, list the
threads and then a stack trace (bt) for threads that don't look
as they should (much like what you did already). Essentially
igoring anything waiting in poll and a couple of other worker
threads (if they look ok).

What's really is important though is your setup, maps, nsswitch
and particularly usage patterns. Any of this info is really
useful.

So bummer, having this bug, sorry.
Ian

Comment 10 Ian Kent 2007-05-31 12:21:21 UTC
(In reply to comment #8)
> Another point of possible interest (assuming it's a kernel bug):
> 
> An affected system stops to auto-unmount unused filesystems. I'll continue to
> try to understand where that behaviour intersects with a stuck mount.
> 

Mmm .. I hadn't noticed that with the race I mentioned
above but we were more interested in fixing the dead mount
problem so it could be related to the same problem. But lets
not make assumptions, the gdb output looks like a pthreads
deadlock in the user space daemon.

Ian


Ian


Comment 11 Anders Blomdell 2007-05-31 14:54:54 UTC
Ahh, it was the automount maps (silly me)

# Master map for automounter
/home   auto.home --ghost
/work   auto.work --ghost
/auto.direct  /etc/auto.direct


# /etc/nsswitch.conf
passwd:     files nis
shadow:     files nis
group:      files nis
hosts:      files dns nis
ethers:     files nis
netmasks:   files 
networks:   files
protocols:  files nis
rpc:        files
services:   files nis
netgroup:   files nis
automount:  files nis
aliases:    files

Some more stray thoughts:

Restarting automounter makes problem go away (kill -9 required). This indicates
that the kernel is waiting for the stuck mount, blocking further requests to
that mountpoint to succed (back to pointing fingers at automount I guess).

If I understand you correctly, the problems (probably) resides in automount,
since 2.6.21-rc7 [SMP PREEMPT] exhibits the problem. I slowly bring all our
computers to use autofs-5.0.1-0.rc3.30.

Comment 12 Ian Kent 2007-05-31 15:10:27 UTC
(In reply to comment #11)
> Ahh, it was the automount maps (silly me)
> 
> # Master map for automounter
> /home   auto.home --ghost
> /work   auto.work --ghost
> /auto.direct  /etc/auto.direct

Are the maps themselves simple maps, ie no multi-mount,
submount entries or "+" map inclusion?

> 
> 
> # /etc/nsswitch.conf
> passwd:     files nis
> shadow:     files nis
> group:      files nis
> hosts:      files dns nis
> ethers:     files nis
> netmasks:   files 
> networks:   files
> protocols:  files nis
> rpc:        files
> services:   files nis
> netgroup:   files nis
> automount:  files nis
> aliases:    files

All normal.

> 
> Some more stray thoughts:
> 
> Restarting automounter makes problem go away (kill -9 required). This indicates
> that the kernel is waiting for the stuck mount, blocking further requests to
> that mountpoint to succed (back to pointing fingers at automount I guess).
> 
> If I understand you correctly, the problems (probably) resides in automount,
> since 2.6.21-rc7 [SMP PREEMPT] exhibits the problem. I slowly bring all our
> computers to use autofs-5.0.1-0.rc3.30.

Yep, there are no other known problems with the kernel module.
I'm keen to see how the update goes.
It's been pushed so it should find its way to the testing
directory on the mirrors soonish.

Ian




Comment 13 Ian Kent 2007-05-31 15:15:12 UTC
(In reply to comment #11)
> Restarting automounter makes problem go away (kill -9 required). This indicates
> that the kernel is waiting for the stuck mount, blocking further requests to
> that mountpoint to succed (back to pointing fingers at automount I guess).

OK, that sounds more like a daemon problem, yes.

> 
> If I understand you correctly, the problems (probably) resides in automount,
> since 2.6.21-rc7 [SMP PREEMPT] exhibits the problem. I slowly bring all our
> computers to use autofs-5.0.1-0.rc3.30.

It would be good to understand the workload better.
The best way for me to work out what's wrong is to be able
to duplicate it. So understanding what's wrong is the first
step. I've got a dual core notebook here so I may be able to
simulate the workload and just leave it running. That also
involves using maps similar to what you have in the same way.

Ian


Comment 14 Anders Blomdell 2007-05-31 16:07:35 UTC
> Are the maps themselves simple maps, ie no multi-mount,
> submount entries or "+" map inclusion?
Yes, mainly peoples /home/ directories (coing from 2 different servers) and
/work directories residing on peoples workstations (around 50), and usually
mounted locally, but accessible from all machines.

> Yep, there are no other known problems with the kernel module.
> I'm keen to see how the update goes.
> It's been pushed so it should find its way to the testing
> directory on the mirrors soonish.
Installed on my development machine :-)

> It would be good to understand the workload better.
Basically random mounts with a huge preference to one /home and one /work
directory per machine.

I have started the following script on the upgraded and one other machine (I
will report back if I get any interesting results):

#!/usr/bin/python

import nis
import os
import random
import time

if __name__ == "__main__":
    HOME = nis.cat("auto.home").keys()
    WORK = nis.cat("auto.work").keys()
    while True:
        dir = random.choice(["/home/%s/." % random.choice(HOME) ,
                             "/work/%s/." % random.choice(WORK)])
        try:
            print dir, os.stat(dir).st_mtime
        except OSError, e:
            print e

        time.sleep(30)





Comment 15 Anders Blomdell 2007-06-04 08:47:30 UTC
Created attachment 156053 [details]
gdb + strace info of rc3.30 hung machine

Conclusive answer was reached after this weekends stresstest: 

6 out of 39 computers running autofs-5.0.1-0.rc3.30 are stuck!

Comment 16 Anders Blomdell 2007-06-04 12:22:27 UTC
Created attachment 156069 [details]
Better gdb trace (with symbols)

Finally managed to get my mands on the debugs symbols (despite the
recent release of Fedora7)

Deadlock:

Thread 11:
  alarm_handler(void *arg):
    alarm_lock()
    state_mutex_lock(ap)

Thread 5:
  expire_cleanup(void *arg)
    state_mutex_lock(ap);
    alarm_add(struct autofs_point *ap, time_t seconds):
      alarm_lock()

And the following lock chain evolves:

  11 <-+- 4
  ^    |
  |    +- 6 <-+- 2
  v	      |
  5	      +- 3 <- 1


To be honest, there is a lot of locking going on, and it seems (to me)
like one big lock and condwaits could do the job, if that is not the case, 
a strict locking hierarchy should be observed!

Comment 17 Ian Kent 2007-06-04 13:34:46 UTC
(In reply to comment #16)
> Created an attachment (id=156069) [edit]
> Better gdb trace (with symbols)
> 
> Finally managed to get my mands on the debugs symbols (despite the
> recent release of Fedora7)
> 
> Deadlock:
> 
> Thread 11:
>   alarm_handler(void *arg):
>     alarm_lock()
>     state_mutex_lock(ap)
> 
> Thread 5:
>   expire_cleanup(void *arg)
>     state_mutex_lock(ap);
>     alarm_add(struct autofs_point *ap, time_t seconds):
>       alarm_lock()

Mmm .. I thought I got rid of that out of order locking.
Thanks for the detective work, I'll go through the code
and see what I can find. I'm not sure why I thought I
needed to add the alarm under the state mutex.

> 
> And the following lock chain evolves:
> 
>   11 <-+- 4
>   ^    |
>   |    +- 6 <-+- 2
>   v	      |
>   5	      +- 3 <- 1
> 
> 
> To be honest, there is a lot of locking going on, and it seems (to me)
> like one big lock and condwaits could do the job, if that is not the case, 
> a strict locking hierarchy should be observed!

True, it is to complex and there is plenty of room for
improvement. It may be possible to use that approach,
I certainly thought about it and started down that path
but it rapidly got very difficult, perhaps in the next
version when what we have settles down and fewer changes
are being made.

Ian




Comment 18 Anders Blomdell 2007-06-04 13:48:44 UTC
Please send the result (patch or similar) as soon as you have something to test,
and I'll put it under stress...


Comment 19 Anders Blomdell 2007-06-04 15:00:51 UTC
As an extra layer of paranoia, how about adding something like this (to detect
locks that are not granted within TIMEOUT seconds)? If you think it's a good
idea, I could create an actual patch with something that even might compile :-)


static pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;

struct watch_queue {
  struct list_head list;
  pthread_mutex_t *mutex;
  pthread_t thread;
  time_t time
};
LIST_HEAD(watch);

/* Use this to monitor deadlocks */
int watch_pthread_mutex_lock(pthread_mutex_t *m)
{
  int result;

  struct watch_queue w;
  w.mutex = m;
  w.time = time();
  w.thread = pthread_self();

  pthread_mutex_lock(mutex);
  list_add(&w->list, &watch);
  pthread_mutex_unlock(mutex);

  result = pthread_mutex_lock(m);

  pthread_mutex_lock(mutex);
  list_delete(&w->list);
  pthread_mutex_unlock(mutex);

  return result;
}

/* To be run in a separate thread */
void watch() {
  while (true) {
     struct list_head *p;
     sleep(10);

    time_t now = time();
    pthread_mutex_lock(mutex);
    list_for_each(p, &watch) {
      struct watch w = list_entry(p, struct watch, list);
      if (now - w->time > TIMEOUT) {
        report_or_die();
      }
    }
    pthread_mutex_unlock(mutex);
  }
}

Comment 20 Ian Kent 2007-06-04 15:08:52 UTC
(In reply to comment #19)
> As an extra layer of paranoia, how about adding something like this (to detect
> locks that are not granted within TIMEOUT seconds)? If you think it's a good
> idea, I could create an actual patch with something that even might compile :-)

Ha .. thanks, good idea but I'd really rather not add to the
complexity just now.

I've had a quick look and I have a simple patch, I haven't
even checked to see if it compiles. I do want to be able
to add alarms while holding the state mutex, at least for
now.

Bearing in mind that I still need to think more about this
would you be willing to give the patch a try anyway. I'll
check it compiles and give it a quick run against the
autofs connectathon suite first.

Ian


Comment 21 Anders Blomdell 2007-06-04 15:17:38 UTC
Of course, but I'll be gone for today within 35 minutes, so it will probably be
tomorrow!

Comment 22 Ian Kent 2007-06-04 15:23:56 UTC
(In reply to comment #21)
> Of course, but I'll be gone for today within 35 minutes, so it will probably be
> tomorrow!

And I'm out much of tomorrow.
I'll post the patch after testing and catch up tomorrow
afternoon (probably near your lunch time).

Ian


Comment 23 Anders Blomdell 2007-06-04 15:32:49 UTC
Probably not, I'm on CET (GMT + 1), but since I don't know your timezone, what
do I know...

Comment 24 Ian Kent 2007-06-04 15:52:31 UTC
(In reply to comment #23)
> Probably not, I'm on CET (GMT + 1), but since I don't know your timezone, what
> do I know...

WST (GMT + 8).

Comment 25 Ian Kent 2007-06-04 15:58:08 UTC
Created attachment 156089 [details]
Test patch for the alarm deadlock problem

Can you give this patch a try when you get a chance please.
I've ru it against the autofs connectathon test suite but
with only a single cpu machine.

Comment 26 Anders Blomdell 2007-06-05 14:59:26 UTC
Created attachment 156232 [details]
.tar file of various pieces to show presence of [possible] deadlock

I feel confident that it will deadlock (eventually). This is based on actual
traces of a short execution on my machine, and what I hope is a correct
analysis of the output of that trace. The following output from the
autofs_analyze tool on the automount.log file illustrates my point (last entry
in analyze.output, which is generated by './autofs_analyze automount.log'):

['0x8002fa68', '0x8004c2a4', '0x8002f740', '0x8004c2a4']
0xb7e886c0 :
   0x8002fa68			      0x8004c2a4			 
   &master_mutex		       &ap->state_mutex 		 
   master.c:1149 master_mount_mounts   master.c:1192 master_mount_mounts 
0xb7e87b90 :
   0x8002f740			      0x8004c2a4			 
   &mutex			       &ap->state_mutex 		 
   state.c:899 st_queue_handler        state.c:819 run_state_task	 
0xb7ab6b90 :
   0x8004c2a4			      0x8002f740			 
   &ap->state_mutex		       &mutex				 
   state.c:108 expire_cleanup	       state.c:876 st_set_done	   

The deadlock will occur beacuse tasks 0xb7e87b90 and 0xb7ab6b90 takes the
mutexes in reverse order. There are NO shortcuts to a proper lock ordering
(thats my experience after more than 20 years of realtime programming
experience), but it can take years for it to actually come and bite you...

It might be that some of the possible deadlocks are impossible due to lifetime
of tasks, but I wouldn't bet on it...

Comment 27 Ian Kent 2007-06-05 15:44:33 UTC
(In reply to comment #26)
> Created an attachment (id=156232) [edit]
> .tar file of various pieces to show presence of [possible] deadlock
> 
> I feel confident that it will deadlock (eventually). This is based on actual
> traces of a short execution on my machine, and what I hope is a correct
> analysis of the output of that trace. The following output from the
> autofs_analyze tool on the automount.log file illustrates my point (last entry
> in analyze.output, which is generated by './autofs_analyze automount.log'):
> 
> ['0x8002fa68', '0x8004c2a4', '0x8002f740', '0x8004c2a4']
> 0xb7e886c0 :
>    0x8002fa68			      0x8004c2a4			 
>    &master_mutex		       &ap->state_mutex 		 
>    master.c:1149 master_mount_mounts   master.c:1192 master_mount_mounts 
> 0xb7e87b90 :
>    0x8002f740			      0x8004c2a4			 
>    &mutex			       &ap->state_mutex 		 
>    state.c:899 st_queue_handler        state.c:819 run_state_task	 
> 0xb7ab6b90 :
>    0x8004c2a4			      0x8002f740			 
>    &ap->state_mutex		       &mutex				 
>    state.c:108 expire_cleanup	       state.c:876 st_set_done	   
> 
> The deadlock will occur beacuse tasks 0xb7e87b90 and 0xb7ab6b90 takes the
> mutexes in reverse order. There are NO shortcuts to a proper lock ordering
> (thats my experience after more than 20 years of realtime programming
> experience), but it can take years for it to actually come and bite you...
> 
> It might be that some of the possible deadlocks are impossible due to lifetime
> of tasks, but I wouldn't bet on it...
> 

Good work Anders, give me a bit to work through this but,
on the face of it, it looks like your analyzer is a smart
little cookie!

Ian




Comment 28 Anders Blomdell 2007-06-05 15:57:07 UTC
Created attachment 156238 [details]
lock reorder patch

Not smart enough :-(, with the attached patch I get a lot of warnings like:

['0x8004b6a4', '0x8002f660', '0x8002fa80', '0x8004c2a4', '0x8002f660']
0xb7df9b90 :
   0x8004b6a4			   0x8002f660			    0x8002fa80 
			      
   &ap->state_mutex		    &suc.mutex			    
&instance_mutex 			 
   automount.c:1165 handle_mounts   automount.c:1167 handle_mounts  
master.c:353 master_find_source_instance 
0xb7b64b90 :
   0x8004c2a4			   0x8002f660			    0x8002fa80 
			      
   &ap->state_mutex		    &suc.mutex			    
&instance_mutex 			 
   automount.c:1165 handle_mounts   automount.c:1167 handle_mounts  
master.c:353 master_find_source_instance 

Which I think are benign (even if I feel a bit queasy about them).
But I don't know if it is correct to temporarily drop the ap_lock and then
retake it in st_set_done (but this time after st_mutex_lock has been taken)

Comment 29 Anders Blomdell 2007-06-05 16:31:34 UTC
Created attachment 156240 [details]
new autofs_analyze

OK, modified autofs_analyze tool. It reports the obvious deadlock problems and
is quiet with the ones I think are benign. This really need to be cleaned up
and scrutinized if we should rely on it...

Comment 30 Ian Kent 2007-06-05 16:50:23 UTC
(In reply to comment #28)
> Created an attachment (id=156238) [edit]
> lock reorder patch
> 
> Not smart enough :-(, with the attached patch I get a lot of warnings like:
> 
> ['0x8004b6a4', '0x8002f660', '0x8002fa80', '0x8004c2a4', '0x8002f660']
> 0xb7df9b90 :
>    0x8004b6a4			   0x8002f660			    0x8002fa80 
> 			      
>    &ap->state_mutex		    &suc.mutex			    
> &instance_mutex 			 
>    automount.c:1165 handle_mounts   automount.c:1167 handle_mounts  
> master.c:353 master_find_source_instance 
> 0xb7b64b90 :
>    0x8004c2a4			   0x8002f660			    0x8002fa80 
> 			      
>    &ap->state_mutex		    &suc.mutex			    
> &instance_mutex 			 
>    automount.c:1165 handle_mounts   automount.c:1167 handle_mounts  
> master.c:353 master_find_source_instance 
> 
> Which I think are benign (even if I feel a bit queasy about them).
> But I don't know if it is correct to temporarily drop the ap_lock and then
> retake it in st_set_done (but this time after st_mutex_lock has been taken)


OK, I'll have a look there as well.

The other trace is quite interesting but I'm not sure that
deadlock is possible. The only thing I could see (assuming
output from the analyzer is correct, and it looks to be) is
a suggested deadlock between the st_queue_handler->st_run_task
and expire_cleanup->st_set_done. For a given ap (struct 
autofs_point, one for each mount) the st_run_task is run only
when a new task is present and doesn't dequeue another until that
one has finished, by calling st_set_done. At least that's the
way it's supposed to work. OTOH this assumption may be violated
at shutdown so I'll go through that and check. OTOOH I think
there really isn't any need to call st_task_done from under
the ap state_mutex so I'm going to move it out anyway.

So I guess what I'm saying is that maybe the analyzer picked
invocations of st_run_task and expire_cleanup that each
correspond to a different ap and hence different state_mutex.

Thanks for your effort, I appreciate it very much.
btw, I have no idea how you produced the automount.log, what
did you use for that?

Ian

Comment 31 Anders Blomdell 2007-06-05 17:03:50 UTC
> The other trace is quite interesting but I'm not sure that
> deadlock is possible. 
Neither am I, but the risk is that some very fast multiprocessor manages to
stumble over it :-)

> btw, I have no idea how you produced the automount.log, what
> did you use for that?
The hack.patch from the earlier .tar file (don't tell anyone I'm responsible for
that one :-) )

Good that you pointed to shutdown as an important thing to check, a shutdown
that exposed the following possible problems:

['0x8002fa68', '0x8002f660', '0x8004c2a4', '0x8002fa68']
0xb7e0b6c0 :
   0x8002fa68                         0x8002f660                    
   &master_mutex                       &suc.mutex                   
   master.c:1149 master_mount_mounts   master.c:999 master_do_mount 
0xb7b64b90 :
   0x8004c2a4                         0x8002f660                      
0x8002fa80                                
   &ap->state_mutex                    &suc.mutex                      
&instance_mutex                          
   automount.c:1165 handle_mounts      automount.c:1167 handle_mounts  
master.c:353 master_find_source_instance 
0xb7874b90 :
   0x8002fa68                               0x8004c2a4                         
     
   &master_mutex                             &ap->state_mutex                  
     
   master.c:939 master_notify_state_change   master.c:948
master_notify_state_change 

['0x8002fa68', '0x8002f660', '0x8004b6a4', '0x8002fa68']
0xb7e0b6c0 :
   0x8002fa68                         0x8002f660                    
   &master_mutex                       &suc.mutex                   
   master.c:1149 master_mount_mounts   master.c:999 master_do_mount 
0xb7df9b90 :
   0x8004b6a4                         0x8002f660                      
0x8002f7ac                        
   &ap->state_mutex                    &suc.mutex                      
&parse_mutex                     
   automount.c:1165 handle_mounts      automount.c:1167 handle_mounts  
nss_parse.y:142 parse_mutex_lock 
0xb7874b90 :
   0x8002fa68                               0x8004b6a4                         
     
   &master_mutex                             &ap->state_mutex                  
     
   master.c:939 master_notify_state_change   master.c:948
master_notify_state_change 

['0x8002fa68', '0x8002f660', '0x8004bc74', '0x8002fa68']
0xb7e0b6c0 :
   0x8002fa68                         0x8002f660                    
   &master_mutex                       &suc.mutex                   
   master.c:1149 master_mount_mounts   master.c:999 master_do_mount 
0xb7c65b90 :
   0x8004bc74                         0x8002f660                      
0x8002f7ac                        
   &ap->state_mutex                    &suc.mutex                      
&parse_mutex                     
   automount.c:1165 handle_mounts      automount.c:1167 handle_mounts  
nss_parse.y:142 parse_mutex_lock 
0xb7874b90 :
   0x8002fa68                               0x8004bc74                         
     
   &master_mutex                             &ap->state_mutex                  
     
   master.c:939 master_notify_state_change   master.c:948
master_notify_state_change 



Comment 32 Ian Kent 2007-06-05 17:21:33 UTC
(In reply to comment #31)
> > The other trace is quite interesting but I'm not sure that
> > deadlock is possible. 
> Neither am I, but the risk is that some very fast multiprocessor manages to
> stumble over it :-)

Maybe, you'd be talking about between the call to st_set_done and
state_mutex_unlock in expire_cleanup since I've set the next alarm
above it. But it's worse because the user can send a prune event with
a signal, so yes, I'll move the st_set_done out from under the lock.

I can't see any reason not to move it out but I'm still thinking
about it.

> 
> > btw, I have no idea how you produced the automount.log, what
> > did you use for that?
> The hack.patch from the earlier .tar file (don't tell anyone I'm responsible for
> that one :-) )

Oh yes, I missed that.

> 
> Good that you pointed to shutdown as an important thing to check, a shutdown
> that exposed the following possible problems:

And I'll have a look at these below.

snip ...

Ian


Comment 33 Ian Kent 2007-06-06 03:23:37 UTC
(In reply to comment #32)
> (In reply to comment #31)
> > > The other trace is quite interesting but I'm not sure that
> > > deadlock is possible. 
> > Neither am I, but the risk is that some very fast multiprocessor manages to
> > stumble over it :-)
> 
> Maybe, you'd be talking about between the call to st_set_done and
> state_mutex_unlock in expire_cleanup since I've set the next alarm
> above it. But it's worse because the user can send a prune event with
> a signal, so yes, I'll move the st_set_done out from under the lock.

Nop, this deadlock is not possible. As it stands the code is
independent of execution order and timing. There's no need
to move the st_set_done or add additional state mutex locking.

I think the reason for the false positive is that watch
mutex doesn't take account of the state queue handler
predicate (struct state_queue) control variables, task->busy
and task->done, and it assumes that the ap state_mutex
referred to in st_run_task is the same ap state_mutex
that is currently held. When st_run_task is called the
ap can be any ap except for one that has a currently
active task and the one that holds the state mutex is
clearly an active task.

Now, onto the shutdown.

Iam

Comment 34 Ian Kent 2007-06-06 06:37:08 UTC
(In reply to comment #31)
> Good that you pointed to shutdown as an important thing to check, a shutdown
> that exposed the following possible problems:
> 
> ['0x8002fa68', '0x8002f660', '0x8004c2a4', '0x8002fa68']
> 0xb7e0b6c0 :
>    0x8002fa68                         0x8002f660                    
>    &master_mutex                       &suc.mutex                   
>    master.c:1149 master_mount_mounts   master.c:999 master_do_mount 
> 0xb7b64b90 :
>    0x8004c2a4                         0x8002f660                      
> 0x8002fa80                                
>    &ap->state_mutex                    &suc.mutex                      
> &instance_mutex                          
>    automount.c:1165 handle_mounts      automount.c:1167 handle_mounts  
> master.c:353 master_find_source_instance 
> 0xb7874b90 :
>    0x8002fa68                               0x8004c2a4                         
>      
>    &master_mutex                             &ap->state_mutex                  
>      
>    master.c:939 master_notify_state_change   master.c:948
> master_notify_state_change 

I'm having a lot of trouble understanding how this sequence
could be a problem.

In order for master_notify_state_change to aquire the
master_mutex master_mount_mounts must have completed
in which case the master_mutex will have been released.
It is possible for the last mount (handle_mounts) to
not have completed it's startup at this time and if
there is an event which calls master_notify_state_change
and it is able to take the master_mutex it may need to
wait on an ap state_mutex if it's held.

It looks like the analysis tool doesn't take account of
the passage of time and so is a little inaccurate in its
reporting.

Ian


Comment 35 Anders Blomdell 2007-06-06 18:21:02 UTC
> I'm having a lot of trouble understanding how this sequence
> could be a problem.
So have I at a closer inspection, once more barking up the wrong tree...

Let's hope that all deadlock chances are removed.

Sorry to bother you with non-problems...


Comment 36 Ian Kent 2007-06-07 05:10:38 UTC
(In reply to comment #35)
> > I'm having a lot of trouble understanding how this sequence
> > could be a problem.
> So have I at a closer inspection, once more barking up the wrong tree...
> 
> Let's hope that all deadlock chances are removed.
> 
> Sorry to bother you with non-problems...
> 

Absolutely no problem.
There's no need to be sorry.
I welcome your efforts to find problems and it gives
me the opportunity to focus on validating specific
possible problem areas of the code.

I've tried to identify all the races and deadlocks
but Murphys' law rules, so time will tell.

Thanks
Ian


Comment 37 Ian Kent 2007-06-07 05:18:24 UTC
btw, I'm going to go ahead with the patch we have for
this so far as it's certainly a bug that needs to be
fixed.

Ian

Comment 38 Anders Blomdell 2007-06-07 07:58:09 UTC
Created attachment 156433 [details]
alternate implementation of alarm_handler

(In reply to comment #37)
> btw, I'm going to go ahead with the patch we have for
> this so far as it's certainly a bug that needs to be
> fixed.
If it's the patch to alarm.c, I have attached an alternate implementation of
alarm_handler:
  * no code duplication (less risk of reintroducing bug in one branch :-)
  * one extra call to pthread_cond_timedwait per expired alarm,
    but no call to time(), so I belive that performance would be 
    on par with current implementation

Comment 39 Ian Kent 2007-06-07 08:56:54 UTC
(In reply to comment #38)
> Created an attachment (id=156433) [edit]
> alternate implementation of alarm_handler
> 
> (In reply to comment #37)
> > btw, I'm going to go ahead with the patch we have for
> > this so far as it's certainly a bug that needs to be
> > fixed.
> If it's the patch to alarm.c, I have attached an alternate implementation of
> alarm_handler:
>   * no code duplication (less risk of reintroducing bug in one branch :-)
>   * one extra call to pthread_cond_timedwait per expired alarm,
>     but no call to time(), so I belive that performance would be 
>     on par with current implementation

I'm not sure about this alternate implementation.

Why incur the overhead of possibly unlocking the mutex,
waiting (I hope it's smarter than that) and locking the
mutex again for an alarm that's already passed.

We have to assume that the mutex will be unlocked
and then locked, even if the time has passed, since
that's the way condition waits are defined, so it's
possible for current to be invalid on return.

So it's not quite right as is.

Ian


Comment 40 Ian Kent 2007-06-07 09:09:09 UTC
(In reply to comment #39)
> (In reply to comment #38)
> > Created an attachment (id=156433) [edit] [edit]
> > alternate implementation of alarm_handler
> > 
> > (In reply to comment #37)
> > > btw, I'm going to go ahead with the patch we have for
> > > this so far as it's certainly a bug that needs to be
> > > fixed.
> > If it's the patch to alarm.c, I have attached an alternate implementation of
> > alarm_handler:
> >   * no code duplication (less risk of reintroducing bug in one branch :-)
> >   * one extra call to pthread_cond_timedwait per expired alarm,
> >     but no call to time(), so I belive that performance would be 
> >     on par with current implementation
> 
> I'm not sure about this alternate implementation.
> 
> Why incur the overhead of possibly unlocking the mutex,
> waiting (I hope it's smarter than that) and locking the
> mutex again for an alarm that's already passed.
> 
> We have to assume that the mutex will be unlocked
> and then locked, even if the time has passed, since
> that's the way condition waits are defined, so it's
> possible for current to be invalid on return.
> 
> So it's not quite right as is.

Also, what happens if the pthread_cond_timedwait is
signaled, not timed out and an alarm has been added
with a timeout before the alarm we're waiting for?

Ian


Comment 41 Anders Blomdell 2007-06-07 09:22:11 UTC
Created attachment 156437 [details]
new try on alarm_handler

In reply to comment #40)
> Also, what happens if the pthread_cond_timedwait is
> signaled, not timed out and an alarm has been added
> with a timeout before the alarm we're waiting for?
We start over from the top of the while loop. New version of alarm_handler,
addressing the [VERY] valid point of current to be the first element in the
alarm list (leading to out of sequence firing).

I probably should try to keep my mouth shut :-)

Comment 42 Ian Kent 2007-06-07 12:20:11 UTC
(In reply to comment #41)
> Created an attachment (id=156437) [edit]
> new try on alarm_handler
> 
> In reply to comment #40)
> > Also, what happens if the pthread_cond_timedwait is
> > signaled, not timed out and an alarm has been added
> > with a timeout before the alarm we're waiting for?
> We start over from the top of the while loop. New version of alarm_handler,
> addressing the [VERY] valid point of current to be the first element in the
> alarm list (leading to out of sequence firing).
> 

It looks like it's still not quite right.
But we're getting away from fixing a bug here now.

If you'd like to contribute to autofs please feel
welcome to join the autofs mailing list at
http://linux.kernel.org/mailman/listinfo/autofs

I prefer code submissions as unified diffs against
either the current release tarball (with all patches
from kernel.org applied) or the development git
repository.

Tarball:
http://www.kernel.org/pub/linux/daemons/autofs/v5/autofs-5.0.1.tar.gz

git:
git://git.kernel.org/pub/scm/linux/storage/autofs/autofs.git

Ian


Comment 43 Anders Blomdell 2007-06-07 13:12:30 UTC
(In reply to comment #42)
> I prefer code submissions as unified diffs against
> either the current release tarball (with all patches
> from kernel.org applied) or the development git
> repository
OK, will do. 

Comment 44 Bug Zapper 2008-04-04 07:19:38 UTC
Fedora apologizes that these issues have not been resolved yet. We're
sorry it's taken so long for your bug to be properly triaged and acted
on. We appreciate the time you took to report this issue and want to
make sure no important bugs slip through the cracks.

If you're currently running a version of Fedora Core between 1 and 6,
please note that Fedora no longer maintains these releases. We strongly
encourage you to upgrade to a current Fedora release. In order to
refocus our efforts as a project we are flagging all of the open bugs
for releases which are no longer maintained and closing them.
http://fedoraproject.org/wiki/LifeCycle/EOL

If this bug is still open against Fedora Core 1 through 6, thirty days
from now, it will be closed 'WONTFIX'. If you can reporduce this bug in
the latest Fedora version, please change to the respective version. If
you are unable to do this, please add a comment to this bug requesting
the change.

Thanks for your help, and we apologize again that we haven't handled
these issues to this point.

The process we are following is outlined here:
http://fedoraproject.org/wiki/BugZappers/F9CleanUp

We will be following the process here:
http://fedoraproject.org/wiki/BugZappers/HouseKeeping to ensure this
doesn't happen again.

And if you'd like to join the bug triage team to help make things
better, check out http://fedoraproject.org/wiki/BugZappers

Comment 45 Bug Zapper 2008-05-06 19:38:42 UTC
This bug is open for a Fedora version that is no longer maintained and
will not be fixed by Fedora. Therefore we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen thus bug against that version.

Thank you for reporting this bug and we are sorry it could not be fixed.


Note You need to log in before you can comment on or make changes to this bug.