Bug 204309 - kernel retries portmap query indefinitely when statd is down
kernel retries portmap query indefinitely when statd is down
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: kernel (Show other bugs)
4.4
i386 Linux
medium Severity medium
: ---
: ---
Assigned To: Jeff Layton
Brian Brock
:
: 448479 (view as bug list)
Depends On:
Blocks: 430698
  Show dependency treegraph
 
Reported: 2006-08-28 10:15 EDT by Alexey Roytman
Modified: 2008-07-24 15:11 EDT (History)
5 users (show)

See Also:
Fixed In Version: RHSA-2008-0665
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2008-07-24 15:11:47 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:


Attachments (Terms of Use)
Output of "tethereal -i any -w <the-file>" during login to KDE on problematic kernel (68.38 KB, application/octet-stream)
2007-01-07 04:46 EST, Alexey Roytman
no flags Details
fcntl() simulation that hangs on a "bad" kernels (788 bytes, text/plain)
2007-09-17 04:34 EDT, Alexey Roytman
no flags Details
proposed patch (944 bytes, patch)
2007-09-18 21:05 EDT, Jeff Layton
no flags Details | Diff
patch -- clean up soft task error handling (3.34 KB, patch)
2008-01-07 15:05 EST, Jeff Layton
no flags Details | Diff

  None (edit)
Description Alexey Roytman 2006-08-28 10:15:17 EDT
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.0.5) Gecko/20060726 Red Hat/1.5.0.5-0.el4.1 Firefox/1.5.0.5 pango-text

Description of problem:
After a massive upgrade of RHEL4 to Update 4 and a reboot, I was unable to log in into KDE environment. The ksplash stayed some time in "Initializing peripherals", then exited, and a blue screen was left with no further progress.

My credentials are stored on NIS server, and home directory on NFS server:
  mount -v
  FileServer:/users on /users type nfs (rw,soft,intr,timeo=100,rsize=32768,wsize=32768,nfsvers=3,addr=XXX.YYY.ZZZ.TTT)

When I've created a new user with local /etc/passwd entry with home directory on NFS server, I also could not log in. Also, a user with local credentials and local directory could log in, so the problem was in NFS layer.

I've rebooted with an older kernel (the one that I have had before the upgrade) and everything worked as it should.



Version-Release number of selected component (if applicable):
kernel-smp-2.6.9-34.0.1.EL

How reproducible:
Always


Steps to Reproduce:
1. Boot with kernel kernel-smp-2.6.9-42.0.2.EL
2. Log in into KDE environment with a username and password of a user that has NFS mounted home directory


Actual Results:
A blue/cyan screen is left with no activity.


Expected Results:
We must enter our/new environment.

Additional info:
$ uname -a
Linux frog 2.6.9-34.0.1.ELsmp #1 SMP Wed May 17 17:05:24 EDT 2006 i686 i686 i386 GNU/Linux

$ rpm -qa | egrep 'glibc|kdebase|kdelibs|kernel' | sort
glibc-2.3.4-2.25
glibc-common-2.3.4-2.25
glibc-devel-2.3.4-2.25
glibc-headers-2.3.4-2.25
glibc-kernheaders-2.4-9.1.98.EL
kdebase-3.3.1-5.13
kdelibs-3.3.1-3.14
kernel-smp-2.6.9-34.0.1.EL
kernel-smp-2.6.9-42.0.2.EL
kernel-utils-2.4-13.1.83

$ cat /etc/redhat-release
Red Hat Enterprise Linux ES release 4 (Nahant Update 4)

NFS server is Fedora Core 3:
   $ rpm -qa | egrep 'kernel|nfs|rpc'|sort
   kernel-smp-2.6.12-1.1381_FC3
   kernel-utils-2.4-13.1.49_FC3
   nfs-utils-1.0.6-52

NIS server is RedHat 7.2 (Enigma):
   $ rpm -qa | egrep 'kernel|nis|yp'|sort
   kernel-smp-2.4.7-10
   ypbind-1.8-1
   ypserv-1.3.12-2
   yp-tools-2.5-1
Comment 1 Hansjoerg Maurer 2006-09-08 05:17:12 EDT
Hi

we experienced a similar problem after an upgrade to RHEL4U4.
After the Update, a nastran run on the RHEL4U4 Client
fails to run the nastran analysis program.
The reason is an IO Error.
Nastran complains, that the data it reads ar not reasonable.

We did an strace on nastran in both cases and recogniced,
that after the Upgrade, a read operation reads only zero bytes, whereas
before the upgrade, it reads 32768 bytes at the same point of the nastran run.

The interesting part after the upgrade is here:
_llseek(12, 131072, [131072], SEEK_SET) = 0
read(12, "", 32768)               = 0

before the upgrade it looks like:
_llseek(12, 131072, [131072], SEEK_SET) = 0
read(12, "\1\0\0\0\30\0\0\0\0\0\0\0\22\0\0\21\0\0\0\0\0\0\0\0\0\0"..., 32768) =
32768

(longer parts are attached)

As you can see, that read in the file returns zero bytes.
If we mount the filesystem with -o sync or -o noac
the application starts working again.

Can you try this client side mount option in order to test, if it solves your
problem.
You can put it in an automount map...

cat /proc/mounts
rmcs33:/export /net/rmcs33/export nfs
rw,sync,nosuid,nodev,v3,rsize=8192,wsize=8192,hard,intr,tcp,lock,addr=rmcs33 0 0

But this option slows down all nfs operations.

There must have been a significant change in the nfs client caching behaviour,
which makes a write with a subsequent raid to fail in some cases.
My understanding of posix is
(according to the german read manpage), that a read operation after a write
operation inside one program has to return the new (written) data.



Greetings

Hansjörg




strace after upgrade
5602  write(13, "\0\0\0\0\0\0\0\0\0\0\0\0\n\0\0\0\0 \0\0s\0\0\0\n\0\0\0"...,
32768) = 32768
5602  _llseek(14, 0, [0], SEEK_SET)     = 0
5602  write(14, "\0\0\0\0\0\0\0\0\0\0\0\0\n\0\0\0\0 \0\0s\0\0\0\n\0\0\0"...,
32768) = 32768
5602  _llseek(15, 0, [0], SEEK_SET)     = 0
5602  write(15, "\0\0\0\0\0\0\0\0\0\0\0\0\n\0\0\0\0 \0\0s\0\0\0\n\0\0\0"...,
32768) = 32768
5602  _llseek(16, 0, [0], SEEK_SET)     = 0
5602  write(16, "\0\0\0\0\0\0\0\0\0\0\0\0\n\0\0\0\0 \0\0s\0\0\0\n\0\0\0"...,
32768) = 32768
5602  _llseek(12, 98304, [98304], SEEK_SET) = 0
5602  write(12, "\1\0\0\0\35\0\0\0\0\0\0\0\2\0\0\21PROJVERS\4\0\0001\f\0"...,
32768) = 32768
5602  write(12, "\1\0\0\0\30\0\0\0\0\0\0\0\22\0\0\21\0\0\0\0\0\0\0\0\0\0"...,
32768) = 32768
5602  _llseek(12, 131072, [131072], SEEK_SET) = 0
5602  read(12, "", 32768)               = 0


29552 write(13, "\0\0\0\0\0\0\0\0\0\0\0\0\n\0\0\0\0 \0\0s\0\0\0\n\0\0\0"...,
32768) = 32768
29552 _llseek(14, 0, [0], SEEK_SET)     = 0
29552 write(14, "\0\0\0\0\0\0\0\0\0\0\0\0\n\0\0\0\0 \0\0s\0\0\0\n\0\0\0"...,
32768) = 32768
29552 _llseek(15, 0, [0], SEEK_SET)     = 0
29552 write(15, "\0\0\0\0\0\0\0\0\0\0\0\0\n\0\0\0\0 \0\0s\0\0\0\n\0\0\0"...,
32768) = 32768
29552 _llseek(16, 0, [0], SEEK_SET)     = 0
29552 write(16, "\0\0\0\0\0\0\0\0\0\0\0\0\n\0\0\0\0 \0\0s\0\0\0\n\0\0\0"...,
32768) = 32768
29552 _llseek(12, 98304, [98304], SEEK_SET) = 0
29552 write(12, "\1\0\0\0\35\0\0\0\0\0\0\0\2\0\0\21PROJVERS\4\0\0001\f\0"...,
32768) = 32768
29552 write(12, "\1\0\0\0\30\0\0\0\0\0\0\0\22\0\0\21\0\0\0\0\0\0\0\0\0\0"...,
32768) = 32768
29552 _llseek(12, 131072, [131072], SEEK_SET) = 0
29552 read(12, "\1\0\0\0\30\0\0\0\0\0\0\0\22\0\0\21\0\0\0\0\0\0\0\0\0\0"...,
32768) = 32768
29552 _llseek(12, 131072, [131072], SEEK_SET) = 0
29552 write(12, "\1\0\0\0\370\37\0\0\0\0\0\0\22\0\0\21\0\0\0\0\0\0\0\0\0"...,
32768) = 32768
29552 _llseek(12, 131072, [131072], SEEK_SET) = 0
29552 read(12, "\1\0\0\0\370\37\0\0\0\0\0\0\22\0\0\21\0\0\0\0\0\0\0\0\0"...,
32768) = 32768
29552 write(12, "\1\0\0\0\256\0\0\0\0\0\0\0\2\0\0\21DBSPACE \4\0\0001\244"...,
32768) = 32768
Comment 3 Steve Dickson 2007-01-04 20:32:15 EST
Would it be possible to get at bzip2-ed binar tethereal network trace of this?
Something similar to:
    ethereal -w /tmp/data.pcap host <server> ; bzip2 /tmp/data.pcap
Comment 4 Alexey Roytman 2007-01-07 04:46:52 EST
Created attachment 144999 [details]
Output of "tethereal -i any -w <the-file>" during login to KDE on problematic kernel
Comment 5 Alexey Roytman 2007-01-07 04:50:05 EST
The current configuration was updated:
$ rpm -qa | egrep 'glibc|kdebase|kdelibs|kernel' | sort
glibc-2.3.4-2.25
glibc-common-2.3.4-2.25
glibc-devel-2.3.4-2.25
glibc-headers-2.3.4-2.25
glibc-kernheaders-2.4-9.1.98.EL
kdebase-3.3.1-5.13
kdelibs-3.3.1-6.RHEL4
kernel-smp-2.6.9-34.0.1.EL
kernel-smp-2.6.9-42.0.2.EL
kernel-smp-2.6.9-42.0.3.EL
kernel-utils-2.4-13.1.83

Still, I've tested the issue on kernel-smp-2.6.9-42.0.2.EL, and I do attach the
output of "tethereal -a any -w "tethereal.-i_any.dump" during the login to KDE
environment. For security, I've replaced internal info (like encrypted
passwords, additional user and private names) with 'X'-es.

The server's IP is 10.1.1.137, local machine's IP is 10.1.1.124.
Comment 6 Alexey Roytman 2007-01-07 04:56:27 EST
Is this enough, or you wanted something different ?
Comment 7 Hansjoerg Maurer 2007-01-08 01:27:11 EST
Hi

we had a problem which seems to be related to that.
Due to our busines contract with redhat they provided a testkernel, 
which solved the problem.

The statement of techsupport concerning the reason of the problem was:

This is an issue related with caching. This has been fixed in the test kernel
which I have provided. I am reporting the same to Engineering Team. This patch
should be released with RHEL 4 update 5. I would let you know when this would be
released

Changelog

-fix NFS cache consistency when a file is mmap'ed (Jeff Layton) [186142]

Unfortnutaly Bug 186142 is a private bug, not visible with bugzilla

The testkernel was:

kernel-smp-2.6.9-42.22.EL.i686.rpm

Maybe redhat can provide a link to download this or a newer version
which includes the fix .

Then you could test, if it is the same problem.

Greeting

Hansjörg
Comment 8 Alexey Roytman 2007-01-08 02:49:04 EST
Hansjörg, thanks for the information.
The issue is not urgent for me, I use older kernel.
But I will recall that I need either kernel later than kernel-smp-2.6.9-42.22,
or the one that fixes Bug 186142.

Thank you. 
Comment 9 Steve Dickson 2007-01-08 10:11:17 EST
Jason,

Can you post a point to a newer kernel for these guys to test
against? 
Comment 11 RHEL Product and Program Management 2007-05-09 05:46:24 EDT
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.
Comment 12 RHEL Product and Program Management 2007-09-07 15:43:50 EDT
This request was previously evaluated by Red Hat Product Management
for inclusion in the current Red Hat Enterprise Linux release, but
Red Hat was unable to resolve it in time.  This request will be
reviewed for a future Red Hat Enterprise Linux release.
Comment 13 Jeff Layton 2007-09-10 07:44:35 EDT
From the comments above, it sounds like this was fixed in RHEL4.5. Could you
install kernel -55.EL or later and verify whether that is the case?
Comment 14 Alexey Roytman 2007-09-16 03:57:32 EDT
I've done update for all packages, and tried kernel-smp-2.6.9-55.0.6.EL, but it
still has the problem.
Comment 15 Jeff Layton 2007-09-16 06:57:13 EDT
I see a lot of these errors in the capture:

1398 147.740090    127.0.0.1 -> 127.0.0.1    Portmap V2 GETPORT Call
STAT(100024) V:1 UDP
1399 147.740633    127.0.0.1 -> 127.0.0.1    Portmap V2 GETPORT Reply (Call In
1398) PROGRAM_NOT_AVAILABLE

Can you run the following on the client and post the output?

# ps -ef | grep rpc.statd
# rpcinfo -p

...it seems like a missing statd shouldn't make things hang, but it's the only
thing NFS-related that I see that's amiss...

From an NFS standpoint, the last calls I see concern this filehandle:

919  30.276934   10.1.1.124 -> 10.1.1.137   NFS V3 LOOKUP Call,
DH:0xde4723bf/.qtrc.lock
920  30.277132   10.1.1.137 -> 10.1.1.124   NFS V3 LOOKUP Reply (Call In 919),
FH:0x37843785
...
936  30.973065   10.1.1.124 -> 10.1.1.137   NFS V3 GETATTR Call, FH:0x37843785
937  30.973255   10.1.1.137 -> 10.1.1.124   NFS V3 GETATTR Reply (Call In 936) 
Regular File mode:0600 uid:1001 gid:10
938  30.973335   10.1.1.124 -> 10.1.1.137   TCP mdbs_daemon > nfs [ACK]
Seq=39888 Ack=127164 Win=18876 Len=0 TSV=42005 TSER=480933153

...after that, there are no more NFS calls. A minute and a half or so later, the
sockets get closed:

1279 124.428741   10.1.1.124 -> 10.1.1.137   TCP 799 > nfs [FIN, ACK] Seq=0
Ack=0 Win=1460 Len=0 TSV=135470 TSER=480726606
1280 124.428840   10.1.1.137 -> 10.1.1.124   TCP nfs > 799 [FIN, ACK] Seq=0
Ack=1 Win=1448 Len=0 TSV=481026623 TSER=135470
1281 124.428895   10.1.1.124 -> 10.1.1.137   TCP 799 > nfs [ACK] Seq=1 Ack=1
Win=1460 Len=0 TSV=135470 TSER=481026623
1282 124.467721   10.1.1.124 -> 10.1.1.125   TCP 798 > nfs [FIN, ACK] Seq=0
Ack=0 Win=1460 Len=0 TSV=135509 TSER=34303508
1283 124.467904   10.1.1.125 -> 10.1.1.124   TCP nfs > 798 [ACK] Seq=0 Ack=1
Win=10136 Len=0 TSV=34333504 TSER=135509
1284 124.468150   10.1.1.125 -> 10.1.1.124   TCP nfs > 798 [FIN, ACK] Seq=0
Ack=1 Win=10136 Len=0 TSV=34333504 TSER=135509
1285 124.468199   10.1.1.124 -> 10.1.1.125   TCP 798 > nfs [ACK] Seq=1 Ack=1
Win=1460 Len=0 TSV=135509 TSER=34333504
Comment 16 Alexey Roytman 2007-09-16 07:19:18 EDT
I cannot reboot the machine now, so I've done it on the good kernel:
kernel-smp-2.6.9-34.0.1.EL

1. "ps -ef | grep rpc.statd | grep -v grep" gives an empty output
2. "rpcinfo -p" gives:
   program vers proto   port
    100000    2   tcp    111  portmapper
    100000    2   udp    111  portmapper
    100021    1   udp  32768  nlockmgr
    100021    3   udp  32768  nlockmgr
    100021    4   udp  32768  nlockmgr
    100021    1   tcp  32772  nlockmgr
    100021    3   tcp  32772  nlockmgr
    100021    4   tcp  32772  nlockmgr
    100007    2   udp    970  ypbind
    100007    1   udp    970  ypbind
    100007    2   tcp    973  ypbind
    100007    1   tcp    973  ypbind

If you need the output from the problematic kernel strictly, reply, and I'll
post the data later, after reboot.
Comment 17 Alexey Roytman 2007-09-16 07:20:56 EDT
Also, even on good kernel, the /var/log/messages has a plenty of:
  Sep 16 10:47:56 frog kernel: lockd: cannot monitor 10.1.1.137
  Sep 16 10:47:56 frog kernel: lockd: failed to monitor 10.1.1.137
(The 10.1.1.137 is the NFS server.)
Comment 18 Jeff Layton 2007-09-16 07:42:41 EDT
What version of nfs-utils are you using?

# rpm -q nfs-utils
Comment 19 Alexey Roytman 2007-09-16 07:58:14 EDT
There is no nfs-utils package installed.

  $ rpm -qa | grep -i nfs
  nfs-utils-lib-1.0.6-8
  system-config-nfs-1.2.8-1
Comment 20 Jeff Layton 2007-09-16 10:38:53 EDT
Ok. The statd thing is probably red herring anyway...

From the capture, it looks like the server is doing everything it should, so any
problem is likely on the client. Unfortunately, a KDE login is a pretty complex
process. We'll need to narrow down to a single process (or processes) that's
actually hanging, and understand what they're waiting for...

Once you do narrow it down to a single process, collect several snapshots of
sysrq-t output from the kernel and get a stack trace of the hung processes. That
should help us to determine what they're doing (presuming they're in kernel
space) and whether they're actually hung...

If you're need help doing this, then I'd suggest opening a support case. Our
support people should be able to help you narrow this down. If you do this, be
sure to mention that you've opened this BZ case as well.
Comment 21 Alexey Roytman 2007-09-17 04:33:19 EDT
Well, I've traced the KDE login, and found that ksplash was using a lock file,
and was hanging. Then I've replaced ksplash binary with a wrapper with strace
usage, and found the problematic call:
    open("/users/alexey/.qt/.qtrc.lock", O_RDWR|O_CREAT|O_LARGEFILE, 0600) = 4
    fcntl64(4, F_SETLKW64, {type=F_RDLCK, whence=SEEK_SET, start=0, len=0}
The latter call hangs.

I've written a small C program that ensures that I'm right, and it also hangs on
fcntl() call. On a "good" kernel it shows:
   ./a.out: pid=3600
   testing /users/alexey/lockfile...
   fcntl (may hang!)...
   fcntl (did not hang, no bug) ==> -1 (37:No locks available)
On a "bad" kernel it hangs:
   ./a.out: pid=3600
   testing /users/alexey/lockfile...
   fcntl (may hang!)...

I've done Ctrl-ScrollLock and obtained the following information:

  a.out         D 00000000  1628  5383   5222                     (NOTLB)
  d53b0b40 00000082 00000003 00000000 d72b0280 00000247 00000000 e0b677eb
       00000000 d7bb1180 00000246 c140fde0 00000001 0000b041 37cb1193 000001ec
       dfe310b0 decfc1d0 decfc33c d72b0300 c0120458 d72b0280 d72b0300 d53b0b6c
  Call Trace:
   [<e0b677eb>] rpc_run_child+0xd4/0xe2 [sunrpc]
   [<c0120458>] prepare_to_wait+0x12/0x4c
   [<e0b67005>] __rpc_execute+0x1a0/0x326 [sunrpc]
   [<c012052d>] autoremove_wake_function+0x0/0x2d
   [<c012052d>] autoremove_wake_function+0x0/0x2d
   [<c012052d>] autoremove_wake_function+0x0/0x2d
   [<e0b635de>] rpc_call_sync+0x68/0x8c [sunrpc]
   [<e0b3f507>] nsm_mon_unmon+0xaf/0xd0 [lockd]
   [<e0b3f555>] nsm_monitor+0x2d/0x56 [lockd]
   [<e0b3beaf>] nlmclnt_lock+0x22/0xc7 [lockd]
   [<e0b3b925>] nlmclnt_proc+0x217/0x2bf [lockd]
   [<e0bcc4f8>] do_setlk+0xc3/0x186 [nfs]
   [<e0bcc5bb>] nfs_lock+0x0/0xe2 [nfs]
   [<c016ec06>] fcntl_setlk+0x169/0x2ec
   [<c01fdc61>] tty_ldisc_deref+0x50/0x5f
   [<c01fea00>] tty_write+0x258/0x262
   [<c020322d>] write_chan+0x0/0x1d0

Hope this helps.
Comment 22 Alexey Roytman 2007-09-17 04:34:23 EDT
Created attachment 197201 [details]
fcntl() simulation that hangs on a "bad" kernels
Comment 23 Jeff Layton 2007-09-18 09:07:34 EDT
Thanks for the reproducer. I'll give it a go and let you know what I find.
Comment 24 Jeff Layton 2007-09-18 10:15:41 EDT
I'm seeing a similar hang, but stack trace looks a bit different. That's
probably attributable to the architecture change (x86_64 in my case):

lock-nfs-bug  D 0000000000000000     0 11168  11069                     (NOTLB)
ffffff8003c17918 0000000000000286 ffffff801f5777f0 0000007300000000 
       ffffff801ecde7f0 0000000000002062 00011bae13161699 ffffffff80322140 
       ffffff801ecdea88 0000000000000000 
Call Trace:<ffffffffa00a4718>{:sunrpc:__rpc_atrun+0}
<ffffffffa00a4718>{:sunrpc:__rpc_atrun+0} 
       <ffffffffa00a421b>{:sunrpc:rpc_sleep_on+50}
<ffffffffa00a4db6>{:sunrpc:__rpc_execute+462} 
       <ffffffff8012dadc>{autoremove_wake_function+0}
<ffffffff8012dadc>{autoremove_wake_function+0} 
       <ffffffffa00a079c>{:sunrpc:rpc_call_sync+114}
<ffffffffa013ee13>{:lockd:nsm_mon_unmon+299} 
       <ffffffffa00a0224>{:sunrpc:rpc_create_client+548}
<ffffffffa013eec3>{:lockd:nsm_monitor+50} 
       <ffffffffa013afb6>{:lockd:nlmclnt_proc+900}
<ffffffffa015307c>{:nfs:nfs_lock+980} 
       <ffffffff8018a066>{fcntl_setlk+305} <ffffffff8029454a>{thread_return+0} 
       <ffffffff802945a2>{thread_return+88} <ffffffff801863cb>{sys_fcntl+736} 
       <ffffffff8010d636>{system_call+134} <ffffffff8010d5b0>{system_call+0} 
Comment 25 Jeff Layton 2007-09-18 10:33:52 EDT
Stack trace seems to indicate that the client is getting hung up while trying to
set up statd monitoring. Starting statd seems to work around it (i.e., install
the nfs-utils package and do a service nfslock start). I don't think statd is
considered to be required, but it's highly recommended to use it. Without it,
locking won't be reliable if the server reboots.

Regardless though, the fcntl call shouldn't hang indefinitely like this.
Comment 26 Peter Staubach 2007-09-18 11:04:02 EDT
I might suggest that statd is required for proper locking support.  It
is a little more than just recommended.  Applications use locking for
consistency purposes and invisibly losing a lock, because the statd
was not there to recover the lock, would be considered a very bad
thing, tantamount to data corruption.
Comment 27 Jeff Layton 2007-09-18 11:10:25 EDT
Good point. Perhaps then we should just shoot for having a missing statd handled
in a more graceful way. Maybe when statd isn't up we should try to have locking
calls return an error of some sort?

I'm also checking this against rawhide to see what the behavior is like there...
Comment 28 Alexey Roytman 2007-09-18 11:21:23 EDT
Citating myself:
   fcntl (did not hang, no bug) ==> -1 (37:No locks available)
Comment 29 Jeff Layton 2007-09-18 14:59:42 EDT
I think the issue is centered around some changes that went in that changed
pmap_getport_done(). In the -34-ish timeframe, when the client wasn't able to
get a port, the call would be aborted and an error returned. With the
introduction of that patch, the portmap call seems to retry indefinitely. I'm
still tracking down the BZ that made that change.
Comment 30 Jeff Layton 2007-09-18 15:17:39 EDT
I think the patch that introduced this is the one for bug #172082.

The main issue is this part in pmap_getport_done:

 	} else if (clnt->cl_port == 0) {
 		/* Program not registered */
 		task->tk_status = -EACCES;
-		task->tk_action = NULL;

...taking that tk_action=NULL out seems to be what causes the nsm rpc call to
retry talking to the portmapper indefinitely. I think this part of that that
patch may not actually be needed to fix the originally reported problem in that
BZ...


When I replaced both of those lines with:

         rpc_exit(task, EACCES);

...the test program errored out with ENOLCK immediately when statd was down.
After shutting down portmapper, it hung (which seems to be what we want to
happen, according to bug #172082).
Comment 31 Jeff Layton 2007-09-18 21:05:08 EDT
Created attachment 198961 [details]
proposed patch

This patch seems to fix it, but I'm not clear on whether it might cause the
earlier BZ to regress somehow. It doesn't seem to, but it needs to have a bit
more testing before I'm comfortable with it.

In any case, I've built some test kernels with this patch and put them on my
people page:

http://people.redhat.com/jlayton

...please test them and let me know if that fixes the issue for you.

Note that as Peter pointed out, statd is really not an optional component here.
So while this may make lock requests error out when statd isn't running, you
most certainly don't want to run your hosts without it...
Comment 32 Alexey Roytman 2007-09-23 04:16:30 EDT
The patch fixes the problem I've reported.

$ uname -a
Linux frog 2.6.9-59.EL.jtltest.18smp #1 SMP Tue Sep 18 16:16:25 EDT 2007 i686
i686 i386 GNU/Linux
Comment 35 Jeff Layton 2007-09-24 13:40:41 EDT
Steve had a question about why/how this works...

When setting up the statd client, the code does a call_allocate, which calls
call_bind. call_bind does this:

        task->tk_action = call_connect;
        if (!clnt->cl_port) {
                task->tk_action = call_bind_status;
                task->tk_timeout = RPC_CONNECT_TIMEOUT;
                rpc_getport(task, clnt);
        }

If cl_port is non-zero (indicating a successful portmap query), then the next
call is call_connect. If the cl_port is 0, then we fall into the if statement
here. rpc_getport is done and then rpciod calls call_bind_status to handle the
result. For most error conditions (including -EACCES) call_bind_status does a
"goto retry_bind;". That does:

retry_bind:
        task->tk_status = 0;
        task->tk_action = call_bind;
        return;


...and so we fall back into call_bind again. So in this situation, where the
portmapper is responding but statd isn't up, we'll continuously loop with:

call_bind
rpc_getport
pmap_getport_done
call_bind_status

(...there are some other calls in between, but you get the idea...)

Until statd comes up. This patch breaks that cycle by having rpc_getport_done
set task->tk_action=NULL which makes rpciod return -EACCES back to the caller.
Comment 36 Steve Dickson 2007-10-03 09:50:28 EDT
> This patch breaks that cycle by having rpc_getport_done
> set task->tk_action=NULL which makes rpciod return -EACCES back to the caller.
Ok, and why is this a good thing? I would think as long as the call is 
interpretable (when using the -o intr) the bind loop should continue.

What happens if the portmapper is just slow to respond
(because its single threaded and its in the middle of a
mount storm)?



Comment 37 Jeff Layton 2007-10-03 09:58:54 EDT
> Ok, and why is this a good thing? I would think as long as the call is 
> interpretable (when using the -o intr) the bind loop should continue.

If the situation were one that was likely to resolve itself in time then I'd
agree with you. But I'd submit that statd being not registered with the
portmapper is not one that's likely to be resolved by just waiting long enough.
It likely means that statd just wasn't started. Returning with an error in that
situation would seem to be OK. It's what upstream does and should cue the user
that something is wrong with locking.

> What happens if the portmapper is just slow to respond
> (because its single threaded and its in the middle of a
> mount storm)?

That situation won't be affected by this. This only affects when the portmapper
has responded and said "no such service". If the portmapper isn't responding
then it will continue to loop as it does today.
Comment 39 Jeff Layton 2007-11-05 10:55:30 EST
I've rescinded this patch as it could cause lock reclamation to fail (a'la RHEL5
bug 240976). Currently looking at upstream behavior -- perhaps we can turn these
situations into an RPC timeout event and thereby make the behavior contingent
upon soft/hard settings.
Comment 40 Jeff Layton 2008-01-07 15:05:14 EST
Created attachment 291005 [details]
patch -- clean up soft task error handling

Backported upstream patch to clean up soft task error handling.
Comment 41 Jeff Layton 2008-01-07 16:19:23 EST
Patch seems to make RHEL4's behavior consistent with how RHEL5 now behaves with
the patch for bug 240976. The lock attempt times out after ~30 seconds of trying
to query statd.

I want to do a bit more testing, but it looks good so far...
Comment 43 Vivek Goyal 2008-02-26 17:09:28 EST
Committed in 68.13. RPMS are available at http://people.redhat.com/vgoyal/rhel4/
Comment 45 Jeff Layton 2008-06-10 06:19:45 EDT
*** Bug 448479 has been marked as a duplicate of this bug. ***
Comment 48 errata-xmlrpc 2008-07-24 15:11:47 EDT
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2008-0665.html

Note You need to log in before you can comment on or make changes to this bug.