Bug 517629 - Sequence id issue with nfs4/kerberos between RHEL kernel and Fedora 11
Summary: Sequence id issue with nfs4/kerberos between RHEL kernel and Fedora 11
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel
Version: 5.3
Hardware: x86_64
OS: Linux
medium
medium
Target Milestone: ---
: ---
Assignee: Jeff Layton
QA Contact: Jian Li
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2009-08-15 10:59 UTC by Louis Lagendijk
Modified: 2014-03-04 00:06 UTC (History)
18 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2011-07-21 10:19:41 UTC


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2011:1065 normal SHIPPED_LIVE Important: Red Hat Enterprise Linux 5.7 kernel security and bug fix update 2011-07-21 09:21:37 UTC

Description Louis Lagendijk 2009-08-15 10:59:31 UTC
Description of problem:
I am using nfsv4 from Fedora 11 towards a Centos-5 server. The server is mounted using NFSv4 with kerberos. The relevant part of the fstab looks as follows:
nest.pheasant:/home1    /home/home1             nfs4    sec=krb5        0 0
The problem appears since the Fedora 10 kernels (see BZ 476084) 

Version-Release number of selected component (if applicable):

kernel 2.6.18-162.el5xen (dzikus) but the same happens with stock kernels

How reproducible:
I have not found a way to reproduce the problem. It appears most often with Evolution

Actual results:
Fedora dmesg shows many lines like:
NFS: v4 server returned a bad sequence-id error on an unconfirmed sequence ffff8801afdd4e28!
NFS: v4 server returned a bad sequence-id error on an unconfirmed sequence ffff8801aedac628!
NFS: v4 server returned a bad sequence-id error on an unconfirmed sequence ffff8801aedac628!
NFS: v4 server returned a bad sequence-id error on an unconfirmed sequence ffff8801aedac228!
NFS: v4 server returned a bad sequence-id error on an unconfirmed sequence ffff8801afe72828!
NFS: v4 server returned a bad sequence-id error on an unconfirmed sequence ffff8801afe72e28!

and the Centos dmesg says:
NFSD: preprocess_seqid_op: bad seqid (expected 5764, got 5765)
NFSD: preprocess_seqid_op: bad seqid (expected 5764, got 5765)
NFSD: preprocess_seqid_op: bad seqid (expected 5764, got 5765)
NFSD: preprocess_seqid_op: bad seqid (expected 5764, got 5765)

Evolution hangs and needs a --force-shutdown to recover

Expected results:
No errors

Additional info:
On Fedora 9 the error also did show up, but the system recoved.
I have for a long time been using a self compiled 2.6.28 kernel on the Centos box for unrelated reasons. This kernel did no exhibit the problem. The issue seems to be rare: when I first switched to dzikus kernel a week ago, the problem seemed to have vanished. I now saw it happen twice within a day (on light personal use).

Comment 1 Jeff Layton 2009-08-17 13:01:45 UTC
Looks like this error is a dprintk on recent upstream kernels. Backporting 2fdada03b3876ab9f84ede160f187e888cafefb4 should silence it.

Still, would be nice to know what triggers this so we can determine if whether we have a client or server bug someplace that needs to be fixed.

Comment 2 Louis Lagendijk 2009-08-18 21:01:07 UTC
Hello Jeff,
- the problem does not appear when my server runs the 2.6.28 kernel
- the error message itself does not worry me, the hang on the client does. It happens primarily when evolution does a lot of disk IO, something I fail to understand as it uses dovecot, so I assume (no proof) that Evo updates it indexes etc., probably doing a lot of locking
- As said above, with Fedora 9 I saw the error messages, but no hangs (maybe Evo handled the situation more gracefully?). The systems DO recover, but more often than not, I need to force an Evo shutdown, everything else continues to work.
- I use a somewhat complex setup: NFSv4 + Kerberos

The first point above makes me believe this is a RHEL kernel bug, as 2.6.28 does not have the issue. Could the issue be related to Kerberos?

Comment 3 Louis Lagendijk 2009-10-31 14:29:54 UTC
Recent stock kernels (2.6.31.[45] and 2.6.31.5 show the problem as well. The hangings are there, but due to the demotion of the errors to dprintk() I do not see the error messages. I have no clue how to debug this....

Comment 4 Miguel Di Ciurcio Filho 2009-11-18 10:35:06 UTC
I've faced the same errors using a very similar setup. The server is CentOS 5.4 using NFS4+kerberos and clients are Fedora 9 up to 11 and Debian Lenny (2.6.26). We have our home directories on NFS using automount. It's very noticeable serious slowdowns when using Firefox.

Comment 5 Tim Nelson 2010-03-27 10:59:49 UTC
Just a note to let you know that I'm experiencing a similar problem.  

Client:

Linux blaise.nelson.org.au 2.6.29.6-1.rt23.4.fc11.ccrma.x86_64.rt #1 SMP PREEMPT RT Thu Sep 3 18:59:28 EDT 2009 x86_64 x86_64 x86_64 GNU/Linux

Server:

kernel 2.6.18-28 or so.  

I'm using LDAP instead of kerberos, though.  It prevents me from saving documents in OpenOffice.  Admittedly one of my documents is over 60,000 words.  

Thanks to whoever pointed out that the systems recover eventually.  I just need to practise my patience :).  

You'll note that my Fedora/client kernel is a ccrma realtime one, but unless others are doing that, that's probably not a relevant factor either.

Comment 6 Simone Caldana 2010-06-04 09:31:25 UTC
I am experiencing this as well, after upgrading to 2.6.18-200.el5 which I deployed trying to fix https://bugzilla.redhat.com/show_bug.cgi?id=551028

it is followed by (admittedly one hour later) 
Jun  4 09:18:07 fe01 kernel: NFS: v4 server returned a bad sequence-id error!
Jun  4 10:18:54 fe01 kernel: INFO: task java:28498 blocked for more than 120 seconds.
Jun  4 10:18:54 fe01 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jun  4 10:18:54 fe01 kernel: java          D 000036AF  1892 28498      1         28499 28497 (NOTLB)
Jun  4 10:18:54 fe01 kernel:        d66e2e84 00000086 4f174382 000036af f0ac4ac0 f8c4e734 00001000 00000008 
Jun  4 10:18:54 fe01 kernel:        d58ddaa0 4f4207d1 000036af 002ac44f 00000000 d58ddbac c1c09900 f66d9200 
Jun  4 10:18:54 fe01 kernel:        d66e2e90 00000000 0000000f d66e2e90 00aafb06 c042c9d5 f65b784c ffffffff 
Jun  4 10:18:54 fe01 kernel: Call Trace:
Jun  4 10:18:54 fe01 kernel:  [<f8c4e734>] nfs_pagein_one+0x224/0x243 [nfs]
Jun  4 10:18:54 fe01 kernel:  [<c042c9d5>] getnstimeofday+0x30/0xb6
Jun  4 10:18:54 fe01 kernel:  [<c061d9d6>] io_schedule+0x36/0x59
Jun  4 10:18:54 fe01 kernel:  [<c0459568>] sync_page+0x0/0x3b
Jun  4 10:18:54 fe01 kernel:  [<c04595a0>] sync_page+0x38/0x3b
Jun  4 10:18:54 fe01 kernel:  [<c061dae8>] __wait_on_bit_lock+0x2a/0x52
Jun  4 10:18:54 fe01 kernel:  [<c04594e3>] __lock_page+0x52/0x59
Jun  4 10:18:54 fe01 kernel:  [<c04362b8>] wake_bit_function+0x0/0x3c
Jun  4 10:18:54 fe01 kernel:  [<c045bf27>] filemap_nopage+0x265/0x34a
Jun  4 10:18:54 fe01 kernel:  [<c046527f>] __handle_mm_fault+0x178/0xa25
Jun  4 10:18:54 fe01 kernel:  [<c047efbf>] sys_stat64+0x1e/0x23
Jun  4 10:18:54 fe01 kernel:  [<c061fba6>] do_page_fault+0x23a/0x52d
Jun  4 10:18:54 fe01 kernel:  [<c061f96c>] do_page_fault+0x0/0x52d
Jun  4 10:18:54 fe01 kernel:  [<c0405a89>] error_code+0x39/0x40
Jun  4 10:18:54 fe01 kernel:  =======================

the java process is a tomcat, and this is a dev server, so not much heavy traffic.

Comment 7 Andrew Bartlett 2010-06-13 03:24:21 UTC
I get this from Fedora 13: 2.6.33.5-112.fc13.x86_64 (but it's been a problem ever since I moved to NFSv4).  It is reliably triggered by Evolution. 

Server is likewise CentOS 5: 2.6.18-194.3.1.el5

Is there anything I can usefully provide to assist in finding this?

Comment 8 Christopher Heiny 2010-07-28 20:26:06 UTC
Also occurring on Fedora 13 2.6.33.6-147.fc13.x86_64.  Should I file a separate bug for that?

Comment 9 Matt Bernstein 2010-10-04 10:35:06 UTC
We're seeing this too, on a 2.6.18-194.11.4.el5.x86_64 file server with about 200 up-to-date (64-bit) F13 clients. It only happens under heavy load, like at the beginning of the academic year.

We're not using kerberos.

I don't think it's locking-related since the errors seem to persist after trying "echo 0 > /proc/sys/fs/leases-enable" on the server. (This apparently disables locking and delegation.)

Comment 10 Jeff Layton 2010-10-20 13:08:08 UTC
What would probably be a good start is a wire capture of this. Preferably, start the capture, mount up the filesystem and do whatever is needed to make the problem happen and then kill it off.

If the capture is large or you're not comfortable attaching it to the bug, send me an email and I'll let you know a location where you can FTP it.

Comment 11 Jeff Layton 2010-10-20 13:34:35 UTC
Oh and if you use tcpdump to do the capture, please make sure you're using a snarflen that's large enough to capture most of the compound RPCs. '-s512' should be sufficient.

Comment 12 Dr J Austin 2010-10-22 14:22:04 UTC
I have had very similar problems (F13 to Centos 5.5 server, NFS4, Evolution, iptables enabled on the client. All fully updated)

Has everyone opened up the call back port on the client ?
Just a thought - it fixed it for me

John

Force the client to tell the server which port to use for call back
ja@milos ~ 1$ cat /etc/modprobe.d/nfs.conf 
options nfs callback_tcpport=801

Open up the port on the client
[root@milos ~]# cat /etc/sysconfig/iptables
...
-A INPUT -m state --state NEW -m tcp -p tcp --dport 801 -s "server_ip_addr" -j ACCEPT
...

Reboot client

Comment 13 Jeff Layton 2010-10-25 19:35:22 UTC
Thanks, that's an interesting point. I wouldn't think that blocking the callback port would cause this. It should just make it so that the server won't hand you delegations.

It may be that opening the callback port simply papered over the bug by allowing delegations to work and thereby reducing the number of state-morphing operations.

I'd still like to see a capture of this in action if anyone can provide one. Note that I'll need a binary one (i.e.: tcpdump -w foo.pcap), as I'll likely need to sift through it with wireshark.

Comment 14 Dr J Austin 2010-10-26 08:56:19 UTC
I should have added some more detail but I wasn't convinced my problem was
relevant

My Symptoms were

Following a clean boot of the client and only after login
of a user with home directory on the NFS4 server the server thows

maui kernel: nfs4_cb: server "client_ip_addr" not responding, timed out

No apparent problem until the user uses evolution
Then the directories "can" become unreadable and evolution is unusable
A totally repeatable method of causing evo to fail was to start creating
a new email, no entries in the header required, start typing into the
text window and when RETURN is hit the "bad seqid" error is given by the server.

The server shows errors of the form
Oct 13 11:14:55 maui kernel: NFSD: preprocess_seqid_op: bad seqid (expected 2787, got 2788)
Oct 13 11:14:55 maui kernel: NFSD: preprocess_seqid_op: bad seqid (expected 2, got 3)
Oct 13 11:15:45 maui kernel: NFSD: preprocess_seqid_op: bad seqid (expected 2787, got 2788

Disabling iptables solved the problem ...
Having fixed the callback port (which made it easy to monitor traffic)
wireshark shows the server talking to the client just after an NFS4 login
Is it this transaction that is of interest?

Comment 15 Jeff Layton 2010-10-26 10:45:07 UTC
(In reply to comment #14)

> Disabling iptables solved the problem ...
> Having fixed the callback port (which made it easy to monitor traffic)
> wireshark shows the server talking to the client just after an NFS4 login
> Is it this transaction that is of interest?

It sounds sort of like the server or client may be incrementing or not incrementing a seqid when it should.

It would be ideal to have a capture of the entire problem taking place, preferably from mount time until it manifests itself. That would include all traffic between client and server on the nfs port (2049) and the client's callback port.

What we'd probably need you to do is re-enable iptables so that the callback port is blocked, start up a capture prior to mounting and then reproduce the problem. Once the problem manifests, kill the capture.

Compress the resulting capture file and either attach it here if it's small enough or email me and I'll give you a location to FTP it if it's large.

Comment 16 Dr J Austin 2010-10-27 10:29:12 UTC
I have captured the transactions during a failure

http://www.vigla.eclipse.co.uk/wireshark_capture_files.tar.gz

Hopefully they are useful (and this is the same problem as the bugzilla!)

I will remove the file as soon as you have down loaded it

Comment 17 Jeff Layton 2010-10-27 16:26:05 UTC
(In reply to comment #16)

> I will remove the file as soon as you have down loaded it

Thanks, got them. Let me look over them and I'll see what I can find.

Comment 18 Jeff Layton 2010-10-27 16:49:21 UTC
I had a look and I'm afraid I don't see any BAD_SEQID errors in that capture. It looks like it may have been stopped too quickly?

The last packet in the port 2049 capture is at 10:12:44, but the first bad-seqid printk pop was at 10:13:04. If you can provide a more complete capture that may be helpful.

Comment 19 Dr J Austin 2010-10-28 11:22:36 UTC
Second attempt !

http://www.vigla.eclipse.co.uk/wireshark_mk2.tar.gz

Hopefully its OK now

I started the capture before the client booted
The file contains all the data but for the ypserv traffic

I do not see anything in wireshark that says "BAD_SEQID" but there
are quite a few re-transmission, lost segments, RCP dup ...

Comment 20 Jeff Layton 2010-10-29 14:42:52 UTC
Thanks, that helps...

Ok, so I see 2 BAD_SEQID errors in the capture in frames 16391 and 16405. Here's the whole sequence from open-to-close (which is where the error comes in):

...file is opened read-only:

16386  84.500290 148.197.29.246 -> 148.197.29.5 NFS V4 COMP Call <EMPTY> PUTFH;SAVEFH SAVEFH;OPEN OPEN;GETFH GETFH;GETATTR GETATTR;RESTOREFH RESTOREFH;GETATTR GETATTR
16387  84.500382 148.197.29.5 -> 148.197.29.246 NFS V4 COMP Reply (Call In 16386) <EMPTY> PUTFH;SAVEFH SAVEFH;OPEN OPEN;GETFH GETFH;GETATTR GETATTR;RESTOREFH RESTOREFH;GETATTR GETATTR

...attempt to get a write lock on the file, which fails with NFS4ERR_OPENMODE (seems like correct behavior here):

16388  84.500513 148.197.29.246 -> 148.197.29.5 NFS V4 COMP Call <EMPTY> PUTFH;LOCK LOCK
16389  84.500580 148.197.29.5 -> 148.197.29.246 NFS V4 COMP Reply (Call In 16388) <EMPTY> PUTFH;LOCK LOCK(10038)

...then we try to close the file and get a NFS4ERR_BAD_SEQID:

16390  84.500700 148.197.29.246 -> 148.197.29.5 NFS V4 COMP Call <EMPTY> PUTFH;CLOSE CLOSE;GETATTR GETATTR
16391  84.500778 148.197.29.5 -> 148.197.29.246 NFS V4 COMP Reply (Call In 16390) <EMPTY> PUTFH;CLOSE CLOSE(10026)

...it seems somewhat likely that the problem is related to bad lock state handling here, but I'll need to look over it more closely to understand whether it's a client or server problem.

Comment 21 Jeff Layton 2010-10-29 15:00:45 UTC
From RFC3530:

   The client MUST monotonically increment the sequence number for the
   CLOSE, LOCK, LOCKU, OPEN, OPEN_CONFIRM, and OPEN_DOWNGRADE
   operations.  This is true even in the event that the previous
   operation that used the sequence number received an error.  The only
   exception to this rule is if the previous operation received one of
   the following errors: NFS4ERR_STALE_CLIENTID, NFS4ERR_STALE_STATEID,
   NFS4ERR_BAD_STATEID, NFS4ERR_BAD_SEQID, NFS4ERR_BADXDR,
   NFS4ERR_RESOURCE, NFS4ERR_NOFILEHANDLE.

The seqid's from the client look correct:

    OPEN: 0x000003be
    LOCK: 0x000003bf
    CLOSE: 0x000003c0

...so this does look like a server bug. I'll have to look over the cause, but I suspect that the LOCK failure didn't cause the seqid on the server to get incremented like it should have been.

Comment 22 Jeff Layton 2010-10-29 15:16:12 UTC
Looks like we might need this patch:

commit 39325bd03fc16d903f1e0f51104436d939899c8c
Author: J. Bruce Fields <bfields@citi.umich.edu>
Date:   Mon Nov 26 17:06:39 2007 -0500

    nfsd4: fix bad seqid on lock request incompatible with open mode
    
    The failure to return a stateowner from nfs4_preprocess_seqid_op() means
    in the case where a lock request is of a type incompatible with an open
    (due to, e.g., an application attempting a write lock on a file open for
    read), means that fs/nfsd/nfs4xdr.c:ENCODE_SEQID_OP_TAIL() never bumps
    the seqid as it should.  The client, attempting to close the file
    afterwards, then gets an (incorrect) bad sequence id error.  Worse, this
    prevents the open file from ever being closed, so we leak state.
    
    Thanks to Benny Halevy and Trond Myklebust for analysis, and to Steven
    Wilton for the report and extensive data-gathering.
    
    Cc: Benny Halevy <bhalevy@panasas.com>
    Cc: Steven Wilton <steven.wilton@team.eftel.com.au>
    Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
    Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>

I'll plan to add this to my test kernels in the near future.

Comment 23 Jeff Layton 2010-10-29 20:10:54 UTC
Ok, the RHEL5 kernels on this page (jtltest.123) have this patch:

     http://people.redhat.com/jlayton/

...if anyone here is able to test these on their NFS server, please report the results.

Comment 24 Dr J Austin 2010-11-01 10:17:31 UTC
Server maui
ja@maui ~ 8$ uname -a
Linux maui.jaa.org.uk 2.6.18-230.el5.jtltest.123 #1 SMP Fri Oct 29 ...

I have run with above over the weekend and have not seen any
NFSD: preprocess_seqid_op: bad seqid errors(with client call back port closed)
nor any other ill effects !

Looking very good - but I am a very light, home user.

How long will it be before the fix is included in a "released" kernel?
(I'm lazy and I am using things such as
nvidia-graphics260.19.12-kmdl-2.6.18-194.17.4.el5-260.19.12-129.el5.x86_64 ATrpms.net on the server and currently have no "X").

Many thanks
John

Comment 25 Jeff Layton 2010-11-01 11:17:51 UTC
(In reply to comment #24)
> Server maui
> ja@maui ~ 8$ uname -a
> Linux maui.jaa.org.uk 2.6.18-230.el5.jtltest.123 #1 SMP Fri Oct 29 ...
> 
> I have run with above over the weekend and have not seen any
> NFSD: preprocess_seqid_op: bad seqid errors(with client call back port closed)
> nor any other ill effects !
> 
> Looking very good - but I am a very light, home user.
> 

Thanks for testing it. You'll probably want to open that port back up. Otherwise you don't get one of the main benefits of NFSv4 (delegations).

> How long will it be before the fix is included in a "released" kernel?

5.7 unless someone with a support contract wants to open a support case and agitate for earlier inclusion.

Comment 26 Klaus Steinberger 2010-12-01 16:13:50 UTC
This bug hits us too very hard, we have a large filerserver with NFSv4 and currently kernel-xen-2.6.18-194.26.1.el5.x86_64 and more than 100 clients.

There daily at least some clients which need a reboot after NFSD: preprocess_seqid_op: bad seqid errors

So a backport to 5.5 would be really nice.

Sincerly,
Klaus Steinberger

Comment 27 Jeff Layton 2010-12-01 16:39:43 UTC
(In reply to comment #26)
> This bug hits us too very hard, we have a large filerserver with NFSv4 and
> currently kernel-xen-2.6.18-194.26.1.el5.x86_64 and more than 100 clients.
> 
> There daily at least some clients which need a reboot after NFSD:
> preprocess_seqid_op: bad seqid errors
> 
> So a backport to 5.5 would be really nice.
> 

The patch should apply to 5.5-ish kernels without too much trouble. Currently this is on track for inclusion in 5.7.

If you want to see it earlier than that, then you'll need to open a support case, reference this bug and make a case for its inclusion into z-stream. It's a fairly small, targeted patch so it seems reasonable for inclusion there.

Comment 29 RHEL Product and Program Management 2011-02-01 16:54:55 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 33 yanfu,wang 2011-02-16 03:58:26 UTC
hi Jeff,
could customer help to test it since he hold the environment or give me some advices for how to verify the bug? thanks.

Comment 35 Jeff Layton 2011-02-16 11:41:38 UTC
(In reply to comment #33)
> hi Jeff,
> could customer help to test it since he hold the environment or give me some
> advices for how to verify the bug? thanks.

You certainly can ask them to help verify it (see comment #24 where he already did). I suspect that we can also write a reproducer for this that opens a file read-only on the mount, tries to get a write lock (which should then fail) and then closes the file.

Comment 37 Jarod Wilson 2011-02-21 20:55:57 UTC
in kernel-2.6.18-245.el5
You can download this test kernel (or newer) from http://people.redhat.com/jwilson/el5

Detailed testing feedback is always welcomed.

Comment 39 Jian Li 2011-06-13 06:58:40 UTC
Source is checked:
[root@beaker rhel5]# git log --oneline 2.6.18-244.el5...2.6.18-245.el5 | grep "nfsd4: fix"
33b9e69 [fs] nfsd4: fix seqid on lock req incompat w/open mode

This test use tips of comment 35, flock() instead of fcntl() would reproduce the bug. Details is listed below, reproduced in 2.6.18-126. This bug is caused by server.

[[client]]
** mount:
10.66.86.226:/ on /mnt/test type nfs4 (rw,addr=10.66.86.226)
** steps
cd /mnt/test
touch test
./flock test

[[server]]
** /etc/exports
/test *(rw,no_root_squash,fsid=0)
** dmesg
NFSD: preprocess_seqid_op: bad seqid (expected 3, got 4)
** other message appears in console, which is not in dmesg, maybe in syslog.

flock.c used in test:
** ** **
#include <stdio.h>
#include <fcntl.h>

int main(int argc, char **argv) {
	if (argc < 2 ) {
		printf("./flock file\n");
		return 0;
	}

	int fd = open(argv[1], O_RDONLY);
	if (fd <= 0) {
		perror("open");
		return 0;
	}

	if (flock(fd, LOCK_EX) == -1) {
		perror("flock");
		close(fd);
		return 0;
	}

	printf("test finish\n");
	close(fd);
	return 0;
}
** ** **

In 2.6.18-262, this message doesn't appear in dmesg. With comment 24, this bug is verified.

Comment 40 errata-xmlrpc 2011-07-21 10:19:41 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2011-1065.html


Note You need to log in before you can comment on or make changes to this bug.