Bug 476084 - kernel: NFS: v4 server returned a bad sequence-id error on an unconfirmed sequence ffff880052c57228!
kernel: NFS: v4 server returned a bad sequence-id error on an unconfirmed seq...
Status: CLOSED NOTABUG
Product: Fedora
Classification: Fedora
Component: kernel (Show other bugs)
10
x86_64 Linux
low Severity high
: ---
: ---
Assigned To: Steve Dickson
Fedora Extras Quality Assurance
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2008-12-11 15:29 EST by Louis Lagendijk
Modified: 2009-08-15 07:21 EDT (History)
2 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2009-08-12 15:40:47 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Louis Lagendijk 2008-12-11 15:29:51 EST
Description of problem:
I am using nfsv4 towards a Centos-5 server. The server is mounted using kerberos. The relevant part of the fstab looks as follows:
nest.pheasant:/home1    /home/home1             nfs4    sec=krb5        0 0

At some point in time (mainly after some heavy nfs activity) the kernel keeps repeating "NFS: v4 server returned a bad sequence-id error" It does not recover. Reboot seems to be the only recovery possible.

The Fedora 9 kernel occasionally also gives this type of errors, but that kernel does recover from the error, while the fedora 10 kernel does not.

Version-Release number of selected component (if applicable):
kernel-2.6.27.7-134.fc10.x86_64
nfs-utils-1.1.4-4.fc10.x86_64

How reproducible:
Some heavy nfs activity. I noticed this error especially while installing a virtual machine in VirtualBox, but I have also seen the errors while using evolution 

Steps to Reproduce:
1. heavy nfs activity
2.
3.
  
Actual results:
errors keep repeating in /var/log/messages

Expected results:
system recovers (or even better, no such errors at all)

Additional info:
Comment 1 Louis Lagendijk 2008-12-11 16:40:58 EST
I forgot to show the error message on the server side. Here are the errors that the Centos 5 server reports (from a recent Fedora 9 client, as I am back to Fedora 9 for the time being as Fedora 10 is not usable):

NFSD: preprocess_seqid_op: bad seqid (expected 8305, got 8306)
NFSD: preprocess_seqid_op: bad seqid (expected 2, got 3)
NFSD: preprocess_seqid_op: bad seqid (expected 2, got 3)
NFSD: preprocess_seqid_op: bad seqid (expected 2, got 3)
NFSD: preprocess_seqid_op: bad seqid (expected 8305, got 8306)
NFSD: preprocess_seqid_op: bad seqid (expected 8305, got 8306)
NFSD: preprocess_seqid_op: bad seqid (expected 8305, got 8306)
NFSD: preprocess_seqid_op: bad seqid (expected 8305, got 8306)
NFSD: preprocess_seqid_op: bad seqid (expected 8305, got 8306)
NFSD: preprocess_seqid_op: bad seqid (expected 8305, got 8306)
NFSD: preprocess_seqid_op: bad seqid (expected 8305, got 8306)
NFSD: preprocess_seqid_op: bad seqid (expected 8305, got 8306)
NFSD: preprocess_seqid_op: bad seqid (expected 8305, got 8306)
NFSD: preprocess_seqid_op: bad seqid (expected 8305, got 8306)
NFSD: preprocess_seqid_op: bad seqid (expected 8305, got 8306)
NFSD: preprocess_seqid_op: bad seqid (expected 8305, got 8306)
NFSD: preprocess_seqid_op: bad seqid (expected 8305, got 8306)
NFSD: preprocess_seqid_op: bad seqid (expected 8305, got 8306)
NFSD: preprocess_seqid_op: bad seqid (expected 8305, got 8306)
NFSD: preprocess_seqid_op: bad seqid (expected 8305, got 8306)
NFSD: preprocess_seqid_op: bad seqid (expected 8305, got 8306)
NFSD: preprocess_seqid_op: bad seqid (expected 8305, got 8306)
NFSD: preprocess_seqid_op: bad seqid (expected 8305, got 8306)
NFSD: preprocess_seqid_op: bad seqid (expected 8305, got 8306)
NFSD: preprocess_seqid_op: bad seqid (expected 8305, got 8306)
NFSD: preprocess_seqid_op: bad seqid (expected 8305, got 8306)
NFSD: preprocess_seqid_op: bad seqid (expected 8305, got 8306)
Comment 2 Louis Lagendijk 2008-12-27 07:54:03 EST
With the latest kernel :
Linux travel.pheasant 2.6.27.9-159.fc10.x86_64 #1 SMP Tue Dec 16 14:47:52 EST 2008 x86_64 x86_64 x86_64 GNU/Linux
the hangings disappeared. this may be related to the kernel update on my Centos server machine taht now runs:
Linux nest.pheasant 2.6.18-92.1.22.el5 #1 SMP Tue Dec 16 11:57:43 EST 2008 x86_64 x86_64 x86_64 GNU/Linux
Comment 3 Steve Dickson 2008-12-29 10:25:37 EST
So we can close this bug?
Comment 4 Louis Lagendijk 2008-12-29 10:59:05 EST
The bug hit again today. A reboot solved the issue. I still do not know exactly how to trigger it. Here is an exempt from the Fedora 10 messages file:
Dec 29 10:27:13 travel ntpd[3052]: synchronized to 81.171.44.131, stratum 2
Dec 29 10:27:11 travel ntpd[3052]: time reset -1.518701 s
Dec 29 10:27:11 travel ntpd[3052]: kernel time sync status change 0001
Dec 29 10:30:51 travel ntpd[3052]: synchronized to 81.171.44.131, stratum 2
Dec 29 10:58:01 travel yum: Updated: gstreamer-plugins-ugly-0.10.10-1.fc10.x86_64
Dec 29 12:01:04 travel kernel: NFS: v4 server returned a bad sequence-id error on an unconfirmed sequence ffff880053409a28!
Dec 29 12:01:04 travel kernel: NFS: v4 server returned a bad sequence-id error on an unconfirmed sequence ffff880053409628!
Dec 29 12:01:04 travel kernel: NFS: v4 server returned a bad sequence-id error on an unconfirmed sequence ffff880053409a28!
Dec 29 12:01:04 travel kernel: NFS: v4 server returned a bad sequence-id error on an unconfirmed sequence ffff880053409628!
Dec 29 12:01:04 travel kernel: NFS: v4 server returned a bad sequence-id error on an unconfirmed sequence ffff880053409a28!
and then it goes on and on.....


and this is what the Centos logfile has:
Dec 29 11:48:27 nest nmbd[4557]:   Cannot get workgroup name.
Dec 29 11:56:25 nest kernel: NFSD: preprocess_seqid_op: bad seqid (expected 27286, got 27287)
Dec 29 11:56:25 nest kernel: NFSD: preprocess_seqid_op: bad seqid (expected 2, got 3)
Dec 29 12:01:02 nest xinetd[4160]: START: hotwayd pid=8583 from=10.0.0.1
it repeats the 

Dec 29 12:01:03 nest kernel: NFSD: preprocess_seqid_op: bad seqid (expected 27286, got 27287)

many times

Centos is now running the Xen kernel:
Linux nest.pheasant 2.6.18-92.1.22.el5xen #1 SMP Tue Dec 16 12:26:32 EST 2008 x86_64 x86_64 x86_64 GNU/Linux. Apart from this nothing changed
Comment 5 Louis Lagendijk 2009-08-03 14:54:11 EDT
I have for a long time been running a standard 2.6.28 kernel on my Centos box and did not have the problem any more. I now started the stock Centos kernel 2.6.18-128.2.1.el5xen (x86_64) to play around with Xen and the problem re-appeared. It appears to be a Centos/RHEL5 problem, not Fedora. Should I file a new BZ towards RHEL5?
Comment 6 Steve Dickson 2009-08-04 07:49:39 EDT
yes... Close this one and please open a RHEL5 bz...
Comment 7 Danilo Câmara 2009-08-06 16:29:44 EDT
Today I installed a Fedora 11 client in my network and noticed this bug. My server is also a NFSv4 with Kerberos:

Server CentOS: kernel-2.6.18-128.2.1.el5
Client Fedora: kernel-2.6.29.6-217.2.3.fc11.x86_64
Comment 8 Louis Lagendijk 2009-08-12 15:40:06 EDT
Closing this BZ. Have not seen the bug after installing dzickus 2.6.18-162.el5xen experimental RHEL kernel. Will report it on RHEL kernel only if the problem re-appears
Comment 9 Louis Lagendijk 2009-08-15 07:21:33 EDT
Created a BZ for RHEL5:
https://bugzilla.redhat.com/show_bug.cgi?id=517629

Note You need to log in before you can comment on or make changes to this bug.