Bug 833746 - Wrong byte order in comparison causes a wrong return value from function nlmclnt_cancel
Wrong byte order in comparison causes a wrong return value from function nlmc...
Status: CLOSED DUPLICATE of bug 526829
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel (Show other bugs)
5.8
x86_64 Linux
unspecified Severity high
: rc
: ---
Assigned To: nfs-maint
Red Hat Kernel QE team
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2012-06-20 05:20 EDT by Menny Hamburger
Modified: 2012-11-27 10:16 EST (History)
2 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2012-11-27 10:16:16 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Menny Hamburger 2012-06-20 05:20:44 EDT
When NLM client sends a lock command, it will wait a certain amount of time after which it will give up and send a cancel command.
In some situations, the lock may actually be granted although the reply was delayed so the cancel command will fail, which will cause the kernel to send an unlock. 

This situation is tested in nlmclnt_cancel by the line:
(status == 0 && req->a_res.status == nlm_lck_denied)
The problem here is that req->a_res.status is returned in host byte order while nlm_lck_denied is defined in network byte order, so the comparison fails.
Comment 2 J. Bruce Fields 2012-11-26 12:14:11 EST
(In reply to comment #0)
> The problem here is that req->a_res.status is returned in host byte order

Are you sure about that?  Looking at the code, I see a_res.status is defined in include/linux/lockd/xdr.h as __be32 (hence network order), and fs/lockd/xdr.c:nlmsvc_decode_res() does indeed read it without any byte-swapping.

What kernel exactly are you testing?  It looks like some fixes in this area went in to 2.6.18-247.el5.
Comment 3 Menny Hamburger 2012-11-27 06:05:58 EST
I saw the fix put in between RHEL5.6 kernel and RHEL5.7.
I think it's a case of misinformation on our side regarding the kernel on which this problem was witnessed. I usually detect these issues while going through the set of patches, but RHEL5.7 no longer has the specific patch organization that exists in RHEL5.6.

Sorry for the hassle.
Comment 4 J. Bruce Fields 2012-11-27 10:16:16 EST
OK, thanks.  Then it looks like the correct thing is to close this as a dup of the (already closed) 526829.

*** This bug has been marked as a duplicate of bug 526829 ***

Note You need to log in before you can comment on or make changes to this bug.