Bug 661730
Summary: | NFS4 clients cannot reclaim locks after server reboot [rhel-6.0.z] | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 6 | Reporter: | RHEL Program Management <pm-rhel> | ||||||
Component: | kernel | Assignee: | Frantisek Hrbata <fhrbata> | ||||||
Status: | CLOSED ERRATA | QA Contact: | Red Hat Kernel QE team <kernel-qe> | ||||||
Severity: | high | Docs Contact: | |||||||
Priority: | high | ||||||||
Version: | 6.0 | CC: | bfields, dhoward, jlayton, jmalanik, pbenas, pm-eus, rwheeler, sprabhu, steved, tscofield, yanwang | ||||||
Target Milestone: | rc | Keywords: | ZStream | ||||||
Target Release: | --- | ||||||||
Hardware: | All | ||||||||
OS: | Linux | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | kernel-2.6.32-71.16.1.el6 | Doc Type: | Bug Fix | ||||||
Doc Text: |
The lock reclaim operation on a Red Hat Enterprise Linux 6 NFSv4 client did not work properly when, after a server reboot, an I/O operation which resulted in a STALE_STATEID response was performed before the RENEW call was sent to the server. This behavior was caused due to the improper use of the state flags. While investigating this bug, a different bug was discovered in the state recovery operation which resulted in a reclaim thread looping in the nfs4_reclaim_open_state() function. With this update, both operations have been fixed and work as expected.
|
Story Points: | --- | ||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2011-02-22 17:40:02 UTC | Type: | --- | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Bug Depends On: | 638269 | ||||||||
Bug Blocks: | |||||||||
Attachments: |
|
Description
RHEL Program Management
2010-12-09 14:35:25 UTC
Created attachment 478366 [details]
tcpdump on kernel-2.6.32-71.18.1.el6
hi, I encounter the same problem as comment #5, the reproducer process continued writing after server reboot. What do you think about the test result? I've attached the tcpdump log pls refer to above, thanks. Created attachment 479031 [details]
full log on kernel 2.6.32-71.18.1.el6
use the above test steps and # virsh destroy rhel6.0; virsh start rhel6.0 to crash nfs server, finally get the results: 57.141223 10.16.42.210 -> 10.66.65.95 NFS [RPC retransmission of #394][TCP Retransmission] V4 COMP Call <EMPTY> PUTFH;WRITE WRITE;GETATTR GETATTR 60.717421 10.16.42.210 -> 10.66.65.95 NFS V4 COMP Call <EMPTY> PUTFH;WRITE WRITE;GETATTR GETATTR 61.005714 10.66.65.95 -> 10.16.42.210 NFS V4 COMP Reply (Call In 740) <EMPTY> PUTFH;WRITE WRITE(10023) 61.005837 10.16.42.210 -> 10.66.65.95 NFS V4 COMP Call <EMPTY> RENEW RENEW 61.289059 10.66.65.95 -> 10.16.42.210 NFS V4 COMP Reply (Call In 745) <EMPTY> RENEW RENEW(10022) 61.289117 10.16.42.210 -> 10.66.65.95 NFS V4 COMP Call <EMPTY> SETCLIENTID SETCLIENTID 61.572563 10.66.65.95 -> 10.16.42.210 NFS V4 COMP Reply (Call In 752) <EMPTY> SETCLIENTID SETCLIENTID 61.572591 10.16.42.210 -> 10.66.65.95 NFS V4 COMP Call <EMPTY> SETCLIENTID_CONFIRM SETCLIENTID_CONFIRM;PUTROOTFH PUTROOTFH;GETATTR GETATTR 61.856146 10.66.65.95 -> 10.16.42.210 NFS V4 COMP Reply (Call In 756) <EMPTY> SETCLIENTID_CONFIRM SETCLIENTID_CONFIRM;PUTROOTFH PUTROOTFH;GETATTR GETATTR 61.856182 10.16.42.210 -> 10.66.65.95 NFS V4 COMP Call <EMPTY> PUTFH;OPEN OPEN;GETATTR GETATTR 62.137404 10.66.65.95 -> 10.16.42.210 NFS V1 CB_NULL Call 62.137444 10.16.42.210 -> 10.66.65.95 NFS V1 CB_NULL Reply (Call In 764) 62.142706 10.66.65.95 -> 10.16.42.210 NFS V4 COMP Reply (Call In 760) <EMPTY> PUTFH;OPEN OPEN;GETATTR GETATTR 62.142760 10.16.42.210 -> 10.66.65.95 NFS V4 COMP Call <EMPTY> PUTFH;LOCK LOCK 62.427752 10.66.65.95 -> 10.16.42.210 NFS V4 COMP Reply (Call In 768) <EMPTY> PUTFH;LOCK LOCK 62.427803 10.16.42.210 -> 10.66.65.95 NFS V4 COMP Call <EMPTY> PUTFH;WRITE WRITE;GETATTR GETATTR 62.711675 10.66.65.95 -> 10.16.42.210 NFS V4 COMP Reply (Call In 777) <EMPTY> PUTFH;WRITE WRITE;GETATTR GETATTR 62.711751 10.16.42.210 -> 10.66.65.95 NFS V4 COMP Call <EMPTY> PUTFH;COMMIT COMMIT;GETATTR GETATTR 63.007462 10.66.65.95 -> 10.16.42.210 NFS V4 COMP Reply (Call In 779) <EMPTY> PUTFH;COMMIT COMMIT;GETATTR GETATTR 64.007684 10.16.42.210 -> 10.66.65.95 NFS V4 COMP Call <EMPTY> PUTFH;WRITE WRITE;GETATTR GETATTR 64.291681 10.66.65.95 -> 10.16.42.210 NFS V4 COMP Reply (Call In 791) <EMPTY> PUTFH;WRITE WRITE;GETATTR GETATTR 64.291754 10.16.42.210 -> 10.66.65.95 NFS V4 COMP Call <EMPTY> PUTFH;COMMIT COMMIT;GETATTR GETATTR 64.587636 10.66.65.95 -> 10.16.42.210 NFS V4 COMP Reply (Call In 798) <EMPTY> PUTFH;COMMIT COMMIT;GETATTR GETATTR 65.587782 10.16.42.210 -> 10.66.65.95 NFS V4 COMP Call <EMPTY> PUTFH;WRITE WRITE;GETATTR GETATTR 65.870717 10.66.65.95 -> 10.16.42.210 NFS V4 COMP Reply (Call In 811) <EMPTY> PUTFH;WRITE WRITE;GETATTR GETATTR 65.870786 10.16.42.210 -> 10.66.65.95 NFS V4 COMP Call <EMPTY> PUTFH;COMMIT COMMIT;GETATTR GETATTR An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2011-0283.html Technical note added. If any revisions are required, please edit the "Technical Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. New Contents: The lock reclaim operation on a Red Hat Enterprise Linux 6 NFSv4 client did not work properly when, after a server reboot, an I/O operation which resulted in a STALE_STATEID response was performed before the RENEW call was sent to the server. This behavior was caused due to the improper use of the state flags. While investigating this bug, a different bug was discovered in the state recovery operation which resulted in a reclaim thread looping in the nfs4_reclaim_open_state() function. With this update, both operations have been fixed and work as expected. |