Bug 1471869
Summary: | cthon04 can cause segfault in gNFS/NLM | ||
---|---|---|---|
Product: | [Community] GlusterFS | Reporter: | Niels de Vos <ndevos> |
Component: | nfs | Assignee: | Niels de Vos <ndevos> |
Status: | CLOSED CURRENTRELEASE | QA Contact: | |
Severity: | urgent | Docs Contact: | |
Priority: | medium | ||
Version: | 3.11 | CC: | bugs |
Target Milestone: | --- | Keywords: | Triaged |
Target Release: | --- | ||
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | glusterfs-3.11.2 | Doc Type: | If docs needed, set a value |
Doc Text: | Story Points: | --- | |
Clone Of: | 1467313 | Environment: | |
Last Closed: | 2017-08-12 13:08:02 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | 1467313 | ||
Bug Blocks: |
Description
Niels de Vos
2017-07-17 14:49:48 UTC
REVIEW: https://review.gluster.org/17797 (nfs: make nfs3_call_state_t refcounted) posted (#1) for review on release-3.11 by Niels de Vos (ndevos) REVIEW: https://review.gluster.org/17798 (nfs/nlm: unref fds in nlm_client_free()) posted (#1) for review on release-3.11 by Niels de Vos (ndevos) REVIEW: https://review.gluster.org/17799 (nfs/nlm: handle reconnect for non-NLM4_LOCK requests) posted (#1) for review on release-3.11 by Niels de Vos (ndevos) REVIEW: https://review.gluster.org/17800 (nfs/nlm: use refcounting for nfs3_call_state_t) posted (#1) for review on release-3.11 by Niels de Vos (ndevos) REVIEW: https://review.gluster.org/17801 (nfs/nlm: keep track of the call-state and frame for notifications) posted (#1) for review on release-3.11 by Niels de Vos (ndevos) COMMIT: https://review.gluster.org/17797 committed in release-3.11 by Shyamsundar Ranganathan (srangana) ------ commit b5d0c9b48e87455b961a3e0022de4091d9a4cdf8 Author: Niels de Vos <ndevos> Date: Mon Jul 17 16:43:30 2017 +0200 nfs: make nfs3_call_state_t refcounted There is no refcounting done of the nfs3_call_state_t structure, which seems to result in use-after-free problems in the NLM part of Gluster/NFS. The structure is initialized with two different functions, it is easier to have a single place to do this. The Gluster/NFS part will not use the refcounting, for now. This is being added to make the NLM code more stable. nfs3_call_state_wipe() will behave as before for Gluster/NFS, but cleanup is triggered through the refcounting now. This prevents major changes to the stable part of the NFS-server, and makes it possible to improve the NLM component separately. Cherry picked from commit daed52b8ebcac7ef36f11e944f83826f46593867: > Change-Id: I2e15bcf12af74e8a46c2727e4a160e9444d29ece > BUG: 1467313 > Signed-off-by: Niels de Vos <ndevos> > Reviewed-on: https://review.gluster.org/17696 > Smoke: Gluster Build System <jenkins.org> > Reviewed-by: Amar Tumballi <amarts> > CentOS-regression: Gluster Build System <jenkins.org> > Reviewed-by: Kaleb KEITHLEY <kkeithle> > Reviewed-by: jiffin tony Thottan <jthottan> Change-Id: I2e15bcf12af74e8a46c2727e4a160e9444d29ece BUG: 1471869 Signed-off-by: Niels de Vos <ndevos> Reviewed-on: https://review.gluster.org/17797 Smoke: Gluster Build System <jenkins.org> CentOS-regression: Gluster Build System <jenkins.org> Reviewed-by: Shyamsundar Ranganathan <srangana> COMMIT: https://review.gluster.org/17798 committed in release-3.11 by Shyamsundar Ranganathan (srangana) ------ commit b6675a718d5074043fcc81a3a041d0202e39024f Author: Niels de Vos <ndevos> Date: Mon Jul 17 16:43:43 2017 +0200 nfs/nlm: unref fds in nlm_client_free() When a nlm_clnt is getting free'd, the FDs associated with this client should be unref'd as well. Cherry picked from commit e9a482f94e748ea12e73ddd2e275bad9aa314b4c: > Change-Id: Ifa4ea4b7ed45a454413cfc0c820f2516c534a9aa > BUG: 1467313 > Signed-off-by: Niels de Vos <ndevos> > Reviewed-on: https://review.gluster.org/17697 > Smoke: Gluster Build System <jenkins.org> > Reviewed-by: Amar Tumballi <amarts> > CentOS-regression: Gluster Build System <jenkins.org> > Reviewed-by: jiffin tony Thottan <jthottan> > Reviewed-by: Kaleb KEITHLEY <kkeithle> Change-Id: Ifa4ea4b7ed45a454413cfc0c820f2516c534a9aa BUG: 1471869 Signed-off-by: Niels de Vos <ndevos> Reviewed-on: https://review.gluster.org/17798 Smoke: Gluster Build System <jenkins.org> CentOS-regression: Gluster Build System <jenkins.org> Reviewed-by: Shyamsundar Ranganathan <srangana> COMMIT: https://review.gluster.org/17799 committed in release-3.11 by Shyamsundar Ranganathan (srangana) ------ commit 3da8ba1f2b5275bf60373531d91aebf7d3cbf154 Author: Niels de Vos <ndevos> Date: Mon Jul 17 16:44:38 2017 +0200 nfs/nlm: handle reconnect for non-NLM4_LOCK requests When a reply on an NLM-procedure gets stuck, the NFS-client will resend the request. This can happen through a re-connect in case the connection was terminated (long delay in the reply on the initial request). Once that happens, not all NLM-procedures are handled correctly. Testing this is difficult and time-consuming. There still may be problems with certain operations, but this definitely makes it behave much better than before. The problem occured due to a problem in EC, change-id I18a782903ba addressed the root cause. Cherry picked from commit fafe1491ead527ba1024c521013aa90d2ee2b355: > Change-Id: I23b385568e27232951fa3fbd7198a0e5d775a8c2 > BUG: 1467313 > Signed-off-by: Niels de Vos <ndevos> > Reviewed-on: https://review.gluster.org/17698 > Smoke: Gluster Build System <jenkins.org> > CentOS-regression: Gluster Build System <jenkins.org> Change-Id: I23b385568e27232951fa3fbd7198a0e5d775a8c2 BUG: 1471869 Signed-off-by: Niels de Vos <ndevos> Reviewed-on: https://review.gluster.org/17799 Smoke: Gluster Build System <jenkins.org> CentOS-regression: Gluster Build System <jenkins.org> Reviewed-by: Shyamsundar Ranganathan <srangana> REVIEW: https://review.gluster.org/17800 (nfs/nlm: use refcounting for nfs3_call_state_t) posted (#2) for review on release-3.11 by Shyamsundar Ranganathan (srangana) REVIEW: https://review.gluster.org/17801 (nfs/nlm: keep track of the call-state and frame for notifications) posted (#2) for review on release-3.11 by Shyamsundar Ranganathan (srangana) REVIEW: https://review.gluster.org/17801 (nfs/nlm: keep track of the call-state and frame for notifications) posted (#3) for review on release-3.11 by Shyamsundar Ranganathan (srangana) COMMIT: https://review.gluster.org/17800 committed in release-3.11 by Shyamsundar Ranganathan (srangana) ------ commit 52725e305d39296a7d01944c62c3166a3cad06bc Author: Niels de Vos <ndevos> Date: Mon Jul 17 16:45:22 2017 +0200 nfs/nlm: use refcounting for nfs3_call_state_t In order to track down a potential use-after-free of the nfs3_call_state_t structure in the NLM component, add reference counting where teh structure is used. This should prevent premature free'ing of the structure. Cherry picked from commit 01bfdd4d1759423681d311da33f4ac2346ace445: > Change-Id: Ib1f13b0463ab1e012b7b49a623c91f0f3e73e1fb > BUG: 1467313 > Signed-off-by: Niels de Vos <ndevos> > Reviewed-on: https://review.gluster.org/17699 > Reviewed-by: jiffin tony Thottan <jthottan> > Smoke: Gluster Build System <jenkins.org> > CentOS-regression: Gluster Build System <jenkins.org> Change-Id: Ib1f13b0463ab1e012b7b49a623c91f0f3e73e1fb BUG: 1471869 Signed-off-by: Niels de Vos <ndevos> Reviewed-on: https://review.gluster.org/17800 CentOS-regression: Gluster Build System <jenkins.org> Smoke: Gluster Build System <jenkins.org> Reviewed-by: Shyamsundar Ranganathan <srangana> COMMIT: https://review.gluster.org/17801 committed in release-3.11 by Shyamsundar Ranganathan (srangana) ------ commit bec1c177d7fccaa6dbe353cb06064256bc997895 Author: Niels de Vos <ndevos> Date: Mon Jul 17 16:45:47 2017 +0200 nfs/nlm: keep track of the call-state and frame for notifications When blocking locks are used, a new frame is allocated that is used to send the notification to the client once once the lock becomes available. In all other cases, the frame that contains the request from the client will be used for the reply. Because there was no way to track the different clients with their requests (captured in the call-state), the call-state could be free'd before the notification was sent to the client. This caused a use-after-free of the call-state and could trigger segfaults of the Gluster/NFS server or incorrect replies on (un)lock requests. By introducing a nlm4_notify_args structure, the call-state and frame can be tracked better. This prevents the possibility of segfaulting when the call-state is used after being free'd. Cherry picked from commit b81997264f079983fa02bd5fa2b3715224942b00: > BUG: 1467313 > Change-Id: I285d2bc552f509e5145653b7a50afcff827cd612 > Signed-off-by: Niels de Vos <ndevos> > Reviewed-on: https://review.gluster.org/17700 > Smoke: Gluster Build System <jenkins.org> > CentOS-regression: Gluster Build System <jenkins.org> > Reviewed-by: Kaleb KEITHLEY <kkeithle> > Reviewed-by: jiffin tony Thottan <jthottan> Change-Id: I285d2bc552f509e5145653b7a50afcff827cd612 BUG: 1471869 Signed-off-by: Niels de Vos <ndevos> Reviewed-on: https://review.gluster.org/17801 CentOS-regression: Gluster Build System <jenkins.org> Smoke: Gluster Build System <jenkins.org> Reviewed-by: Shyamsundar Ranganathan <srangana> This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.11.2, please open a new bug report. glusterfs-3.11.2 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] http://lists.gluster.org/pipermail/gluster-users/2017-July/031908.html [2] https://www.gluster.org/pipermail/gluster-users/ |