Bug 1481600
| Summary: | rpc: client_t and related objects leaked due to incorrect ref counts | |||
|---|---|---|---|---|
| Product: | [Community] GlusterFS | Reporter: | Milind Changire <mchangir> | |
| Component: | rpc | Assignee: | Milind Changire <mchangir> | |
| Status: | CLOSED CURRENTRELEASE | QA Contact: | ||
| Severity: | unspecified | Docs Contact: | ||
| Priority: | unspecified | |||
| Version: | mainline | CC: | bugs, moagrawa | |
| Target Milestone: | --- | |||
| Target Release: | --- | |||
| Hardware: | Unspecified | |||
| OS: | Unspecified | |||
| Whiteboard: | ||||
| Fixed In Version: | glusterfs-3.13.0 | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | ||
| Clone Of: | ||||
| : | 1487033 1487036 (view as bug list) | Environment: | ||
| Last Closed: | 2017-12-08 17:38:25 UTC | Type: | Bug | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Embargoed: | ||||
| Bug Depends On: | ||||
| Bug Blocks: | 1487033, 1487036 | |||
REVIEW: https://review.gluster.org/17982 (rpc: destroy transport after client_t) posted (#9) for review on master by Milind Changire (mchangir) REVIEW: https://review.gluster.org/17982 (rpc: destroy transport after client_t) posted (#10) for review on master by Milind Changire (mchangir) REVIEW: https://review.gluster.org/17982 (rpc: destroy transport after client_t) posted (#11) for review on master by Milind Changire (mchangir) REVIEW: https://review.gluster.org/17982 (rpc: destroy transport after client_t) posted (#12) for review on master by Milind Changire (mchangir) REVIEW: https://review.gluster.org/17982 (rpc: destroy transport after client_t) posted (#13) for review on master by Milind Changire (mchangir) REVIEW: https://review.gluster.org/17982 (rpc: destroy transport after client_t) posted (#14) for review on master by Milind Changire (mchangir) REVIEW: https://review.gluster.org/17982 (rpc: destroy transport after client_t) posted (#15) for review on master by Milind Changire (mchangir) REVIEW: https://review.gluster.org/17982 (rpc: destroy transport after client_t) posted (#16) for review on master by Milind Changire (mchangir) REVIEW: https://review.gluster.org/17982 (rpc: destroy transport after client_t) posted (#17) for review on master by Milind Changire (mchangir) COMMIT: https://review.gluster.org/17982 committed in master by Raghavendra G (rgowdapp) ------ commit 24b95089a18a6a40e7703cb344e92025d67f3086 Author: Milind Changire <mchangir> Date: Wed Aug 30 11:25:29 2017 +0530 rpc: destroy transport after client_t Problem: 1. Ref counting increment on the client_t object is done in rpcsvc_request_init() which is incorrect. 2. Ref not taken when delegating to grace_time_handler() Solution: 1. Only fop requests which require processing down the graph via stack 'frames' now ref count the request in get_frame_from_request() 2. Take ref on client_t object in server_rpc_notify() but avoid dropping in RPCSVC_EVENT_TRANSPORT_DESRTROY. Drop the ref unconditionally when exiting out of grace_time_handler(). Also, avoid dropping ref on client_t in RPCSVC_EVENT_TRANSPORT_DESTROY when ref mangement as been delegated to grace_time_handler() Change-Id: Ic16246bebc7ea4490545b26564658f4b081675e4 BUG: 1481600 Reported-by: Raghavendra G <rgowdapp> Signed-off-by: Milind Changire <mchangir> Reviewed-on: https://review.gluster.org/17982 Tested-by: Raghavendra G <rgowdapp> Reviewed-by: Raghavendra G <rgowdapp> CentOS-regression: Gluster Build System <jenkins.org> Smoke: Gluster Build System <jenkins.org> Correction to the problem description:
Description of problem:
Problem:
1. incorrectly placed gf_client_get() in rpc_request_init()
gf_client_ref() in rpc_request_init() should be moved to
get_frame_from_request()
2. incorrect ref handling in server_rpc_notify() and grace_time_handler()
2.1 last ref count on client_t should be dropped in
RPCSVC_EVENT_TRANSPORT_DESTROY only for non-grace-time-handling case
2.2 ref should be taken on client_t before being delegated to
grace_tim_handler()
2.3 ref should be dropped from client_t in server_setvolume() when the
grace_time_handler() is successfully canceled for a re-connected client
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.13.0, please open a new bug report. glusterfs-3.13.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] http://lists.gluster.org/pipermail/announce/2017-December/000087.html [2] https://www.gluster.org/pipermail/gluster-users/ |
Description of problem: Problem: 1. asymmetrical ref counting First call to gf_client_get() creates a new client_t object and sets the bind count and the ref count to 1. Additional gf_client_get() just increments the bind count but not the ref count. This is causing confusion as to when should the ref count be decremented on a gf_client_put(), since currently gf_client_put() only decrements the bind count 2. missing unref on some handshake and glusterfs program actors server_submit_reply() on following actors is called with frame pointer as NULL Handshake actors: GETSPEC: can't add ref on this request since client obj hasn't been created when this request hits the server; so this is good; no unref required either; The tests, however, request a the spec by an explicit 'gluster system getspec' command after a SETVOLUME. So there would be a ref that needs to be accounted for. SETVOLUME: does gf_client_get() which inits bind count and ref count to 1; this is good case; SET_LK_VER: Actually the actor function does a gf_client_get() as well as a gf_client_put(). But the problem is rpcsvc_request_create() path adds a ref to the client_t on receiving this request but fails to drop the ref in server_submit_reply() since the frame pointer passed is NULL. PING: rpcsvc_request_init() adds a ref but can't unref the request since frame is NULL in server_submit_reply() Glusterfs actors: RELEASE: rpcsvc_request_init() adds a ref but can't unref the request since frame is NULL in server_submit_reply() RELEASE_DIR: rpcsvc_request_init() adds a ref but can't unref the request since frame is NULL in server_submit_reply() NULL: rpcsvc_request_init() adds a ref but can't unref the request since frame is NULL in server_submit_reply() Version-Release number of selected component (if applicable): How reproducible: always (with valgrind)