1421719 – volume stop generates error log

Bug 1421719 - volume stop generates error log

Summary: volume stop generates error log

Keywords:
Status:	CLOSED WORKSFORME
Alias:	None
Product:	GlusterFS
Classification:	Community
Component:	glusterd
Sub Component:
Version:	mainline
Hardware:	Unspecified
OS:	Unspecified
Priority:	low
Severity:	low
Target Milestone:	---
Assignee:	Atin Mukherjee
QA Contact:
Docs Contact:
URL:
Whiteboard:	brick-multiplexing-testing
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2017-02-13 14:11 UTC by Atin Mukherjee
Modified:	2017-08-09 07:58 UTC (History)
CC List:	2 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2017-08-09 07:58:28 UTC
Regression:	---
Mount Type:	---
Documentation:	---
CRM:
Verified Versions:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description Atin Mukherjee 2017-02-13 14:11:10 UTC

Description of problem:

On stopping a volume following logs are constantly seen:

[2017-02-13 14:07:41.340381] E [rpc-clnt.c:365:saved_frames_unwind] (--> /usr/local/lib/libglusterfs.so.0(_gf_log_callingfn+0x12a)[0x7f42a4c569aa] (--> /usr/local/lib/libgfrpc.so.0(saved_frames_unwind+0x1b5)[0x7f42a4a1db75] (--> /usr/local/lib/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7f42a4a1dc6e] (--> /usr/local/lib/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x89)[0x7f42a4a1f1e9] (--> /usr/local/lib/libgfrpc.so.0(rpc_clnt_notify+0x84)[0x7f42a4a1fa74] ))))) 0-management: forced unwinding frame type(brick operations) op(--(1)) called at 2017-02-13 14:07:41.262775 (xid=0x8)
[2017-02-13 14:07:41.341037] E [rpc-clnt.c:365:saved_frames_unwind] (--> /usr/local/lib/libglusterfs.so.0(_gf_log_callingfn+0x12a)[0x7f42a4c569aa] (--> /usr/local/lib/libgfrpc.so.0(saved_frames_unwind+0x1b5)[0x7f42a4a1db75] (--> /usr/local/lib/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7f42a4a1dc6e] (--> /usr/local/lib/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x89)[0x7f42a4a1f1e9] (--> /usr/local/lib/libgfrpc.so.0(rpc_clnt_notify+0x84)[0x7f42a4a1fa74] ))))) 0-management: forced unwinding frame type(brick operations) op(--(1)) called at 2017-02-13 14:07:41.262790 (xid=0x9)
[2017-02-13 14:07:41.341590] E [rpc-clnt.c:365:saved_frames_unwind] (--> /usr/local/lib/libglusterfs.so.0(_gf_log_callingfn+0x12a)[0x7f42a4c569aa] (--> /usr/local/lib/libgfrpc.so.0(saved_frames_unwind+0x1b5)[0x7f42a4a1db75] (--> /usr/local/lib/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7f42a4a1dc6e] (--> /usr/local/lib/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x89)[0x7f42a4a1f1e9] (--> /usr/local/lib/libgfrpc.so.0(rpc_clnt_notify+0x84)[0x7f42a4a1fa74] ))))) 0-management: forced unwinding frame type(brick operations) op(--(1)) called at 2017-02-13 14:07:41.262798 (xid=0xa)
[2017-02-13 14:07:41.342144] E [rpc-clnt.c:365:saved_frames_unwind] (--> /usr/local/lib/libglusterfs.so.0(_gf_log_callingfn+0x12a)[0x7f42a4c569aa] (--> /usr/local/lib/libgfrpc.so.0(saved_frames_unwind+0x1b5)[0x7f42a4a1db75] (--> /usr/local/lib/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7f42a4a1dc6e] (--> /usr/local/lib/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x89)[0x7f42a4a1f1e9] (--> /usr/local/lib/libgfrpc.so.0(rpc_clnt_notify+0x84)[0x7f42a4a1fa74] ))))) 0-management: forced unwinding frame type(brick operations) op(--(1)) called at 2017-02-13 14:07:41.262807 (xid=0xb)

Version-Release number of selected component (if applicable):
mainline

How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 1 hari gowtham 2017-02-14 12:08:26 UTC

How reproducible:
1/1

Steps to Reproduce:
1.create a volume
2.stop the volume
3.

Actual results:
error messages are seen

Expected results:
no error messages should be seen

Comment 2 Jeff Darcy 2017-02-15 22:14:40 UTC

Is this really *constant*?  When I did this one one of my own machines, I got exactly two of these messages - one per brick.  How many bricks was your test using?

This is probably because the RPC connection was closed while the GLUSTERD_BRICK_TERMINATE requests themselves were still outstanding (see glusterfs_handle_terminate).  This isn't seen in the non-multiplexing case because then we just send a SIGKILL instead of an RPC, but that's not an option here.  Workarounds are likely to require significant rearranging of a control flow that has already proven quite fragile, which will be more disruptive than some log messages, so I would recommend against spending time on this until true release blockers have been fixed.

Comment 3 Atin Mukherjee 2017-08-09 07:58:28 UTC

I'm not seeing this any more in latest upstream mainline.

Note You need to log in before you can comment on or make changes to this bug.