Bug 1409135

Summary: [Replicate] "RPC call decoding failed" leading to IO hang & mount inaccessible
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Karan Sandha <ksandha>
Component: rpcAssignee: Poornima G <pgurusid>
Status: CLOSED ERRATA QA Contact: Karan Sandha <ksandha>
Severity: high Docs Contact:
Priority: unspecified    
Version: rhgs-3.2CC: amukherj, ksandha, nchilaka, rcyriac, rhs-bugs, rjoseph, skoduri
Target Milestone: ---   
Target Release: RHGS 3.2.0   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: glusterfs-3.8.4-15 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1421937 (view as bug list) Environment:
Last Closed: 2017-03-23 06:00:59 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1421937, 1422363, 1422787, 1422788, 1425740, 1427012, 1428670    
Bug Blocks: 1351528    

Description Karan Sandha 2016-12-29 18:42:46 UTC
Description of problem:
RPC failed to decode the msg on nfs mount leading to  Mounted volume on NFS mount and started sequential write with o-direct flag

Version-Release number of selected component (if applicable):
3.8.4-8
Logs are placed at 
rhsqe-repo.lab.eng.blr.redhat.com:/var/www/html/sosreports/<bug>

How reproducible:
Tried Once

Steps to Reproduce:
1. 4 Servers And 4 Clients , Mount 1:1 with gnfs
2. Daemonize FIO on 4 Client and start sequential write with 0 direct flag
3. After the inception of IO's the tool got hanged.
4. Multiple errors and warnings in nfs.log

<SNIP>
2016-12-29 10:27:29.424871] W [xdr-rpc.c:55:xdr_to_rpc_call] 0-rpc: failed to decode call msg
[2016-12-29 10:27:29.425032] W [rpc-clnt.c:717:rpc_clnt_handle_cbk] 0-testvol-client-0: RPC call decoding failed
[2016-12-29 10:27:29.443275] E [rpc-clnt.c:365:saved_frames_unwind] (--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x192)[0x7f9ccdb40682] (--> /lib64/libgfrpc.so.0(saved_frames_unwind+0x1de)[0x7f9ccd90675e] (--> /lib64/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7f9ccd90686e] (--> /lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x84)[0x7f9ccd907fd4] (--> /lib64/libgfrpc.so.0(rpc_clnt_notify+0x94)[0x7f9ccd908864] ))))) 0-testvol-client-0: forced unwinding frame type(GlusterFS 3.3) op(FINODELK(30)) called at 2016-12-29 10:26:55.289465 (xid=0xa8ddd)
[2016-12-29 10:27:29.443308] E [MSGID: 114031] [client-rpc-fops.c:1601:client3_3_finodelk_cbk] 0-testvol-client-0: remote operation failed [Transport endpoint is not connected]
[2016-12-29 10:27:29.443571] E [rpc-clnt.c:365:saved_frames_unwind] (--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x192)[0x7f9ccdb40682] (--> /lib64/libgfrpc.so.0(saved_frames_unwind+0x1de)[0x7f9ccd90675e] (--> /lib64/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7f9ccd90686e] (--> /lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x84)[0x7f9ccd907fd4] (-->

</SNIP>

Actual results:
IO tool got hung and multiple error and warning in the logs

Expected results:
NO IO hang should be observed

Additional info:

[root@gqas004 ~]# gluster v info
 
Volume Name: testvol
Type: Distributed-Replicate
Volume ID: e2212c28-f04a-4f08-9f17-b0fb74434bbf
Status: Started
Snapshot Count: 0
Number of Bricks: 2 x 2 = 4
Transport-type: tcp
Bricks:
Brick1: gqas003.sbu.lab.eng.bos.redhat.com:/bricks/testvol_brick0
Brick2: gqas004.sbu.lab.eng.bos.redhat.com:/bricks/testvol_brick1
Brick3: gqas007.sbu.lab.eng.bos.redhat.com:/bricks/testvol_brick2
Brick4: gqas012.sbu.lab.eng.bos.redhat.com:/bricks/testvol_brick3
Options Reconfigured:
cluster.use-compound-fops: off
network.remote-dio: off
performance.strict-o-direct: on
network.inode-lru-limit: 90000
performance.md-cache-timeout: 600
performance.cache-invalidation: on
features.cache-invalidation-timeout: 600
features.cache-invalidation: on
server.allow-insecure: on
performance.stat-prefetch: on
transport.address-family: inet
performance.readdir-ahead: on
nfs.disable: off

Comment 3 Nag Pavan Chilakam 2017-01-03 06:55:40 UTC
I am seeing lot of these errors in my systemic setup for regression testing of 3.2

On my systemic setup, I am doing same path directory creation simultaneously from 3 different clients.
Each client used different server IP to mount the volume using fuse protocol
Also, Each client were dumping sosreports every 5min into the volume mount in a screen session, along with top output being appended to a file every minute
The dir-creations were happening from different users
Eg:
client1(el 7.2) was running the dir-creation using pavan@rhs-client23
client2(el 6.7) as root@rhs-client24
client3(el 7.3) as cli21@rhs-client21


Note: these logs are wrt client1 ie rhs-client21
Also, however note that I am able to access the mount

sosreports available at 
http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/nchilaka/3.2_logs/systemic_testing_logs/regression_cycle/same_dir_create_clients/rhs-client21.lab.eng.blr.redhat.com/

test execution details available at https://docs.google.com/spreadsheets/d/1iP5Mi1TewBFVh8HTmlcBm9072Bgsbgkr3CLcGmawDys/edit#gid=632186609

Version-Release number of selected component (if applicable):
============
3.8.4-10

other BZs for reference(raised wrt issues on same setup)
1409472 - brick crashed on systemic setup
1397907 - seeing frequent kernel hangs when doing operations both on fuse client and gluster nodes on replica volumes (edit) [NEEDINFO]
1409568 - seeing socket disconnects and transport endpoint not connected frequently on systemic setup 
1409572 - In fuse mount logs:seeing input/output error with split-brain observed logs and failing GETXATTR and STAT
1409580 - seeing stale file handle errors in fuse mount logs in systemic testing
1409583 - seeing RPC status error messages and timeouts due to RPC (rpc-clnt.c:200:call_bail)
1409584 - seeing remote operation failed [Invalid argument] wrt inode locking in my systemic setup

Comment 4 Nag Pavan Chilakam 2017-01-03 06:56:57 UTC
folowing were the errors seen
[2016-12-31 22:30:27.780829] E [rpc-clnt.c:365:saved_frames_unwind] (--> /usr/lib64/libglusterfs.so.0(_gf_log_callingfn+0x1f2)[0x7f69dfd68882] (--> /usr/lib64/libgfrpc.s
o.0(saved_frames_unwind+0x1e7)[0x7f69dfb30717] (--> /usr/lib64/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7f69dfb3082e] (--> /usr/lib64/libgfrpc.so.0(rpc_clnt_connection_
cleanup+0x99)[0x7f69dfb308e9] (--> /usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0x1fa)[0x7f69dfb30b3a] ))))) 0-sysvol-client-6: forced unwinding frame type(GlusterFS 3.3) op
(OPEN(11)) called at 2016-12-31 22:29:01.224843 (xid=0x42ad609)
[2016-12-31 22:30:27.780958] E [rpc-clnt.c:365:saved_frames_unwind] (--> /usr/lib64/libglusterfs.so.0(_gf_log_callingfn+0x1f2)[0x7f69dfd68882] (--> /usr/lib64/libgfrpc.s
o.0(saved_frames_unwind+0x1e7)[0x7f69dfb30717] (--> /usr/lib64/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7f69dfb3082e] (--> /usr/lib64/libgfrpc.so.0(rpc_clnt_connection_
cleanup+0x99)[0x7f69dfb308e9] (--> /usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0x1fa)[0x7f69dfb30b3a] ))))) 0-sysvol-client-6: forced unwinding frame type(GlusterFS 3.3) op
(OPEN(11)) called at 2016-12-31 22:29:01.224850 (xid=0x42ad60a)
[2016-12-31 22:30:27.781091] E [rpc-clnt.c:365:saved_frames_unwind] (--> /usr/lib64/libglusterfs.so.0(_gf_log_callingfn+0x1f2)[0x7f69dfd68882] (--> /usr/lib64/libgfrpc.so.0(saved_frames_unwind+0x1e7)[0x7f69dfb30717] (--> /usr/lib64/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7f69dfb3082e] (--> /usr/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x99)[0x7f69dfb308e9] (--> /usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0x1fa)[0x7f69dfb30b3a] ))))) 0-sysvol-client-6: forced unwinding frame type(GlusterFS 3.3) op(OPEN(11)) called at 2016-12-31 22:29:01.224856 (xid=0x42ad60b)
[2016-12-31 22:30:27.781222] E [rpc-clnt.c:365:saved_frames_unwind] (--> /usr/lib64/libglusterfs.so.0(_gf_log_callingfn+0x1f2)[0x7f69dfd68882] (--> /usr/lib64/libgfrpc.so.0(saved_frames_unwind+0x1e7)[0x7f69dfb30717] (--> /usr/lib64/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7f69dfb3082e] (--> /usr/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x99)[0x7f69dfb308e9] (--> /usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0x1fa)[0x7f69dfb30b3a] ))))) 0-sysvol-client-6: forced unwinding frame type(GlusterFS 3.3) op(OPEN(11)) called at 2016-12-31 22:29:01.224862 (xid=0x42ad60c)

Comment 5 Nag Pavan Chilakam 2017-01-03 06:59:08 UTC
(In reply to nchilaka from comment #3)
> I am seeing lot of these errors in my systemic setup for regression testing
> of 3.2
> 
> On my systemic setup, I am doing same path directory creation simultaneously
> from 3 different clients.
> Each client used different server IP to mount the volume using fuse protocol
> Also, Each client were dumping sosreports every 5min into the volume mount
> in a screen session, along with top output being appended to a file every
> minute
> The dir-creations were happening from different users
> Eg:
> client1(el 7.2) was running the dir-creation using pavan@rhs-client23
> client2(el 6.7) as root@rhs-client24
> client3(el 7.3) as cli21@rhs-client21
> 
> 
> Note: these logs are wrt client1 ie rhs-client21
> Also, however note that I am able to access the mount
> 
> sosreports available at 
> http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/nchilaka/3.2_logs/
> systemic_testing_logs/regression_cycle/same_dir_create_clients/rhs-client21.
> lab.eng.blr.redhat.com/
> 
One small correction, I am seeing the issue on rhs-client24 so refer to 
http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/nchilaka/3.2_logs/systemic_testing_logs/regression_cycle/same_dir_create_clients/rhs-client24.lab.eng.blr.redhat.com/

> test execution details available at
> https://docs.google.com/spreadsheets/d/
> 1iP5Mi1TewBFVh8HTmlcBm9072Bgsbgkr3CLcGmawDys/edit#gid=632186609
> 
> Version-Release number of selected component (if applicable):
> ============
> 3.8.4-10
> 
> other BZs for reference(raised wrt issues on same setup)
> 1409472 - brick crashed on systemic setup
> 1397907 - seeing frequent kernel hangs when doing operations both on fuse
> client and gluster nodes on replica volumes (edit) [NEEDINFO]
> 1409568 - seeing socket disconnects and transport endpoint not connected
> frequently on systemic setup 
> 1409572 - In fuse mount logs:seeing input/output error with split-brain
> observed logs and failing GETXATTR and STAT
> 1409580 - seeing stale file handle errors in fuse mount logs in systemic
> testing
> 1409583 - seeing RPC status error messages and timeouts due to RPC
> (rpc-clnt.c:200:call_bail)
> 1409584 - seeing remote operation failed [Invalid argument] wrt inode
> locking in my systemic setup

Comment 11 Poornima G 2017-01-12 05:18:53 UTC
The error is not generated by the upcall request but from a fop cbk, we(Rtalur, Rajesh) are yet to identify which is the fop and what field of the XDR recieved was corrupt

Comment 12 Poornima G 2017-01-25 12:28:28 UTC
I am unable to reproduce the original "failed to decode call msg" error, the only difference is its latest downstream with debug mode. I shall once try with the latest Gluster RPMs once.

But i do see disconnect errors consistently:
[2017-01-25 11:46:46.592556] W [socket.c:590:__socket_rwv] 0-socket.nfs-server: writev on 192.168.97.120:1011 failed (Connection reset by peer)
[2017-01-25 11:46:48.588515] I [socket.c:2403:socket_event_handler] 0-transport: disconnecting now

These disconnects are between the nfs-clients and gluster NFS server as mentioned in the IP address.

Comment 13 Poornima G 2017-01-27 06:43:45 UTC
With the latest downstream RPMs, i am not able to reproduce the "failed to decode call msg" error.

But i still get the following errors:
[2017-01-27 05:37:17.284133] I [socket.c:2403:socket_event_handler] 0-transport: disconnecting now
[2017-01-27 05:37:17.289722] I [socket.c:3439:socket_submit_request] 0-testvol-client-0: not connected (priv->connected = 0)

[2017-01-25 11:46:46.592556] W [socket.c:590:__socket_rwv] 0-socket.nfs-server: writev on 192.168.97.120:1011 failed (Connection reset by peer)
[2017-01-25 11:46:48.588515] I [socket.c:2403:socket_event_handler] 0-transport: disconnecting now

And as per my testing this doesn't seem to be upcall issue, as the disconnects appear even when upcall is not in progress. It could be RPC problem or in case of heavy load, throttling in NFS(client-server) could send ECONRESET, not sure. Need more time for RCA.

Comment 14 Soumya Koduri 2017-01-27 10:37:05 UTC
Similar issue/logs as mentioned in this bug description are observed in bug1417062. The setup is in the same state now. Request to take a look if needed.

Comment 15 Poornima G 2017-02-14 07:22:27 UTC
Fix posted:  https://review.gluster.org/#/c/16613/1

Comment 16 Poornima G 2017-02-16 09:14:59 UTC
Downstream patch: https://code.engineering.redhat.com/gerrit/#/c/97927/

Comment 19 rjoseph 2017-02-21 15:29:35 UTC
Karan, As discussed can you please provide the pstack output of the mount process and as well of the brick processes?

Comment 24 errata-xmlrpc 2017-03-23 06:00:59 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2017-0486.html