Description of problem: Upon upgrading from U4 to U6, I see I/O errors on client. Mount point is not accessible. [root@bob-the-minion fuse]# ls /mnt/nfs ls: cannot access /mnt/nfs: Input/output error [root@bob-the-minion fuse]# Version-Release number of selected component (if applicable): glusterfs 3.3.0.14rhs built on Aug 29 2013 09:33:47 Steps to Reproduce: 1. Install U4 and create a 1x2 volume 2. Mount the volume with NFS 3. Create some data. 4. umount client and stop the volume. 5. Upgrade to U6 6. Try creating data again. 7. Brick processes crash. Actual results: Brick processes crash and mount point is inaccessible Additional info: Attaching sosreports and core files from both the servers.
Created attachment 793074 [details] sosreports and core
Per discussion with Sac, the client in question is 2.1. Not for bigbend or u6.
As per discussion with Sachidananda: The server is upgraded from U4 to U6. After the upgrade used a 2.1 client to connect to the server. Brick crashed when the 2.1 client started I/O operations.
NFS is giving I/O error because the brick server is crashed. This is the expected behavior. The problem is in RPC layer where server is trying to decode the data sent by the client without checking the client version. Following is the backtrace output from crash: #0 0x000000368aa8860b in memcpy () from /lib64/libc.so.6 #1 0x00007ffdf8d05c9f in memdup (orig_buf=<value optimized out>, size=<value optimized out>, fill=0xf2d878) at /usr/include/bits/string3.h:52 #2 dict_unserialize (orig_buf=<value optimized out>, size=<value optimized out>, fill=0xf2d878) at dict.c:2428 #3 0x00007ffdef585fbb in server_writev (req=0x7ffdeee95208) at server3_1-fops.c:3393 #4 0x00007ffdf8ae27a7 in rpcsvc_handle_rpc_call (svc=0xecafa0, trans=<value optimized out>, msg=<value optimized out>) at rpcsvc.c:502 #5 0x00007ffdf8ae28a3 in rpcsvc_notify (trans=0xf88180, mydata=<value optimized out>, event=<value optimized out>, data=<value optimized out>) at rpcsvc.c:612 #6 0x00007ffdf8ae3308 in rpc_transport_notify (this=<value optimized out>, event=<value optimized out>, data=<value optimized out>) at rpc-transport.c:489 #7 0x00007ffdf55d0a04 in socket_event_poll_in (this=0xf88180) at socket.c:1677 #8 0x00007ffdf55d0ae7 in socket_event_handler (fd=<value optimized out>, idx=8, data=0xf88180, poll_in=1, poll_out=0, poll_err=<value optimized out>) at socket.c:1792 #9 0x00007ffdf8d2e104 in event_dispatch_epoll_handler (event_pool=0xea2e20) at event.c:785 #10 event_dispatch_epoll (event_pool=0xea2e20) at event.c:847 #11 0x00000000004077d4 in main (argc=<value optimized out>, argv=0x7fff67008cc8) at glusterfsd.c:1841 Here server_writev is calling dict_unserialize without checking the client version. server_writev is sending xdata_val for parsing. Looks like 2.1 client is sending a different xdata_val then what is expected here.
https://code.engineering.redhat.com/gerrit/#/c/12575/
Connected from 2.1 clients and did some I/O, no crashes seen. IO successful. Client: [root@bob-the-minion fuse-2.0]# glusterfs --version glusterfs 3.4.0.32rhs built on Sep 6 2013 10:27:55 Server: glusterfs 3.3.0.14rhs
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2013-1262.html