Description of problem: I replaced bricks on one of my servers, ie. "replace-brick wpkg ewcs7:/var/spool/glusterfs/d_wpkg ewcs7:/data/glusterfs/wpkg/d/brick commit force" One of my clients began a loop where it was logging "I [client.c:2090:client_rpc_notify] 1-wpkg-client-8: disconnected" several times a second, with a corrosponding "I [socket.c:1798:socket_event_handler] 0-transport: disconnecting now" from glusterd on ewcs7. I finally narrowed it down to the replace brick. The client was continuing to send PORTBYBRICK requests to ewcs7 for /var/spool/glusterfs/d_wpkg. The result was: Retval: -1 Errno: 0 Port: 0 Version-Release number of selected component (if applicable): 3.3.1 How reproducible: Uncertain. Failed on 3 mounted volumes on one client. Steps to Reproduce: 1. Replace a brick 2. ??? Actual results: Logs start filling with attempts to connect to the old brick. Expected results: If a PORTBYBRICK rpc request fails because the brick does not exist, an error should be raised and the graph should be reloaded.
Given what I thought I knew of the problem, I thought by making a client graph change I could make it reload the graph and cure it's problem. I disabled quick-read and the problem continued. I reset quick-read and the problem still continued. Here's the new graph: Given volfile: +------------------------------------------------------------------------------+ 1: volume wpkg-client-0 2: type protocol/client 3: option remote-host ewcs2 4: option remote-subvolume /var/spool/glusterfs/a_wpkg 5: option transport-type tcp 6: end-volume 7: 8: volume wpkg-client-1 9: type protocol/client 10: option remote-host ewcs10 11: option remote-subvolume /data/glusterfs/wpkg/a 12: option transport-type tcp 13: end-volume 14: 15: volume wpkg-client-2 16: type protocol/client 17: option remote-host ewcs7 18: option remote-subvolume /data/glusterfs/wpkg/a/brick 19: option transport-type tcp 20: end-volume 21: 22: volume wpkg-client-3 23: type protocol/client 24: option remote-host ewcs2 25: option remote-subvolume /var/spool/glusterfs/b_wpkg 26: option transport-type tcp 27: end-volume 28: 29: volume wpkg-client-4 30: type protocol/client 31: option remote-host ewcs10 32: option remote-subvolume /data/glusterfs/wpkg/b 33: option transport-type tcp 34: end-volume 35: 36: volume wpkg-client-5 37: type protocol/client 38: option remote-host ewcs7 39: option remote-subvolume /data/glusterfs/wpkg/b/brick 40: option transport-type tcp 41: end-volume 42: 43: volume wpkg-client-6 44: type protocol/client 45: option remote-host ewcs2 46: option remote-subvolume /var/spool/glusterfs/c_wpkg 47: option transport-type tcp 48: end-volume 49: 50: volume wpkg-client-7 51: type protocol/client 52: option remote-host ewcs10 53: option remote-subvolume /data/glusterfs/wpkg/c 54: option transport-type tcp 55: end-volume 56: 57: volume wpkg-client-8 58: type protocol/client 59: option remote-host ewcs7 60: option remote-subvolume /data/glusterfs/wpkg/c/brick 61: option transport-type tcp 62: end-volume 63: 64: volume wpkg-client-9 65: type protocol/client 66: option remote-host ewcs2 67: option remote-subvolume /data/glusterfs/wpkg/d/brick 68: option transport-type tcp 69: end-volume 70: 71: volume wpkg-client-10 72: type protocol/client 73: option remote-host ewcs10 74: option remote-subvolume /data/glusterfs/wpkg/d 75: option transport-type tcp 76: end-volume 77: 78: volume wpkg-client-11 79: type protocol/client 80: option remote-host ewcs7 81: option remote-subvolume /data/glusterfs/wpkg/d/brick 82: option transport-type tcp 83: end-volume 84: 85: volume wpkg-replicate-0 86: type cluster/replicate 87: subvolumes wpkg-client-0 wpkg-client-1 wpkg-client-2 88: end-volume 89: 90: volume wpkg-replicate-1 91: type cluster/replicate 92: subvolumes wpkg-client-3 wpkg-client-4 wpkg-client-5 93: end-volume 94: 95: volume wpkg-replicate-2 96: type cluster/replicate 97: subvolumes wpkg-client-6 wpkg-client-7 wpkg-client-8 98: end-volume 99: 100: volume wpkg-replicate-3 101: type cluster/replicate 102: subvolumes wpkg-client-9 wpkg-client-10 wpkg-client-11 103: end-volume 104: 105: volume wpkg-dht 106: type cluster/distribute 107: subvolumes wpkg-replicate-0 wpkg-replicate-1 wpkg-replicate-2 wpkg-replicate-3 108: end-volume 109: 110: volume wpkg-write-behind 111: type performance/write-behind 112: subvolumes wpkg-dht 113: end-volume 114: 115: volume wpkg-read-ahead 116: type performance/read-ahead 117: subvolumes wpkg-write-behind 118: end-volume 119: 120: volume wpkg-quick-read 121: type performance/quick-read 122: option cache-size 8MB 123: subvolumes wpkg-read-ahead 124: end-volume 125: 126: volume wpkg-md-cache 127: type performance/md-cache 128: subvolumes wpkg-quick-read 129: end-volume 130: 131: volume wpkg 132: type debug/io-stats 133: option log-level INFO 134: option latency-measurement off 135: option count-fop-hits off 136: subvolumes wpkg-md-cache 137: end-volume So the it's getting the right graph, but for some reason it's still requesting the port for the former brick.
Joe, Could you attach the client logs, where you see PMAP_PORTBYBRICK rpc failure messages? I suspect that the old graph, where the client xlator still refers to the 'old brick', is still 'alive'. IOWs, I believe that graph switch hasn't happened (cleanly). These logs would help me confirm this hunch.
In a similar case, I tried doing a live upgrade from 3.3.2qa1 to 3.4.0. One of the changes (which needs actual mention in the release notes, but that's a different issue) is that the brick port number ranges changed. It appears to me that the fuse clients have no way to pick up port number changes for bricks. I'm including TRACE level logs from the client: [2013-07-16 21:44:28.596656] T [rpc-clnt.c:426:rpc_clnt_reconnect] 0-gvhome-client-1: attempting reconnect [2013-07-16 21:44:28.596733] D [name.c:149:client_fill_address_family] 0-gvhome-client-1: address-family not specified, guessing it to be inet from (remote-host: admin02.mgmt) [2013-07-16 21:44:28.596765] T [name.c:219:af_inet_client_get_remote_sockaddr] 0-gvhome-client-1: option remote-port missing in volume gvhome-client-1. Defaulting to 24007 [2013-07-16 21:44:28.596793] T [common-utils.c:104:gf_resolve_ip6] 0-resolver: flushing DNS cache [2013-07-16 21:44:28.596819] T [common-utils.c:111:gf_resolve_ip6] 0-resolver: DNS cache not present, freshly probing hostname: admin02.mgmt [2013-07-16 21:44:28.603614] D [common-utils.c:151:gf_resolve_ip6] 0-resolver: returning ip-172.22.4.209 (port-24007) for hostname: admin02.mgmt and port: 24007 [2013-07-16 21:44:28.603678] T [socket.c:370:__socket_nodelay] 0-gvhome-client-1: NODELAY enabled for socket 20 [2013-07-16 21:44:28.603703] T [socket.c:424:__socket_keepalive] 0-gvhome-client-1: Keep-alive enabled for socket 20, interval 2, idle: 20 [2013-07-16 21:44:28.603982] D [socket.c:280:__socket_disconnect] 0-gvhome-client-1: shutdown() returned -1. Transport endpoint is not connected [2013-07-16 21:44:28.604043] T [rpc-clnt.c:532:rpc_clnt_connection_cleanup] 0-gvhome-client-1: cleaning up state in transport object 0x1df49e0 [2013-07-16 21:44:28.604073] D [socket.c:184:__socket_rwv] 0-gvhome-client-1: EOF from peer 172.22.4.209:24009 [2013-07-16 21:44:28.604100] D [socket.c:1512:__socket_proto_state_machine] 0-gvhome-client-1: reading from socket failed. Error (Transport endpoint is not connected), peer (172.22.4.209:24009) [2013-07-16 21:44:28.604124] D [socket.c:1798:socket_event_handler] 0-transport: disconnecting now [2013-07-16 21:44:28.604161] T [rpc-clnt.c:532:rpc_clnt_connection_cleanup] 0-gvhome-client-1: cleaning up state in transport object 0x1df49e0 and you can see that it's trying to connect to port 24009 even though the port has changed for admin02.mgmt: Status of volume: gvhome Gluster process Port Online Pid ------------------------------------------------------------------------------ Brick admin01.mgmt:/brick/home 24011 Y 3387 Brick admin02.mgmt:/brick/home 49152 Y 8135 Self-heal Daemon on localhost N/A Y 3431 Self-heal Daemon on admin02.mgmt N/A Y 8154 I would suspect that the same issue would happen if glusterfsd processes were restarted and bricks did not bind to the same port numbers they were before. Todd
Oh, and sending the process a HUP to get it to 'reincarnate', it logs the following: [2013-07-16 21:55:22.335282] I [glusterfsd.c:896:reincarnate] 0-glusterfsd: Fetching the volume file from server... [2013-07-16 21:55:22.335370] T [rpc-clnt.c:1305:rpc_clnt_record] 0-glusterfs: Auth Info: pid: 0, uid: 0, gid: 0, owner: [2013-07-16 21:55:22.335401] T [rpc-clnt.c:1185:rpc_clnt_record_build_header] 0-rpc-clnt: Request fraglen 84, payload: 20, rpc hdr: 64 [2013-07-16 21:55:22.335471] T [rpc-clnt.c:1504:rpc_clnt_submit] 0-rpc-clnt: submitted request (XID: 0x19x Program: GlusterFS Handshake, ProgVers: 2, Proc: 2) to rpc-transport (glusterfs) [2013-07-16 21:55:22.335936] T [rpc-clnt.c:682:rpc_clnt_reply_init] 0-glusterfs: received rpc message (RPC XID: 0x19x Program: GlusterFS Handshake, ProgVers: 2, Proc: 2) from rpc-transport (glusterfs) [2013-07-16 21:55:22.336008] I [glusterfsd-mgmt.c:1560:mgmt_getspec_cbk] 0-glusterfs: No change in volfile, continuing Obviously, the volfile hasn't changed, but some details, like the port number for the brick has. Todd