| Summary: | [RHS-C] Adding 2nd RHS node failed with error "Failed to fetch gluster peer list from server Server1 on Cluster" | ||||||
|---|---|---|---|---|---|---|---|
| Product: | Red Hat Gluster Storage | Reporter: | Prasanth <pprakash> | ||||
| Component: | rhsc | Assignee: | Bala.FA <barumuga> | ||||
| Status: | CLOSED ERRATA | QA Contact: | Prasanth <pprakash> | ||||
| Severity: | urgent | Docs Contact: | |||||
| Priority: | high | ||||||
| Version: | 2.1 | CC: | dpati, dtsang, knarra, mmahoney, pprakash, rhs-bugs, sdharane, ssampat | ||||
| Target Milestone: | --- | Keywords: | ZStream | ||||
| Target Release: | RHGS 2.1.2 | ||||||
| Hardware: | Unspecified | ||||||
| OS: | Unspecified | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | rhsc-2.1.2-0.25.master.el6_5 | Doc Type: | Bug Fix | ||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2014-02-25 08:04:17 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Bug Depends On: | 972619 | ||||||
| Bug Blocks: | |||||||
| Attachments: |
|
||||||
|
Description
Prasanth
2013-11-18 11:48:56 UTC
rhsc-log-collector reports can be downloaded from: http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/1031613/ Ifconfig output before adding the host to cluster:
[root@vm10 ~]# ifconfig
eth0 Link encap:Ethernet HWaddr 00:1A:4A:46:26:2D
inet addr:10.70.36.82 Bcast:10.70.37.255 Mask:255.255.254.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:4910 errors:0 dropped:0 overruns:0 frame:0
TX packets:95 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:307109 (299.9 KiB) TX bytes:12543 (12.2 KiB)
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
UP LOOPBACK RUNNING MTU:16436 Metric:1
RX packets:700 errors:0 dropped:0 overruns:0 frame:0
TX packets:700 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:111428 (108.8 KiB) TX bytes:111428 (108.8 KiB)
virbr0 Link encap:Ethernet HWaddr 52:54:00:13:24:07
inet addr:192.168.122.1 Bcast:192.168.122.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:0 (0.0 b) TX bytes:0 (0.0 b)
I see below traceback in vdsm of vm10.lab.eng.blr.redhat.com
MainProcess|Thread-154::ERROR::2013-11-18 16:51:17,565::supervdsmServer::94::SuperVdsm.ServerCallback::(wrapper) Error in wrapper
Traceback (most recent call last):
File "/usr/share/vdsm/supervdsmServer", line 92, in wrapper
res = func(*args, **kwargs)
File "/usr/share/vdsm/supervdsmServer", line 362, in wrapper
return func(*args, **kwargs)
File "/usr/share/vdsm/gluster/__init__.py", line 31, in wrapper
return func(*args, **kwargs)
File "/usr/share/vdsm/gluster/cli.py", line 876, in peerStatus
raise ge.GlusterHostsListFailedException(rc=e.rc, err=e.err)
GlusterHostsListFailedException: Hosts list failed
error: (null)
return code: -22
MainProcess|Thread-155::DEBUG::2013-11-18 16:51:17,898::supervdsmServer::90::SuperVdsm.ServerCallback::(wrapper) call wrapper with (None,) {}
It looks like glusterfs fails on 'gluster peer info'
Below is the log from glusterfs. It looks like 'peer probe' and 'peer status' are called immediately which causes 'peer status' to return empty. The similar behaviour is also addressed in bz#972619 [2013-11-18 11:21:17.141761] I [glusterd-handler.c:821:__glusterd_handle_cli_probe] 0-glusterd: Received CLI probe req vm11.lab.eng.blr.redhat.com 24007 [2013-11-18 11:21:17.156228] I [glusterd-handler.c:2905:glusterd_probe_begin] 0-glusterd: Unable to find peerinfo for host: vm11.lab.eng.blr.redhat.com (24007) [2013-11-18 11:21:17.340987] I [rpc-clnt.c:976:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600 [2013-11-18 11:21:17.341045] I [socket.c:3505:socket_init] 0-management: SSL support is NOT enabled [2013-11-18 11:21:17.341055] I [socket.c:3520:socket_init] 0-management: using system polling thread [2013-11-18 11:21:17.347291] I [glusterd-handler.c:2886:glusterd_friend_add] 0-management: connect returned 0 [2013-11-18 11:21:17.563649] I [glusterd-handler.c:1018:__glusterd_handle_cli_list_friends] 0-glusterd: Received cli list req [2013-11-18 11:21:17.563864] W [dict.c:778:str_to_data] (-->/usr/lib64/glusterfs/3.4.0.43.1u2rhs/xlator/mgmt/glusterd.so(__glusterd_handle_cli_list_friends+0xa2) [0x7f9adce28842] (-->/usr/lib64/glusterfs/3.4.0.43.1u2rhs/xlator/mgmt/glusterd.so(glusterd_list_friends+0x114) [0x7f9adce283f4] (-->/usr/lib64/libglusterfs.so.0(dict_set_str+0x1c) [0x7f9ae1652cfc]))) 0-dict: value is NULL [2013-11-18 11:21:18.025020] I [glusterd-handler.c:1073:__glusterd_handle_cli_get_volume] 0-glusterd: Received get vol req [2013-11-18 11:21:18.860709] I [glusterd-rpc-ops.c:235:__glusterd_probe_cbk] 0-glusterd: Received probe resp from uuid: e05b29ce-8b8c-4b62-9d81-747eac6916da, host: vm11.lab.eng.blr.redhat.com [2013-11-18 11:21:19.200455] I [glusterd-rpc-ops.c:307:__glusterd_probe_cbk] 0-glusterd: Received resp to probe req [2013-11-18 11:21:19.200596] I [glusterd-rpc-ops.c:357:__glusterd_friend_add_cbk] 0-glusterd: Received ACC from uuid: e05b29ce-8b8c-4b62-9d81-747eac6916da, host: vm11.lab.eng.blr.redhat.com, port: 0 [2013-11-18 11:21:19.728628] I [glusterd-handshake.c:556:__glusterd_mgmt_hndsk_versions_ack] 0-management: using the op-version 2 [2013-11-18 11:21:20.136427] I [glusterd-handler.c:2324:__glusterd_handle_probe_query] 0-glusterd: Received probe from uuid: e05b29ce-8b8c-4b62-9d81-747eac6916da [2013-11-18 11:21:20.136523] I [glusterd-handler.c:2376:__glusterd_handle_probe_query] 0-glusterd: Responded to 10.70.36.83, op_ret: 0, op_errno: 0, ret: 0 [2013-11-18 11:21:20.428050] I [glusterd-handler.c:2028:__glusterd_handle_incoming_friend_req] 0-glusterd: Received probe from uuid: e05b29ce-8b8c-4b62-9d81-747eac6916da [2013-11-18 11:21:20.428118] I [glusterd-handler.c:3059:glusterd_xfer_friend_add_resp] 0-glusterd: Responded to vm11.lab.eng.blr.redhat.com (0), ret: 0 [2013-11-18 11:21:20.818363] I [glusterd-sm.c:495:glusterd_ac_send_friend_update] 0-: Added uuid: e05b29ce-8b8c-4b62-9d81-747eac6916da, host: vm11.lab.eng.blr.redhat.com [2013-11-18 11:21:21.194497] I [glusterd-handler.c:2190:__glusterd_handle_friend_update] 0-glusterd: Received friend update from uuid: e05b29ce-8b8c-4b62-9d81-747eac6916da [2013-11-18 11:21:21.194544] I [glusterd-handler.c:2235:__glusterd_handle_friend_update] 0-: Received uuid: 1f65487d-8859-4e70-b518-682368aaf91c, hostname:10.70.36.82 [2013-11-18 11:21:21.194553] I [glusterd-handler.c:2244:__glusterd_handle_friend_update] 0-: Received my uuid as Friend [2013-11-18 11:21:21.336185] I [glusterd-rpc-ops.c:554:__glusterd_friend_update_cbk] 0-management: Received ACC from uuid: e05b29ce-8b8c-4b62-9d81-747eac6916da [2013-11-18 11:24:29.680426] I [glusterd-handler.c:1018:__glusterd_handle_cli_list_friends] 0-glusterd: Received cli list req Verified in cb9 (rhsc-2.1.2-0.25.master.el6_5.noarch) Haven't seen this issue so far during my testing with cb9 build. Able to add 2nd node immediately after node1 comes up. So marking it as verified. Will re-open the bug if it happens again. As this bug is similar to https://bugzilla.redhat.com/show_bug.cgi?id=972619, its not required to add another doctext to this bz. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHEA-2014-0208.html |