Bug 1433276
Summary: | glusterd crashes when peering an IP where the address is more than acceptable range (>255) OR with random hostnames | |||
---|---|---|---|---|
Product: | [Red Hat Storage] Red Hat Gluster Storage | Reporter: | Nag Pavan Chilakam <nchilaka> | |
Component: | glusterd | Assignee: | Atin Mukherjee <amukherj> | |
Status: | CLOSED ERRATA | QA Contact: | Ambarish <asoman> | |
Severity: | urgent | Docs Contact: | ||
Priority: | unspecified | |||
Version: | rhgs-3.2 | CC: | amukherj, asoman, mchangir, rcyriac, rhinduja, rhs-bugs, storage-qa-internal, vbellur | |
Target Milestone: | --- | |||
Target Release: | RHGS 3.3.0 | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | ||||
Fixed In Version: | glusterfs-3.8.4-19 | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 1433578 1440162 (view as bug list) | Environment: | ||
Last Closed: | 2017-09-21 04:33:25 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | 1433578, 1434399 | |||
Bug Blocks: | 1417151, 1440162, 1449076 |
Description
Nag Pavan Chilakam
2017-03-17 09:51:56 UTC
I hit this on my setup as well just now . [root@localhost bricks]# gluster peer probe 10.70.37.12345 peer probe: failed: Probe returned with Transport endpoint is not connected [root@localhost bricks]# The weird thing is I see this file getting created with the wrong/random hostname : [root@localhost peers]# ll -h /var/lib/glusterd/peers/ total 12K -rw-------. 1 root root 73 Mar 17 05:52 02ef4e27-a38e-4e1e-8b75-a0657c2eae6b -rw-------. 1 root root 75 Mar 17 05:52 10.70.37.12345 -----> BAD -rw-------. 1 root root 94 Mar 17 05:52 f6384f3a-ab69-4757-8fc8-eda43bd17c2e [root@localhost peers]# [root@localhost peers]# cat 10.70.37.12345 uuid=00000000-0000-0000-0000-000000000000 state=0 hostname1=10.70.37.12345 [root@localhost peers]# Peer Status fails on the crashed node as well : [root@localhost peers]# gluster peer status peer status: failed [root@localhost peers]# Though it works fine on other nodes : [root@localhost /]# gluster peer status Number of Peers: 2 Hostname: 10.70.37.65 Uuid: 32095651-cbda-40e8-941c-6b75c260610e State: Peer in Cluster (Connected) Hostname: 10.70.37.116 Uuid: 02ef4e27-a38e-4e1e-8b75-a0657c2eae6b State: Peer in Cluster (Connected) [root@localhost /]# The issue is reproducible if I give peer probe "abcd" as well. Samikshan shared a similar upstream BZ - https://bugzilla.redhat.com/show_bug.cgi?id=770048 ,which got later closed as WFM as noone could reproduce it. But it's very very consistent now. downstream patch : https://code.engineering.redhat.com/gerrit/101366 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2017:2774 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2017:2774 |