Bug 1565729
Summary: | volume creation fails - when a 5 node gluster cluster is reduced to 3 node by removing labels on 2 nodes | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | [Red Hat Storage] Red Hat Gluster Storage | Reporter: | krishnaram Karthick <kramdoss> | ||||||
Component: | heketi | Assignee: | Michael Adam <madam> | ||||||
Status: | CLOSED ERRATA | QA Contact: | krishnaram Karthick <kramdoss> | ||||||
Severity: | high | Docs Contact: | |||||||
Priority: | unspecified | ||||||||
Version: | cns-3.9 | CC: | hchiramm, pprakash, rcyriac, rhs-bugs, rtalur, storage-qa-internal, vinug | ||||||
Target Milestone: | --- | Keywords: | ZStream | ||||||
Target Release: | CNS 3.9 Async | ||||||||
Hardware: | Unspecified | ||||||||
OS: | Unspecified | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | rhgs-volmanager-container-3.3.1-8.3 | Doc Type: | If docs needed, set a value | ||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2018-04-19 03:34:39 UTC | Type: | Bug | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Attachments: |
|
Description
krishnaram Karthick
2018-04-10 15:47:22 UTC
Created attachment 1419998 [details]
heketi_logs
Created attachment 1419999 [details]
topology file
Two items I noticed looking through the logs: 1) The node health monitor thread has not been started. This is probably due to an "old" heketi config that lacks the parameter needed to enable this thread. With the monitor on the volume create operation will not try to use nodes it knows to be unavailable. 2) The volume create operation retried correctly, but must have never hit a combination of nodes where all nodes were up. We may need to tweak the number of retries performed to increase the chances of a working node selection. But before we work on #2, we should retest with #1 working. This issue was due to the node health monitoring not enabled. With rhgs-volmanager-container-3.3.1-8.3, this is enabled by default. Heketi 6.0.0 [heketi] INFO 2018/04/16 14:34:34 Loaded kubernetes executor [heketi] ERROR 2018/04/16 14:34:34 /src/github.com/heketi/heketi/apps/glusterfs/app.go:100: invalid log level: [heketi] INFO 2018/04/16 14:34:34 Block: Auto Create Block Hosting Volume set to true [heketi] INFO 2018/04/16 14:34:34 Block: New Block Hosting Volume size 100 GB [heketi] INFO 2018/04/16 14:34:34 GlusterFS Application Loaded [heketi] INFO 2018/04/16 14:34:34 Started Node Health Cache Monitor Listening on port 8080 Verified the bug in rhgs-volmanager-container-3.3.1-8.4. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:1178 |