Bug 1029211
Summary: | [Scale] Import 64 Node Cluster - One Host Fails To Successfully Come Up | ||||||
---|---|---|---|---|---|---|---|
Product: | [Red Hat Storage] Red Hat Gluster Storage | Reporter: | Matt Mahoney <mmahoney> | ||||
Component: | rhsc | Assignee: | Timothy Asir <tjeyasin> | ||||
Status: | CLOSED EOL | QA Contact: | storage-qa-internal <storage-qa-internal> | ||||
Severity: | high | Docs Contact: | |||||
Priority: | high | ||||||
Version: | 2.1 | CC: | dpati, mmahoney, rhs-bugs, sabose, vagarwal | ||||
Target Milestone: | --- | ||||||
Target Release: | --- | ||||||
Hardware: | Unspecified | ||||||
OS: | Unspecified | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2015-12-03 17:14:15 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
Matt Mahoney
2013-11-11 22:06:44 UTC
Created attachment 822636 [details]
Import - One Host Fails To Successfully Come Up
Note: the same host that failed to come Up had previously been successfully added to a cluster. Matt, Pls. mention, which node/server did not come up, in actual result and also attach all the logs for debugging purpose. Matt, After multiple rounds of probing/detaching/volume creation, I was able to see this issue only once where I had 64 nodes cluster and total 8 volumes with 512 bricks in all. One host went to unassigned state after the imports. BUT the very next sync up of the hosts, the one unassigned host also came up. After analysis I find that there are GlusterHostUUIDNotFoundException in vdsm.log, which means getting the uuid details for the node has failed once, but next sync up time the same command was successful and so the host comes up. Check if the same was the case in your scenario as well, or the host remains in unassigned state forever? Lets try to test this using Corbett build CB10 and if we can reproduce it even with 5 min resynch, then we need to take a look at it. We might document this bug in the release note. Will retest in Corbett. On the nodes that failed to install, vdsm failed to start with 2013-12-17 17:18:43 DEBUG otopi.plugins.otopi.services.rhel plugin.executeRaw:383 execute-result: ('/sbin/service', 'vdsmd', 'start'), rc=1 2013-12-17 17:18:43 DEBUG otopi.plugins.otopi.services.rhel plugin.execute:441 execute-output: ('/sbin/service', 'vdsmd', 'start') stdout: Stopping ksmtuned: [FAILED] vdsm: Stop conflicting ksmtuned[FAILED] vdsm start[FAILED] Tim, could you look into this Thank you for submitting this issue for consideration in Red Hat Gluster Storage. The release for which you requested us to review, is now End of Life. Please See https://access.redhat.com/support/policy/updates/rhs/ If you can reproduce this bug against a currently maintained version of Red Hat Gluster Storage, please feel free to file a new report against the current release. |