Bug 1340049
Summary: | Gluster related xattr are not present on the brick after workload is run in the volume | ||
---|---|---|---|
Product: | [Red Hat Storage] Red Hat Gluster Storage | Reporter: | Shekhar Berry <shberry> |
Component: | glusterd | Assignee: | Atin Mukherjee <amukherj> |
Status: | CLOSED NOTABUG | QA Contact: | storage-qa-internal <storage-qa-internal> |
Severity: | high | Docs Contact: | |
Priority: | unspecified | ||
Version: | rhgs-3.1 | CC: | eboyd, jeder, jharriga, mliyazud, rcyriac, rhs-bugs, shberry, storage-qa-internal, vbellur |
Target Milestone: | --- | Keywords: | ZStream |
Target Release: | --- | ||
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2016-06-03 05:41:52 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Shekhar Berry
2016-05-26 11:22:05 UTC
Link to Log files: http://perf1.perf.lab.eng.bos.redhat.com/pub/mpillai/aplo/ So here is the analysis: From gprfs13 cmd_history.log: [2016-05-25 05:47:53.805251] : volume create cgluster replica 2 172.17.40.13:/bricks/b/g 172.17.40.14:/bricks/b/g 172.17.40.15:/bricks/b/g 172.17.40.16:/bricks/b/g 172.17.40.22:/bricks/b/ g 172.17.40.24:/bricks/b/g : SUCCESS [2016-05-25 05:47:54.632645] : v start cgluster : SUCCESS So the volume was created and started at 05:47 From gprfs14 bricks/bricks-b-g.log [2016-05-25 10:26:15.925553] W [MSGID: 113075] [posix-helpers.c:1824:posix_health_check_thread_proc] 0-cgluster-posix: health_check on /bricks/b/g returned [No such file or directory] [2016-05-25 10:26:15.925563] M [MSGID: 113075] [posix-helpers.c:1845:posix_health_check_thread_proc] 0-cgluster-posix: health-check failed, going down [2016-05-25 10:26:45.965533] M [MSGID: 113075] [posix-helpers.c:1851:posix_health_check_thread_proc] 0-cgluster-posix: still alive! -> SIGTERM [2016-05-25 10:26:45.982655] W [glusterfsd.c:1251:cleanup_and_exit] (-->/lib64/libpthread.so.0(+0x7dc5) [0x7f241b8f6dc5] -->/usr/sbin/glusterfsd(glusterfs_sigwaiter+0xe5) [0x7f241cf70915] -->/usr/sbin/glusterfsd(cleanup_and_exit+0x6b) [0x7f241cf7078b] ) 0-: received signum (15), shutting down The first log entry indicates that posix health check failed with ENOENT which means the brick path was deleted manually and that too when the volume was already up and running. From gprfs13 cmd_history.log [2016-05-25 11:25:27.752059] : v stop cgluster : SUCCESS Volume was stopped. Note that even if brick path is deleted volume stop will work successfully. [2016-05-25 11:25:40.199575] : v start cgluster : FAILED : Pre Validation failed on 172.17.40.14. Failed to get extended attribute trusted.glusterfs.volume-id for brick dir /bricks/b/g. Reason : No data available volume start failed here. This indicates that brick now exists otherwise it would have failed with a different reason saying brick doesn't exist. But here it failed to find the xattrs which indicates that brick was recreated manually but since the volume was already configured then xattrs were all lost and volume couldn't be started. From the looks of the log files this definitely looks like a set up issue where bricks were removed manually even when the volume was in existence which is *not supported* Shekhar, As discussed, do you mind to close this bug now given that you are unable to hit it. Feel free to reopen if you hit it again. ~Atin I am closing this bug now, please feel free to reopen if you hit it again. As Atin mentoned, its closed |