| Summary: | SMB:Glusterd on one node crashes while doing add-brick operation followed by rebalance. | |||
|---|---|---|---|---|
| Product: | Red Hat Gluster Storage | Reporter: | surabhi <sbhaloth> | |
| Component: | distribute | Assignee: | Nithya Balachandran <nbalacha> | |
| Status: | CLOSED DEFERRED | QA Contact: | surabhi <sbhaloth> | |
| Severity: | high | Docs Contact: | ||
| Priority: | high | |||
| Version: | 2.1 | CC: | asriram, lmohanty, nbalacha, nlevinki, rtalur, sbhaloth, sdharane, spalai, vagarwal, vbellur | |
| Target Milestone: | --- | |||
| Target Release: | --- | |||
| Hardware: | Unspecified | |||
| OS: | Unspecified | |||
| Whiteboard: | dht-add-brick | |||
| Fixed In Version: | Doc Type: | Known Issue | ||
| Doc Text: |
While rebalance is in progress, adding a brick to the cluster displays an error message, "failed to get index" in the gluster log file.
|
Story Points: | --- | |
| Clone Of: | ||||
| : | 1286074 (view as bug list) | Environment: | ||
| Last Closed: | 2015-11-27 10:35:59 UTC | Type: | Bug | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Bug Depends On: | ||||
| Bug Blocks: | 1035040, 1286074 | |||
|
Description
surabhi
2013-12-09 11:17:49 UTC
Tried the rebalance test and saw the crash again. Glusterfs version: glusterfs-fuse-3.4.0.49rhs-1.el6rhs.x86_64 glusterfs-server-3.4.0.49rhs-1.el6rhs.x86_64 [2013-12-17 11:24:10.216412] I [socket.c:3520:socket_init] 0-management: using system polling thread [2013-12-17 11:24:15.391601] E [glusterd-utils.c:7825:glusterd_volume_rebalance_use_rsp_dict] 0-: failed to get index [2013-12-17 11:24:15.406033] E [glusterd-utils.c:7825:glusterd_volume_rebalance_use_rsp_dict] 0-: failed to get index [2013-12-17 11:24:15.426745] E [glusterd-utils.c:7825:glusterd_volume_rebalance_use_rsp_dict] 0-: failed to get index [2013-12-17 11:26:05.756373] I [glusterd-handshake.c:364:__server_event_notify] 0-: received defrag status updated [2013-12-17 11:26:05.763349] W [socket.c:522:__socket_rwv] 0-management: readv o Latest Sosreports placed in above location. Please add doctext for this known issue. Could you please retry this with the latest patches? There have been couple of fixes which are part of 3.4.0.54rhs build, that addresses similar issues. A similar issue: https://bugzilla.redhat.com/show_bug.cgi?id=1024316 I will it on glusterfs-3.4.0.55rhs-1.el6rhs.x86_64 and update the results I tried it on build 33 and was able to reproduce the bug on it. Here are the details: Creating directory at /mnt/withreaddir//TestDir0/TestDir2/TestDir2 Creating files in /mnt/withreaddir//TestDir0/TestDir2/TestDir2...... Cannot open file: No such file or directory flock() on closed filehandle FH at ./CreateDirAndFileTree.pl line 74. Cannot lock - Bad file descriptor root.42.178[Jan-08-2014- 6:30:55] >rpm -qa | grep gluster glusterfs-fuse-3.4.0.33rhs-1.el6rhs.x86_64 glusterfs-rdma-3.4.0.33rhs-1.el6rhs.x86_64 glusterfs-libs-3.4.0.33rhs-1.el6rhs.x86_64 glusterfs-geo-replication-3.4.0.33rhs-1.el6rhs.x86_64 glusterfs-api-3.4.0.33rhs-1.el6rhs.x86_64 glusterfs-server-3.4.0.33rhs-1.el6rhs.x86_64 glusterfs-devel-3.4.0.33rhs-1.el6rhs.x86_64 glusterfs-3.4.0.33rhs-1.el6rhs.x86_64 glusterfs-api-devel-3.4.0.33rhs-1.el6rhs.x86_64 glusterfs-debuginfo-3.4.0.33rhs-1.el6rhs.x86_64 Analysis as of now: Gluster fails to create/open a file when: a. File's hash corresponds to the new brick. b. File must not be directly under the / of the volume. c. Folder or multiple folders under which the file lies are not yet created on the new brick. The above analysis is for BZ 1049181. Tried it on glusterfs-3.4.0.55rhs-1.el6rhs.x86_64: Core is not generated now but the failures seen while doing rebalance are still present. [2014-01-20 06:16:57.077109] E [glusterd-utils.c:4007:glusterd_nodesvc_unlink_socket_file] 0-management: Failed to remove /var/run/fdc31c62f15c054be9507d58711f3d14.sock et error: No such file or directory [2014-01-20 06:16:57.079450] I [mem-pool.c:539:mem_pool_destroy] 0-management: size=2236 max=0 total=0 [2014-01-20 06:16:57.079473] I [rpc-clnt.c:977:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600 [2014-01-20 06:17:22.171858] E [glusterd-utils.c:7964:glusterd_volume_rebalance_use_rsp_dict] 0-: failed to get index [2014-01-20 06:17:22.189004] E [glusterd-utils.c:7964:glusterd_volume_rebalance_use_rsp_dict] 0-: failed to get index [2014-01-20 06:17:22.307779] E [glusterd-utils.c:7964:glusterd_volume_rebalance_use_rsp_dict] 0-: failed to get index Sosreports are updated. Please review the edited doc text and sign off. Doc text looks fine Cloning to 3.1. To be fixed in future release |