Bug 981708
Summary: | Dist-geo-rep: fail to sync files, status is 'faulty' an log gives warning '1 subvolumes down -- not fixing' | |||
---|---|---|---|---|
Product: | [Red Hat Storage] Red Hat Gluster Storage | Reporter: | Rachana Patel <racpatel> | |
Component: | geo-replication | Assignee: | Bug Updates Notification Mailing List <rhs-bugs> | |
Status: | CLOSED EOL | QA Contact: | Matt Zywusko <mzywusko> | |
Severity: | high | Docs Contact: | ||
Priority: | medium | |||
Version: | 2.1 | CC: | avishwan, chrisw, csaba, mzywusko, nsathyan, rhs-bugs, rwheeler, sdharane, vagarwal | |
Target Milestone: | --- | |||
Target Release: | --- | |||
Hardware: | x86_64 | |||
OS: | Linux | |||
Whiteboard: | ||||
Fixed In Version: | glusterfs-3.4.0.15rhs-1 | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 982913 (view as bug list) | Environment: | ||
Last Closed: | 2015-11-25 08:49:48 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 982913 |
Description
Rachana Patel
2013-07-05 14:31:48 UTC
Shishir, lets discuss this on Monday. I think solution should be same as https://code.engineering.redhat.com/gerrit/9801 but not very sure. not able to reproduce with build glusterfs-3.4.0.18rhs-1 able to reproduce with glusterfs-3.4.0.24rhs-1 Steps to Reproduce: 1. had a distributed volume. created data on it. -created and started geo rep session. - added new RHSS(10.70.37.67) to master cluster and added brick to the volume(Brick6: 10.70.37.67:/rhs/brick1/2) - started rebalance process for that server. - run - gsec, create push-pem and start command(with force option) - keep checking status - process on one server is going into faulty status and after sometimes its defunct [root@4DVM3 ~]# gluster v info 4_master2 Volume Name: 4_master2 Type: Distribute Volume ID: 82f7b96b-ac89-4603-9e64-c3be5d6b2db3 Status: Started Number of Bricks: 6 Transport-type: tcp Bricks: Brick1: 10.70.37.110:/rhs/brick1/2 Brick2: 10.70.37.81:/rhs/brick1/2 Brick3: 10.70.37.110:/rhs/brick2/2 Brick4: 10.70.37.81:/rhs/brick2/2 Brick5: 10.70.37.110:/rhs/brick3/2 Brick6: 10.70.37.67:/rhs/brick1/2 Options Reconfigured: geo-replication.indexing: on geo-replication.ignore-pid-check: on changelog.changelog: on [root@4DVM2 ~]# gluster volume geo 4_master2 rhsauto026.lab.eng.blr.redhat.com::4_slave2 status NODE MASTER SLAVE HEALTH UPTIME ------------------------------------------------------------------------------------------------------------------------- 4DVM2.lab.eng.blr.redhat.com 4_master2 rhsauto026.lab.eng.blr.redhat.com::4_slave2 defunct N/A 4DVM3.lab.eng.blr.redhat.com 4_master2 rhsauto026.lab.eng.blr.redhat.com::4_slave2 Stable 3 days 05:11:36 4DVM5.lab.eng.blr.redhat.com 4_master2 rhsauto026.lab.eng.blr.redhat.com::4_slave2 Stable 01:34:15 log from that server [root@4DVM2 ~]# less /var/log/glusterfs/geo-replication/4_master2/ssh%3A%2F%2Froot%4010.70.37.1%3Agluster%3A%2F%2F127.0.0.1%3A4_slave2.%2Frhs%2Fbrick1%2F2.gluster.log | grep -B6 shutting [2013-08-30 01:18:06.963483] I [client.c:2103:client_rpc_notify] 0-4_master2-client-2: disconnected from 10.70.37.110:49156. Client process will keep trying to connect to glusterd until brick's port is available. [2013-08-30 01:18:06.963492] W [dht-common.c:5111:dht_notify] 0-4_master2-dht: Received CHILD_DOWN. Exiting [2013-08-30 01:18:06.963521] I [client.c:2103:client_rpc_notify] 0-4_master2-client-3: disconnected from 10.70.37.81:49156. Client process will keep trying to connect to glusterd until brick's port is available. [2013-08-30 01:18:06.963531] W [dht-common.c:5111:dht_notify] 0-4_master2-dht: Received CHILD_DOWN. Exiting [2013-08-30 01:18:06.963559] I [client.c:2103:client_rpc_notify] 0-4_master2-client-4: disconnected from 10.70.37.110:49157. Client process will keep trying to connect to glusterd until brick's port is available. [2013-08-30 01:18:06.963567] W [dht-common.c:5111:dht_notify] 0-4_master2-dht: Received CHILD_DOWN. Exiting [2013-08-30 01:18:06.963964] W [glusterfsd.c:1062:cleanup_and_exit] (-->/lib64/libc.so.6(clone+0x6d) [0x315aee890d] (-->/lib64/libpthread.so.0() [0x315b607851] (-->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xcd) [0x4052dd]))) 0-: received signum (15), shutting down -- [2013-08-30 01:22:39.197446] I [dht-layout.c:633:dht_layout_normalize] 0-4_master2-dht: found anomalies in /flat/1/etc2/rpm. holes=1 overlaps=0 missing=1 down=0 misc=0 [2013-08-30 01:22:39.749246] I [dht-layout.c:633:dht_layout_normalize] 0-4_master2-dht: found anomalies in /flat/1/etc2/mcelog. holes=1 overlaps=0 missing=1 down=0 misc=0 [2013-08-30 01:22:39.749544] W [client-rpc-fops.c:322:client3_3_mkdir_cbk] 0-4_master2-client-5: remote operation failed: File exists. Path: /flat/1/etc2/mcelog [2013-08-30 01:22:39.958014] I [dht-layout.c:633:dht_layout_normalize] 0-4_master2-dht: found anomalies in /flat/1/etc2/makedev.d. holes=1 overlaps=0 missing=1 down=0 misc=0 [2013-08-30 01:22:39.968967] W [client-rpc-fops.c:322:client3_3_mkdir_cbk] 0-4_master2-client-5: remote operation failed: File exists. Path: /flat/1/etc2/makedev.d [2013-08-30 01:22:40.701434] I [fuse-bridge.c:5714:fuse_thread_proc] 0-fuse: unmounting /tmp/gsyncd-aux-mount-iCFhsq [2013-08-30 01:22:40.701745] W [glusterfsd.c:1062:cleanup_and_exit] (-->/lib64/libc.so.6(clone+0x6d) [0x315aee890d] (-->/lib64/libpthread.so.0() [0x315b607851] (-->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xcd) [0x4052dd]))) 0-: received signum (15), shutting down -- [2013-08-30 14:43:46.197873] I [dht-layout.c:633:dht_layout_normalize] 0-4_master2-dht: found anomalies in /flat/59/etc6/sssd. holes=1 overlaps=0 missing=1 down=0 misc=0 [2013-08-30 14:43:46.485099] I [dht-layout.c:633:dht_layout_normalize] 0-4_master2-dht: found anomalies in /flat/59/etc6/postfix. holes=1 overlaps=0 missing=1 down=0 misc=0 [2013-08-30 14:43:46.492639] W [client-rpc-fops.c:322:client3_3_mkdir_cbk] 0-4_master2-client-5: remote operation failed: File exists. Path: /flat/59/etc6/postfix [2013-08-30 14:43:46.794727] I [dht-layout.c:633:dht_layout_normalize] 0-4_master2-dht: found anomalies in /flat/59/etc6/init. holes=1 overlaps=1 missing=0 down=0 misc=0 [2013-08-30 14:43:47.015001] I [dht-common.c:1035:dht_lookup_everywhere_cbk] 0-4_master2-dht: deleting stale linkfile /flat/59/etc6/init/rc.conf on 4_master2-client-0 [2013-08-30 14:43:47.192651] I [fuse-bridge.c:5714:fuse_thread_proc] 0-fuse: unmounting /tmp/gsyncd-aux-mount-dZdREe [2013-08-30 14:43:47.393305] W [glusterfsd.c:1062:cleanup_and_exit] (-->/lib64/libc.so.6(clone+0x6d) [0x315aee890d] (-->/lib64/libpthread.so.0() [0x315b607851] (-->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xcd) [0x4052dd]))) 0-: received signum (15), shutting down -- [2013-08-30 14:44:02.448437] I [client-handshake.c:1468:client_setvolume_cbk] 0-4_master2-client-1: Server and Client lk-version numbers are not same, reopening the fds [2013-08-30 14:44:02.452657] I [fuse-bridge.c:5855:fuse_graph_setup] 0-fuse: switched to graph 0 [2013-08-30 14:44:02.452760] I [client-handshake.c:450:client_set_lk_version_cbk] 0-4_master2-client-1: Server lk version = 1 [2013-08-30 14:44:02.452780] I [client-handshake.c:450:client_set_lk_version_cbk] 0-4_master2-client-3: Server lk version = 1 [2013-08-30 14:44:02.452874] I [fuse-bridge.c:4810:fuse_init] 0-glusterfs-fuse: FUSE inited with protocol versions: glusterfs 7.13 kernel 7.13 [2013-08-30 15:57:05.809306] I [fuse-bridge.c:5714:fuse_thread_proc] 0-fuse: unmounting /tmp/gsyncd-aux-mount-1jTEVe [2013-08-30 15:57:05.977090] W [glusterfsd.c:1062:cleanup_and_exit] (-->/lib64/libc.so.6(clone+0x6d) [0x315aee890d] (-->/lib64/libpthread.so.0() [0x315b607851] (-->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xcd) [0x4052dd]))) 0-: received signum (15), shutting down -- [2013-08-30 15:57:21.918372] I [client-handshake.c:450:client_set_lk_version_cbk] 0-4_master2-client-4: Server lk version = 1 [2013-08-30 15:57:21.918583] I [fuse-bridge.c:4810:fuse_init] 0-glusterfs-fuse: FUSE inited with protocol versions: glusterfs 7.13 kernel 7.13 [2013-08-30 17:21:46.566736] I [dht-common.c:1141:dht_lookup_linkfile_cbk] 0-4_master2-dht: lookup of /flat/21/etc10/makedev.d/02dac960 on 4_master2-client-2 (following linkfile) reached link [2013-08-30 17:21:46.648623] W [client-rpc-fops.c:256:client3_3_mknod_cbk] 0-4_master2-client-2: remote operation failed: File exists. Path: /flat/21/etc10/makedev.d/02dac960 [2013-08-30 17:21:46.669311] W [dht-linkfile.c:44:dht_linkfile_lookup_cbk] 0-4_master2-dht: got non-linkfile 4_master2-client-2:/flat/21/etc10/makedev.d/02dac960 [2013-08-30 17:21:47.002521] I [fuse-bridge.c:5714:fuse_thread_proc] 0-fuse: unmounting /tmp/gsyncd-aux-mount-CA2gwx [2013-08-30 17:21:47.430519] W [glusterfsd.c:1062:cleanup_and_exit] (-->/lib64/libc.so.6(clone+0x6d) [0x315aee890d] (-->/lib64/libpthread.so.0() [0x315b607851] (-->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xcd) [0x4052dd]))) 0-: received signum (15), shutting down -- [2013-09-01 03:14:09.105737] I [dht-layout.c:633:dht_layout_normalize] 0-4_master2-dht: found anomalies in /flat/flat/44/etc2/hal/fdi/policy. holes=1 overlaps=1 missing=2 down=0 misc=0 [2013-09-01 03:14:09.232154] I [dht-layout.c:633:dht_layout_normalize] 0-4_master2-dht: found anomalies in /flat/flat/44/etc2/kdump-adv-conf. holes=1 overlaps=1 missing=2 down=0 misc=0 [2013-09-01 03:14:09.959418] I [dht-layout.c:633:dht_layout_normalize] 0-4_master2-dht: found anomalies in /flat/flat/44/etc2/kdump-adv-conf/kdump_initscripts. holes=1 overlaps=1 missing=2 down=0 misc=0 [2013-09-01 03:14:10.599752] I [dht-layout.c:633:dht_layout_normalize] 0-4_master2-dht: found anomalies in /flat/flat/44/etc2/gtk-2.0/x86_64-redhat-linux-gnu. holes=1 overlaps=2 missing=0 down=0 misc=0 [2013-09-01 03:14:11.130747] I [dht-common.c:1035:dht_lookup_everywhere_cbk] 0-4_master2-dht: deleting stale linkfile /flat/flat/44/etc2/gtk-2.0/x86_64-redhat-linux-gnu/gdk-pixbuf.loaders on 4_master2-client-0 [2013-09-01 03:14:12.023580] I [fuse-bridge.c:5714:fuse_thread_proc] 0-fuse: unmounting /tmp/gsyncd-aux-mount-bxpGbb [2013-09-01 03:14:13.136548] W [glusterfsd.c:1062:cleanup_and_exit] (-->/lib64/libc.so.6(clone+0x6d) [0x315aee890d] (-->/lib64/libpthread.so.0() [0x315b607851] (-->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xcd) [0x4052dd]))) 0-: received signum (15), shutting down --> [root@4DVM2 ~]# less /var/log/glusterfs/geo-replication/4_master2/ssh%3A%2F%2Froot%4010.70.37.1%3Agluster%3A%2F%2F127.0.0.1%3A4_slave2.%2Frhs%2Fbrick1%2F2.gluster.log | grep 'disk layout missing' [2013-08-30 01:18:18.871504] I [dht-layout.c:720:dht_layout_dir_mismatch] 0-4_master2-dht: / - disk layout missing [2013-08-30 01:18:20.568471] I [dht-layout.c:720:dht_layout_dir_mismatch] 0-4_master2-dht: / - disk layout missing [2013-08-30 01:18:21.710717] I [dht-layout.c:720:dht_layout_dir_mismatch] 0-4_master2-dht: / - disk layout missing [2013-08-30 01:18:22.872744] I [dht-layout.c:720:dht_layout_dir_mismatch] 0-4_master2-dht: / - disk layout missing [2013-08-30 01:18:22.995967] I [dht-layout.c:720:dht_layout_dir_mismatch] 0-4_master2-dht: /flat - disk layout missing [2013-08-30 01:18:24.254494] I [dht-layout.c:720:dht_layout_dir_mismatch] 0-4_master2-dht: / - disk layout missing [2013-08-30 01:18:24.526342] I [dht-layout.c:720:dht_layout_dir_mismatch] 0-4_master2-dht: /flat - disk layout missing [2013-08-30 01:18:25.566394] I [dht-layout.c:720:dht_layout_dir_mismatch] 0-4_master2-dht: / - disk layout missing [2013-08-30 01:18:25.714968] I [dht-layout.c:720:dht_layout_dir_mismatch] 0-4_master2-dht: /flat - disk layout missing [2013-08-30 01:18:26.782343] I [dht-layout.c:720:dht_layout_dir_mismatch] 0-4_master2-dht: / - disk layout missing [2013-08-30 01:18:27.029267] I [dht-layout.c:720:dht_layout_dir_mismatch] 0-4_master2-dht: /flat - disk layout missing [2013-08-30 01:18:28.027141] I [dht-layout.c:720:dht_layout_dir_mismatch] 0-4_master2-dht: / - disk layout missing [2013-08-30 01:18:28.293197] I [dht-layout.c:720:dht_layout_dir_mismatch] 0-4_master2-dht: /flat - disk layout missing [2013-08-30 01:18:29.358315] I [dht-layout.c:720:dht_layout_dir_mismatch] 0-4_master2-dht: / - disk layout missing [2013-08-30 01:18:29.718380] I [dht-layout.c:720:dht_layout_dir_mismatch] 0-4_master2-dht: /flat - disk layout missing [2013-08-30 01:18:30.600632] I [dht-layout.c:720:dht_layout_dir_mismatch] 0-4_master2-dht: / - disk layout missing [2013-08-30 01:18:31.896036] I [dht-layout.c:720:dht_layout_dir_mismatch] 0-4_master2-dht: / - disk layout missing [2013-08-30 01:18:32.174283] I [dht-layout.c:720:dht_layout_dir_mismatch] 0-4_master2-dht: /flat - disk layout missing [2013-08-30 01:18:33.302341] I [dht-layout.c:720:dht_layout_dir_mismatch] 0-4_master2-dht: / - disk layout missing [2013-08-30 01:18:33.529066] I [dht-layout.c:720:dht_layout_dir_mismatch] 0-4_master2-dht: /flat - disk layout missing [2013-08-30 01:18:34.386815] I [dht-layout.c:720:dht_layout_dir_mismatch] 0-4_master2-dht: / - disk layout missing [2013-08-30 01:18:35.083716] I [dht-layout.c:720:dht_layout_dir_mismatch] 0-4_master2-dht: /flat - disk layout missing [2013-08-30 01:18:35.516559] I [dht-layout.c:720:dht_layout_dir_mismatch] 0-4_master2-dht: / - disk layout missing [2013-08-30 01:18:36.303773] I [dht-layout.c:720:dht_layout_dir_mismatch] 0-4_master2-dht: /flat - disk layout missing [2013-08-30 01:18:36.660047] I [dht-layout.c:720:dht_layout_dir_mismatch] 0-4_master2-dht: / - disk layout missing [2013-08-30 01:18:37.607971] I [dht-layout.c:720:dht_layout_dir_mismatch] 0-4_master2-dht: /flat - disk layout missing [2013-08-30 01:18:38.116217] I [dht-layout.c:720:dht_layout_dir_mismatch] 0-4_master2-dht: / - disk layout missing [2013-08-30 01:18:38.719176] I [dht-layout.c:720:dht_layout_dir_mismatch] 0-4_master2-dht: /flat - disk layout missing [2013-08-30 01:18:39.315605] I [dht-layout.c:720:dht_layout_dir_mismatch] 0-4_master2-dht: / - disk layout missing [2013-08-30 01:18:39.978429] I [dht-layout.c:720:dht_layout_dir_mismatch] 0-4_master2-dht: /flat - disk layout missing [2013-08-30 01:18:40.751351] I [dht-layout.c:720:dht_layout_dir_mismatch] 0-4_master2-dht: / - disk layout missing [2013-08-30 01:18:41.431811] I [dht-layout.c:720:dht_layout_dir_mismatch] 0-4_master2-dht: /flat - disk layout missing [2013-08-30 01:18:41.855414] I [dht-layout.c:720:dht_layout_dir_mismatch] 0-4_master2-dht: / - disk layout missing [2013-08-30 01:18:42.661853] I [dht-layout.c:720:dht_layout_dir_mismatch] 0-4_master2-dht: /flat - disk layout missing [2013-08-30 01:18:42.963473] I [dht-layout.c:720:dht_layout_dir_mismatch] 0-4_master2-dht: / - disk layout missing [2013-08-30 01:18:43.888822] I [dht-layout.c:720:dht_layout_dir_mismatch] 0-4_master2-dht: /flat - disk layout missing [2013-08-30 01:18:44.202181] I [dht-layout.c:720:dht_layout_dir_mismatch] 0-4_master2-dht: / - disk layout missing [2013-08-30 01:18:45.032295] I [dht-layout.c:720:dht_layout_dir_mismatch] 0-4_master2-dht: /flat - disk layout missing [2013-08-30 01:18:45.393314] I [dht-layout.c:720:dht_layout_dir_mismatch] 0-4_master2-dht: / - disk layout missing [2013-08-30 01:18:46.218640] I [dht-layout.c:720:dht_layout_dir_mismatch] 0-4_master2-dht: /flat - disk layout missing [2013-08-30 01:18:46.535993] I [dht-layout.c:720:dht_layout_dir_mismatch] 0-4_master2-dht: / - disk layout missing [2013-08-30 01:18:47.483077] I [dht-layout.c:720:dht_layout_dir_mismatch] 0-4_master2-dht: /flat - disk layout missing [2013-08-30 01:18:47.723434] I [dht-layout.c:720:dht_layout_dir_mismatch] 0-4_master2-dht: / - disk layout missing [2013-08-30 01:18:48.728541] I [dht-layout.c:720:dht_layout_dir_mismatch] 0-4_master2-dht: /flat - disk layout missing [2013-08-30 01:18:48.945093] I [dht-layout.c:720:dht_layout_dir_mismatch] 0-4_master2-dht: / - disk layout missing [2013-08-30 01:18:49.904291] I [dht-layout.c:720:dht_layout_dir_mismatch] 0-4_master2-dht: /flat - disk layout missing [2013-08-30 04:47:38.025596] I [dht-layout.c:720:dht_layout_dir_mismatch] 0-4_master2-dht: /flat/14 - disk layout missing [2013-08-30 06:32:47.115101] I [dht-layout.c:720:dht_layout_dir_mismatch] 0-4_master2-dht: /flat/20 - disk layout missing [2013-08-30 08:21:40.693015] I [dht-layout.c:720:dht_layout_dir_mismatch] 0-4_master2-dht: /flat/28 - disk layout missing [2013-08-30 08:30:57.120143] I [dht-layout.c:720:dht_layout_dir_mismatch] 0-4_master2-dht: /flat/28 - disk layout missing [2013-08-30 08:32:05.010634] I [dht-layout.c:720:dht_layout_dir_mismatch] 0-4_master2-dht: /flat/28 - disk layout missing [2013-08-30 09:46:26.413607] I [dht-layout.c:720:dht_layout_dir_mismatch] 0-4_master2-dht: /flat/35 - disk layout missing [2013-08-30 22:45:49.459738] I [dht-layout.c:720:dht_layout_dir_mismatch] 0-4_master2-dht: /flat/51 - disk layout missing [2013-08-31 02:27:51.010273] I [dht-layout.c:720:dht_layout_dir_mismatch] 0-4_master2-dht: /flat/70 - disk layout missing [2013-08-31 02:56:09.013264] I [dht-layout.c:720:dht_layout_dir_mismatch] 0-4_master2-dht: /flat/72 - disk layout missing [2013-08-31 13:04:19.605368] I [dht-layout.c:720:dht_layout_dir_mismatch] 0-4_master2-dht: /flat/flat - disk layout missing [2013-08-31 13:04:19.606665] I [dht-layout.c:720:dht_layout_dir_mismatch] 0-4_master2-dht: /flat/flat - disk layout missing [2013-08-31 20:52:33.006487] I [dht-layout.c:720:dht_layout_dir_mismatch] 0-4_master2-dht: /flat/flat/26/etc3/selinux/targeted/modules/active/modules - disk layout missing [2013-08-31 20:52:33.010413] I [dht-layout.c:720:dht_layout_dir_mismatch] 0-4_master2-dht: /flat/flat/26/etc3/selinux/targeted/modules/active/modules - disk layout missing [2013-08-31 23:22:36.005677] I [dht-layout.c:720:dht_layout_dir_mismatch] 0-4_master2-dht: /flat/flat/33/etc5/rc.d/rc5.d - disk layout missing [2013-08-31 23:22:36.029402] I [dht-layout.c:720:dht_layout_dir_mismatch] 0-4_master2-dht: /flat/flat/33/etc5/rc.d/rc5.d - disk layout missing [2013-09-02 06:13:34.016586] I [dht-layout.c:720:dht_layout_dir_mismatch] 0-4_master2-dht: /flat/flat/47 - disk layout missing [2013-09-02 06:13:34.070065] I [dht-layout.c:720:dht_layout_dir_mismatch] 0-4_master2-dht: /flat/flat/47 - disk layout missing this time ''1 subvolumes down -- not fixing' is not present in log getting same defect in build 3.4.0.32rhs-1.el6_4.x86_64 also Targeting for 3.0.0 (Denali) release. Seems related to failures from dht. Need to investigate further. Closing this bug since RHGS 2.1 release reached EOL. Required bugs are cloned to RHGS 3.1. Please re-open this issue if found again. Closing this bug since RHGS 2.1 release reached EOL. Required bugs are cloned to RHGS 3.1. Please re-open this issue if found again. |