Bug 996003
Summary: | DHT: Rebalance will never finishes in case of a node restart | ||
---|---|---|---|
Product: | [Red Hat Storage] Red Hat Gluster Storage | Reporter: | shylesh <shmohan> |
Component: | glusterfs | Assignee: | Susant Kumar Palai <spalai> |
Status: | CLOSED DUPLICATE | QA Contact: | shylesh <shmohan> |
Severity: | high | Docs Contact: | |
Priority: | high | ||
Version: | 2.1 | CC: | nsathyan, rhs-bugs, shmohan, spalai, vbellur |
Target Milestone: | --- | ||
Target Release: | RHGS 3.0.0 | ||
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2014-06-03 06:50:33 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 923774 |
Description
shylesh
2013-08-12 07:39:51 UTC
Targeting for 3.0.0 (Denali) release. Shylesh, Was not able to reproduce the issue with the above steps. Created a 2 brick distribute volume : [root@localhost upstream]# gluster v i Volume Name: test1 Type: Distribute Volume ID: 84a3db85-9328-4d4d-adc2-045bddb55414 Status: Started Number of Bricks: 2 Transport-type: tcp Bricks: Brick1: 10.70.42.211:/brick/1 Brick2: 10.70.43.49:/brick/1 Created 10000 directories on mount point [root@localhost mnt]# df -h Filesystem Size Used Avail Use% Mounted on /dev/mapper/VolGroup-lv_root 9.6G 6.7G 2.5G 74% / tmpfs 1004M 0 1004M 0% /dev/shm /dev/vda1 485M 32M 428M 7% /boot /dev/mapper/VolGroup-lv_brick 9.8G 33M 9.8G 1% /brick 10.70.42.211:test1 20G 88M 20G 1% /mnt [root@localhost mnt]# ls [root@localhost mnt]# mkdir dir{1..10000} From 10.70.42.211 started fix-layout rebalance [root@localhost mnt]# gluster volume rebalance test1 fix-layout start volume rebalance: test1: success: Starting rebalance on volume test1 has been successful. ID: 074a2a5f-8b60-4b33-8908-569cec40215e [root@localhost upstream]# gluster volume rebalance test1 status Node Rebalanced-files size scanned failures skipped status run time in secs --------- ----------- ----------- ----------- ----------- ----------- ------------ -------------- localhost 0 0Bytes 0 0 0 fix-layout in progress 2.00 10.70.42.211 0 0Bytes 0 0 0 fix-layout in progress 1.00 In the middle of the operation rebooted 10.70.43.49 [root@localhost upstream]# reboot [root@localhost upstream]# Broadcast message from root (/dev/pts/0) at 7:31 ... The system is going down for reboot NOW! Connection to 10.70.43.49 closed by remote host. Connection to 10.70.43.49 closed. After the server came up checked rebalance status [root@localhost mnt]# gluster v rebalance test1 status Node Rebalanced-files size scanned failures skipped status run time in secs --------- ----------- ----------- ----------- ----------- ----------- ------------ -------------- localhost 0 0Bytes 0 1 0 fix-layout failed 64.00 10.70.43.49 0 0Bytes 0 1 0 fix-layout failed 0.00 It Found the following in the log: [2014-03-04 07:37:28.828568] I [dht-common.c:2601:dht_setxattr] 0-test1-dht: fixing the layout of /dir2123 [2014-03-04 07:37:28.838269] I [dht-common.c:2601:dht_setxattr] 0-test1-dht: fixing the layout of /dir2124 [2014-03-04 07:37:28.852623] I [dht-common.c:2601:dht_setxattr] 0-test1-dht: fixing the layout of /dir2125 [2014-03-04 07:37:28.856090] I [dht-common.c:2601:dht_setxattr] 0-test1-dht: fixing the layout of /dir2126 [2014-03-04 07:37:28.865942] I [dht-common.c:2601:dht_setxattr] 0-test1-dht: fixing the layout of /dir2127 [2014-03-04 07:38:12.083731] I [dht-rebalance.c:1826:gf_defrag_status_get] 0-glusterfs: Rebalance is in progress. Time taken is 56.00 secs [2014-03-04 07:38:12.083771] I [dht-rebalance.c:1829:gf_defrag_status_get] 0-glusterfs: Files migrated: 0, size: 0, lookups: 0, failures: 0, skipped: 0 [2014-03-04 07:38:20.382415] W [socket.c:522:__socket_rwv] 0-test1-client-1: readv on 10.70.43.49:49156 failed (Connection reset by peer) [2014-03-04 07:38:20.382698] E [rpc-clnt.c:369:saved_frames_unwind] (-->/usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0x147) [0x7f3d853525ae] (-->/usr/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x139) [0x7f3d85351ba6] (-->/usr/lib64/libgfrpc.so.0(saved_frames_destroy+0x1f) [0x7f3d853516a3]))) 0-test1-client-1: forced unwinding frame type(GlusterFS 3.3) op(OPENDIR(20)) called at 2014-03-04 07:37:28.872648 (xid=0x361a) [2014-03-04 07:38:20.382723] W [client-rpc-fops.c:2689:client3_3_opendir_cbk] 0-test1-client-1: remote operation failed: Transport endpoint is not connected. Path: /dir2127 (b520279b-781b-4f3b-90b9-3018557b0360) [2014-03-04 07:38:20.383338] E [rpc-clnt.c:369:saved_frames_unwind] (-->/usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0x147) [0x7f3d853525ae] (-->/usr/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x139) [0x7f3d85351ba6] (-->/usr/lib64/libgfrpc.so.0(saved_frames_destroy+0x1f) [0x7f3d853516a3]))) 0-test1-client-1: forced unwinding frame type(GlusterFS Handshake) op(PING(3)) called at 2014-03-04 07:37:58.574734 (xid=0x361b) [2014-03-04 07:38:20.383348] W [client-handshake.c:276:client_ping_cbk] 0-test1-client-1: timer must have expired [2014-03-04 07:38:20.383365] I [client.c:2208:client_rpc_notify] 0-test1-client-1: disconnected from 10.70.43.49:49156. Client process will keep trying to connect to glusterd until brick's port is available [2014-03-04 07:38:20.383376] W [dht-common.c:5172:dht_notify] 0-test1-dht: Received CHILD_DOWN. Exiting [2014-03-04 07:38:20.383383] I [dht-rebalance.c:1849:gf_defrag_stop] 0-: Received stop command on rebalance [2014-03-04 07:38:20.383443] E [dht-rebalance.c:1541:gf_defrag_fix_layout] 0-test1-dht: Fix layout failed for /dir2127 [2014-03-04 07:38:20.383583] I [dht-rebalance.c:1826:gf_defrag_status_get] 0-glusterfs: Rebalance is failed. Time taken is 64.00 secs [2014-03-04 07:38:20.383595] I [dht-rebalance.c:1829:gf_defrag_status_get] 0-glusterfs: Files migrated: 0, size: 0, lookups: 0, failures: 1, skipped: 0 [2014-03-04 07:38:20.383809] W [glusterfsd.c:1128:cleanup_and_exit] (-->/lib64/libc.so.6(clone+0x6d) [0x3ea50e890d] (-->/lib64/libpthread.so.0() [0x3ea5807851] (-->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xe4) [0x407e55]))) 0-: received signum (15), shutting down Will confirm the behaviour with code. Shylesh, Here is the code snippet no how a child down event is handled here. From dht_notify{ case GF_EVENT_CHILD_DOWN: subvol = data; if (conf->assert_no_child_down) { gf_log (this->name, GF_LOG_WARNING, "Received CHILD_DOWN. Exiting"); if (conf->defrag) { /* status is set to GF_DEFRAG_STATUS_FAILED in gf_defrag_stop*/ gf_defrag_stop (conf->defrag, GF_DEFRAG_STATUS_FAILED, NULL); } else { kill (getpid(), SIGTERM); } } } From gf_defrag_status_get (gf_defrag_info_t *defrag, dict_t *dict) { switch (defrag->defrag_status) { case GF_DEFRAG_STATUS_NOT_STARTED: status = "not started"; break; case GF_DEFRAG_STATUS_STARTED: status = "in progress"; break; case GF_DEFRAG_STATUS_STOPPED: status = "stopped"; break; case GF_DEFRAG_STATUS_COMPLETE: status = "completed"; break; case GF_DEFRAG_STATUS_FAILED: status = "failed"; /*Status is set to failed and logged*/ break; default: break; } gf_log (THIS->name, GF_LOG_INFO, "Rebalance is %s. Time taken is %.2f " "secs", status, elapsed); gf_log (THIS->name, GF_LOG_INFO, "Files migrated: %"PRIu64", size: %" PRIu64", lookups: %"PRIu64", failures: %"PRIu64", skipped: " "%"PRIu64, files, size, lookup, failures, skipped); } Can you confirm the behaviour with latest build. Susant, Description says the volume type was distributed-replicate whereas you tried with a distribute volume. Please try with distributed-replicate volume and let me know Shylesh, Able to see one issue with reboot. Reproducible : Always Volume: Volume Name: test1 Type: Distributed-Replicate Volume ID: 1dcd8dc0-3d7e-4e7f-b561-c3fadd053bf3 Status: Started Number of Bricks: 2 x 2 = 4 Transport-type: tcp Bricks: Brick1: 10.70.42.211:/brick/1 Brick2: 10.70.43.49:/brick/1 Brick3: 10.70.42.211:/brick/2 Brick4: 10.70.43.49:/brick/2 [root@localhost mnt]# *Created some data [root@localhost mnt]# ls dir1 dir11 dir13 dir15 dir17 dir19 dir20 dir4 dir6 dir8 zinkile1 zinkile2 zinkile4 zinkile6 zinkile8 dir10 dir12 dir14 dir16 dir18 dir2 dir3 dir5 dir7 dir9 zinkile10 zinkile3 zinkile5 zinkile7 zinkile9 [root@localhost mnt]# *added a pair of brick and started rebalance [root@localhost mnt]# gluster v add-brick test1 10.70.42.211:/brick/5 10.70.43.49:/brick/5 volume add-brick: success [root@localhost mnt]# [root@localhost mnt]# gluster v rebalance test1 status Node Rebalanced-files size scanned failures skipped status run time in secs --------- ----------- ----------- ----------- ----------- ----------- ------------ -------------- localhost 0 0Bytes 10 0 3 completed 0.00 10.70.43.49 0 0Bytes 10 0 0 completed 0.00 volume rebalance: test1: success: Now rebooted 10.70.43.49 [root@localhost ~]# reboot Broadcast message from root (/dev/pts/0) at 18:18 ... The system is going down for reboot NOW! [root@localhost ~]# Connection to 10.70.43.49 closed by remote host. Connection to 10.70.43.49 closed. Now logged in to 10.70.43.49 and started glusterd [susant@admin ~]$ vm3 root.43.49's password: Last login: Tue Mar 4 13:49:40 2014 from dhcp-0-200.blr.redhat.com [root@localhost ~]# pgrep glusterd [root@localhost ~]# glusterd After glusterd started saw rebalance is started again on 10.70.43.49 [2014-03-04 18:20:08.269956] I [dht-rebalance.c:1393:gf_defrag_migrate_data] 0-test1-dht: Migration operation on dir /dir13 took 0.00 secs [2014-03-04 18:20:08.273536] I [dht-common.c:2601:dht_setxattr] 0-test1-dht: fixing the layout of /dir14 [2014-03-04 18:20:08.276153] I [dht-rebalance.c:1170:gf_defrag_migrate_data] 0-test1-dht: migrate data called on /dir14 [2014-03-04 18:20:08.278087] I [dht-rebalance.c:1393:gf_defrag_migrate_data] 0-test1-dht: Migration operation on dir /dir14 took 0.00 secs [2014-03-04 18:20:08.281345] I [dht-common.c:2601:dht_setxattr] 0-test1-dht: fixing the layout of /dir15 [2014-03-04 18:20:08.284134] I [dht-rebalance.c:1170:gf_defrag_migrate_data] 0-test1-dht: migrate data called on /dir15 [2014-03-04 18:20:08.286188] I [dht-rebalance.c:1393:gf_defrag_migrate_data] 0-test1-dht: Migration operation on dir /dir15 took 0.00 secs [2014-03-04 18:20:08.289099] I [dht-common.c:2601:dht_setxattr] 0-test1-dht: fixing the layout of /dir16 [2014-03-04 18:20:08.291804] I [dht-rebalance.c:1170:gf_defrag_migrate_data] 0-test1-dht: migrate data called on /dir16 [2014-03-04 18:20:08.293951] I [dht-rebalance.c:1393:gf_defrag_migrate_data] 0-test1-dht: Migration operation on dir /dir16 took 0.00 secs [2014-03-04 18:20:08.297058] I [dht-common.c:2601:dht_setxattr] 0-test1-dht: fixing the layout of /dir17 [2014-03-04 18:20:08.299245] I [dht-rebalance.c:1170:gf_defrag_migrate_data] 0-test1-dht: migrate data called on /dir17 [2014-03-04 18:20:08.300983] I [dht-rebalance.c:1393:gf_defrag_migrate_data] 0-test1-dht: Migration operation on dir /dir17 took 0.00 secs [2014-03-04 18:20:08.303776] I [dht-common.c:2601:dht_setxattr] 0-test1-dht: fixing the layout of /dir18 [2014-03-04 18:20:08.306262] I [dht-rebalance.c:1170:gf_defrag_migrate_data] 0-test1-dht: migrate data called on /dir18 [2014-03-04 18:20:08.308070] I [dht-rebalance.c:1393:gf_defrag_migrate_data] 0-test1-dht: Migration operation on dir /dir18 took 0.00 secs [2014-03-04 18:20:08.311072] I [dht-common.c:2601:dht_setxattr] 0-test1-dht: fixing the layout of /dir19 [2014-03-04 18:20:08.313643] I [dht-rebalance.c:1170:gf_defrag_migrate_data] 0-test1-dht: migrate data called on /dir19 [2014-03-04 18:20:08.315300] I [dht-rebalance.c:1393:gf_defrag_migrate_data] 0-test1-dht: Migration operation on dir /dir19 took 0.00 secs [2014-03-04 18:20:08.318545] I [dht-common.c:2601:dht_setxattr] 0-test1-dht: fixing the layout of /dir20 [2014-03-04 18:20:08.321079] I [dht-rebalance.c:1170:gf_defrag_migrate_data] 0-test1-dht: migrate data called on /dir20 [2014-03-04 18:20:08.323018] I [dht-rebalance.c:1393:gf_defrag_migrate_data] 0-test1-dht: Migration operation on dir /dir20 took 0.00 secs [2014-03-04 18:20:08.327121] I [dht-rebalance.c:1826:gf_defrag_status_get] 0-glusterfs: Rebalance is completed. Time taken is 0.00 secs [2014-03-04 18:20:08.327163] I [dht-rebalance.c:1829:gf_defrag_status_get] 0-glusterfs: Files migrated: 0, size: 0, lookups: 10, failures: 0, skipped: 0 It's a glusterd bug. Will check with Kaushal. Dev ack to 3.0 RHS BZs Hi shylesh, Here is my findings for this bug. [root@localhost mnt]# gvi Volume Name: test1 Type: Distribute Volume ID: 7552e742-d2e8-4282-82b7-c210c8f5ff2c Status: Started Snap Volume: no Number of Bricks: 5 Transport-type: tcp Bricks: Brick1: 192.168.122.11:/brick/a Brick2: 192.168.122.100:/brick/b Brick3: 192.168.122.100:/brick/c Added two bricks : [root@localhost mnt]# gvi Volume Name: test1 Type: Distribute Volume ID: 7552e742-d2e8-4282-82b7-c210c8f5ff2c Status: Started Snap Volume: no Number of Bricks: 5 Transport-type: tcp Bricks: Brick1: 192.168.122.11:/brick/a Brick2: 192.168.122.100:/brick/b Brick3: 192.168.122.100:/brick/c Brick4: 192.168.122.11:/brick/@ Brick5: 192.168.122.100:/brick/@ Started fix-layout , rebooted 192.168.122.100 After the server came up, rebalance started for only 192.168.122.100 ON 192.168.122.100 checked status after reboot [root@localhost ~]# gluster v rebalance test1 status Node Rebalanced-files size scanned failures skipped status run time in secs --------- ----------- ----------- ----------- ----------- ----------- ------------ -------------- localhost 0 0Bytes 0 0 0 fix-layout in progress 13.00 192.168.122.11 0 0Bytes 0 1 0 fix-layout failed 3.00 volume rebalance: test1: success: [root@localhost ~]# gluster v rebalance test1 status Node Rebalanced-files size scanned failures skipped status run time in secs --------- ----------- ----------- ----------- ----------- ----------- ------------ -------------- localhost 0 0Bytes 0 0 0 fix-layout completed 20.00 192.168.122.11 0 0Bytes 0 1 0 fix-layout failed 3.00 volume rebalance: test1: success: [root@localhost ~]# Hence, moving it to ON_QA as the behaviour has changed over time. If any server goes down in the middle of the rebalance process, after reboot only that server rebalance process should start. The issue I mentioned in comment 8 is handled in bug : https://bugzilla.redhat.com/show_bug.cgi?id=1046908 *** This bug has been marked as a duplicate of bug 1046908 *** |