Bug 1562484

Summary: [Tracker-RHGS-BZ#1631664] device remove fails with error 'Id not found', although the device is still present
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: krishnaram Karthick <kramdoss>
Component: heketiAssignee: Michael Adam <madam>
Status: CLOSED DEFERRED QA Contact: krishnaram Karthick <kramdoss>
Severity: high Docs Contact:
Priority: unspecified    
Version: cns-3.9CC: amukherj, hchiramm, jmulligan, ndevos, rhs-bugs, rtalur, storage-qa-internal
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-01-24 00:16:17 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1631664    
Bug Blocks: 1641915    
Attachments:
Description Flags
heketi_logs none

Description krishnaram Karthick 2018-03-31 04:00:45 UTC
Description of problem:
With the following test case, device removal fails with error 'Failed to remove device, error: Id not found'

1) create few volumes via heketi
2) remove the device on which volumes are created
3) while device removal is in progress, reduce the gluster pod count to 2. i.e., from one of the nodes remove this label - glusterfs: storage-host
4) device removal fails with this error - This is expected

[root@dhcp47-116 ~]# heketi-cli device remove 2acb109d758154a8d12de3b90e084f0c
Error: Failed to remove device, error: Unable to replace brick 10.70.47.166:/var/lib/heketi/mounts/vg_2acb109d758154a8d12de3b90e084f0c/brick_32967d19fad73bae52c45bdc3e7574a1/brick with 10.70.47.166:/var/lib/heketi/mounts/vg_ea53cfb78199251ba13668b8ff16a350/brick_7cad9f700851411d3ae9b687a7a1f308/brick for volume vol_9480cf14bde1c564b1d67d8e364fe18f

5) Now, add the label back and wait for gluster pod to be up and running
6) re-run device remove operation

heketi-cli device remove 2acb109d758154a8d12de3b90e084f0c
Error: Failed to remove device, error: Id not found

[root@dhcp47-116 ~]# heketi-cli node info 4f29a5682ca8842da2f8ceb41c2eec94
Node Id: 4f29a5682ca8842da2f8ceb41c2eec94
State: online
Cluster Id: 82a2d4b56387e2912c23dc1ff8abb025
Zone: 1
Management Hostname: dhcp47-166.lab.eng.blr.redhat.com
Storage Hostname: 10.70.47.166
Devices:
Id:2acb109d758154a8d12de3b90e084f0c   Name:/dev/sdd            State:offline   Size (GiB):99      Used (GiB):15      Free (GiB):84      
Id:ea53cfb78199251ba13668b8ff16a350   Name:/dev/sde            State:online    Size (GiB):599     Used (GiB):2       Free (GiB):597     

This is not expected. This code path was fixed in 3.9 release, but looks like it is not fixed completely.

Version-Release number of selected component (if applicable):
 rpm -qa | grep 'heketi'
heketi-6.0.0-7.1.el7rhgs.x86_64
python-heketi-6.0.0-7.1.el7rhgs.x86_64
heketi-client-6.0.0-7.1.el7rhgs.x86_64

How reproducible:
1/1

Actual results:
device remove failed

Expected results:
device remove should succeed

Additional info:
heketi logs shall be attached

Comment 2 krishnaram Karthick 2018-03-31 04:14:32 UTC
Created attachment 1415356 [details]
heketi_logs

Comment 3 Humble Chirammal 2018-03-31 14:31:37 UTC
We will analyze the issue in detail soon, 

However, I want to make one comment here about the 'error' message itself, ie

>>
 "Error: Failed to remove device, error: Id not found
>>

I think these 'Id not found' messages should also list the problematic ID, mostly its going to be the same ID provided in the command itself, however its good to display this ID in error and it also help if we are failing to find some other ID ( not sure we have such code path though) in this transaction.

Comment 4 Raghavendra Talur 2018-04-02 13:23:57 UTC
volume for which replace brick failed: vol_9480cf14bde1c564b1d67d8e364fe18f

original brick: 10.70.47.166:/var/lib/heketi/mounts/vg_2acb109d758154a8d12de3b90e084f0c/brick_32967d19fad73bae52c45bdc3e7574a1/
new brick: 10.70.47.166:/var/lib/heketi/mounts/vg_ea53cfb78199251ba13668b8ff16a350/brick_7cad9f700851411d3ae9b687a7a1f308/


gluster replied with failure on command
[kubeexec] ERROR 2018/03/31 03:40:33 /src/github.com/heketi/heketi/executors/kubeexec/kubeexec.go:240: Failed to run command [gluster --mode=script volume replace-brick vol_9480cf14bde1c564b1d67d8e364fe18f 10.70.47.166:/var/lib/heketi/mounts/vg_2acb109d758154a8d12de3b90e084f0c/brick_32967d19fad73bae52c45bdc3e7574a1/brick 10.70.47.166:/var/lib/heketi/mounts/vg_ea53cfb78199251ba13668b8ff16a350/brick_7cad9f700851411d3ae9b687a7a1f308/brick commit force] on glusterfs-storage-zs9kh: Err[command terminated with exit code 1]: Stdout []: Stderr [volume replace-brick: failed: Commit failed on 10.70.47.162. Please check log file for details.




original brick list from gluster:
10.70.47.162:/var/lib/heketi/mounts/vg_94138d2abb10d88c1d83ba18c881b389/brick_da6dd5bfff1415b786f8ce88446a0846/brick
10.70.46.45:/var/lib/heketi/mounts/vg_4c39fa5f5ccdd94712b471fc46b92760/brick_8140f263b15c19b5e31cd4f5632a369d/brick
10.70.47.166:/var/lib/heketi/mounts/vg_2acb109d758154a8d12de3b90e084f0c/brick_32967d19fad73bae52c45bdc3e7574a1/brick

new brick list from gluster:
10.70.47.162:/var/lib/heketi/mounts/vg_94138d2abb10d88c1d83ba18c881b389/brick_da6dd5bfff1415b786f8ce88446a0846/brick
10.70.46.45:/var/lib/heketi/mounts/vg_4c39fa5f5ccdd94712b471fc46b92760/brick_8140f263b15c19b5e31cd4f5632a369d/brick
10.70.47.166:/var/lib/heketi/mounts/vg_ea53cfb78199251ba13668b8ff16a350/brick_7cad9f700851411d3ae9b687a7a1f308/brick

ROOT CAUSE:
gluster updated its brick list even though it replied with failure on replace-brick command.

Comment 5 Raghavendra Talur 2018-04-02 13:39:38 UTC
For glusterd debugging:

We reproduced the issue in our setup, hence the ids and volume names are different but here is the log

[2018-04-02 11:30:58.108514] W [socket.c:593:__socket_rwv] 0-management: readv on 192.168.10.103:24007 failed (No data available)
[2018-04-02 11:30:58.108550] I [MSGID: 106004] [glusterd-handler.c:6317:__glusterd_peer_rpc_notify] 0-management: Peer <192.168.10.103> (<cd9b911d-c537-4ef4-82a1-1f8fd26a8e7a>), in state <Peer in Cluster>, has disconnected from glusterd.
[2018-04-02 11:30:58.108694] W [glusterd-locks.c:854:glusterd_mgmt_v3_unlock] (-->/usr/lib64/glusterfs/3.12.6/xlator/mgmt/glusterd.so(+0x2322a) [0x7fb0f80de22a] -->/usr/lib64/glusterfs/3.12.6/xlator/mgmt/glusterd.so(+0x2d198) [0x7fb0f80e8198] -->/usr/lib64/glusterfs/3.12.6/xlator/mgmt/glusterd.so(+0xe458c) [0x7fb0f819f58c] ) 0-management: Lock owner mismatch. Lock for vol vol_1c15c1637ac1e325f03d56fdd41f75f5 held by 0eafd295-aa2d-4170-a7b8-3ce98fbff0fc
[2018-04-02 11:30:58.108704] W [MSGID: 106118] [glusterd-handler.c:6342:__glusterd_peer_rpc_notify] 0-management: Lock not released for vol_1c15c1637ac1e325f03d56fdd41f75f5
[2018-04-02 11:30:58.108722] W [glusterd-locks.c:843:glusterd_mgmt_v3_unlock] (-->/usr/lib64/glusterfs/3.12.6/xlator/mgmt/glusterd.so(+0x2322a) [0x7fb0f80de22a] -->/usr/lib64/glusterfs/3.12.6/xlator/mgmt/glusterd.so(+0x2d198) [0x7fb0f80e8198] -->/usr/lib64/glusterfs/3.12.6/xlator/mgmt/glusterd.so(+0xe4765) [0x7fb0f819f765] ) 0-management: Lock for vol vol_a12b9272cc1f4cb70c17b0ccbbc57633 not held
[2018-04-02 11:30:58.108729] W [MSGID: 106118] [glusterd-handler.c:6342:__glusterd_peer_rpc_notify] 0-management: Lock not released for vol_a12b9272cc1f4cb70c17b0ccbbc57633
[2018-04-02 11:30:58.108743] W [glusterd-locks.c:843:glusterd_mgmt_v3_unlock] (-->/usr/lib64/glusterfs/3.12.6/xlator/mgmt/glusterd.so(+0x2322a) [0x7fb0f80de22a] -->/usr/lib64/glusterfs/3.12.6/xlator/mgmt/glusterd.so(+0x2d198) [0x7fb0f80e8198] -->/usr/lib64/glusterfs/3.12.6/xlator/mgmt/glusterd.so(+0xe4765) [0x7fb0f819f765] ) 0-management: Lock for vol vol_b2b243820c1277c994178a58d465db25 not held
[2018-04-02 11:30:58.108749] W [MSGID: 106118] [glusterd-handler.c:6342:__glusterd_peer_rpc_notify] 0-management: Lock not released for vol_b2b243820c1277c994178a58d465db25
[2018-04-02 11:30:58.108762] W [glusterd-locks.c:843:glusterd_mgmt_v3_unlock] (-->/usr/lib64/glusterfs/3.12.6/xlator/mgmt/glusterd.so(+0x2322a) [0x7fb0f80de22a] -->/usr/lib64/glusterfs/3.12.6/xlator/mgmt/glusterd.so(+0x2d198) [0x7fb0f80e8198] -->/usr/lib64/glusterfs/3.12.6/xlator/mgmt/glusterd.so(+0xe4765) [0x7fb0f819f765] ) 0-management: Lock for vol vol_d1a96444079258dae3e28dd4a8b6fd81 not held
[2018-04-02 11:30:58.108768] W [MSGID: 106118] [glusterd-handler.c:6342:__glusterd_peer_rpc_notify] 0-management: Lock not released for vol_d1a96444079258dae3e28dd4a8b6fd81
[2018-04-02 11:30:58.109188] E [rpc-clnt.c:350:saved_frames_unwind] (--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x13b)[0x7fb0fd46cedb] (--> /lib64/libgfrpc.so.0(saved_frames_unwind+0x1de)[0x7fb0fd231e6e] (--> /lib64/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7fb0fd231f8e] (--> /lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x90)[0x7fb0fd233710] (--> /lib64/libgfrpc.so.0(rpc_clnt_notify+0x2a0)[0x7fb0fd234200] ))))) 0-management: forced unwinding frame type(glusterd mgmt v3) op(--(4)) called at 2018-04-02 11:30:58.107966 (xid=0x13)
[2018-04-02 11:30:58.109236] E [MSGID: 106116] [glusterd-mgmt.c:124:gd_mgmt_v3_collate_errors] 0-management: Commit failed on 192.168.10.103. Please check log file for details.
[2018-04-02 11:30:58.109372] I [MSGID: 106144] [glusterd-pmap.c:396:pmap_registry_remove] 0-pmap: removing brick /var/lib/heketi/mounts/vg_e08f107a4127b1b05d577654773be4cc/brick_168912fe5c0deb6c23772ea7acd96550/brick on port 49154
[2018-04-02 11:30:58.109405] I [socket.c:2474:socket_event_handler] 0-transport: EPOLLERR - disconnecting now
[2018-04-02 11:31:00.164657] E [MSGID: 106123] [glusterd-mgmt.c:1677:glusterd_mgmt_v3_commit] 0-management: Commit failed on peers
[2018-04-02 11:31:00.164697] E [MSGID: 106123] [glusterd-replace-brick.c:669:glusterd_mgmt_v3_initiate_replace_brick_cmd_phases] 0-management: Commit Op Failed
[2018-04-02 11:32:14.112970] E [socket.c:2369:socket_connect_finish] 0-management: connection to 192.168.10.103:24007 failed (No route to host); disconnecting socket



How to reproduce:
execute replace-brick command and bring down a node in cluster which neither the owner of old brick nor the owner of new brick.

Comment 6 Raghavendra Talur 2018-04-02 13:42:57 UTC
Impact: High

In this case, as heketi considers the command to be a failure, we cleanup(or try to) new bricks that were created. As gluster brick is already using new brick partition, heketi won't be able to unmount it. However, we remove the entry from fstab. This will lead to the brick not coming up on next reboot and the state of volume from heketi and gluster perspective have diverged.

Comment 9 Niels de Vos 2018-05-09 09:25:46 UTC
John, I think this problem should be prevented with the changes to how the operations are structured in the current version. Could you confirm? If you agree, it will be fixed with cns-3.10.0.

Comment 10 John Mulligan 2018-05-09 14:38:10 UTC
Unfortunately, the pending operation protection in device replace is very weak. Plus there seems to be the above discussed issues w/in gluster itself.
We can get better here by continuing to restructure how device replace is done, but I don't know when we'll have the time to do that.

Comment 11 Niels de Vos 2018-05-09 15:32:18 UTC
(In reply to John Mulligan from comment #10)
> Unfortunately, the pending operation protection in device replace is very
> weak. Plus there seems to be the above discussed issues w/in gluster itself.
> We can get better here by continuing to restructure how device replace is
> done, but I don't know when we'll have the time to do that.

Thanks, we'll reconsider this for cns-3.11.