Bug 1221949

Summary: Detach tier start on one volume fails because of failed commit on another volume.
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Triveni Rao <trao>
Component: coreAssignee: Mohammed Rafi KC <rkavunga>
Status: CLOSED CURRENTRELEASE QA Contact: Sweta Anandpara <sanandpa>
Severity: urgent Docs Contact:
Priority: unspecified    
Version: rhgs-3.1CC: amukherj, kramdoss, nchilaka, rcyriac, rhs-bugs, rkavunga, sankarshan
Target Milestone: ---Keywords: ZStream
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard: TIERING
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1224084 (view as bug list) Environment:
Last Closed: 2018-11-11 15:10:28 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1223636, 1224084    

Description Triveni Rao 2015-05-15 10:27:13 UTC
Description of problem:

Detach tier start on one volume fails because of failed commit on another volume.
if detach tier commit is failed on one volume then detach tier start is blocked for another volume.

Version-Release number of selected component (if applicable):


How reproducible:
easily

Steps to Reproduce:
1.create dist-rep vol, attach tier, fuse mount, write data, detach tier start and commit.
2.detach commit fails, create one more distribute vol, attach tier, and detach start 
3. detach start fails because of toher volume's commit failed.


Additional info:

[root@rhsqa14-vm1 ~]# gluster v detach-tier vol1 start
volume detach-tier start: failed: An earlier remove-brick task exists for volume vol1. Either commit it or stop it before starting a new task.
[root@rhsqa14-vm1 ~]# 

[root@rhsqa14-vm1 ~]# gluster v detach-tier vol1 start
volume detach-tier start: failed: An earlier remove-brick task exists for volume vol1. Either commit it or stop it before starting a new task.
You have new mail in /var/spool/mail/root
[root@rhsqa14-vm1 ~]# 
[root@rhsqa14-vm1 ~]# 
[root@rhsqa14-vm1 ~]# gluster v detach-tier vol1 commit
volume detach-tier commit: failed: Staging failed on 4cdeee40-2cb6-463e-ba08-905cedb3d26a. Error: Deleting all the bricks of the volume is not allowed
[root@rhsqa14-vm1 ~]# 


Log messages:

[2015-05-14 06:34:47.596813] I [MSGID: 100030] [glusterfsd.c:2294:main] 0-glusterd: Started running glusterd version 3.7.0beta2 (args: glusterd --xlator-option *.upgrade=on -N)
[2015-05-14 06:34:47.605211] I [graph.c:269:gf_add_cmdline_options] 0-management: adding option 'upgrade' for volume 'management' with value 'on'
[2015-05-14 06:34:47.605328] I [glusterd.c:1282:init] 0-management: Maximum allowed open file descriptors set to 65536
[2015-05-14 06:34:47.605370] I [glusterd.c:1327:init] 0-management: Using /var/lib/glusterd as working directory
[2015-05-14 06:34:47.630137] E [rpc-transport.c:291:rpc_transport_load] 0-rpc-transport: /usr/lib64/glusterfs/3.7.0beta2/rpc-transport/rdma.so: cannot open shared object file: No such file or directory
[2015-05-14 06:34:47.630198] W [rpc-transport.c:295:rpc_transport_load] 0-rpc-transport: volume 'rdma.management': transport-type 'rdma' is not valid or not found on this machine
[2015-05-14 06:34:47.630218] W [rpcsvc.c:1595:rpcsvc_transport_create] 0-rpc-service: cannot create listener, initing the transport failed
[2015-05-14 06:34:47.630235] E [glusterd.c:1515:init] 0-management: creation of 1 listeners failed, continuing with succeeded transport
[2015-05-14 06:34:47.649973] I [glusterd.c:413:glusterd_check_gsync_present] 0-glusterd: geo-replication module not installed in the system
[2015-05-14 06:34:47.650135] E [store.c:432:gf_store_handle_retrieve] 0-: Path corresponding to /var/lib/glusterd/glusterd.info, returned error: (No such file or directory)
[2015-05-14 06:34:47.650161] E [store.c:432:gf_store_handle_retrieve] 0-: Path corresponding to /var/lib/glusterd/glusterd.info, returned error: (No such file or directory)
[2015-05-14 06:34:47.650173] I [glusterd-store.c:2005:glusterd_restore_op_version] 0-management: Detected new install. Setting op-version to maximum : 30700
[2015-05-14 06:34:47.650462] E [store.c:432:gf_store_handle_retrieve] 0-: Path corresponding to /var/lib/glusterd/glusterd.info, returned error: (No such file or directory)
[2015-05-14 06:34:47.650781] I [glusterd.c:184:glusterd_uuid_generate_save] 0-management: generated UUID: 87acbf29-e821-48bf-9aa8-bbda9321e609
[2015-05-14 06:34:47.817571] I [rpc-clnt.c:972:rpc_clnt_connection_init] 0-glustershd: setting frame-timeout to 600
[2015-05-14 06:34:47.818755] I [rpc-clnt.c:972:rpc_clnt_connection_init] 0-nfs: setting frame-timeout to 600
[2015-05-14 06:34:47.819295] I [rpc-clnt.c:972:rpc_clnt_connection_init] 0-quotad: setting frame-timeout to 600
[2015-05-14 06:34:47.819873] I [rpc-clnt.c:972:rpc_clnt_connection_init] 0-bitd: setting frame-timeout to 600
[2015-05-14 06:34:47.820373] I [rpc-clnt.c:972:rpc_clnt_connection_init] 0-scrub: setting frame-timeout to 600
[2015-05-14 06:34:47.820867] I [glusterd-store.c:3371:glusterd_store_retrieve_missed_snaps_list] 0-management: No missed snaps list.
[2015-05-14 06:34:47.821075] E [store.c:432:gf_store_handle_retrieve] 0-: Path corresponding to /var/lib/glusterd/options, returned error: (No such file or directory)
Final graph:
+------------------------------------------------------------------------------+
...skipping...
gument)
[2015-05-15 10:26:00.442113] W [socket.c:642:__socket_rwv] 0-quotad: readv on /var/run/gluster/87382ddc53b616370f1b86e694eba7fc.socket failed (Invalid argument)
[2015-05-15 10:26:01.442822] W [socket.c:642:__socket_rwv] 0-nfs: readv on /var/run/gluster/cd9131b65e498c98e62c155fa7f02179.socket failed (Invalid argument)
[2015-05-15 10:26:03.443521] W [socket.c:642:__socket_rwv] 0-quotad: readv on /var/run/gluster/87382ddc53b616370f1b86e694eba7fc.socket failed (Invalid argument)
[2015-05-15 10:26:04.443964] W [socket.c:642:__socket_rwv] 0-nfs: readv on /var/run/gluster/cd9131b65e498c98e62c155fa7f02179.socket failed (Invalid argument)
[2015-05-15 10:26:06.444470] W [socket.c:642:__socket_rwv] 0-quotad: readv on /var/run/gluster/87382ddc53b616370f1b86e694eba7fc.socket failed (Invalid argument)
[2015-05-15 10:26:07.444820] W [socket.c:642:__socket_rwv] 0-nfs: readv on /var/run/gluster/cd9131b65e498c98e62c155fa7f02179.socket failed (Invalid argument)
[2015-05-15 10:26:09.445513] W [socket.c:642:__socket_rwv] 0-quotad: readv on /var/run/gluster/87382ddc53b616370f1b86e694eba7fc.socket failed (Invalid argument)
[2015-05-15 10:26:10.445746] W [socket.c:642:__socket_rwv] 0-nfs: readv on /var/run/gluster/cd9131b65e498c98e62c155fa7f02179.socket failed (Invalid argument)
[2015-05-15 10:26:12.446313] W [socket.c:642:__socket_rwv] 0-quotad: readv on /var/run/gluster/87382ddc53b616370f1b86e694eba7fc.socket failed (Invalid argument)
[2015-05-15 10:26:13.446708] W [socket.c:642:__socket_rwv] 0-nfs: readv on /var/run/gluster/cd9131b65e498c98e62c155fa7f02179.socket failed (Invalid argument)
[2015-05-15 10:26:15.447287] W [socket.c:642:__socket_rwv] 0-quotad: readv on /var/run/gluster/87382ddc53b616370f1b86e694eba7fc.socket failed (Invalid argument)
[2015-05-15 10:26:16.447697] W [socket.c:642:__socket_rwv] 0-nfs: readv on /var/run/gluster/cd9131b65e498c98e62c155fa7f02179.socket failed (Invalid argument)
[2015-05-15 10:26:18.448374] W [socket.c:642:__socket_rwv] 0-quotad: readv on /var/run/gluster/87382ddc53b616370f1b86e694eba7fc.socket failed (Invalid argument)
[2015-05-15 10:26:19.448700] W [socket.c:642:__socket_rwv] 0-nfs: readv on /var/run/gluster/cd9131b65e498c98e62c155fa7f02179.socket failed (Invalid argument)
[2015-05-15 10:26:21.449319] W [socket.c:642:__socket_rwv] 0-quotad: readv on /var/run/gluster/87382ddc53b616370f1b86e694eba7fc.socket failed (Invalid argument)

Comment 1 Triveni Rao 2015-05-19 06:10:56 UTC
Along with above issues i see couple issues related to after effects of detach-tier commit failed

1. volume set operations will not work on the volume.
2. volume sttaus info doesnot show proper output.
3. you cannot detach tier from another volume.
4. detach-tier commit doesnot show proper host name/id from which the bricks are getting decommisioned.

ex:
root@rhsqa14-vm1 ~]# gluster v detach-tier v1 commit
volume detach-tier commit: failed: Staging failed on f32b5f3e-d3f0-4a06-bea4-45b661519f90. Error: Brick 10.70.46.233:/rhs/brick3/m0 is not decommissioned. Use start or force option
Staging failed on 388d666a-82be-4cf3-9bec-af3df504fe54. Error: Brick 10.70.46.233:/rhs/brick3/m0 is not decommissioned. Use start or force option
Staging failed on ed2d05d3-fbd2-40d4-88d8-22f2171ae5dd. Error: Brick 10.70.46.233:/rhs/brick3/m0 is not decommissioned. Use start or force option
[root@rhsqa14-vm1 ~]# g

Comment 5 Mohammed Rafi KC 2015-05-22 07:30:27 UTC
based on RCA given in bug #1222442 , this bug looks like a dependent on the same bug.

Comment 6 Mohammed Rafi KC 2015-05-22 07:30:48 UTC
based on RCA given in bug #1222442 , this bug looks like a dependent on the same.

Comment 8 Nag Pavan Chilakam 2016-03-01 06:50:17 UTC
karthick Can you kindly check this and close update accordingly

Comment 9 Nag Pavan Chilakam 2016-03-01 06:50:59 UTC
karthick Can you kindly check this and close update accordingly

Comment 14 krishnaram Karthick 2020-09-28 02:57:23 UTC
clearing stale needinfos.