Description of problem: Detach tier start on one volume fails because of failed commit on another volume. if detach tier commit is failed on one volume then detach tier start is blocked for another volume. Version-Release number of selected component (if applicable): How reproducible: easily Steps to Reproduce: 1.create dist-rep vol, attach tier, fuse mount, write data, detach tier start and commit. 2.detach commit fails, create one more distribute vol, attach tier, and detach start 3. detach start fails because of toher volume's commit failed. Additional info: [root@rhsqa14-vm1 ~]# gluster v detach-tier vol1 start volume detach-tier start: failed: An earlier remove-brick task exists for volume vol1. Either commit it or stop it before starting a new task. [root@rhsqa14-vm1 ~]# [root@rhsqa14-vm1 ~]# gluster v detach-tier vol1 start volume detach-tier start: failed: An earlier remove-brick task exists for volume vol1. Either commit it or stop it before starting a new task. You have new mail in /var/spool/mail/root [root@rhsqa14-vm1 ~]# [root@rhsqa14-vm1 ~]# [root@rhsqa14-vm1 ~]# gluster v detach-tier vol1 commit volume detach-tier commit: failed: Staging failed on 4cdeee40-2cb6-463e-ba08-905cedb3d26a. Error: Deleting all the bricks of the volume is not allowed [root@rhsqa14-vm1 ~]# Log messages: [2015-05-14 06:34:47.596813] I [MSGID: 100030] [glusterfsd.c:2294:main] 0-glusterd: Started running glusterd version 3.7.0beta2 (args: glusterd --xlator-option *.upgrade=on -N) [2015-05-14 06:34:47.605211] I [graph.c:269:gf_add_cmdline_options] 0-management: adding option 'upgrade' for volume 'management' with value 'on' [2015-05-14 06:34:47.605328] I [glusterd.c:1282:init] 0-management: Maximum allowed open file descriptors set to 65536 [2015-05-14 06:34:47.605370] I [glusterd.c:1327:init] 0-management: Using /var/lib/glusterd as working directory [2015-05-14 06:34:47.630137] E [rpc-transport.c:291:rpc_transport_load] 0-rpc-transport: /usr/lib64/glusterfs/3.7.0beta2/rpc-transport/rdma.so: cannot open shared object file: No such file or directory [2015-05-14 06:34:47.630198] W [rpc-transport.c:295:rpc_transport_load] 0-rpc-transport: volume 'rdma.management': transport-type 'rdma' is not valid or not found on this machine [2015-05-14 06:34:47.630218] W [rpcsvc.c:1595:rpcsvc_transport_create] 0-rpc-service: cannot create listener, initing the transport failed [2015-05-14 06:34:47.630235] E [glusterd.c:1515:init] 0-management: creation of 1 listeners failed, continuing with succeeded transport [2015-05-14 06:34:47.649973] I [glusterd.c:413:glusterd_check_gsync_present] 0-glusterd: geo-replication module not installed in the system [2015-05-14 06:34:47.650135] E [store.c:432:gf_store_handle_retrieve] 0-: Path corresponding to /var/lib/glusterd/glusterd.info, returned error: (No such file or directory) [2015-05-14 06:34:47.650161] E [store.c:432:gf_store_handle_retrieve] 0-: Path corresponding to /var/lib/glusterd/glusterd.info, returned error: (No such file or directory) [2015-05-14 06:34:47.650173] I [glusterd-store.c:2005:glusterd_restore_op_version] 0-management: Detected new install. Setting op-version to maximum : 30700 [2015-05-14 06:34:47.650462] E [store.c:432:gf_store_handle_retrieve] 0-: Path corresponding to /var/lib/glusterd/glusterd.info, returned error: (No such file or directory) [2015-05-14 06:34:47.650781] I [glusterd.c:184:glusterd_uuid_generate_save] 0-management: generated UUID: 87acbf29-e821-48bf-9aa8-bbda9321e609 [2015-05-14 06:34:47.817571] I [rpc-clnt.c:972:rpc_clnt_connection_init] 0-glustershd: setting frame-timeout to 600 [2015-05-14 06:34:47.818755] I [rpc-clnt.c:972:rpc_clnt_connection_init] 0-nfs: setting frame-timeout to 600 [2015-05-14 06:34:47.819295] I [rpc-clnt.c:972:rpc_clnt_connection_init] 0-quotad: setting frame-timeout to 600 [2015-05-14 06:34:47.819873] I [rpc-clnt.c:972:rpc_clnt_connection_init] 0-bitd: setting frame-timeout to 600 [2015-05-14 06:34:47.820373] I [rpc-clnt.c:972:rpc_clnt_connection_init] 0-scrub: setting frame-timeout to 600 [2015-05-14 06:34:47.820867] I [glusterd-store.c:3371:glusterd_store_retrieve_missed_snaps_list] 0-management: No missed snaps list. [2015-05-14 06:34:47.821075] E [store.c:432:gf_store_handle_retrieve] 0-: Path corresponding to /var/lib/glusterd/options, returned error: (No such file or directory) Final graph: +------------------------------------------------------------------------------+ ...skipping... gument) [2015-05-15 10:26:00.442113] W [socket.c:642:__socket_rwv] 0-quotad: readv on /var/run/gluster/87382ddc53b616370f1b86e694eba7fc.socket failed (Invalid argument) [2015-05-15 10:26:01.442822] W [socket.c:642:__socket_rwv] 0-nfs: readv on /var/run/gluster/cd9131b65e498c98e62c155fa7f02179.socket failed (Invalid argument) [2015-05-15 10:26:03.443521] W [socket.c:642:__socket_rwv] 0-quotad: readv on /var/run/gluster/87382ddc53b616370f1b86e694eba7fc.socket failed (Invalid argument) [2015-05-15 10:26:04.443964] W [socket.c:642:__socket_rwv] 0-nfs: readv on /var/run/gluster/cd9131b65e498c98e62c155fa7f02179.socket failed (Invalid argument) [2015-05-15 10:26:06.444470] W [socket.c:642:__socket_rwv] 0-quotad: readv on /var/run/gluster/87382ddc53b616370f1b86e694eba7fc.socket failed (Invalid argument) [2015-05-15 10:26:07.444820] W [socket.c:642:__socket_rwv] 0-nfs: readv on /var/run/gluster/cd9131b65e498c98e62c155fa7f02179.socket failed (Invalid argument) [2015-05-15 10:26:09.445513] W [socket.c:642:__socket_rwv] 0-quotad: readv on /var/run/gluster/87382ddc53b616370f1b86e694eba7fc.socket failed (Invalid argument) [2015-05-15 10:26:10.445746] W [socket.c:642:__socket_rwv] 0-nfs: readv on /var/run/gluster/cd9131b65e498c98e62c155fa7f02179.socket failed (Invalid argument) [2015-05-15 10:26:12.446313] W [socket.c:642:__socket_rwv] 0-quotad: readv on /var/run/gluster/87382ddc53b616370f1b86e694eba7fc.socket failed (Invalid argument) [2015-05-15 10:26:13.446708] W [socket.c:642:__socket_rwv] 0-nfs: readv on /var/run/gluster/cd9131b65e498c98e62c155fa7f02179.socket failed (Invalid argument) [2015-05-15 10:26:15.447287] W [socket.c:642:__socket_rwv] 0-quotad: readv on /var/run/gluster/87382ddc53b616370f1b86e694eba7fc.socket failed (Invalid argument) [2015-05-15 10:26:16.447697] W [socket.c:642:__socket_rwv] 0-nfs: readv on /var/run/gluster/cd9131b65e498c98e62c155fa7f02179.socket failed (Invalid argument) [2015-05-15 10:26:18.448374] W [socket.c:642:__socket_rwv] 0-quotad: readv on /var/run/gluster/87382ddc53b616370f1b86e694eba7fc.socket failed (Invalid argument) [2015-05-15 10:26:19.448700] W [socket.c:642:__socket_rwv] 0-nfs: readv on /var/run/gluster/cd9131b65e498c98e62c155fa7f02179.socket failed (Invalid argument) [2015-05-15 10:26:21.449319] W [socket.c:642:__socket_rwv] 0-quotad: readv on /var/run/gluster/87382ddc53b616370f1b86e694eba7fc.socket failed (Invalid argument)
Along with above issues i see couple issues related to after effects of detach-tier commit failed 1. volume set operations will not work on the volume. 2. volume sttaus info doesnot show proper output. 3. you cannot detach tier from another volume. 4. detach-tier commit doesnot show proper host name/id from which the bricks are getting decommisioned. ex: root@rhsqa14-vm1 ~]# gluster v detach-tier v1 commit volume detach-tier commit: failed: Staging failed on f32b5f3e-d3f0-4a06-bea4-45b661519f90. Error: Brick 10.70.46.233:/rhs/brick3/m0 is not decommissioned. Use start or force option Staging failed on 388d666a-82be-4cf3-9bec-af3df504fe54. Error: Brick 10.70.46.233:/rhs/brick3/m0 is not decommissioned. Use start or force option Staging failed on ed2d05d3-fbd2-40d4-88d8-22f2171ae5dd. Error: Brick 10.70.46.233:/rhs/brick3/m0 is not decommissioned. Use start or force option [root@rhsqa14-vm1 ~]# g
based on RCA given in bug #1222442 , this bug looks like a dependent on the same bug.
based on RCA given in bug #1222442 , this bug looks like a dependent on the same.
karthick Can you kindly check this and close update accordingly
clearing stale needinfos.