Bug 1221949 - Detach tier start on one volume fails because of failed commit on another volume.
Summary: Detach tier start on one volume fails because of failed commit on another vol...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: core
Version: rhgs-3.1
Hardware: x86_64
OS: Linux
unspecified
urgent
Target Milestone: ---
: ---
Assignee: Mohammed Rafi KC
QA Contact: Sweta Anandpara
URL:
Whiteboard: TIERING
Depends On:
Blocks: 1223636 1224084
TreeView+ depends on / blocked
 
Reported: 2015-05-15 10:27 UTC by Triveni Rao
Modified: 2020-09-28 02:57 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 1224084 (view as bug list)
Environment:
Last Closed: 2018-11-11 15:10:28 UTC
Embargoed:


Attachments (Terms of Use)

Description Triveni Rao 2015-05-15 10:27:13 UTC
Description of problem:

Detach tier start on one volume fails because of failed commit on another volume.
if detach tier commit is failed on one volume then detach tier start is blocked for another volume.

Version-Release number of selected component (if applicable):


How reproducible:
easily

Steps to Reproduce:
1.create dist-rep vol, attach tier, fuse mount, write data, detach tier start and commit.
2.detach commit fails, create one more distribute vol, attach tier, and detach start 
3. detach start fails because of toher volume's commit failed.


Additional info:

[root@rhsqa14-vm1 ~]# gluster v detach-tier vol1 start
volume detach-tier start: failed: An earlier remove-brick task exists for volume vol1. Either commit it or stop it before starting a new task.
[root@rhsqa14-vm1 ~]# 

[root@rhsqa14-vm1 ~]# gluster v detach-tier vol1 start
volume detach-tier start: failed: An earlier remove-brick task exists for volume vol1. Either commit it or stop it before starting a new task.
You have new mail in /var/spool/mail/root
[root@rhsqa14-vm1 ~]# 
[root@rhsqa14-vm1 ~]# 
[root@rhsqa14-vm1 ~]# gluster v detach-tier vol1 commit
volume detach-tier commit: failed: Staging failed on 4cdeee40-2cb6-463e-ba08-905cedb3d26a. Error: Deleting all the bricks of the volume is not allowed
[root@rhsqa14-vm1 ~]# 


Log messages:

[2015-05-14 06:34:47.596813] I [MSGID: 100030] [glusterfsd.c:2294:main] 0-glusterd: Started running glusterd version 3.7.0beta2 (args: glusterd --xlator-option *.upgrade=on -N)
[2015-05-14 06:34:47.605211] I [graph.c:269:gf_add_cmdline_options] 0-management: adding option 'upgrade' for volume 'management' with value 'on'
[2015-05-14 06:34:47.605328] I [glusterd.c:1282:init] 0-management: Maximum allowed open file descriptors set to 65536
[2015-05-14 06:34:47.605370] I [glusterd.c:1327:init] 0-management: Using /var/lib/glusterd as working directory
[2015-05-14 06:34:47.630137] E [rpc-transport.c:291:rpc_transport_load] 0-rpc-transport: /usr/lib64/glusterfs/3.7.0beta2/rpc-transport/rdma.so: cannot open shared object file: No such file or directory
[2015-05-14 06:34:47.630198] W [rpc-transport.c:295:rpc_transport_load] 0-rpc-transport: volume 'rdma.management': transport-type 'rdma' is not valid or not found on this machine
[2015-05-14 06:34:47.630218] W [rpcsvc.c:1595:rpcsvc_transport_create] 0-rpc-service: cannot create listener, initing the transport failed
[2015-05-14 06:34:47.630235] E [glusterd.c:1515:init] 0-management: creation of 1 listeners failed, continuing with succeeded transport
[2015-05-14 06:34:47.649973] I [glusterd.c:413:glusterd_check_gsync_present] 0-glusterd: geo-replication module not installed in the system
[2015-05-14 06:34:47.650135] E [store.c:432:gf_store_handle_retrieve] 0-: Path corresponding to /var/lib/glusterd/glusterd.info, returned error: (No such file or directory)
[2015-05-14 06:34:47.650161] E [store.c:432:gf_store_handle_retrieve] 0-: Path corresponding to /var/lib/glusterd/glusterd.info, returned error: (No such file or directory)
[2015-05-14 06:34:47.650173] I [glusterd-store.c:2005:glusterd_restore_op_version] 0-management: Detected new install. Setting op-version to maximum : 30700
[2015-05-14 06:34:47.650462] E [store.c:432:gf_store_handle_retrieve] 0-: Path corresponding to /var/lib/glusterd/glusterd.info, returned error: (No such file or directory)
[2015-05-14 06:34:47.650781] I [glusterd.c:184:glusterd_uuid_generate_save] 0-management: generated UUID: 87acbf29-e821-48bf-9aa8-bbda9321e609
[2015-05-14 06:34:47.817571] I [rpc-clnt.c:972:rpc_clnt_connection_init] 0-glustershd: setting frame-timeout to 600
[2015-05-14 06:34:47.818755] I [rpc-clnt.c:972:rpc_clnt_connection_init] 0-nfs: setting frame-timeout to 600
[2015-05-14 06:34:47.819295] I [rpc-clnt.c:972:rpc_clnt_connection_init] 0-quotad: setting frame-timeout to 600
[2015-05-14 06:34:47.819873] I [rpc-clnt.c:972:rpc_clnt_connection_init] 0-bitd: setting frame-timeout to 600
[2015-05-14 06:34:47.820373] I [rpc-clnt.c:972:rpc_clnt_connection_init] 0-scrub: setting frame-timeout to 600
[2015-05-14 06:34:47.820867] I [glusterd-store.c:3371:glusterd_store_retrieve_missed_snaps_list] 0-management: No missed snaps list.
[2015-05-14 06:34:47.821075] E [store.c:432:gf_store_handle_retrieve] 0-: Path corresponding to /var/lib/glusterd/options, returned error: (No such file or directory)
Final graph:
+------------------------------------------------------------------------------+
...skipping...
gument)
[2015-05-15 10:26:00.442113] W [socket.c:642:__socket_rwv] 0-quotad: readv on /var/run/gluster/87382ddc53b616370f1b86e694eba7fc.socket failed (Invalid argument)
[2015-05-15 10:26:01.442822] W [socket.c:642:__socket_rwv] 0-nfs: readv on /var/run/gluster/cd9131b65e498c98e62c155fa7f02179.socket failed (Invalid argument)
[2015-05-15 10:26:03.443521] W [socket.c:642:__socket_rwv] 0-quotad: readv on /var/run/gluster/87382ddc53b616370f1b86e694eba7fc.socket failed (Invalid argument)
[2015-05-15 10:26:04.443964] W [socket.c:642:__socket_rwv] 0-nfs: readv on /var/run/gluster/cd9131b65e498c98e62c155fa7f02179.socket failed (Invalid argument)
[2015-05-15 10:26:06.444470] W [socket.c:642:__socket_rwv] 0-quotad: readv on /var/run/gluster/87382ddc53b616370f1b86e694eba7fc.socket failed (Invalid argument)
[2015-05-15 10:26:07.444820] W [socket.c:642:__socket_rwv] 0-nfs: readv on /var/run/gluster/cd9131b65e498c98e62c155fa7f02179.socket failed (Invalid argument)
[2015-05-15 10:26:09.445513] W [socket.c:642:__socket_rwv] 0-quotad: readv on /var/run/gluster/87382ddc53b616370f1b86e694eba7fc.socket failed (Invalid argument)
[2015-05-15 10:26:10.445746] W [socket.c:642:__socket_rwv] 0-nfs: readv on /var/run/gluster/cd9131b65e498c98e62c155fa7f02179.socket failed (Invalid argument)
[2015-05-15 10:26:12.446313] W [socket.c:642:__socket_rwv] 0-quotad: readv on /var/run/gluster/87382ddc53b616370f1b86e694eba7fc.socket failed (Invalid argument)
[2015-05-15 10:26:13.446708] W [socket.c:642:__socket_rwv] 0-nfs: readv on /var/run/gluster/cd9131b65e498c98e62c155fa7f02179.socket failed (Invalid argument)
[2015-05-15 10:26:15.447287] W [socket.c:642:__socket_rwv] 0-quotad: readv on /var/run/gluster/87382ddc53b616370f1b86e694eba7fc.socket failed (Invalid argument)
[2015-05-15 10:26:16.447697] W [socket.c:642:__socket_rwv] 0-nfs: readv on /var/run/gluster/cd9131b65e498c98e62c155fa7f02179.socket failed (Invalid argument)
[2015-05-15 10:26:18.448374] W [socket.c:642:__socket_rwv] 0-quotad: readv on /var/run/gluster/87382ddc53b616370f1b86e694eba7fc.socket failed (Invalid argument)
[2015-05-15 10:26:19.448700] W [socket.c:642:__socket_rwv] 0-nfs: readv on /var/run/gluster/cd9131b65e498c98e62c155fa7f02179.socket failed (Invalid argument)
[2015-05-15 10:26:21.449319] W [socket.c:642:__socket_rwv] 0-quotad: readv on /var/run/gluster/87382ddc53b616370f1b86e694eba7fc.socket failed (Invalid argument)

Comment 1 Triveni Rao 2015-05-19 06:10:56 UTC
Along with above issues i see couple issues related to after effects of detach-tier commit failed

1. volume set operations will not work on the volume.
2. volume sttaus info doesnot show proper output.
3. you cannot detach tier from another volume.
4. detach-tier commit doesnot show proper host name/id from which the bricks are getting decommisioned.

ex:
root@rhsqa14-vm1 ~]# gluster v detach-tier v1 commit
volume detach-tier commit: failed: Staging failed on f32b5f3e-d3f0-4a06-bea4-45b661519f90. Error: Brick 10.70.46.233:/rhs/brick3/m0 is not decommissioned. Use start or force option
Staging failed on 388d666a-82be-4cf3-9bec-af3df504fe54. Error: Brick 10.70.46.233:/rhs/brick3/m0 is not decommissioned. Use start or force option
Staging failed on ed2d05d3-fbd2-40d4-88d8-22f2171ae5dd. Error: Brick 10.70.46.233:/rhs/brick3/m0 is not decommissioned. Use start or force option
[root@rhsqa14-vm1 ~]# g

Comment 5 Mohammed Rafi KC 2015-05-22 07:30:27 UTC
based on RCA given in bug #1222442 , this bug looks like a dependent on the same bug.

Comment 6 Mohammed Rafi KC 2015-05-22 07:30:48 UTC
based on RCA given in bug #1222442 , this bug looks like a dependent on the same.

Comment 8 Nag Pavan Chilakam 2016-03-01 06:50:17 UTC
karthick Can you kindly check this and close update accordingly

Comment 9 Nag Pavan Chilakam 2016-03-01 06:50:59 UTC
karthick Can you kindly check this and close update accordingly

Comment 14 krishnaram Karthick 2020-09-28 02:57:23 UTC
clearing stale needinfos.


Note You need to log in before you can comment on or make changes to this bug.