Bug 1572599 - converting distribute to distribute-replicate is failing
Summary: converting distribute to distribute-replicate is failing
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: glusterd
Version: rhgs-3.4
Hardware: Unspecified
OS: Unspecified
low
medium
Target Milestone: ---
: ---
Assignee: Nikhil Ladha
QA Contact: Bala Konda Reddy M
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-04-27 11:53 UTC by Vijay Avuthu
Modified: 2020-06-10 12:12 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-03-10 11:59:56 UTC
Embargoed:


Attachments (Terms of Use)

Description Vijay Avuthu 2018-04-27 11:53:27 UTC
Description of problem:

converting distribute ( 2 bricks )  to distribute-replicate ( 2 * 3 ) is failing with with "Commit failed on <node>"

Version-Release number of selected component (if applicable):

glusterfs-3.12.2-8.el7rhgs.x86_64

How reproducible:

Always

Steps to Reproduce:
1. Create distribute ( 2 bricks )volume and start
2. add bricks so that it will convert to Distribute-Replicate ( 2 * 3 ) volume.


> 
# gluster vol info
 
Volume Name: dist
Type: Distribute
Volume ID: f648ecbc-5763-4ee0-9496-b1c5379cc480
Status: Started
Snapshot Count: 0
Number of Bricks: 2
Transport-type: tcp
Bricks:
Brick1: 10.70.35.61:/bricks/brick1/b0
Brick2: 10.70.35.174:/bricks/brick1/b1
Options Reconfigured:
transport.address-family: inet
nfs.disable: on
cluster.localtime-logging: disable
# 

> 
#gluster vol add-brick dist replica 3 10.70.35.17:/bricks/brick1/b2 10.70.35.163:/bricks/brick1/b3 10.70.35.136:/bricks/brick1/b4 10.70.35.214:/bricks/brick1/b5

volume add-brick: failed: Commit failed on dhcp35-17.lab.eng.blr.redhat.com. Please check log file for details.
Commit failed on dhcp35-214.lab.eng.blr.redhat.com. Please check log file for details.
Commit failed on dhcp35-163.lab.eng.blr.redhat.com. Please check log file for details.
Commit failed on dhcp35-136.lab.eng.blr.redhat.com. Please check log file for details.
#

# gluster vol status 
Status of volume: dist
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.70.35.61:/bricks/brick1/b0         49152     0          Y       6697 
Brick 10.70.35.17:/bricks/brick1/b2         N/A       N/A        N       N/A  
Brick 10.70.35.163:/bricks/brick1/b3        N/A       N/A        N       N/A  
Brick 10.70.35.174:/bricks/brick1/b1        49152     0          Y       5628 
Brick 10.70.35.136:/bricks/brick1/b4        N/A       N/A        N       N/A  
Brick 10.70.35.214:/bricks/brick1/b5        N/A       N/A        N       N/A  
Self-heal Daemon on localhost               N/A       N/A        Y       6842 
Self-heal Daemon on dhcp35-214.lab.eng.blr.
redhat.com                                  N/A       N/A        N       N/A  
Self-heal Daemon on dhcp35-17.lab.eng.blr.r
edhat.com                                   N/A       N/A        N       N/A  
Self-heal Daemon on dhcp35-163.lab.eng.blr.
redhat.com                                  N/A       N/A        N       N/A  
Self-heal Daemon on dhcp35-136.lab.eng.blr.
redhat.com                                  N/A       N/A        N       N/A  
Self-heal Daemon on dhcp35-174.lab.eng.blr.
redhat.com                                  N/A       N/A        Y       5744 
 
Task Status of Volume dist
------------------------------------------------------------------------------
There are no active volume tasks
 
# 

> glusterd logs from failed node:

[2018-04-27 11:19:31.807111] I [run.c:190:runner_log] (-->/usr/lib64/glusterfs/3.12.2/xlator/mgmt/glusterd.so(+0x39f05) [0x7fd5810c2f05] -->/usr/lib64/glusterfs/3.12.2/xlator/mgmt/glusterd.so(+0xe29dd) [0x7fd58116b9dd] -->/lib64/libglusterfs.so.0(runner_log+0x115) [0x7fd58c672805] ) 0-management: Ran script: /var/lib/glusterd/hooks/1/add-brick/pre/S28Quota-enable-root-xattr-heal.sh --volname=dist --version=1 --volume-op=add-brick --gd-workdir=/var/lib/glusterd
[2018-04-27 11:19:31.807293] I [MSGID: 106578] [glusterd-brick-ops.c:1354:glusterd_op_perform_add_bricks] 0-management: replica-count is set 3
[2018-04-27 11:19:31.807340] I [MSGID: 106578] [glusterd-brick-ops.c:1364:glusterd_op_perform_add_bricks] 0-management: type is set 2, need to change it
[2018-04-27 11:19:32.960558] E [MSGID: 106054] [glusterd-utils.c:13609:glusterd_handle_replicate_brick_ops] 0-management: Failed to set extended attribute trusted.add-brick : Read-only file system [Read-only file system]
[2018-04-27 11:19:32.992604] E [MSGID: 106074] [glusterd-brick-ops.c:2590:glusterd_op_add_brick] 0-glusterd: Unable to add bricks
[2018-04-27 11:19:32.992704] E [MSGID: 106123] [glusterd-mgmt.c:312:gd_mgmt_v3_commit_fn] 0-management: Add-brick commit failed.
[2018-04-27 11:19:32.992755] E [MSGID: 106123] [glusterd-mgmt-handler.c:603:glusterd_handle_commit_fn] 0-management: commit failed on operation Add brick
[2018-04-27 11:21:29.338876] I [MSGID: 106488] [glusterd-handler.c:1549:__glusterd_handle_cli_get_volume] 0-management: Received get vol req

Additional info:


sos reports:

http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/vavuthu/volume_convertion/

Comment 15 Atin Mukherjee 2018-11-27 03:04:46 UTC
Vijay - the needinfo is pending since May. Can you please clear it at earliest?

Comment 17 Atin Mukherjee 2018-12-21 08:09:17 UTC
Karthik - I'd like to see this bug to get addressed in 3.4 BU5.Please plan it accordingly.


Note You need to log in before you can comment on or make changes to this bug.