1264520 – volume rebalance start is successfull but status returns failed status

Bug 1264520 - volume rebalance start is successfull but status returns failed status

Summary: volume rebalance start is successfull but status returns failed status

Keywords:
Status:	CLOSED INSUFFICIENT_DATA
Alias:	None
Product:	GlusterFS
Classification:	Community
Component:	distribute
Sub Component:
Version:	3.6.2
Hardware:	x86_64
OS:	Linux
Priority:	unspecified
Severity:	medium
Target Milestone:	---
Assignee:	Nithya Balachandran
QA Contact:
Docs Contact:
URL:
Whiteboard:	dht-rebalance-usability
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2015-09-18 16:31 UTC by Leildin
Modified:	2016-08-19 12:46 UTC (History)
CC List:	6 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2016-08-19 12:46:36 UTC
Regression:	---
Mount Type:	---
Documentation:	---
CRM:
Verified Versions:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
data-rebalance.vol file (2.71 KB, text/plain) 2015-09-18 16:31 UTC, Leildin	no flags	Details
View All

Description Leildin 2015-09-18 16:31:09 UTC

Created attachment 1075005 [details]
data-rebalance.vol file

Description of problem:

Upgrading a distributed volume from 6 to 8 bricks.
xfs disks mounted to /gluster/bricks/brick*
added brick 7 and 8 without any problems

Version-Release number of selected component (if applicable):
3.6.2 Gluster packages all around

How reproducible:
everytime I try a rebalance with or without fix-layout option

Steps to Reproduce:
1. gluster volume rebalance data start
2. gluster volume rebalance data status
3. cat /var/lib/glusterd/vols/data/data-rebalance.vol to see error status

Actual results:
Failed/non-started rebalance

/var/lib/glusterd/vols/data/data-rebalance.vol contains:
"volume data-client-0
    type protocol/client
    option send-gids true
    option frame-timeout 45
    option transport-type tcp
    option remote-subvolume /gluster/bricks/brick1/data
    option remote-host gls-safran1
    option ping-timeout 42
end-volume"

for 6 first bricks then 
volume data-client-1 for brick 7
volume data-client-2 for brick 8


Expected results:
Rebalance between the 8 bricks

Additional info:
gluster volume info:

Volume Name: data
Type: Distribute
Volume ID: df8943d0-0e40-4ba9-87f4-08bd5f962f9d
Status: Started
Number of Bricks: 8
Transport-type: tcp
Bricks:
Brick1: gls-safran1:/gluster/bricks/brick1/data
Brick2: gls-safran1:/gluster/bricks/brick2/data
Brick3: gls-safran1:/gluster/bricks/brick3/data
Brick4: gls-safran1:/gluster/bricks/brick4/data
Brick5: gls-safran1:/gluster/bricks/brick5/data
Brick6: gls-safran1:/gluster/bricks/brick6/data
Brick7: gls-safran1:/gluster/bricks/brick7/data
Brick8: gls-safran1:/gluster/bricks/brick8/data
Options Reconfigured:
performance.stat-prefetch: off
performance.cache-refresh-timeout: 30
network.frame-timeout: 45
performance.io-cache: off
performance.quick-read: off
diagnostics.count-fop-hits: on
diagnostics.latency-measurement: on
diagnostics.brick-log-level: INFO
nfs.disable: on
diagnostics.client-log-level: INFO
server.allow-insecure: on
performance.cache-size: 128MB
performance.io-thread-count: 16
performance.write-behind-window-size: 4MB
performance.flush-behind: off
performance.read-ahead: on





First bug submit, if any errors or stuff please forgive me !

Comment 1 Joe Julian 2015-09-18 16:44:08 UTC

The failure occurs due to the vol file being created incorrectly. The vol file parsing errors due to having multiple "volume data-client-0" entries.

I suspect it has something to do with all the bricks for this volume residing on one single server.

Comment 2 Leildin 2015-09-18 16:46:00 UTC

(In reply to Leildin from comment #0)

Sorry, missed some stuff here:
 
> Description of problem:
> 
> Upgrading a distributed volume from 6 to 8 bricks.
> xfs disks mounted to /gluster/bricks/brick*
> added brick 7 and 8 without any problems

Volume is called data

the volume sees bricks fine and I want to push layout and files onto new bricks.
I try a gluster volume rebalance datat start.
It's successful and I get a job-id.

when I try to see status it says my job failed at 0%.

we tried to regenerate all .vol files in /var/lib/glusterd/vol/data by running pkill glusterd ; glusterd --xlator-option *.upgrade=on -N ; glusterd

The .vol files have the same problem as can be found in attachement and summarised in "Actual results"

Comment 3 Leildin 2015-09-21 10:15:16 UTC

So, a quick work-around is to change manually the .vol files that have an error, for me it was:

data-replication.vol
trusted-data.tcp-fuse.vol
data.tcp-fuse.vol

after these files are changed, the rebalance can be done.

Comment 4 Leildin 2015-09-22 10:46:20 UTC

After rebalance is done the files are rewritten and have the same bug put back in !

Clients can't mount the volume in this state. They have the same parsing error.

Comment 5 Jiffin 2015-09-22 12:40:35 UTC

Is this bug is still valid ? It seems to be your problem is related to creation of vols , Can u change the bug accordingly

Comment 6 Leildin 2015-09-22 14:55:53 UTC

This bug is still valid. it happened after an add brick.
I had been using this volume no problem for a year+.

when I added the bricks all went fine and then tried to rebalance to get the bricks up and running with the rest.

I have changed one option since finding this bug, I removed:
network.frame-timeout: 45

# volume reset data network.frame-timeout

If you need any more info let me know, I have gluster-clients that can't mount volumes here and there, I have to unmount and remount them after correcting the config files a second time.

Comment 7 Joe Julian 2015-09-22 15:24:14 UTC

I *think* Jiffin was suggesting that you change the component to glusterd, but I'm not entirely sure.

Comment 8 Atin Mukherjee 2015-09-22 17:25:52 UTC

Since this is an issue at add-brick API, I would request DHT team to look into it. Hence moving the component back to distribute

Comment 9 Leildin 2015-09-28 17:07:53 UTC

Just upgraded from 3.6.2 to 3.7.4 without much of a hitch.
rebooted and found my volume could not be mounted.
looked at .vol files, they were regenerated with false information just as above.

I would like to add I have a single server with all 8 bricks on it.
this server has never been part of a trusted pool

Comment 10 Nithya Balachandran 2016-08-19 05:18:15 UTC

My apologies for the extremely delayed response.

I went through the code and the glusterd process generates the volfiles based on the info stored in /var/lib/glusterd/.  It looks like something might be wrong there.

glusterd uses the information in the /var/lib/glusterd/<volname>/bricks directory to generate the client info portion for the client vol files (this includes any fuse client, rebalance etc).

For example, I have a volume called loop with 3 bricks.

Volume Name: loop
Type: Distribute
Volume ID: 68b941df-b656-4950-bcfa-bdd940b774a7
Status: Started
Number of Bricks: 3
Transport-type: tcp
Bricks:
Brick1: 192.168.122.9:/bricks/brick2/b2
Brick2: 192.168.122.9:/bricks/brick1/b2
Brick3: 192.168.122.8:/bricks/brick2/b2
Options Reconfigured:
transport.address-family: inet
performance.readdir-ahead: on
nfs.disable: on
diagnostics.client-log-level: INFO


If I check the brick info stored in /var/lib/glusterd/vols/loop/bricks, I see 

-rw------- 1 root root 179 Aug 17 13:39 192.168.122.8:-bricks-brick2-b2
-rw------- 1 root root 175 Aug 17 13:39 192.168.122.9:-bricks-brick1-b2
-rw------- 1 root root 175 Aug 17 13:39 192.168.122.9:-bricks-brick2-b2


These files contain the information which is used to generate the volfiles.


[root@nb-rhs3-srv1 bricks]# cat 192.168.122.9:-bricks-brick2-b2
hostname=192.168.122.9
path=/bricks/brick2/b2
real_path=/bricks/brick2/b2
listen-port=0
rdma.listen-port=0
decommissioned=0
brick-id=loop-client-0   <--- client 0
mount_dir=/b2
snap-status=0


[root@nb-rhs3-srv1 bricks]# cat 192.168.122.9:-bricks-brick1-b2
hostname=192.168.122.9
path=/bricks/brick1/b2
real_path=/bricks/brick1/b2
listen-port=0
rdma.listen-port=0
decommissioned=0
brick-id=loop-client-1    <--- client 1
mount_dir=/b2
snap-status=0

[root@nb-rhs3-srv1 bricks]# cat 192.168.122.8:-bricks-brick2-b2
hostname=192.168.122.8
path=/bricks/brick2/b2
real_path=/bricks/brick2/b2
listen-port=49152
rdma.listen-port=0
decommissioned=0
brick-id=loop-client-2   <--- client 2
mount_dir=/b2
snap-status=0


It sounds like the files in /var/lib/glusterd/data/bricks for the original 6 bricks have for some reason got the same brick-id. 

We do not know why this could have happened. If you have any steps to reproduce the issue, please let us know.

Can you please send across the contents of the /var/lib/glusterd/data on the server so we can confirm this theory?

If this is the case, this problem will show up everytime the volfiles are generated (if you were to change an option or add/remove bricks for example). You will need to edit the files and correct the brick ids in the same order as listed in the gluster volume info. 


Brick1: gls-safran1:/gluster/bricks/brick1/data   <-- data-client-0
Brick2: gls-safran1:/gluster/bricks/brick2/data   <-- data-client-1
Brick3: gls-safran1:/gluster/bricks/brick3/data   <-- data-client-2
Brick4: gls-safran1:/gluster/bricks/brick4/data   <-- data-client-3
Brick5: gls-safran1:/gluster/bricks/brick5/data   <-- data-client-4
Brick6: gls-safran1:/gluster/bricks/brick6/data   <-- data-client-5
Brick7: gls-safran1:/gluster/bricks/brick7/data   <-- data-client-6
Brick8: gls-safran1:/gluster/bricks/brick8/data   <-- data-client-7

Please let me know if you have any questions.

Comment 11 Leildin 2016-08-19 09:23:09 UTC

(In reply to Nithya Balachandran from comment #10)
> My apologies for the extremely delayed response.
> 
> I went through the code and the glusterd process generates the volfiles
> based on the info stored in /var/lib/glusterd/.  It looks like something
> might be wrong there.
> 
> glusterd uses the information in the /var/lib/glusterd/<volname>/bricks
> directory to generate the client info portion for the client vol files (this
> includes any fuse client, rebalance etc).
> 
> For example, I have a volume called loop with 3 bricks.
> 
> Volume Name: loop
> Type: Distribute
> Volume ID: 68b941df-b656-4950-bcfa-bdd940b774a7
> Status: Started
> Number of Bricks: 3
> Transport-type: tcp
> Bricks:
> Brick1: 192.168.122.9:/bricks/brick2/b2
> Brick2: 192.168.122.9:/bricks/brick1/b2
> Brick3: 192.168.122.8:/bricks/brick2/b2
> Options Reconfigured:
> transport.address-family: inet
> performance.readdir-ahead: on
> nfs.disable: on
> diagnostics.client-log-level: INFO
> 
> 
> If I check the brick info stored in /var/lib/glusterd/vols/loop/bricks, I
> see 
> 
> -rw------- 1 root root 179 Aug 17 13:39 192.168.122.8:-bricks-brick2-b2
> -rw------- 1 root root 175 Aug 17 13:39 192.168.122.9:-bricks-brick1-b2
> -rw------- 1 root root 175 Aug 17 13:39 192.168.122.9:-bricks-brick2-b2
> 
> 
> These files contain the information which is used to generate the volfiles.
> 
> 
> [root@nb-rhs3-srv1 bricks]# cat 192.168.122.9:-bricks-brick2-b2
> hostname=192.168.122.9
> path=/bricks/brick2/b2
> real_path=/bricks/brick2/b2
> listen-port=0
> rdma.listen-port=0
> decommissioned=0
> brick-id=loop-client-0   <--- client 0
> mount_dir=/b2
> snap-status=0
> 
> 
> [root@nb-rhs3-srv1 bricks]# cat 192.168.122.9:-bricks-brick1-b2
> hostname=192.168.122.9
> path=/bricks/brick1/b2
> real_path=/bricks/brick1/b2
> listen-port=0
> rdma.listen-port=0
> decommissioned=0
> brick-id=loop-client-1    <--- client 1
> mount_dir=/b2
> snap-status=0
> 
> [root@nb-rhs3-srv1 bricks]# cat 192.168.122.8:-bricks-brick2-b2
> hostname=192.168.122.8
> path=/bricks/brick2/b2
> real_path=/bricks/brick2/b2
> listen-port=49152
> rdma.listen-port=0
> decommissioned=0
> brick-id=loop-client-2   <--- client 2
> mount_dir=/b2
> snap-status=0
> 
> 
> It sounds like the files in /var/lib/glusterd/data/bricks for the original 6
> bricks have for some reason got the same brick-id. 
> 
> We do not know why this could have happened. If you have any steps to
> reproduce the issue, please let us know.
> 
> Can you please send across the contents of the /var/lib/glusterd/data on the
> server so we can confirm this theory?
> 
> If this is the case, this problem will show up everytime the volfiles are
> generated (if you were to change an option or add/remove bricks for
> example). You will need to edit the files and correct the brick ids in the
> same order as listed in the gluster volume info. 
> 
> 
> Brick1: gls-safran1:/gluster/bricks/brick1/data   <-- data-client-0
> Brick2: gls-safran1:/gluster/bricks/brick2/data   <-- data-client-1
> Brick3: gls-safran1:/gluster/bricks/brick3/data   <-- data-client-2
> Brick4: gls-safran1:/gluster/bricks/brick4/data   <-- data-client-3
> Brick5: gls-safran1:/gluster/bricks/brick5/data   <-- data-client-4
> Brick6: gls-safran1:/gluster/bricks/brick6/data   <-- data-client-5
> Brick7: gls-safran1:/gluster/bricks/brick7/data   <-- data-client-6
> Brick8: gls-safran1:/gluster/bricks/brick8/data   <-- data-client-7
> 
> Please let me know if you have any questions.

Hi,

I have since moved on to gluster 3.7.14 on all of my servers.
I can confirm that when I had the bug any rebalance, option change would corrupt the vol files.
I had to go back into them and make them right then upgrade to not have the bug anymore.
Do you still want the /var/lib/glusterd/vols/data files ?
They are correct and don't get corrupted anymore.

Comment 12 Nithya Balachandran 2016-08-19 10:55:08 UTC

(In reply to Leildin from comment #11)
> (In reply to Nithya Balachandran from comment #10)
> > My apologies for the extremely delayed response.
> > 
> > I went through the code and the glusterd process generates the volfiles
> > based on the info stored in /var/lib/glusterd/.  It looks like something
> > might be wrong there.
> > 
> > glusterd uses the information in the /var/lib/glusterd/<volname>/bricks
> > directory to generate the client info portion for the client vol files (this
> > includes any fuse client, rebalance etc).
> > 
> > For example, I have a volume called loop with 3 bricks.
> > 
> > Volume Name: loop
> > Type: Distribute
> > Volume ID: 68b941df-b656-4950-bcfa-bdd940b774a7
> > Status: Started
> > Number of Bricks: 3
> > Transport-type: tcp
> > Bricks:
> > Brick1: 192.168.122.9:/bricks/brick2/b2
> > Brick2: 192.168.122.9:/bricks/brick1/b2
> > Brick3: 192.168.122.8:/bricks/brick2/b2
> > Options Reconfigured:
> > transport.address-family: inet
> > performance.readdir-ahead: on
> > nfs.disable: on
> > diagnostics.client-log-level: INFO
> > 
> > 
> > If I check the brick info stored in /var/lib/glusterd/vols/loop/bricks, I
> > see 
> > 
> > -rw------- 1 root root 179 Aug 17 13:39 192.168.122.8:-bricks-brick2-b2
> > -rw------- 1 root root 175 Aug 17 13:39 192.168.122.9:-bricks-brick1-b2
> > -rw------- 1 root root 175 Aug 17 13:39 192.168.122.9:-bricks-brick2-b2
> > 
> > 
> > These files contain the information which is used to generate the volfiles.
> > 
> > 
> > [root@nb-rhs3-srv1 bricks]# cat 192.168.122.9:-bricks-brick2-b2
> > hostname=192.168.122.9
> > path=/bricks/brick2/b2
> > real_path=/bricks/brick2/b2
> > listen-port=0
> > rdma.listen-port=0
> > decommissioned=0
> > brick-id=loop-client-0   <--- client 0
> > mount_dir=/b2
> > snap-status=0
> > 
> > 
> > [root@nb-rhs3-srv1 bricks]# cat 192.168.122.9:-bricks-brick1-b2
> > hostname=192.168.122.9
> > path=/bricks/brick1/b2
> > real_path=/bricks/brick1/b2
> > listen-port=0
> > rdma.listen-port=0
> > decommissioned=0
> > brick-id=loop-client-1    <--- client 1
> > mount_dir=/b2
> > snap-status=0
> > 
> > [root@nb-rhs3-srv1 bricks]# cat 192.168.122.8:-bricks-brick2-b2
> > hostname=192.168.122.8
> > path=/bricks/brick2/b2
> > real_path=/bricks/brick2/b2
> > listen-port=49152
> > rdma.listen-port=0
> > decommissioned=0
> > brick-id=loop-client-2   <--- client 2
> > mount_dir=/b2
> > snap-status=0
> > 
> > 
> > It sounds like the files in /var/lib/glusterd/data/bricks for the original 6
> > bricks have for some reason got the same brick-id. 
> > 
> > We do not know why this could have happened. If you have any steps to
> > reproduce the issue, please let us know.
> > 
> > Can you please send across the contents of the /var/lib/glusterd/data on the
> > server so we can confirm this theory?
> > 
> > If this is the case, this problem will show up everytime the volfiles are
> > generated (if you were to change an option or add/remove bricks for
> > example). You will need to edit the files and correct the brick ids in the
> > same order as listed in the gluster volume info. 
> > 
> > 
> > Brick1: gls-safran1:/gluster/bricks/brick1/data   <-- data-client-0
> > Brick2: gls-safran1:/gluster/bricks/brick2/data   <-- data-client-1
> > Brick3: gls-safran1:/gluster/bricks/brick3/data   <-- data-client-2
> > Brick4: gls-safran1:/gluster/bricks/brick4/data   <-- data-client-3
> > Brick5: gls-safran1:/gluster/bricks/brick5/data   <-- data-client-4
> > Brick6: gls-safran1:/gluster/bricks/brick6/data   <-- data-client-5
> > Brick7: gls-safran1:/gluster/bricks/brick7/data   <-- data-client-6
> > Brick8: gls-safran1:/gluster/bricks/brick8/data   <-- data-client-7
> > 
> > Please let me know if you have any questions.
> 
> Hi,
> 
> I have since moved on to gluster 3.7.14 on all of my servers.
> I can confirm that when I had the bug any rebalance, option change would
> corrupt the vol files.
> I had to go back into them and make them right then upgrade to not have the
> bug anymore.
> Do you still want the /var/lib/glusterd/vols/data files ?
> They are correct and don't get corrupted anymore.



Hi,

If you corrected the files in /var/lib/glusterd/vols/data/bricks, then yes, this should not happen anymore and we don't need the files. 


In that case can we close the BZ?

Comment 13 Nithya Balachandran 2016-08-19 10:59:31 UTC

If you have not corrected the files in /var/lib/glusterd/vols/data/bricks, please send them across.

Comment 14 Leildin 2016-08-19 12:13:19 UTC

(In reply to Nithya Balachandran from comment #13)
> If you have not corrected the files in /var/lib/glusterd/vols/data/bricks,
> please send them across.

I've corrected them since the bug, changed version and done all kinds of rebalance and option changing without issue since.
Let's close this BZ.

Comment 15 Nithya Balachandran 2016-08-19 12:46:36 UTC

Thanks. Closing this BZ as per comment#14.

Note You need to log in before you can comment on or make changes to this bug.