1355846 – Data corruption when disabling sharding

Bug 1355846 - Data corruption when disabling sharding

Summary: Data corruption when disabling sharding

Keywords:
Status:	CLOSED EOL
Alias:	None
Product:	GlusterFS
Classification:	Community
Component:	sharding
Sub Component:
Version:	3.8
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	high
Target Milestone:	---
Assignee:	Krutika Dhananjay
QA Contact:	bugs@gluster.org
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2016-07-12 17:33 UTC by Alessandro Corbelli
Modified:	2017-11-07 10:36 UTC (History)
CC List:	8 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2017-11-07 10:36:36 UTC
Regression:	---
Mount Type:	---
Documentation:	---
CRM:
Verified Versions:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description Alessandro Corbelli 2016-07-12 17:33:01 UTC

Description of problem:

When switch off sharding a previously shareded volume, sharded data are corrupted and not more readable untile sharding set back to on

How reproducible:


Steps to Reproduce:
1. create a sharded volume
2. create a file that make use of multiple shard
3. disable sharding
4. try to read the previously created file

Actual results:

file is unreadable, only the first shard is available


Expected results:

file must be readable even if sharding is disabled, or "gluster" cli must refure to disable sharding if there are some sharded file in the volume





node1# gluster volume set gv0 features.shard on

volume set: success
node1# 
node1# gluster volume set gv0 features.shard-block-size 10MB
volume set: success
node1# gluster volume info                                  
 
Volume Name: gv0
Type: Replicate
Volume ID: 2a36dc0f-1d9b-469c-82de-9d8d98321b83
Status: Started
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: 1.2.3.112:/export/sdb1/brick
Brick2: 1.2.3.113:/export/sdb1/brick
Brick3: 1.2.3.114:/export/sdb1/brick
Options Reconfigured:
nfs.disable: on
performance.readdir-ahead: on
transport.address-family: inet
features.shard: on
features.shard-block-size: 10MB
performance.write-behind-window-size: 1GB
performance.cache-size: 1GB

client# fallocate -l 50M testfile
client# md5sum testfile 
25e317773f308e446cc84c503a6d1f85  testfile

node1# gluster volume set gv0 features.shard off
volume set: success

client# md5sum testfile 
f1c9645dbc14efddc7d8a322685f26eb  testfile

node1# gluster volume set gv0 features.shard on
volume set: success

# md5sum testfile 
25e317773f308e446cc84c503a6d1f85  testfile

Comment 1 Krutika Dhananjay 2016-07-13 05:25:52 UTC

Thanks for the feedback. Until this is fixed, do you mind following the instruction documented at http://staged-gluster-docs.readthedocs.io/en/release3.7.0beta1/Features/shard/ :

"... If you want to disable sharding on a volume, it is advisable to create a new volume without sharding and copy out contents of this volume into the new volume."


-Krutika

Comment 2 Alessandro Corbelli 2016-07-13 06:50:19 UTC

I wasn't aware of that paragraph in docs but can i suggest to block sharding disabling when shaded files exists until a "force" option is set?

With data corruption you can't rely on a simple warning on docs

Additionally what's would happen changing the shard size with existing shareded files? i think the same corruption would happen.

Until a fix, gluster cli should block these kind ogmf changes for safety.

Comment 3 Alessandro Corbelli 2016-07-13 06:51:46 UTC

(How can i edit my comment?  I did many typo that i would like to correct)

Comment 4 Niels de Vos 2016-09-12 05:37:03 UTC

All 3.8.x bugs are now reported against version 3.8 (without .x). For more information, see http://www.gluster.org/pipermail/gluster-devel/2016-September/050859.html

Comment 5 Alessandro Corbelli 2016-10-29 13:01:14 UTC

Is this fixed ?

The ability to easily corrupt the whole cluster directly from a Gluster command, by changing the shard size, is very scary.

There is no turn back, if you change the shard size on a filled cluster, older file are corrupted and newer files are saved with the new shard size. Thus, you can't go back. You have to revert the shard size to access the older files but you'll loose the newer files or viceversa, you have access to newer files but you lost the olders ones.

This must be fixed. Don't allow users to change shard size when data are placed on the cluster. (or allow the change only with a --force argument or similar)

Comment 6 Krutika Dhananjay 2016-11-14 11:15:45 UTC

Sorry about the late response.

So you see this issue even when you change the shard-block-size on a volume? Or is this only if sharding itself is disabled altogether?

We wrote sharding such that it would work even if shard-block-size on a volume is changed by storing the block-size property of every sharded file in the form of an extended attribute called "trusted.glusterfs.shard-block-size".

Comment 7 Alessandro Corbelli 2016-11-14 11:25:13 UTC

Yes, the issue arise in these 2 cases:

1) disable the sharding on an shared volume with data on it
2) change the shard size on a sharded volume with data on it

Another idea could be to store the shard size, for each file, on an xattr.
In this case, when changing the shardsize, it won't affect existing file, as you can use the older shard size by reading it's value from the xattr.

Comment 8 Krutika Dhananjay 2016-11-14 13:24:54 UTC

(In reply to Alessandro Corbelli from comment #7)
> Yes, the issue arise in these 2 cases:
> 
> 1) disable the sharding on an shared volume with data on it

This is currently not fixed. The same is documented at http://staged-gluster-docs.readthedocs.io/en/release3.7.0beta1/Features/shard/ 

> 2) change the shard size on a sharded volume with data on it

Hmm this is something I wasn't able to recreate. I tried it just now. Do you have a consistent recreatable test case for this?

> 
> Another idea could be to store the shard size, for each file, on an xattr.
> In this case, when changing the shardsize, it won't affect existing file, as
> you can use the older shard size by reading it's value from the xattr.

Yes, that is precisely what sharding does in its current state. It's the same thing I was referring to in comment #6. :)

-Krutika

Comment 9 Alessandro Corbelli 2016-11-14 15:32:39 UTC

Probably you are right.
I don't remember exactly if changing the shard size would affect the stored data, but for sure, disabling sharding would break everything. You could still use sharding if the shard size is stored in xattr, ignoring the default for the volume

Comment 10 Alessandro Corbelli 2016-11-14 15:32:46 UTC

Probably you are right.
I don't remember exactly if changing the shard size would affect the stored data, but for sure, disabling sharding would break everything. You could still use sharding if the shard size is stored in xattr, ignoring the default for the volume

Comment 11 Krutika Dhananjay 2016-11-15 01:26:46 UTC

Got it. So would it be safe to say comment #5 doesn't hold true anymore?

-Krutika

Comment 12 Alessandro Corbelli 2016-12-26 19:09:58 UTC

Yes, comment #5 is wrong.

Comment 13 Niels de Vos 2017-11-07 10:36:36 UTC

This bug is getting closed because the 3.8 version is marked End-Of-Life. There will be no further updates to this version. Please open a new bug against a version that still receives bugfixes if you are still facing this issue in a more current release.

Note You need to log in before you can comment on or make changes to this bug.