1335156 – Ext4 FS corruption in VM hosted on sharded volume

Bug 1335156 - Ext4 FS corruption in VM hosted on sharded volume

Summary: Ext4 FS corruption in VM hosted on sharded volume

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	Red Hat Gluster Storage
Classification:	Red Hat Storage
Component:	sharding
Sub Component:
Version:	rhgs-3.1
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	unspecified
Target Milestone:	---
Target Release:	RHGS 3.1.3
Assignee:	Krutika Dhananjay
QA Contact:	Bhaskarakiran
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	Gluster-HC-1 1311817
TreeView+	depends on / blocked

Reported:	2016-05-11 12:57 UTC by Bhaskarakiran
Modified:	2016-11-23 23:12 UTC (History)
CC List:	7 users (show)
Fixed In Version:	glusterfs-3.7.9-6
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2016-06-29 14:09:45 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
sosreport (6.33 MB, application/x-xz) 2016-05-23 06:05 UTC, Bhaskarakiran	no flags	Details
View All

Description Bhaskarakiran 2016-05-11 12:57:20 UTC

Description of problem:
------------------------

Installed RHEL6 on the VM. Added 100G disk and formatted with ext4. Did IO (dd, fio, linux untar and directory creations). I see the input/output errors during IO. Listing the contents shows as below :

-?????????? ? ?    ?           ?            ? 189dhcp46-189-512k-vdb-write-seq.results_bw.log
-?????????? ? ?    ?           ?            ? 189dhcp46-189-512k-vdb-write-seq.results_iops.log
-?????????? ? ?    ?           ?            ? 189dhcp46-189-64k-vdb-randread-para.results_bw.log
-?????????? ? ?    ?           ?            ? 189dhcp46-189-64k-vdb-randread-para.results_iops.log
-?????????? ? ?    ?           ?            ? 189dhcp46-189-64k-vdb-randread-seq.results_bw.log
-?????????? ? ?    ?           ?            ? 189dhcp46-189-64k-vdb-randread-seq.results_iops.log
-?????????? ? ?    ?           ?            ? 189dhcp46-189-64k-vdb-randwrite-para.results_bw.log
-?????????? ? ?    ?           ?            ? 189dhcp46-189-64k-vdb-randwrite-para.results_iops.log
-?????????? ? ?    ?           ?            ? 189dhcp46-189-64k-vdb-randwrite-seq.results_bw.log
-?????????? ? ?    ?           ?            ? 189dhcp46-189-64k-vdb-randwrite-seq.results_iops.log
-?????????? ? ?    ?           ?            ? 189dhcp46-189-64k-vdb-read-para.results_bw.log
-?????????? ? ?    ?           ?            ? 189dhcp46-189-64k-vdb-read-para.results_iops.log
-?????????? ? ?    ?           ?            ? 189dhcp46-189-64k-vdb-read-seq.results_bw.log
-?????????? ? ?    ?           ?            ? 189dhcp46-189-64k-vdb-read-seq.results_iops.log
-?????????? ? ?    ?           ?            ? 189dhcp46-189-64k-vdb-write-para.results_bw.log
-?????????? ? ?    ?           ?            ? 189dhcp46-189-64k-vdb-write-para.results_iops.log
-?????????? ? ?    ?           ?            ? 189dhcp46-189-64k-vdb-write-seq.results_bw.log
-?????????? ? ?    ?           ?            ? 189dhcp46-189-64k-vdb-write-seq.results_iops.log
drwxr-xr-x. 1 root root 30154752 May 11 18:22 dirs
drwxr-xr-x. 2 root root 11284480 May 10 14:58 files
drwxr-xr-x. 4 root root     4096 May 11 16:09 iso
drwxr-xr-x. 6 root root     4096 May 11 16:26 li ?x
d?????????? ? ?    ?           ?            ? lost*found

There was data in iso directory but i dont see that too. I have not deleted anything in that though.

[root@rhsqa13 .shard]# gluster v status
Status of volume: data 
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick dhcp43-201.lab.eng.blr.redhat.com:/rh
gs/data/data                                49154     0          Y       3670
Brick dhcp43-219.lab.eng.blr.redhat.com:/rh
gs/data/data                                49154     0          Y       6335
Brick dhcp43-220.lab.eng.blr.redhat.com:/rh
gs/data/data                                49154     0          Y       3295
NFS Server on localhost                     2049      0          Y       7007
Self-heal Daemon on localhost               N/A       N/A        Y       7023
NFS Server on dhcp43-219.lab.eng.blr.redhat
.com                                        2049      0          Y       8702
Self-heal Daemon on dhcp43-219.lab.eng.blr.
redhat.com                                  N/A       N/A        Y       8710
NFS Server on dhcp43-220.lab.eng.blr.redhat
.com                                        2049      0          Y       11181
Self-heal Daemon on dhcp43-220.lab.eng.blr.
redhat.com                                  N/A       N/A        Y       11189

Task Status of Volume data
------------------------------------------------------------------------------
There are no active volume tasks
 
Status of volume: engine_vol
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick dhcp43-201.lab.eng.blr.redhat.com:/rh
gs/engine/ev                                49152     0          Y       17634
Brick dhcp43-219.lab.eng.blr.redhat.com:/rh
gs/engine/ev                                49152     0          Y       8682 
Brick dhcp43-220.lab.eng.blr.redhat.com:/rh
gs/engine/ev                                49152     0          Y       17048
NFS Server on localhost                     2049      0          Y       7007 
Self-heal Daemon on localhost               N/A       N/A        Y       7023 
NFS Server on dhcp43-219.lab.eng.blr.redhat
.com                                        2049      0          Y       8702 
Self-heal Daemon on dhcp43-219.lab.eng.blr.
redhat.com                                  N/A       N/A        Y       8710 
NFS Server on dhcp43-220.lab.eng.blr.redhat
.com                                        2049      0          Y       11181
Self-heal Daemon on dhcp43-220.lab.eng.blr.
redhat.com                                  N/A       N/A        Y       11189
 
Task Status of Volume engine_vol
------------------------------------------------------------------------------
There are no active volume tasks

Task Status of Volume engine_vol
------------------------------------------------------------------------------
There are no active volume tasks
 
Status of volume: vmstore
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick dhcp43-201.lab.eng.blr.redhat.com:/rh
gs/vmstore/vms                              49153     0          Y       69218
Brick dhcp43-219.lab.eng.blr.redhat.com:/rh
gs/vmstore/vms                              49153     0          Y       18353
Brick dhcp43-220.lab.eng.blr.redhat.com:/rh
gs/vmstore/vms                              49153     0          Y       26617
NFS Server on localhost                     2049      0          Y       7007
Self-heal Daemon on localhost               N/A       N/A        Y       7023
NFS Server on dhcp43-219.lab.eng.blr.redhat
.com                                        2049      0          Y       8702
Self-heal Daemon on dhcp43-219.lab.eng.blr.
redhat.com                                  N/A       N/A        Y       8710
NFS Server on dhcp43-220.lab.eng.blr.redhat
.com                                        2049      0          Y       11181
Self-heal Daemon on dhcp43-220.lab.eng.blr.
redhat.com                                  N/A       N/A        Y       11189

Task Status of Volume vmstore
------------------------------------------------------------------------------
There are no active volume tasks

[root@rhsqa13 .shard]# gluster v info

Volume Name: data
Type: Replicate
Volume ID: 12b4d188-85c7-429b-81ba-59b641efd15e
Status: Started
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: dhcp43-201.lab.eng.blr.redhat.com:/rhgs/data/data
Brick2: dhcp43-219.lab.eng.blr.redhat.com:/rhgs/data/data
Brick3: dhcp43-220.lab.eng.blr.redhat.com:/rhgs/data/data
Options Reconfigured:  
diagnostics.client-log-level: DEBUG
cluster.data-self-heal-algorithm: full
performance.low-prio-threads: 32
features.shard-block-size: 512MB
features.shard: on
storage.owner-gid: 36  
storage.owner-uid: 36  
cluster.server-quorum-type: server
cluster.quorum-type: auto
network.remote-dio: enable
cluster.eager-lock: enable
performance.stat-prefetch: off
performance.io-cache: off
performance.read-ahead: off
performance.quick-read: off
performance.readdir-ahead: on
cluster.shd-max-threads: 4

Volume Name: engine_vol
Type: Replicate
Volume ID: b98cda8e-19a0-4372-9518-361d9e2d8315
Status: Started
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: dhcp43-201.lab.eng.blr.redhat.com:/rhgs/engine/ev
Brick2: dhcp43-219.lab.eng.blr.redhat.com:/rhgs/engine/ev
Brick3: dhcp43-220.lab.eng.blr.redhat.com:/rhgs/engine/ev
Options Reconfigured:
performance.readdir-ahead: on
performance.quick-read: off
performance.read-ahead: off
performance.io-cache: off
performance.stat-prefetch: off
cluster.eager-lock: enable
network.remote-dio: enable
cluster.quorum-type: auto
cluster.server-quorum-type: server
storage.owner-uid: 36  
storage.owner-gid: 36  
features.shard: on
features.shard-block-size: 512MB
performance.low-prio-threads: 32
cluster.data-self-heal-algorithm: full
cluster.shd-max-threads: 4

Volume Name: vmstore   
Type: Replicate
Volume ID: 27f1afb4-6fe8-4cf1-9d29-8deefbcdb43f
Status: Started
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: dhcp43-201.lab.eng.blr.redhat.com:/rhgs/vmstore/vms
Brick2: dhcp43-219.lab.eng.blr.redhat.com:/rhgs/vmstore/vms
Brick3: dhcp43-220.lab.eng.blr.redhat.com:/rhgs/vmstore/vms
Options Reconfigured:  
diagnostics.client-log-level: DEBUG
cluster.data-self-heal-algorithm: full
performance.low-prio-threads: 32
features.shard-block-size: 512MB
features.shard: on
storage.owner-gid: 36  
storage.owner-uid: 36  
cluster.server-quorum-type: server
cluster.quorum-type: auto
network.remote-dio: enable
cluster.eager-lock: enable
performance.stat-prefetch: off
performance.io-cache: off
performance.read-ahead: off
performance.quick-read: off
performance.readdir-ahead: on
cluster.shd-max-threads: 4
[root@rhsqa13 .shard]# 


Version-Release number of selected component (if applicable):
--------------------------------------------------------------
3.7.9-3

How reproducible:
-----------------
100%

Steps to Reproduce:
-------------------
As in description.

Actual results:


Expected results:


Additional info:
----------------
sosreports will be attached.

Comment 2 Krutika Dhananjay 2016-05-11 13:06:39 UTC

Input/output error would most likely be due to a split-brain.
Did you check for that?
If the issue is split-brain, then the bug is not in sharding.

-Krutika

Comment 3 Bhaskarakiran 2016-05-20 05:09:00 UTC

My bad for delaying this. I have not seen any split-brain during this corruption.

Comment 4 Krutika Dhananjay 2016-05-20 06:32:18 UTC

Could you please attach the sosreports?

Comment 5 Bhaskarakiran 2016-05-20 08:50:59 UTC

The same VM is not there now due to setup issues. I will have to re-create it. Will provide the sosreports once it is done.

Comment 6 Bhaskarakiran 2016-05-23 06:05:15 UTC

Created attachment 1160452 [details]
sosreport

Comment 7 Krutika Dhananjay 2016-05-23 08:54:45 UTC

Hi Bhaskar,

I checked the attachment.
There are no directories '/var/log/glusterfs' or '/var/lib/glusterd' in the sosreport. Could you attach the correct sosreports?

-Krutika

Comment 8 Bhaskarakiran 2016-05-24 07:43:17 UTC

all the sosreports are copied to 

http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/1335156/

Comment 9 Krutika Dhananjay 2016-05-24 12:56:57 UTC

I checked your setup.

For the disks with uuid badff993-fe76-4147-97b7-dfd53293907f and 7ff05ee6-b76d-46cd-a98a-17569e2bd318, there are corresponding directories under images/ in the volume 'vmstore' each of which contain another uuid file, a .lease and a .meta for this uuid file.

For 7ff05ee6-b76d-46cd-a98a-17569e2bd318, turns out there were two such files, with two .meta and .lease files.
Is this normal?

Also, for one of those files - namely 8bd7bc1d-8ed3-49e4-92ff-72cd009cd6a6 - the file was empty and all its shards were also 0 bytes. In essence, the file contains no data at all.

Not sure if this is normal either.

Still needs investigation.

-Krutika

Comment 10 Krutika Dhananjay 2016-05-24 12:59:44 UTC

One more thing: I found the files associataed with th uuid for the corrupt data disks given by Bhaskarakiran on the volume vmstore as opposed to data.
Is this okay? Or is that incorrect?

Comment 11 Bhaskarakiran 2016-05-24 16:16:18 UTC

The primary disk i.e. OS disk is on data while the IO is run on the disk that is attached from vmstore. The corruption is seen on the disk that comes from vmstore.

Comment 14 Pranith Kumar K 2016-05-27 09:57:34 UTC

https://code.engineering.redhat.com/gerrit/74760 which fixes 1330044 is the patch which fixes the I/O errors even when there were no files in split-brain

Pranith

Comment 15 Bhaskarakiran 2016-06-01 05:49:03 UTC

Tested on 3.7.9-6 build and didn' see the corruption. Marking this as fixed for now. Will re-open if its seen again.

Note You need to log in before you can comment on or make changes to this bug.