Bug 1386678 - [md-cache]: forceful brick down shows i/o error for some files and unknown permissions for some files
Summary: [md-cache]: forceful brick down shows i/o error for some files and unknown pe...
Keywords:
Status: CLOSED WORKSFORME
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat
Component: disperse
Version: rhgs-3.2
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: RHGS 3.2.0
Assignee: Pranith Kumar K
QA Contact: Nag Pavan Chilakam
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-10-19 12:36 UTC by Nag Pavan Chilakam
Modified: 2016-12-19 12:44 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-12-19 12:44:44 UTC
Target Upstream Version:


Attachments (Terms of Use)
complete file list(but displayed only partial list) (1.18 MB, text/plain)
2016-10-19 12:38 UTC, Nag Pavan Chilakam
no flags Details

Description Nag Pavan Chilakam 2016-10-19 12:36:57 UTC
On an ec vol which had md-cache enabled, I created about 1lakh 20Byte files and I brought down one brick(killing brick process).(4+2 ecvol)
I did a lookup post that , it seem to display all files without issue(used ls -l)
I then brought down another brick by forcefully umounting the brick (the xfs lv)
I then issued an ls on the client . The client displayed files as below


For some files 
[root@dhcp35-180 dir1]# time ls -lt
ls: cannot access c.4047: Input/output error
ls: cannot access c.4048: Input/output error
ls: cannot access c.4049: Input/output error
ls: cannot access c.4050: Input/output error
ls: cannot access c.4051: Input/output error
ls: cannot access c.4052: Input/output error
ls: cannot access c.4053: Input/output error
ls: cannot access c.4054: Input/output error
ls: cannot access c.4055: Input/output error
ls: cannot access c.4056: Input/output error
ls: cannot access c.4057: Input/output error
ls: cannot access c.4058: Input/output error
ls: cannot access c.4059: Input/output error
ls: cannot access c.4060: Input/output error
ls: cannot access c.4061: Input/output error
ls: cannot access c.4062: Input/output error
ls: cannot access c.4063: Input/output error


for some files :
-?????????? ? ?    ?       ?            ? c.4248
-?????????? ? ?    ?       ?            ? c.4249
-?????????? ? ?    ?       ?            ? c.4250
-?????????? ? ?    ?       ?            ? c.4251
-?????????? ? ?    ?       ?            ? c.4252
-?????????? ? ?    ?       ?            ? c.4253
-?????????? ? ?    ?       ?            ? c.4254


for some files :

-rw-r--r--. 1 root root   20 Oct 19 16:14 a.2571
-rw-r--r--. 1 root root   20 Oct 19 16:14 a.2570
-rw-r--r--. 1 root root   20 Oct 19 16:14 a.2569
-rw-r--r--. 1 root root   20 Oct 19 16:14 a.2568
-rw-r--r--. 1 root root   20 Oct 19 16:14 a.2567
-rw-r--r--. 1 root root   20 Oct 19 16:14 a.2566
-rw-r--r--. 1 root root   20 Oct 19 16:14 a.2565
-rw-r--r--. 1 root root   20 Oct 19 16:14 a.2564
-rw-r--r--. 1 root root   20 Oct 19 16:14 a.2563
-rw-r--r--. 1 root root   20 Oct 19 16:14 a.2562
-rw-r--r--. 1 root root   20 Oct 19 16:14 a.2561
-rw-r--r--. 1 root root   20 Oct 19 16:14 a.2560
-rw-r--r--. 1 root root   20 Oct 19 16:14 a.2559
-rw-r--r--. 1 root root   20 Oct 19 16:14 a.2558




In total there were 1,10,000 files but the ls displayed only 25000 files (attached is the compelte o/p)


However on second ls, it display correctly


I did see this once even on an afr volume with mdcache enabled(and just with one brick of a replica pair down , by killing brick process)



Status of volume: ecvol
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.70.35.86:/rhs/brick2/ecvol         N/A       N/A        N       N/A  
Brick 10.70.35.9:/rhs/brick2/ecvol          49155     0          Y       18173
Brick 10.70.35.153:/rhs/brick2/ecvol        49154     0          Y       15230
Brick 10.70.35.79:/rhs/brick2/ecvol         49154     0          Y       13959
Brick 10.70.35.86:/rhs/brick3/ecvol         49154     0          Y       17610 ===>above brick was forcefully unmounted using umount -l
Brick 10.70.35.9:/rhs/brick3/ecvol          49156     0          Y       18193
Self-heal Daemon on localhost               N/A       N/A        Y       17631
Self-heal Daemon on 10.70.35.153            N/A       N/A        Y       15856
Self-heal Daemon on 10.70.35.79             N/A       N/A        Y       14590
Self-heal Daemon on 10.70.35.9              N/A       N/A        Y       18858
 
Task Status of Volume ecvol
------------------------------------------------------------------------------
There are no active volume tasks
 

volume Name: ecvol
Type: Disperse
Volume ID: 809177ca-258a-4262-9ec5-7744ea4f7564
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x (4 + 2) = 6
Transport-type: tcp
Bricks:
Brick1: 10.70.35.86:/rhs/brick2/ecvol
Brick2: 10.70.35.9:/rhs/brick2/ecvol
Brick3: 10.70.35.153:/rhs/brick2/ecvol
Brick4: 10.70.35.79:/rhs/brick2/ecvol
Brick5: 10.70.35.86:/rhs/brick3/ecvol
Brick6: 10.70.35.9:/rhs/brick3/ecvol
Options Reconfigured:
nfs.disable: on
performance.readdir-ahead: on
transport.address-family: inet
features.cache-invalidation: on
features.cache-invalidation-timeout: 60
performance.stat-prefetch: on
performance.cache-invalidation: on
performance.md-cache-timeout: 60
[root@dhcp35-86 ~]# 



BUILD:
It was taken from what was mentioned in http://etherpad.corp.redhat.com/md-cache-3-2


nfs-ganesha-gluster-2.4.0-2.el7rhgs.x86_64
glusterfs-debuginfo-3.8.4-2.26.git0a405a4.el7rhgs.x86_64
glusterfs-api-3.8.4-2.26.git0a405a4.el7rhgs.x86_64
glusterfs-events-3.8.4-2.26.git0a405a4.el7rhgs.x86_64
glusterfs-libs-3.8.4-2.26.git0a405a4.el7rhgs.x86_64
glusterfs-cli-3.8.4-2.26.git0a405a4.el7rhgs.x86_64
glusterfs-ganesha-3.8.4-2.26.git0a405a4.el7rhgs.x86_64
glusterfs-3.8.4-2.26.git0a405a4.el7rhgs.x86_64
glusterfs-server-3.8.4-2.26.git0a405a4.el7rhgs.x86_64
glusterfs-rdma-3.8.4-2.26.git0a405a4.el7rhgs.x86_64
glusterfs-client-xlators-3.8.4-2.26.git0a405a4.el7rhgs.x86_64
python-gluster-3.8.4-2.26.git0a405a4.el7rhgs.noarch
glusterfs-fuse-3.8.4-2.26.git0a405a4.el7rhgs.x86_64

Comment 2 Nag Pavan Chilakam 2016-10-19 12:38:16 UTC
Created attachment 1212137 [details]
complete file list(but displayed only partial list)

Comment 3 Poornima G 2016-10-20 07:27:02 UTC
EC volume is used with redundancy count 2 is it? if so  we expect that even after bringing down 2 bricks the data should be intact?

Does a fresh mount also displays the same error when 'ls' is issued?

Also did you happen to try the same test case disabling md-cache features?

Comment 4 Poornima G 2016-10-20 07:33:33 UTC
I see that the other brick was brought down by "  brick by forcefully umounting the brick". If something goes wrong in the backend without the knowledge of client, i m not sure if the client is intelligent enough to identify the failure the first time and re-read from a different brick. 

Also, md-cache doesn't cache the readddirp data, readdir-ahead may have some role to play. If thee second brick is brought down by killing brick process, this issue may not be seen. 

My understanding is, this issue will be reproducible even without md-cache feature on. Can check with EC maintainers if this is an expected behaviour?

Comment 5 Atin Mukherjee 2016-10-27 09:17:27 UTC
Pranith/Ashish - could you provide your inputs on comment 4?

Comment 6 Pranith Kumar K 2016-10-27 10:15:53 UTC
Looks like EC issue more than md-cache. We will take a look first and if we find that it is not EC issue, then we can hand it over to the correct component.

Comment 7 Poornima G 2016-11-07 13:35:29 UTC
As discussed offline, Nag mentioned he was able to see the same issue with AFR and killing bricks also showed the same error? Can you please confirm.

Comment 8 Atin Mukherjee 2016-11-07 13:51:38 UTC
Based on comment 7 I am moving the component to disperse.

Comment 9 Ashish Pandey 2016-11-17 08:45:16 UTC
Nag,
sosreport is also missing for this bug.

Comment 12 Ashish Pandey 2016-11-25 11:45:29 UTC
Nag,

I have enough reasons to doubt that this sosreport belongs to same issue which you have mentioned in comment#1

[root@apandey bug.1386678]# 
[root@apandey bug.1386678]# ll server/
total 24
drwxrwxrwx. 3 root root 4096 Nov 25 16:57 dhcp35-116.lab.eng.blr.redhat.com
drwxrwxrwx. 3 root root 4096 Nov 25 16:58 dhcp35-135.lab.eng.blr.redhat.com
drwxrwxrwx. 3 root root 4096 Nov 25 16:59 dhcp35-196.lab.eng.blr.redhat.com
drwxrwxrwx. 3 root root 4096 Nov 25 16:59 dhcp35-239.lab.eng.blr.redhat.com
drwxrwxrwx. 3 root root 4096 Nov 25 16:59 dhcp35-37.lab.eng.blr.redhat.com
drwxrwxrwx. 3 root root 4096 Nov 25 16:59 dhcp35-8.lab.eng.blr.redhat.com

IP's of the servers are not matching with the volume info you have provided.


 1: volume ecvol-client-0
  2:     type protocol/client
  3:     option ping-timeout 42
  4:     option remote-host 10.70.35.37
  5:     option remote-subvolume /rhs/brick1/ecvol
  6:     option transport-type socket
  7:     option transport.address-family inet
  8:     option send-gids true
  9: end-volume
 10:
 11: volume ecvol-client-1
 12:     type protocol/client
 13:     option ping-timeout 42
 14:     option remote-host 10.70.35.116
 15:     option remote-subvolume /rhs/brick1/ecvol
 16:     option transport-type socket
 17:     option transport.address-family inet
 18:     option send-gids true
 19: end-volume
 20:
 21: volume ecvol-client-2
 22:     type protocol/client
 23:     option ping-timeout 42
 24:     option remote-host 10.70.35.239
 25:     option remote-subvolume /rhs/brick1/ecvol
 26:     option transport-type socket
 27:     option transport.address-family inet
 28:     option send-gids true
 29: end-volume
 30:
 31: volume ecvol-client-3
 32:     type protocol/client
 33:     option ping-timeout 42
 34:     option remote-host 10.70.35.135
 35:     option remote-subvolume /rhs/brick1/ecvol
 36:     option transport-type socket
 37:     option transport.address-family inet
 38:     option send-gids true
 39: end-volume


Also, on one of the brick I saw crash logs.


Please confirm that it has the logs for this bug only.
If you don't have soisreport, please reproduce the issue and provide it.

Also, as Poornima asked in comment#7, did you try it for afr?

Comment 13 surabhi 2016-11-29 10:02:48 UTC
As per the triaging we all have the agreement that this BZ has to be fixed in rhgs-3.2.0. Providing qa_ack

Comment 15 Nag Pavan Chilakam 2016-12-01 09:51:19 UTC
currently I am blocked as I am hitting this
https://bugzilla.redhat.com/show_bug.cgi?id=1397667

Will upgrade to the build with fix of https://bugzilla.redhat.com/show_bug.cgi?id=1397667 and then update

Comment 17 Nag Pavan Chilakam 2016-12-13 07:04:03 UTC
I have tried to reproduce this issue, but couldn't hit it


Note You need to log in before you can comment on or make changes to this bug.