Bug 1387499
Summary: | Should not display wrong information by stat when data heal is pending for a non-zero size file | ||
---|---|---|---|
Product: | [Red Hat Storage] Red Hat Gluster Storage | Reporter: | Ravishankar N <ravishankar> |
Component: | arbiter | Assignee: | Ravishankar N <ravishankar> |
Status: | CLOSED WONTFIX | QA Contact: | Karan Sandha <ksandha> |
Severity: | high | Docs Contact: | |
Priority: | low | ||
Version: | rhgs-3.2 | CC: | amukherj, bugs, nchilaka, pkarampu, ravishankar, rhinduja, rhs-bugs, storage-qa-internal |
Target Milestone: | --- | Keywords: | Triaged, ZStream |
Target Release: | --- | ||
Hardware: | All | ||
OS: | All | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | 1356974 | Environment: | |
Last Closed: | 2018-11-13 03:22:15 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | 1356974 | ||
Bug Blocks: |
Description
Ravishankar N
2016-10-21 06:05:39 UTC
After some code-reading and debugging, I found that this is not a bug specific to arbiter. In AFR we do have checks to fail afr_stat() or any read transaction when the only good copy is down. But the problem here is that the stat is not reaching AFR, it is getting served from the kernel cache as a part of the lookup response sent by AFR. For example, if we fuse mount the volume with attribute-timeout=0 and entry-timeout=0, we will get EIO for the reproducer given in the description because there is no kernel caching and afr_stat will be hit, which will fail the fop with EIO. Note: I'm not changing the component from arbiter to replicate because I think the acks might be lost if I do that. On fuse mount stat is converted as Lookup. If Lookup on such files fail, unlink on such files will never happen. Especially the files which are in split-brain. While it does show file's stat output as wrong, nothing much can be done with contents of such data: root@dhcp35-190 - ~ 22:26:41 :) ⚡ gluster v heal r2 info Brick localhost.localdomain:/home/gfs/r2_0 Status: Connected Number of entries: 0 Brick localhost.localdomain:/home/gfs/r2_1 Status: Transport endpoint is not connected Number of entries: - Brick localhost.localdomain:/home/gfs/r2_2 /di1/a Status: Connected Number of entries: 1 root@dhcp35-190 - ~ 22:26:46 :) ⚡ ls -l /mnt/r2/di1 total 0 -rw-r--r--. 1 root root 0 Nov 20 22:24 a root@dhcp35-190 - ~ 22:26:53 :) ⚡ cp -r /mnt/r2/di1/ /mnt/r2/di2 cp: error reading '/mnt/r2/di1/a': Input/output error <<----- root@dhcp35-190 - ~ 22:30:41 :( ⚡ truncate -s 0 /mnt/r2/di1/a truncate: failed to truncate '/mnt/r2/di1/a' at 0 bytes: Input/output error root@dhcp35-190 - ~ 22:30:48 :( ⚡ echo abc > /mnt/r2/di1/a -bash: /mnt/r2/di1/a: Input/output error root@dhcp35-190 - ~ 22:30:56 :( ⚡ unlink /mnt/r2/di1/a <<---- Only deletion is successful. Ravi, I was under the impression that the 'cp -r' would succeed, it just occurred to me that I wasn't thinking through it correctly. I think we can fix this a bit later also. Could you check the above cases work the same for other protocols like NFS/Samba? If yes we can defer this. Pranith Tested using gluster NFS and NFS ganesha mounts, confirmed that reads and writes were failing with EIO as expected. In light of comment #5 after discussing with Nag and Pranith, the BZ can be moved to 3.2.0 beyond. To elaborate, as described in the RCA (comment#4), the incorrect stat size is due to caching in the kernel. But since stat is the only command that succeeds ie. reads/writes etc would still fail with EIO (because they hit AFR) to the application, we should be good. I am not providing doc text since there is no workaround needed per se. i.e. once the bricks come up, heal and I/O can continue. |