Bug 1589829 - Meta data is being written even though the data bricks are full and files could not be created succesfully
Summary: Meta data is being written even though the data bricks are full and files cou...
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: arbiter
Version: rhgs-3.3
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: ---
Assignee: Ravishankar N
QA Contact: Karan Sandha
URL:
Whiteboard:
Depends On: 1593242
Blocks: 1588452
TreeView+ depends on / blocked
 
Reported: 2018-06-11 13:20 UTC by Saravanakumar
Modified: 2018-11-19 05:08 UTC (History)
12 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1588452
Environment:
Last Closed: 2018-11-19 05:08:13 UTC
Embargoed:


Attachments (Terms of Use)

Comment 6 Nitin Goyal 2018-06-13 08:55:20 UTC
link to logs and output of getfattr and healinfo

http://rhsqe-repo.lab.eng.blr.redhat.com/cns/bugs/BZ-1589829/

Comment 7 Raghavendra Talur 2018-08-02 12:22:26 UTC
Ravishankar,

With the data in comment 6, can we determine if this is a bug or not?

questions to be answered:
1. Is it complaint with POSIX?
2. Is it same behavior as in replica 3?

If the answer is Yes for both the above questions then we can close this bug.
If the answer is No for either, we have to provide a fix soon as this will be seen much more frequently in same volumes of 1 GB use case that is predominant in CNS.

Comment 8 Ravishankar N 2018-08-03 04:35:01 UTC
(In reply to Raghavendra Talur from comment #7)
> Ravishankar,
> 
> With the data in comment 6, can we determine if this is a bug or not?

From the logs and getfattr output, I see file is present only on arbiter and it blames other 2 bricks:

On the 2 data bricks:
getfattr: file11: No such file or directory

On arbiter brick:
$getfattr -d -m . -e hex file11
# file: file11
security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6c6162656c65645f743a733000
trusted.afr.volvol2-client-0=0x000000020000000000000000
trusted.afr.volvol2-client-1=0x0000002e0000002f00000000
trusted.gfid=0xd57a8e9624424490a1a029fbba27c68f


> 
> questions to be answered:
> 1. Is it complaint with POSIX?
Yes, AFR unwinds creates with failure. So the entry can be present or absent on the partially successful bricks.

> 2. Is it same behavior as in replica 3?
Yes. Note that if AFR fails the FOP, the only guarantee it needs to provide is that the contents on all 3 bricks are same when all bricks are up and there is no hindrance to self-heal being carried out.

That said, we are also working on fixes for entry FOPS consistency where we can safely delete files if not present on quorum no. of bricks and also solving gfid split-brains (https://bugzilla.redhat.com/show_bug.cgi?id=1593242#c18).

Comment 11 Ravishankar N 2018-11-19 05:08:13 UTC
Given that this is a situation which can be hit only when data disks are full and there is no data-loss of any kind or falsely reporting success to the application for these entry operations,this bug is not a priority right now and is being closed. Entry FOP consistency will still be undertaken as a part of 1593242 as explained in comment#8.


Note You need to log in before you can comment on or make changes to this bug.