Bug 1471031 - dht_(f)xattrop does not implement migration checks [NEEDINFO]
dht_(f)xattrop does not implement migration checks
Status: POST
Product: GlusterFS
Classification: Community
Component: distribute (Show other bugs)
mainline
Unspecified Unspecified
unspecified Severity unspecified
: ---
: ---
Assigned To: Nithya Balachandran
:
Depends On:
Blocks: 1498081 1515434
  Show dependency treegraph
 
Reported: 2017-07-14 05:45 EDT by Nithya Balachandran
Modified: 2017-12-04 01:04 EST (History)
2 users (show)

See Also:
Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1498081 1515434 (view as bug list)
Environment:
Last Closed:
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
nbalacha: needinfo? (kdhananj)


Attachments (Terms of Use)

  None (edit)
Description Nithya Balachandran 2017-07-14 05:45:06 EDT
Description of problem:

Sharding uses xattrops to set the size info on the file. DHT does not currently implement migration phase1/phase2 checks for these and can possibly lead to issues when sharding is enabled.

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:
Comment 1 Worker Ant 2017-07-14 05:55:24 EDT
REVIEW: https://review.gluster.org/17776 (cluster/dht: Add migration checks to dht_(f)xattrop) posted (#1) for review on master by N Balachandran (nbalacha@redhat.com)
Comment 2 Worker Ant 2017-07-14 06:05:31 EDT
REVIEW: https://review.gluster.org/17776 (cluster/dht: Add migration checks to dht_(f)xattrop) posted (#2) for review on master by N Balachandran (nbalacha@redhat.com)
Comment 3 Worker Ant 2017-07-17 01:38:41 EDT
REVIEW: https://review.gluster.org/17776 (cluster/dht: Add migration checks to dht_(f)xattrop) posted (#3) for review on master by N Balachandran (nbalacha@redhat.com)
Comment 4 Worker Ant 2017-07-24 08:13:30 EDT
REVIEW: https://review.gluster.org/17776 (cluster/dht: Add migration checks to dht_(f)xattrop) posted (#4) for review on master by N Balachandran (nbalacha@redhat.com)
Comment 5 Nithya Balachandran 2017-12-04 01:04:04 EST
We may run into issues with dht_migrate_file and xattrops sent by the shard xlator where ,as the xattrop is an ADD operation, so this the wrong value could end up being set.

file  = data file
file' = linkto file

Time                Source                                  Target
-----------------------------------------------------------------------

t0                  file                                   file' 
                    (shard.size = x1)

t1                  file  (listxattr)                      file'
                    (shard.size = x1)

t2                  file                                   file' (setxattr) 
                    shard.size=x1                          shard.size=x1

t3                  file (xattrop (ADD))                   file' 
                    shard.size = x2                        (shard.size = x1)
 [operation not performed on the target as this is not marked PHASE1 yet]

Now, a write + xattrop is performed after the S+T bits have been set.

t4                  file (S+T) (xattrop(ADD))             file'
                    shard.size = x3                       (shard.size = x3')
The operation is now performed on the target as well but the value will be different as it is an ADD performed in a different initial value.


Convert target to data file
t5                   file (S+T)                           file
                     shard.size = x3                      (shard.size = x3')

A write + xattrop is now performed but is sent only to the target file (as it is now a data file on the hashed subvol)

t6                   file (S +T)                          file xattrop(ADD))
                     shard.size = x3                      shard.size = x4

Copy xattrs again from src to target

t7                   file (S + T ) listxattr             file (setxattr
                     shard.size = x3                     shard.size=x3


Convert source to linkto

t8                   file'                               file 
                     shard.size= x3                      (shard.size = x3)



The listxattr+setxattr at t7 would fix the race at t3. However, it introduces its own race and hence possible file corruption. 


A possible solution would be to:

1. Set the S+T bits before performing the initial listxattr + setxattr. This does not fix the race where xattrops may reach the dst out of order.
2. Set only the Posix ACLS in the second setxattr.

Initially I thought the f/xattrops need only check for PHASE2 and that the xattrs could be copied across from the second listxattr+setxattr values. However, that leaves a window where the values could go out of sync.

A write could hit the src before the dst linkto as been converted to a data file.  After the dst has been converted to a data file but before the listxattr+setxattr, a lookup from somewhere could update the inode ctx to point to the new hashed subvol (dst) and the xattrop would be sent only on the dst. However, if phase1 checks were not implemented, this add would be on the wrong value and the listxattr+setxattr would then overwrite it with the wrong value.



@Krutika, is the above understanding of how xattrops work with shard correct?

Note You need to log in before you can comment on or make changes to this bug.