RCA: 1. Posix_mknod succeeds even though linkto xattr is not set because of no space. However the dentry/file is created. Hence this is a file with only T bit set, but without any linkto xattr. 2. Since mknod succeeds, dht proceeds to create data file on a subvol with enough space (min-free disk scenario). 3. Now we've two files and both are data files (because of lack of linkto xattr on hashed-subvol). Also, since the file on hashed-subvol has only T bit set, all calls to IS_DHT_MIGRATION_PHASE2 succeed. This results in fops like dht_writev_cbk, dht_stat_cbk etc (fops handling migration) to falsely assume that a migration is going on and invoke dht_migration_complete_check task, which fails because of lack of linkto xattr. This error is propagated back to application in these calls (write, stat etc). Fix: Two possible fixes. 1. make posix_mknod atomic with respect to file creation and xattr setting. It should succeed only if both succeed. When xattr setting fails, it should cleanup the file. 2. dht_linkfile_create_cbk should double check whether the file created is indeed a linkto file (by asking for linkto xattr and iatt in cbk). If it finds that file created is not a linkto file, it'll cleanup the file and fails the create call with appropriate error (ENOSPC in this case). regards, Raghavendra
With patch [1], this bug will go away. Note that [1] is already part of 3.4.0 branch https://code.engineering.redhat.com/gerrit/124629
Verified this BZ on glusterfs version 3.12.2-8. Created files till we get "No Space left on device" on the mount point, didn't see any duplicate files(Linkto files). Hence, moving this BZ to Verified.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2018:2607