Description of problem: ==================== on a disperse volume, if we interrupt or stop a copy of a file(was using dd), the fuse mount log throws below error which is misleading by saying "No such file or directory" [2019-10-18 13:36:20.318902] E [fuse-bridge.c:220:check_and_dump_fuse_W] (--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x13b)[0x7f20d6711a9b] (--> /usr/lib64/glusterfs/6.0/xlator/mount/fuse.so(+0x8201)[0x7f20cdabc201] (--> /usr/lib64/glusterfs/6.0/xlator/mount/fuse.so(+0x8ada)[0x7f20cdabcada] (--> /lib64/libpthread.so.0(+0x7ea5)[0x7f20d554eea5] (--> /lib64/libc.so.6(clone+0x6d)[0x7f20d4e148cd] ))))) 0-glusterfs-fuse: writing to fuse device failed: No such file or directory Version-Release number of selected component (if applicable): ========== 4.0.15 How reproducible: =========== quite consistent Steps to Reproduce: ================== 1) 3x(4+2) volume with below options set disperse.shd-max-threads: 8 server.event-threads: 8 client.event-threads: 8 2) fuse mount the volume and run an IO as below: created a file using dd, then copying this file to a new file using dd with unaligned block size eg: dd if=1_basefile_5203af56e3ece89811d53f597868112f of=dfile_dd-1-547 bs=547 3. now while the copying is in progress, do a ctrl+c to stop it Actual results: =========== fuse mount log registers an error as below [2019-10-18 13:36:20.318902] E [fuse-bridge.c:220:check_and_dump_fuse_W] (--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x13b)[0x7f20d6711a9b] (--> /usr/lib64/glusterfs/6.0/xlator/mount/fuse.so(+0x8201)[0x7f20cdabc201] (--> /usr/lib64/glusterfs/6.0/xlator/mount/fuse.so(+0x8ada)[0x7f20cdabcada] (--> /lib64/libpthread.so.0(+0x7ea5)[0x7f20d554eea5] (--> /lib64/libc.so.6(clone+0x6d)[0x7f20d4e148cd] ))))) 0-glusterfs-fuse: writing to fuse device failed: No such file or directory Expected results: ================ seems a bit misleading by saying "no such file or dir" which may need to be corrected volinfo Volume Name: ecvol Type: Distributed-Disperse Volume ID: eb126eba-37c6-46e1-98ab-ea8992406977 Status: Started Snapshot Count: 0 Number of Bricks: 3 x (4 + 2) = 18 Transport-type: tcp Bricks: Brick1: dhcp35-190.lab.eng.blr.redhat.com:/gluster/brick1/ecvol Brick2: dhcp35-237.lab.eng.blr.redhat.com:/gluster/brick1/ecvol Brick3: dhcp35-208.lab.eng.blr.redhat.com:/gluster/brick1/ecvol Brick4: dhcp35-32.lab.eng.blr.redhat.com:/gluster/brick1/ecvol Brick5: dhcp35-29.lab.eng.blr.redhat.com:/gluster/brick1/ecvol Brick6: dhcp35-157.lab.eng.blr.redhat.com:/gluster/brick1/ecvol Brick7: dhcp35-190.lab.eng.blr.redhat.com:/gluster/brick2/ecvol Brick8: dhcp35-237.lab.eng.blr.redhat.com:/gluster/brick2/ecvol Brick9: dhcp35-208.lab.eng.blr.redhat.com:/gluster/brick2/ecvol Brick10: dhcp35-32.lab.eng.blr.redhat.com:/gluster/brick2/ecvol Brick11: dhcp35-29.lab.eng.blr.redhat.com:/gluster/brick2/ecvol Brick12: dhcp35-157.lab.eng.blr.redhat.com:/gluster/brick2/ecvol Brick13: dhcp35-190.lab.eng.blr.redhat.com:/gluster/brick3/ecvol Brick14: dhcp35-237.lab.eng.blr.redhat.com:/gluster/brick3/ecvol Brick15: dhcp35-208.lab.eng.blr.redhat.com:/gluster/brick3/ecvol Brick16: dhcp35-32.lab.eng.blr.redhat.com:/gluster/brick3/ecvol Brick17: dhcp35-29.lab.eng.blr.redhat.com:/gluster/brick3/ecvol Brick18: dhcp35-157.lab.eng.blr.redhat.com:/gluster/brick3/ecvol Options Reconfigured: nfs.disable: on storage.fips-mode-rchecksum: on transport.address-family: inet disperse.shd-max-threads: 8 server.event-threads: 8 client.event-threads: 8
Quoting https://bugzilla.redhat.com/1699869#c61 : > This is an error that can occur with a reverse entry invalidation request. Such a request needs to specify a directory inode and a name. The effect of the request is that the cached lookup data associating the name in the dir inode with some inode is dropped. > > There is no guarantee though that the nodeid via which the reverse entry invalidation request intends to specify the dir inode actually resolves to a dir inode: it can be the case that it does not resolve to an inode, or it does resolve to an inode but that's not the inode of a directory. In former case the write(2) to /dev/fuse on behalf of the request fails with ENOENT; in latter case, it fails with ENOTDIR. > > This is the latter case. Probably there is a race? I can imagine that kernel forgets the directory inode on its own (ie. on its own agenda and not due to an invalidation request) and simultaneously invalidation of some entry of this dir is triggered in glusterfs, taking effect earlier than processing the respective FORGET / BATCH_FORGET messages. If that's the case, then it's most likely a harmless phenomenon. So the phenomenon is harmless and the error message is correct -- we do need to report "No such file or directory" if the system fails the syscall with ENOENT. (Note that this is admittedly odd, as write(2) / write(3p) does not specify ENOENT as possible errno for the write syscall; it's a FUSE idiosyncrasy to use ENOENT for indication of certain system states as part of communication through /dev/fuse.) However, we acknowledge that flooding the logs with this message is annoying. The following upstream patches seek to mitigate this issue: https://review.gluster.org/23187, "fuse: Set limit on invalidate queue size" https://review.gluster.org/23461, "glusterfs/fuse: Reduce the default lru-limit value" Please let me know what action plan to follow (should I provide a custom build with backports of these? Or wait while these patches hit downstream? Actually the first change is the important one, as the effect of the second can be reached also by setting a volume option).
*** Bug 1773901 has been marked as a duplicate of this bug. ***
I investigated the case. The conjecture put forward in comment #3 was not correct. It's not invalidation what happens, but interruption. If a system call is interrupted, glusterfs gets an INTERRUPT message. From that point on handling the interrupted message and handling the interrupt are in race (that's why number of threads makes a difference). If handling the interrupted message wins the race, then by the time the answer is sent to the interrupt the reference to the original message is dangling. This is indicated by the kernel by failing the write to /dev/fuse with ENOENT. It's a harmless phenomenon. So even if the analysis in comment #3 was incorrect, the conclusion is still valid: it's a FUSE idiosyncrasy. To improve the UX, I'll will send a patch to upstream that degrades the log message from Error to Warning and conceals the errno string which feels counter-intuitive in this situation. There will be no need to backport the patch. I will close the bug with NOTABUG on QA ack.
Upstream change: https://review.gluster.org/23974
(In reply to Csaba Henk from comment #17) > I investigated the case. The conjecture put forward in comment #3 was not > correct. It's not invalidation what happens, but interruption. If a system > call is interrupted, glusterfs gets an INTERRUPT message. From that point on > handling the interrupted message and handling the interrupt are in race > (that's why number of threads makes a difference). If handling the > interrupted message wins the race, then by the time the answer is sent to > the interrupt the reference to the original message is dangling. This is > indicated by the kernel by failing the write to /dev/fuse with ENOENT. It's > a harmless phenomenon. So even if the analysis in comment #3 was incorrect, > the conclusion is still valid: it's a FUSE idiosyncrasy. > > To improve the UX, I'll will send a patch to upstream that degrades the log > message from Error to Warning and conceals the errno string which feels > counter-intuitive in this situation. There will be no need to backport the > patch. > > I will close the bug with NOTABUG on QA ack. Hi Csaba, After putting all the effort required from engineering side to get the bug to onqa and QE spending test cycle to validate the fix, closing as "NOTABUG" is quite unproductive. Also, I don't see a reason closing this as NOTABUG, as this is a bug given that we are seeing messages which can be false alarms. If we believe these messages occur due to fuse limitations, then we should either suppress such messages, without suppressing genuinely expected messages. For Now we will have to failqa the bug, and then we can take the necessary action
(In reply to Csaba Henk from comment #17) > I investigated the case. The conjecture put forward in comment #3 was not > correct. It's not invalidation what happens, but interruption. If a system > call is interrupted, glusterfs gets an INTERRUPT message. From that point on > handling the interrupted message and handling the interrupt are in race > (that's why number of threads makes a difference). If handling the > interrupted message wins the race, then by the time the answer is sent to > the interrupt the reference to the original message is dangling. This is > indicated by the kernel by failing the write to /dev/fuse with ENOENT. It's > a harmless phenomenon. So even if the analysis in comment #3 was incorrect, > the conclusion is still valid: it's a FUSE idiosyncrasy. > > To improve the UX, I'll will send a patch to upstream that degrades the log > message from Error to Warning and conceals the errno string which feels > counter-intuitive in this situation. There will be no need to backport the > patch. Also, in the case of moving the logs to warning, are we making sure, that we won't be supressing genuine errors (ie in case of really missing files/dirs)? If the answer is yes, then you can get the patch merged and this same bug can have qe to retest, and see if the logs are being suppressed or not(ie moved to warning) as part of validation > > I will close the bug with NOTABUG on QA ack.
We decided to backport the change mentioned in comment #18. Then obviously the resolution will be something other than NOTABUG. However, the patch needs to be revised/refined to address some concerns (of reviewers and of myself). When the new revision is submitted, I'll answer the remaining questions in light of that.
New revision of patch addresses the following concerns: - suppressing genuinely expected messages: The degradation of /dev/fuse write errors happens context sensitively. That is, caller of the convenience function can specify which errnos are to be treated as non-errors for glusterfs. - flooding of logs: degraded log messages are degraded to DEBUG so they won't be seen under normal conditions. (Note that the frequency reported in comment #14 and the 1-1 mapping of discussed log messages with SIGINT-s does not sound to be a real flooding situation, but it will be even better now.)
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:0288