Created attachment 1653961 [details] Logs for Gluster FUSE and SMB Client access Description of problem: Client receives no error when triggering autocommit with WRITE FOP. This is reproducible via FUSE Client and also via SMB Client using glusterfs_vfs plugin. In my opinion this is critical because if a client application triggers autocommit via WRITE it thinks, the WRITE was a success and deletes the file from its cache (client-side data loss). In the backend, of course, there is no data loss. Version-Release number of selected component (if applicable): 5.10 How reproducible: Steps to Reproduce (with FUSE client): 1. Create a gluster volume and enable worm-file-level (180s is default for autocommit period) 2. Mount volume via Native FUSE Client 3. In the mount path do: $ echo test >> file1.txt && sleep 185 && echo test >> file1.txt Actual results: There is *no* error in bash like "Permission denied" or "Read-only filesystem" after triggering autocommit with the second WRITE FOP Expected results: There should be an error message. Additional info: If one triggers the autocommit with RENAME or TRUNCATE there is an error message. In the attachment there are gluster-trace-logs (client and brick) for the above reproducible steps. Additionally there are also smb-logs-with-trace to have an example what happens in a smb client (mounted via mount.cifs in the bash)
Additional information from the attached smb client log: The initial WRITE receives correctly no error: [2020/01/20 09:43:18.375622, 10, pid=28300, effective(1101109, 1100513), real(1101109, 0)] ../source3/smbd/aio.c:935(aio_pwrite_smb2_done) pwrite_recv returned 5, err = no error But the WRITE which triggers autocommit also receives no error, which is wrong because the WRITE FOP was blocked in the backend: [2020/01/20 09:46:23.647131, 10, pid=28300, effective(1101109, 1100513), real(1101109, 0)] ../source3/smbd/aio.c:935(aio_pwrite_smb2_done) pwrite_recv returned 5, err = no error
Can you try the same experiment with 'gluster volume set <> write-behind off' and see if this is works fine?
I forgot to give you our volume options: Volume Name: repo2 Type: Replicate Volume ID: 47b9d9e4-be80-4138-8a4c-d3fb77ba2db0 Status: Started Snapshot Count: 0 Number of Bricks: 1 x (2 + 1) = 3 Transport-type: tcp Bricks: Brick1: fs-davids-c1-n1:/gluster/brick3/glusterbrick Brick2: fs-davids-c1-n2:/gluster/brick3/glusterbrick Brick3: fs-davids-c1-n3:/gluster/arbiter3/glusterbrick (arbiter) Options Reconfigured: diagnostics.client-log-level: INFO diagnostics.brick-log-level: INFO performance.client-io-threads: off nfs.disable: on transport.address-family: inet user.smb: disable features.read-only: off features.worm: off features.worm-file-level: on features.retention-mode: enterprise features.default-retention-period: 120 network.ping-timeout: 10 features.cache-invalidation: on features.cache-invalidation-timeout: 600 performance.nl-cache: on performance.nl-cache-timeout: 600 client.event-threads: 32 server.event-threads: 32 cluster.lookup-optimize: on performance.stat-prefetch: on performance.cache-invalidation: on performance.md-cache-timeout: 600 performance.cache-samba-metadata: on performance.cache-ima-xattrs: on performance.io-thread-count: 64 cluster.use-compound-fops: on performance.cache-size: 512MB performance.cache-refresh-timeout: 10 performance.read-ahead: off performance.write-behind-window-size: 4MB performance.write-behind: off storage.build-pgfid: on features.utime: on storage.ctime: on cluster.quorum-type: auto features.bitrot: on features.scrub: Active features.scrub-freq: daily
@Amar You are rigth, disabling the write-behind feature leads to error messages both for FUSE and SMB client. Do you think there is a way to have error messages with write-behind enabled? In my opinion a client should receive an error message when a WRITE FOP fails for an WORMed file.
One way is, write-behind can be disabled for files which are 'WORM'ed, which it can figure out in 'open()/lookup()' call itself. Looks like a good one to have for Release-8. Can this be moved to 'github' as issue, so we can Track it to release-8 at least?
Yes, I can move it to github as an issue and track it for release-8 at least. There we can discuss the details for the fix.
Link to the github issue: https://github.com/gluster/glusterfs/issues/812
This bug is moved to https://github.com/gluster/glusterfs/issues/979, and will be tracked there from now on. Visit GitHub issues URL for further details