Description of problem: In a RHHI-V 1.8 cluster, when we live-migrate several VMs at the same time there's a high probability that some of the VMs will get a corruption in a qcow2 disk after running in the destination host for a while (5-15 minutes). All qcow2 images are stored in Gluster volumes. Gluster itself appears to be healthy and we haven't seen any errors. We have seen these kind of corruptions: qcow2: Marking image as corrupt: Preventing invalid write on metadata (overlaps with refcount block); further corruption events will be suppressed qcow2: Marking image as corrupt: Preventing invalid write on metadata (overlaps with qcow2_header); further corruption events will be suppressed qcow2: Marking image as corrupt: Preventing invalid allocation of refcount block at offset 0; further corruption events will be suppressed Version-Release number of selected component (if applicable): qemu-kvm-6.2.0-11.module+el8.6.0+14707+5aa4b42d.x86_64 libvirt-8.0.0-5.module+el8.6.0+14480+c0a3aa0f.x86_64 kernel-4.18.0-372.16.1.el8_6.x86_64 redhat-release-virtualization-host-4.5.1-1.el8ev.x86_64 How reproducible: Easily reproducible in customer's environment in several independent clusters. Not reproduced locally. Steps to Reproduce: 1. The setup consists of a RHHI-V 1.8 cluster: - 3 hosts RHVH 4.5.1.1-0.20220717.0+1 - RHV-M: ovirt-engine-4.5.3.5-1.el8ev.noarch - Gluster storage in the same hosts - Live-migrations and Gluster share the same network interface: a bridge using a LACP bond of 2x100 Gbps 2. Live migrate 8 VMs from one host to another 3. After the VMs have been migrated, wait ~15 minutes for any corruption event Actual results: There's a high chance of a corruption like this: ~~~ 2023-05-25 16:52:36.504+0000: 3852944: debug : qemuMonitorJSONIOProcessLine:222 : Line [{"timestamp": {"seconds": 1685033556, "microseconds": 504261}, "event": "BLOCK_IMAGE_CORRUPTED", "data": {"device": "", "msg": "Preventing invalid write on metadata (overlaps with refcount block)", "offset": 8590327808, "node-name": "libvirt-2-format", "fatal": true, "size": 4096}}] 2023-05-25 16:52:36.504+0000: 3852944: info : qemuMonitorJSONIOProcessLine:237 : QEMU_MONITOR_RECV_EVENT: mon=0x7f9dfc1f95d0 event={"timestamp": {"seconds": 1685033556, "microseconds": 504261}, "event": "BLOCK_IMAGE_CORRUPTED", "data": {"device": "", "msg": "Preventing invalid write on metadata (overlaps with refcount block)", "offset": 8590327808, "node-name": "libvirt-2-format", "fatal": true, "size": 4096}} 2023-05-25 16:52:36.504+0000: 3852944: debug : qemuMonitorJSONIOProcessEvent:185 : mon=0x7f9dfc1f95d0 obj=0x7f9d9c00b190 2023-05-25 16:52:36.504+0000: 3852944: debug : qemuMonitorEmitEvent:1122 : mon=0x7f9dfc1f95d0 event=BLOCK_IMAGE_CORRUPTED 2023-05-25 16:52:36.504+0000: 3852944: debug : qemuProcessHandleEvent:549 : vm=0x7f9de0818400 2023-05-25 16:52:36.504+0000: 3852944: debug : virObjectEventNew:621 : obj=0x7f9da80456e0 2023-05-25 16:52:36.529+0000: 3852944: debug : qemuMonitorJSONIOProcessLine:222 : Line [{"timestamp": {"seconds": 1685033556, "microseconds": 529133}, "event": "BLOCK_IO_ERROR", "data": {"device": "", "nospace": false, "node-name": "libvirt-2-format", "reason": "Input/output error", "operation": "write", "action": "stop"}}] 2023-05-25 16:52:36.529+0000: 3852944: info : qemuMonitorJSONIOProcessLine:237 : QEMU_MONITOR_RECV_EVENT: mon=0x7f9dfc1f95d0 event={"timestamp": {"seconds": 1685033556, "microseconds": 529133}, "event": "BLOCK_IO_ERROR", "data": {"device": "", "nospace": false, "node-name": "libvirt-2-format", "reason": "Input/output error", "operation": "write", "action": "stop"}} 2023-05-25 16:52:36.529+0000: 3852944: debug : qemuMonitorJSONIOProcessEvent:185 : mon=0x7f9dfc1f95d0 obj=0x7f9d9c017ff0 2023-05-25 16:52:36.529+0000: 3852944: debug : qemuMonitorEmitEvent:1122 : mon=0x7f9dfc1f95d0 event=BLOCK_IO_ERROR 2023-05-25 16:52:36.529+0000: 3852944: debug : qemuProcessHandleEvent:549 : vm=0x7f9de0818400 2023-05-25 16:52:36.529+0000: 3852944: debug : virObjectEventNew:621 : obj=0x7f9d9c02e050 2023-05-25 16:52:36.529+0000: 3852944: debug : qemuMonitorJSONIOProcessEvent:209 : handle BLOCK_IO_ERROR handler=0x7f9e16438db0 data=0x7f9d9c055220 2023-05-25 16:52:36.529+0000: 3852944: debug : qemuMonitorEmitIOError:1199 : mon=0x7f9dfc1f95d0 2023-05-25 16:52:36.529+0000: 3852944: debug : virObjectEventNew:621 : obj=0x7f9d9c02e0e0 2023-05-25 16:52:36.529+0000: 3852944: debug : virObjectEventNew:621 : obj=0x7f9d9c02e170 2023-05-25 16:52:36.529+0000: 3852944: debug : qemuProcessHandleIOError:861 : Transitioned guest test1 to paused state due to IO error 2023-05-25 16:52:36.529+0000: 3852944: debug : virObjectEventNew:621 : obj=0x7f9da8045e50 2023-05-25 16:52:36.530+0000: 3852944: debug : qemuProcessHandleIOError:874 : Preserving lock state '<null>' ~~~ qemu command line of this particular VM: ~~~ 2023-05-25 16:43:41.816+0000: starting up libvirt version: 8.0.0, package: 5.module+el8.6.0+14480+c0a3aa0f (Red Hat, Inc. <http://bugzilla.redhat.com/bugzilla>, 2022-03-15-19:57:04, ), qemu version: 6.2.0qemu-kvm-6.2.0-11.module+el8.6.0+14707+5aa4b42d, kernel: 4.18.0-372.16.1.el8_6.x86_64, hostname: host1 LC_ALL=C \ PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin \ HOME=/var/lib/libvirt/qemu/domain-68-test1 \ XDG_DATA_HOME=/var/lib/libvirt/qemu/domain-68-test1/.local/share \ XDG_CACHE_HOME=/var/lib/libvirt/qemu/domain-68-test1/.cache \ XDG_CONFIG_HOME=/var/lib/libvirt/qemu/domain-68-test1/.config \ /usr/libexec/qemu-kvm \ -name guest=test1itis.lan,debug-threads=on \ -S \ -object '{"qom-type":"secret","id":"masterKey0","format":"raw","file":"/var/lib/libvirt/qemu/domain-68-test1/master-key.aes"}' \ -blockdev '{"driver":"file","filename":"/usr/share/OVMF/OVMF_CODE.secboot.fd","node-name":"libvirt-pflash0-storage","auto-read-only":true,"discard":"unmap"}' \ -blockdev '{"node-name":"libvirt-pflash0-format","read-only":true,"driver":"raw","file":"libvirt-pflash0-storage"}' \ -blockdev '{"driver":"file","filename":"/var/lib/libvirt/qemu/nvram/5e7bcb91-0163-4e78-a615-dcf99ee72828.fd","node-name":"libvirt-pflash1-storage","auto-read-only":true,"discard":"unmap"}' \ -blockdev '{"node-name":"libvirt-pflash1-format","read-only":false,"driver":"raw","file":"libvirt-pflash1-storage"}' \ -machine pc-q35-rhel8.6.0,usb=off,smm=on,dump-guest-core=off,pflash0=libvirt-pflash0-format,pflash1=libvirt-pflash1-format \ -accel kvm \ -global mch.extended-tseg-mbytes=24 \ -cpu EPYC,ibpb=on,virt-ssbd=on,monitor=off,x2apic=on,hypervisor=on,svm=off,topoext=on \ -global driver=cfi.pflash01,property=secure,value=on \ -m size=8388608k,slots=16,maxmem=33554432k \ -overcommit mem-lock=off \ -smp 4,maxcpus=64,sockets=16,dies=1,cores=4,threads=1 \ -object '{"qom-type":"iothread","id":"iothread1"}' \ -object '{"qom-type":"memory-backend-ram","id":"ram-node0","size":8589934592}' \ -numa node,nodeid=0,cpus=0-63,memdev=ram-node0 \ -uuid 5e7bcb91-0163-4e78-a615-dcf99ee72828 \ -smbios 'type=1,manufacturer=Red Hat,product=RHEL,version=8.6-1.el8ev,serial=fe1b8db3-4c1b-ea11-9fc6-00000000003c,uuid=5e7bcb91-0163-4e78-a615-dcf99ee72828,sku=8.6.0,family=RHV' \ -smbios 'type=2,manufacturer=Red Hat,product=RHEL-AV' \ -no-user-config \ -nodefaults \ -chardev socket,id=charmonitor,fd=83,server=on,wait=off \ -mon chardev=charmonitor,id=monitor,mode=control \ -rtc base=2023-05-25T16:43:40,driftfix=slew \ -global kvm-pit.lost_tick_policy=delay \ -no-hpet \ -no-shutdown \ -global ICH9-LPC.disable_s3=1 \ -global ICH9-LPC.disable_s4=1 \ -boot menu=on,splash-time=30000,strict=on \ -device pcie-root-port,port=16,chassis=1,id=pci.1,bus=pcie.0,multifunction=on,addr=0x2 \ -device pcie-root-port,port=17,chassis=2,id=pci.2,bus=pcie.0,addr=0x2.0x1 \ -device pcie-root-port,port=18,chassis=3,id=pci.3,bus=pcie.0,addr=0x2.0x2 \ -device pcie-root-port,port=19,chassis=4,id=pci.4,bus=pcie.0,addr=0x2.0x3 \ -device pcie-root-port,port=20,chassis=5,id=pci.5,bus=pcie.0,addr=0x2.0x4 \ -device pcie-root-port,port=21,chassis=6,id=pci.6,bus=pcie.0,addr=0x2.0x5 \ -device pcie-root-port,port=22,chassis=7,id=pci.7,bus=pcie.0,addr=0x2.0x6 \ -device pcie-root-port,port=23,chassis=8,id=pci.8,bus=pcie.0,addr=0x2.0x7 \ -device pcie-root-port,port=24,chassis=9,id=pci.9,bus=pcie.0,multifunction=on,addr=0x3 \ -device pcie-root-port,port=25,chassis=10,id=pci.10,bus=pcie.0,addr=0x3.0x1 \ -device pcie-root-port,port=26,chassis=11,id=pci.11,bus=pcie.0,addr=0x3.0x2 \ -device pcie-root-port,port=27,chassis=12,id=pci.12,bus=pcie.0,addr=0x3.0x3 \ -device pcie-root-port,port=28,chassis=13,id=pci.13,bus=pcie.0,addr=0x3.0x4 \ -device pcie-root-port,port=29,chassis=14,id=pci.14,bus=pcie.0,addr=0x3.0x5 \ -device pcie-root-port,port=30,chassis=15,id=pci.15,bus=pcie.0,addr=0x3.0x6 \ -device pcie-root-port,port=31,chassis=16,id=pci.16,bus=pcie.0,addr=0x3.0x7 \ -device qemu-xhci,p2=8,p3=8,id=ua-868e398a-2ca8-445c-8d03-7d1dc2197fd8,bus=pci.3,addr=0x0 \ -device virtio-scsi-pci,iothread=iothread1,id=ua-80e32062-9b28-489a-82e3-ce4cd33ddd8c,bus=pci.2,addr=0x0 \ -device virtio-serial-pci,id=ua-dbe4f0f4-be14-43fd-a4a1-d6474e2395d1,max_ports=16,bus=pci.4,addr=0x0 \ -device ide-cd,bus=ide.2,id=ua-ca4f4620-c019-4b42-8050-5da43e1b28af,werror=report,rerror=report \ -blockdev '{"driver":"file","filename":"/rhev/data-center/mnt/glusterSD/host1-example.com:_data-storage-01/ebd88800-b2ef-4475-8400-93f1af83b7ab/images/0dada7a0-94d2-48a9-83cf-a968f90f33b6/a1ba2184-5a1b-43e4-8112-0305bc90c7ce","aio":"threads","node-name":"libvirt-2-storage","cache":{"direct":true,"no-flush":false},"auto-read-only":true,"discard":"unmap"}' \ -blockdev '{"node-name":"libvirt-2-format","read-only":false,"cache":{"direct":true,"no-flush":false},"driver":"qcow2","file":"libvirt-2-storage","backing":null}' \ -device scsi-hd,bus=ua-80e32062-9b28-489a-82e3-ce4cd33ddd8c.0,channel=0,scsi-id=0,lun=0,device_id=0dada7a0-94d2-48a9-83cf-a968f90f33b6,drive=libvirt-2-format,id=ua-0dada7a0-94d2-48a9-83cf-a968f90f33b6,bootindex=1,write-cache=on,serial=0dada7a0-94d2-48a9-83cf-a968f90f33b6,werror=stop,rerror=stop \ -blockdev '{"driver":"file","filename":"/rhev/data-center/mnt/glusterSD/host1-example.com:_data-storage-01/ebd88800-b2ef-4475-8400-93f1af83b7ab/images/41acd2f1-a292-476e-9e41-0b54aaa3c05a/55822524-824c-4dfb-b686-0762529f32af","aio":"threads","node-name":"libvirt-1-storage","cache":{"direct":true,"no-flush":false},"auto-read-only":true,"discard":"unmap"}' \ -blockdev '{"node-name":"libvirt-1-format","read-only":false,"cache":{"direct":true,"no-flush":false},"driver":"qcow2","file":"libvirt-1-storage","backing":null}' \ -device scsi-hd,bus=ua-80e32062-9b28-489a-82e3-ce4cd33ddd8c.0,channel=0,scsi-id=0,lun=1,device_id=41acd2f1-a292-476e-9e41-0b54aaa3c05a,drive=libvirt-1-format,id=ua-41acd2f1-a292-476e-9e41-0b54aaa3c05a,write-cache=on,serial=41acd2f1-a292-476e-9e41-0b54aaa3c05a,werror=stop,rerror=stop \ -netdev tap,fds=84:86:87:88,id=hostua-51b344b8-a525-4e9d-9a55-c65c5451ae93,vhost=on,vhostfds=89:90:91:92 \ -device virtio-net-pci,mq=on,vectors=10,host_mtu=1500,netdev=hostua-51b344b8-a525-4e9d-9a55-c65c5451ae93,id=ua-51b344b8-a525-4e9d-9a55-c65c5451ae93,mac=56:6f:6a:39:00:6b,bootindex=2,bus=pci.1,addr=0x0 \ -chardev socket,id=charchannel0,fd=81,server=on,wait=off \ -device virtserialport,bus=ua-dbe4f0f4-be14-43fd-a4a1-d6474e2395d1.0,nr=1,chardev=charchannel0,id=channel0,name=org.qemu.guest_agent.0 \ -chardev spicevmc,id=charchannel1,name=vdagent \ -device virtserialport,bus=ua-dbe4f0f4-be14-43fd-a4a1-d6474e2395d1.0,nr=2,chardev=charchannel1,id=channel1,name=com.redhat.spice.0 \ -audiodev '{"id":"audio1","driver":"spice"}' \ -spice port=5924,tls-port=5925,addr=192.168.1.1,x509-dir=/etc/pki/vdsm/libvirt-spice,tls-channel=main,tls-channel=display,tls-channel=inputs,tls-channel=cursor,tls-channel=playback,tls-channel=record,tls-channel=smartcard,tls-channel=usbredir,seamless-migration=on \ -device qxl-vga,id=ua-4fd19b52-d5cc-40a9-8954-86048eb37596,ram_size=67108864,vram_size=33554432,vram64_size_mb=0,vgamem_mb=16,max_outputs=1,bus=pcie.0,addr=0x1 \ -incoming defer \ -device virtio-balloon-pci,id=ua-87a286d9-3059-4d66-89ec-beb8fa0c18f4,bus=pci.5,addr=0x0 \ -object '{"qom-type":"rng-random","id":"objua-c60516a5-d513-4722-983e-5b4d737ef918","filename":"/dev/urandom"}' \ -device virtio-rng-pci,rng=objua-c60516a5-d513-4722-983e-5b4d737ef918,id=ua-c60516a5-d513-4722-983e-5b4d737ef918,bus=pci.6,addr=0x0 \ -device vmcoreinfo \ -sandbox on,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontrol=deny \ -msg timestamp=on 2023-05-25 16:49:05.494+0000: Domain id=68 is tainted: custom-ga-command qcow2: Marking image as corrupt: Preventing invalid write on metadata (overlaps with refcount block); further corruption events will be suppressed ~~~ Expected results: No corruptions Additional info:
Hi Juan, Do you have the command lines for all VMs involved, particularly the source VMs before migration? It would also be good to see them grouped by each pairing of source and destination. How does the storage migration work in this case? I presume the images are not copied anywhere, but source and destination use the very same images on the gluster storage, right? qcow2 corruption can be caused by concurrent writes to an image while it is in use by a VM, but this should be prevented by qemu’s file locks. I don’t know anything about gluster, so I’ll just have to ask: Does this particular configuration in this case support OFD file locks (fcntl() with F_OFD_SETLK)? (Such concurrent access would be a misconfiguration, so is unlikely regardless of whether locking works or not, but is still something that would be good to be able to rule out.) So far, I don’t have much of an idea. Seeing the full `qemu-img check` log would be good. What I find most interesting so far is that > qcow2: Marking image as corrupt: Preventing invalid write on metadata (overlaps with qcow2_header); further corruption events will be suppressed > qcow2: Marking image as corrupt: Preventing invalid allocation of refcount block at offset 0; further corruption events will be suppressed Both indicate attempted writes to offset 0. I think this can only be explained if the cached refcount information on the destination is completely wrong, because offset 0 (the image header) can never be available for allocation.
The first immediate thought with corruption after migration with shared storage is what cache coherence guarantees the filesystem makes. I understand that we're running on a glusterfs FUSE filesystem here (i.e. not the built-in gluster driver in QEMU). Specifically, during migration we have an image file opened on two different hosts with O_DIRECT. At first, the destination host uses it read-only and only source host writes to the image, then calls fdatasync() and stops writing to the image. Then the destination host re-reads anything from the image that could have changed and starts writing to it. It is important that the destination host can see everything the source wrote up to its fdatasync(), i.e. the destination host must not read stale data from its local cache. It would be good if someone who knows gluster could confirm that gluster supports this. If it doesn't, we can't do live migration with shared storage with it. Sunil, I'm not sure if you're the right person to answer this. If not, can you please forward the request to the appropriate person?
(In reply to Hanna Czenczek from comment #3) > Hi Juan, > > Do you have the command lines for all VMs involved, particularly the source > VMs before migration? It would also be good to see them grouped by each > pairing of source and destination. The sosreports we have are all generated after the corruptions happened. Let me review if I can correlate some of the events with the source VM using previous sosreports. I expect the source and destination VMs to be identical, but I'll confirm. > How does the storage migration work in this case? I presume the images are > not copied anywhere, but source and destination use the very same images on > the gluster storage, right? That's right, the qcow2 image file is stored in the gluster volume which is mounted by all 3 hosts under the same path. So the image is not copied/migrated anywhere. > qcow2 corruption can be caused by concurrent writes to an image while it is > in use by a VM, but this should be prevented by qemu’s file locks. I don’t > know anything about gluster, so I’ll just have to ask: Does this particular > configuration in this case support OFD file locks (fcntl() with F_OFD_SETLK)? Gluster uses FUSE and does not support OFD file locks (it only supports F_GETLK, F_SETLK and F_SETLKW). > So far, I don’t have much of an idea. Seeing the full `qemu-img check` log > would be good. I'll try to get that info. > What I find most interesting so far is that > > > qcow2: Marking image as corrupt: Preventing invalid write on metadata (overlaps with qcow2_header); further corruption events will be suppressed > > qcow2: Marking image as corrupt: Preventing invalid allocation of refcount block at offset 0; further corruption events will be suppressed > > Both indicate attempted writes to offset 0. I think this can only be > explained if the cached refcount information on the destination is > completely wrong, because offset 0 (the image header) can never be available > for allocation. The Gluster team has suggested turning off some performance optimizations that are enabled by default. We are now waiting for the results of testing this. performance.open-behind performance.flush-behind performance.write-behind https://access.redhat.com/documentation/en-us/red_hat_gluster_storage/3.5/html/administration_guide/volume_option_table "The performance.flush-behind will tell system that flush is done when this is still being done in the background, so this could explain possible corruption of the new VM process on the other node start to read before flush has been completed...this is just a theory. BUT if you detect the corruption 5 mins later after VM was moved I would expect all data being flush by that, so not sure if this will help."
Additional example of corrupted image: # su vdsm -s /bin/sh -c "qemu-img info -U --backing-chain /rhev/data-center/mnt/glusterSD/host1.example.com:_data-storage-01/ebd88800-b2ef-4475-8400-93f1af83b7ab/images/85d45178-10b9-452d-83cf-e30bb277c34e/1e9894e9-5760-4c6b-a32a-ab54d8e96741" image: /rhev/data-center/mnt/glusterSD/host1.example.com:_data-storage-01/ebd88800-b2ef-4475-8400-93f1af83b7ab/images/85d45178-10b9-452d-83cf-e30bb277c34e/1e9894e9-5760-4c6b-a32a-ab54d8e96741 file format: qcow2 virtual size: 50 GiB (53687091200 bytes) disk size: 8.23 GiB cluster_size: 65536 Format specific information: compat: 1.1 compression type: zlib lazy refcounts: false refcount bits: 16 corrupt: true extended l2: false # su vdsm -s /bin/sh -c "qemu-img check -r all /rhev/data-center/mnt/glusterSD/host1.example.com:_data-storage-01/ebd88800-b2ef-4475-8400-93f1af83b7ab/images/85d45178-10b9-452d-83cf-e30bb277c34e/1e9894e9-5760-4c6b-a32a-ab54d8e96741" Repairing OFLAG_COPIED data cluster: l2_entry=80000000 refcount=1 Repairing OFLAG_COPIED data cluster: l2_entry=80010000 refcount=1 The following inconsistencies were found and repaired: 0 leaked clusters 2 corruptions Double checking the fixed image now... No errors were found on the image. 134358/819200 = 16.40% allocated, 9.49% fragmented, 0.00% compressed clusters Image end offset: 8820621312
(In reply to Kevin Wolf from comment #5) > The first immediate thought with corruption after migration with shared > storage is what cache coherence guarantees the filesystem makes. I > understand that we're running on a glusterfs FUSE filesystem here (i.e. not > the built-in gluster driver in QEMU). > > Specifically, during migration we have an image file opened on two different > hosts with O_DIRECT. At first, the destination host uses it read-only and > only source host writes to the image, then calls fdatasync() and stops > writing to the image. Then the destination host re-reads anything from the > image that could have changed and starts writing to it. It is important that > the destination host can see everything the source wrote up to its > fdatasync(), i.e. the destination host must not read stale data from its > local cache. > > It would be good if someone who knows gluster could confirm that gluster > supports this. If it doesn't, we can't do live migration with shared storage > with it. > > Sunil, I'm not sure if you're the right person to answer this. If not, can > you please forward the request to the appropriate person? This has been already sorted out via https://bugzilla.redhat.com/show_bug.cgi?id=2213809
The customer has tested with the following Gluster volume options: performance.open-behind off performance.flush-behind off but no difference. 15 minutes after migrating 10 VMs, they've got 1 VM paused due to corruption.
(In reply to Sunil Kumar Acharya from comment #12) > This has been already sorted out via > https://bugzilla.redhat.com/show_bug.cgi?id=2213809 That BZ is about the Gluster OFD locks, and we have seen that the fcntl calls to acquire the locks succeed, even if Gluster doesn't support them. They are translated to regular locks, see: https://bugzilla.redhat.com/show_bug.cgi?id=2213809#c6 However that doesn't answer the question if when the source host finishes the fdatasync() call, the destination host can immediately read the synced data. Can we get confirm this point? Thank you.
(In reply to Juan Orti from comment #15) > (In reply to Sunil Kumar Acharya from comment #12) > > > This has been already sorted out via > > Red Hathttps://bugzilla.redhat.com/show_bug.cgi?id=2213809 > > That BZ is about the Gluster OFD locks, and we have seen that the fcntl > calls to acquire the locks succeed, even if Gluster doesn't support them. > They are translated to regular locks, see: > > Red Hathttps://bugzilla.redhat.com/show_bug.cgi?id=2213809#c6 > > However that doesn't answer the question if when the source host finishes > the fdatasync() call, the destination host can immediately read the synced > data. Can we get confirm this point? > Thank you. Can you please share the gluster configuration and what are the parameters passed to a client while mount a volume? Ideally, the gluster should access the fresh data if cache invalidation is enabled otherwise it might access the stale data.