+++ This bug was initially created as a clone of Bug #1485962 +++ Description of problem: tcmu-runner is not going to open block with O_SYNC anymore so writes have a chance of getting cached in write-behind when that happens, there is a chance that on failover some data could be stuck in cache and be lost. So strict-o-direct should be on Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info: --- Additional comment from Worker Ant on 2017-08-28 10:52:31 EDT --- REVIEW: https://review.gluster.org/18120 (gluster-block: strict-o-direct should be on) posted (#1) for review on master by Pranith Kumar Karampuri (pkarampu)
https://review.gluster.org/#/c/18120/
Manoj, This will change our perf benchmarks as write-behind was caching before this patch. Wanted to let you know that we need to do perf benchmark again. Leaving a needinfo at the moment, not sure if there is any other way to notify you on the bz. Pranith
(In reply to Pranith Kumar K from comment #3) > Manoj, > This will change our perf benchmarks as write-behind was caching before > this patch. Wanted to let you know that we need to do perf benchmark again. > Leaving a needinfo at the moment, not sure if there is any other way to > notify you on the bz. > > Pranith Ack. That should be good enough to clear the needinfo? :)
Tested and verified this on the build glusterfs-3.8.4-44.el7rhgs.x86_64, gluster-block-0.2.1-11.el7rhgs.x86_64 and tcmu-runner-1.2.0-14.el7rhgs.x86_64. 'gluster volume set <volname> group gluster-block' does set the new option strict-o-direct to on, along with the other already-present options. Moving this bug to verified for rhgs 3.3.0. Logs are pasted below. [root@dhcp47-117 ~]# gluster v create ozone replica 3 10.70.47.121:/bricks/brick8/ozone_0 10.70.47.113:/bricks/brick8/ozone_1 10.70.47.114:/bricks/brick8/ozone_2 10.70.47.115:/bricks/brick8/ozone_3 10.70.47.116:/bricks/brick8/ozone_4 10.70.47.117:/bricks/brick8/ozone_5 volume create: ozone: success: please start the volume to access data [root@dhcp47-117 ~]# gluster v info ozone Volume Name: ozone Type: Distributed-Replicate Volume ID: dacd299a-23f3-4ab9-a5ac-0cfb26e77223 Status: Created Snapshot Count: 0 Number of Bricks: 2 x 3 = 6 Transport-type: tcp Bricks: Brick1: 10.70.47.121:/bricks/brick8/ozone_0 Brick2: 10.70.47.113:/bricks/brick8/ozone_1 Brick3: 10.70.47.114:/bricks/brick8/ozone_2 Brick4: 10.70.47.115:/bricks/brick8/ozone_3 Brick5: 10.70.47.116:/bricks/brick8/ozone_4 Brick6: 10.70.47.117:/bricks/brick8/ozone_5 Options Reconfigured: transport.address-family: inet nfs.disable: on cluster.brick-multiplex: enable [root@dhcp47-117 ~]# [root@dhcp47-117 ~]# gluster v set ozone group gluster-block volume set: success [root@dhcp47-117 ~]# gluster v info ozone Volume Name: ozone Type: Distributed-Replicate Volume ID: dacd299a-23f3-4ab9-a5ac-0cfb26e77223 Status: Created Snapshot Count: 0 Number of Bricks: 2 x 3 = 6 Transport-type: tcp Bricks: Brick1: 10.70.47.121:/bricks/brick8/ozone_0 Brick2: 10.70.47.113:/bricks/brick8/ozone_1 Brick3: 10.70.47.114:/bricks/brick8/ozone_2 Brick4: 10.70.47.115:/bricks/brick8/ozone_3 Brick5: 10.70.47.116:/bricks/brick8/ozone_4 Brick6: 10.70.47.117:/bricks/brick8/ozone_5 Options Reconfigured: server.allow-insecure: on user.cifs: off features.shard-block-size: 64MB features.shard: on cluster.shd-wait-qlength: 10000 cluster.shd-max-threads: 8 cluster.locking-scheme: granular cluster.data-self-heal-algorithm: full cluster.quorum-type: auto cluster.eager-lock: disable network.remote-dio: disable performance.strict-o-direct: on performance.readdir-ahead: off performance.open-behind: off performance.stat-prefetch: off performance.io-cache: off performance.read-ahead: off performance.quick-read: off transport.address-family: inet nfs.disable: on cluster.brick-multiplex: enable [root@dhcp47-117 ~]# rpm -qa | grep gluster glusterfs-3.8.4-44.el7rhgs.x86_64 glusterfs-geo-replication-3.8.4-44.el7rhgs.x86_64 glusterfs-api-3.8.4-44.el7rhgs.x86_64 glusterfs-fuse-3.8.4-44.el7rhgs.x86_64 python-gluster-3.8.4-44.el7rhgs.noarch libvirt-daemon-driver-storage-gluster-3.2.0-14.el7_4.2.x86_64 gluster-nagios-common-0.2.4-1.el7rhgs.noarch glusterfs-client-xlators-3.8.4-44.el7rhgs.x86_64 glusterfs-server-3.8.4-44.el7rhgs.x86_64 glusterfs-rdma-3.8.4-44.el7rhgs.x86_64 vdsm-gluster-4.17.33-1.2.el7rhgs.noarch gluster-nagios-addons-0.2.9-1.el7rhgs.x86_64 glusterfs-cli-3.8.4-44.el7rhgs.x86_64 glusterfs-libs-3.8.4-44.el7rhgs.x86_64 glusterfs-events-3.8.4-44.el7rhgs.x86_64 gluster-block-0.2.1-11.el7rhgs.x86_64 [root@dhcp47-117 ~]# [root@dhcp47-117 ~]# rpm -qa | grep tcmu-runner tcmu-runner-1.2.0-14.el7rhgs.x86_64 [root@dhcp47-117 ~]#
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2017:2774