+++ This bug was initially created as a clone of Bug #1472757 +++ Description of problem: I created a VM using disks stored in gluster volume, and ran the sysbench test on it. This caused I/O errors and the disk to be mounted as read-only disk. # sysbench prepare --test=oltp --mysql-table-engine=innodb --mysql-password=pwd --oltp-table-size=500000000 --oltp-dist-type=gaussian - Errors in dmesg [ 2838.983763] blk_update_request: I/O error, dev vdb, sector 0 [ 2842.932506] blk_update_request: I/O error, dev vdb, sector 524722736 [ 2842.932577] Aborting journal on device vdb-8. [ 2842.933882] EXT4-fs error (device vdb): ext4_journal_check_start:56: Detected aborted journal [ 2842.934009] EXT4-fs (vdb): Remounting filesystem read-only On the host's gluster logs: mount log: [2017-07-19 07:42:47.219501] W [MSGID: 114031] [client-rpc-fops.c:2938:client3_3_lookup_cbk] 0-glusterlocal1-client-0: remote operation failed. Path: /.shard/c37d9820-0fd8-4d8e-af67-a9a54e5a99af.843 (00000000-0000-0000-0000-000000000000) [No data available] [2017-07-19 07:42:47.219587] E [MSGID: 133010] [shard.c:1725:shard_common_lookup_shards_cbk] 0-glusterlocal1-shard: Lookup on shard 843 failed. Base file gfid = c37d9820-0fd8-4d8e-af67-a9a54e5a99af [No data available] brick log: [2017-07-19 07:42:16.094979] E [MSGID: 113020] [posix.c:1361:posix_mknod] 0-glusterlocal1-posix: setting gfid on /rhgs/bricks/gv1/.shard/c37d9820-0fd8-4d8e-af67-a9a54e5a99af.842 failed [2017-07-19 07:42:47.218982] E [MSGID: 113002] [posix.c:253:posix_lookup] 0-glusterlocal1-posix: buf->ia_gfid is null for /rhgs/bricks/gv1/.shard/c37d9820-0fd8-4d8e-af67-a9a54e5a99af.843 [No data available] gluster vol info: Volume Name: glusterlocal1 Type: Distribute Volume ID: 3b0d4b90-10a4-4a91-80c1-27d051daf731 Status: Started Snapshot Count: 0 Number of Bricks: 1 Transport-type: tcp Bricks: Brick1: server1:/rhgs/bricks/gv1 Options Reconfigured: performance.strict-o-direct: on storage.owner-gid: 107 storage.owner-uid: 107 user.cifs: off features.shard: on cluster.shd-wait-qlength: 10000 cluster.shd-max-threads: 8 cluster.locking-scheme: granular cluster.data-self-heal-algorithm: full cluster.server-quorum-type: server cluster.quorum-type: auto cluster.eager-lock: enable network.remote-dio: off performance.low-prio-threads: 32 performance.stat-prefetch: off performance.io-cache: off performance.read-ahead: off performance.quick-read: off transport.address-family: inet performance.readdir-ahead: on nfs.disable: on Version-Release number of selected component (if applicable): How reproducible: Always Steps to Reproduce: 1. Create an image on gluster volume mount point using qemu-img create qemu-img create -f qcow2 -o preallocation=off /mnt/glusterlocal1/vm1boot.img 500G 2. Start the VM with additional device as image created in Step 1 3. Install MariaDB and sysbench. Configure database to be on filesystem using device in Step 1 4. Run sysbench prepare Actual results: Fails to create data.
REVIEW: https://review.gluster.org/17821 (storage/posix: Use the ret value of posix_gfid_heal()) posted (#2) for review on master by Krutika Dhananjay (kdhananj)
REVIEW: https://review.gluster.org/17821 (storage/posix: Use the ret value of posix_gfid_heal()) posted (#3) for review on master by Krutika Dhananjay (kdhananj)
COMMIT: https://review.gluster.org/17821 committed in master by Pranith Kumar Karampuri (pkarampu) ------ commit 669868d23eaeba42809fca7be134137c607d64ed Author: Krutika Dhananjay <kdhananj> Date: Wed Jul 19 16:14:59 2017 +0530 storage/posix: Use the ret value of posix_gfid_heal() ... to make the change in commit acf8cfdf truly useful. Without this, a race between entry creation fops and lookup at posix layer can cause lookups to fail with ENODATA, as opposed to ENOENT. Change-Id: I44a226872283a25f1f4812f03f68921c5eb335bb BUG: 1472758 Signed-off-by: Krutika Dhananjay <kdhananj> Reviewed-on: https://review.gluster.org/17821 Smoke: Gluster Build System <jenkins.org> CentOS-regression: Gluster Build System <jenkins.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu>
This also seem to affect disperse volumes. Attaching logs from when we try to copy an image with qemu-img from one storage to another. We get this on vm storage configured like this: Volume Name: fs02 Type: Disperse Volume ID: 7f3d96e7-8d1e-48b8-bad0-dc5b3de13b38 Status: Started Snapshot Count: 0 Number of Bricks: 1 x (2 + 1) = 3 Transport-type: tcp Bricks: Brick1: vbgsan01:/gluster/fs02/fs02 Brick2: vbgsan02:/gluster/fs02/fs02 Brick3: vbgsan03:/gluster/fs02/fs02 from one brick log: [2017-07-30 15:29:43.815362] E [MSGID: 113002] [posix.c:266:posix_lookup] 0-fs02-posix: buf->ia_gfid is null for /gluster/fs02/fs02/.shard/fea7c5e9-f1a7-4684-bd5c-08383af3a2fb.5079 [No data available] [2017-07-30 15:29:43.815527] E [MSGID: 115050] [server-rpc-fops.c:156:server_lookup_cbk] 0-fs02-server: 400238: LOOKUP /.shard/fea7c5e9-f1a7-4684-bd5c-08383af3a2fb.5079 (be318638-e8a0-4c6d-977d-7a937aa84806/fea7c5e9-f1a7-4684-bd5c-08383af3a2fb.5079) ==> (No data available) [No data available] [2017-07-30 15:29:46.837927] W [MSGID: 113096] [posix-handle.c:761:posix_handle_hard] 0-fs02-posix: link /gluster/fs02/fs02/.shard/fea7c5e9-f1a7-4684-bd5c-08383af3a2fb.5180 -> /gluster/fs02/fs02/.glusterfs/1e/2e/1e2e59bb-f8e3-41d1-93c6-db795bbc96d4failed [File exists] [2017-07-30 15:29:46.837962] E [MSGID: 113020] [posix.c:1402:posix_mknod] 0-fs02-posix: setting gfid on /gluster/fs02/fs02/.shard/fea7c5e9-f1a7-4684-bd5c-08383af3a2fb.5180 failed [2017-07-30 15:29:51.418415] E [MSGID: 113002] [posix.c:266:posix_lookup] 0-fs02-posix: buf->ia_gfid is null for /gluster/fs02/fs02/.shard/fea7c5e9-f1a7-4684-bd5c-08383af3a2fb.5315 [No data available] [2017-07-30 15:29:51.418471] E [MSGID: 115050] [server-rpc-fops.c:156:server_lookup_cbk] 0-fs02-server: 414456: LOOKUP /.shard/fea7c5e9-f1a7-4684-bd5c-08383af3a2fb.5315 (be318638-e8a0-4c6d-977d-7a937aa84806/fea7c5e9-f1a7-4684-bd5c-08383af3a2fb.5315) ==> (No data available) [No data available] From mount log: [2017-07-30 13:09:35.041843] I [MSGID: 109066] [dht-rename.c:1569:dht_rename] 0-fs02-dht: renaming /0924ff77-ef51-435b-b90d-50bfbf2e8de7/images/cab4e8d0-a82a-4048-b0e8-a9c1bd2e38bf/797faedf-b7e2-4b3c-bff6-82264efa11f5.meta.new (hash=fs02-disperse-0/cache=fs02-disperse-0) => /0924ff77-ef51-435b-b90d-50bfbf2e8de7/images/cab4e8d0-a82a-4048-b0e8-a9c1bd2e38bf/797faedf-b7e2-4b3c-bff6-82264efa11f5.meta (hash=fs02-disperse-0/cache=fs02-disperse-0) [2017-07-30 13:09:35.841507] I [MSGID: 109066] [dht-rename.c:1569:dht_rename] 0-fs02-dht: renaming /0924ff77-ef51-435b-b90d-50bfbf2e8de7/images/cab4e8d0-a82a-4048-b0e8-a9c1bd2e38bf/4c3cc823-e977-4fa1-b233-718c445d632e.meta.new (hash=fs02-disperse-0/cache=fs02-disperse-0) => /0924ff77-ef51-435b-b90d-50bfbf2e8de7/images/cab4e8d0-a82a-4048-b0e8-a9c1bd2e38bf/4c3cc823-e977-4fa1-b233-718c445d632e.meta (hash=fs02-disperse-0/cache=<nul>) [2017-07-30 13:36:57.236163] W [MSGID: 114031] [client-rpc-fops.c:2933:client3_3_lookup_cbk] 0-fs02-client-0: remote operation failed. Path: /.shard/3997817a-4678-4e75-8131-438db9faca9a.1300 (00000000-0000-0000-0000-000000000000) [No data available] [2017-07-30 13:36:57.237191] W [MSGID: 114031] [client-rpc-fops.c:2933:client3_3_lookup_cbk] 0-fs02-client-2: remote operation failed. Path: /.shard/3997817a-4678-4e75-8131-438db9faca9a.1300 (00000000-0000-0000-0000-000000000000) [No data available] [2017-07-30 13:36:57.237220] W [MSGID: 122053] [ec-common.c:121:ec_check_status] 0-fs02-disperse-0: Operation failed on 1 of 3 subvolumes.(up=111, mask=111, remaining=000, good=101, bad=010) [2017-07-30 13:36:57.237229] W [MSGID: 122002] [ec-common.c:71:ec_heal_report] 0-fs02-disperse-0: Heal failed [Invalid argument] [2017-07-30 13:36:57.248807] W [MSGID: 114031] [client-rpc-fops.c:2933:client3_3_lookup_cbk] 0-fs02-client-0: remote operation failed. Path: /.shard/3997817a-4678-4e75-8131-438db9faca9a.1301 (00000000-0000-0000-0000-000000000000) [No data available] [2017-07-30 13:36:57.248973] W [MSGID: 114031] [client-rpc-fops.c:2933:client3_3_lookup_cbk] 0-fs02-client-1: remote operation failed. Path: /.shard/3997817a-4678-4e75-8131-438db9faca9a.1301 (00000000-0000-0000-0000-000000000000) [No data available] [2017-07-30 13:36:57.249502] W [MSGID: 122053] [ec-common.c:121:ec_check_status] 0-fs02-disperse-0: Operation failed on 1 of 3 subvolumes.(up=111, mask=111, remaining=000, good=011, bad=100) [2017-07-30 13:36:57.249524] W [MSGID: 122002] [ec-common.c:71:ec_heal_report] 0-fs02-disperse-0: Heal failed [Invalid argument] [2017-07-30 13:36:57.249535] E [MSGID: 133010] [shard.c:1725:shard_common_lookup_shards_cbk] 0-fs02-shard: Lookup on shard 1301 failed. Base file gfid = 3997817a-4678-4e75-8131-438db9faca9a [No data available] [2017-07-30 13:36:57.249585] W [fuse-bridge.c:2228:fuse_readv_cbk] 0-glusterfs-fuse: 84787: READ => -1 gfid=3997817a-4678-4e75-8131-438db9faca9a fd=0x7f0cfa192210 (No data available)
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.13.0, please open a new bug report. glusterfs-3.13.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] http://lists.gluster.org/pipermail/announce/2017-December/000087.html [2] https://www.gluster.org/pipermail/gluster-users/