Bug 1732961
| Summary: | shard file with different gfid in different subvolume causing VM to pause on stale file | ||||||||||||||||||||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Product: | [Community] GlusterFS | Reporter: | Olaf Buitelaar <olaf.buitelaar> | ||||||||||||||||||||||||||||||||||||||||||||
| Component: | distribute | Assignee: | Susant Kumar Palai <spalai> | ||||||||||||||||||||||||||||||||||||||||||||
| Status: | CLOSED UPSTREAM | QA Contact: | |||||||||||||||||||||||||||||||||||||||||||||
| Severity: | high | Docs Contact: | |||||||||||||||||||||||||||||||||||||||||||||
| Priority: | unspecified | ||||||||||||||||||||||||||||||||||||||||||||||
| Version: | mainline | CC: | billycole, bugs, olaf.buitelaar, rhs-bugs, storage-qa-internal | ||||||||||||||||||||||||||||||||||||||||||||
| Target Milestone: | --- | ||||||||||||||||||||||||||||||||||||||||||||||
| Target Release: | --- | ||||||||||||||||||||||||||||||||||||||||||||||
| Hardware: | x86_64 | ||||||||||||||||||||||||||||||||||||||||||||||
| OS: | Linux | ||||||||||||||||||||||||||||||||||||||||||||||
| Whiteboard: | |||||||||||||||||||||||||||||||||||||||||||||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |||||||||||||||||||||||||||||||||||||||||||||
| Doc Text: | Story Points: | --- | |||||||||||||||||||||||||||||||||||||||||||||
| Clone Of: | Environment: | ||||||||||||||||||||||||||||||||||||||||||||||
| Last Closed: | 2020-03-12 12:59:30 UTC | Type: | Bug | ||||||||||||||||||||||||||||||||||||||||||||
| Regression: | --- | Mount Type: | --- | ||||||||||||||||||||||||||||||||||||||||||||
| Documentation: | --- | CRM: | |||||||||||||||||||||||||||||||||||||||||||||
| Verified Versions: | Category: | --- | |||||||||||||||||||||||||||||||||||||||||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||||||||||||||||||||||||||||||||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||||||||||||||||||||||||||||||||||||||||
| Embargoed: | |||||||||||||||||||||||||||||||||||||||||||||||
| Attachments: |
|
||||||||||||||||||||||||||||||||||||||||||||||
|
Description
Olaf Buitelaar
2019-07-24 20:41:38 UTC
Olaf, Can you update the IO pattern from the application? With that, I will try to reproduce the issue. For now, if lookup-optimize is turned off you can delete the linkto file(with the permission --------T) on the backend directly and do a lookup from the mount (or a fresh mount). Olaf, Gluster 3.12.15 is an upstream release so it looks like you are using the upstream bits. I am modifying the BZ accordingly. Regards, Nithya Olaf, Please provide the information Susant has asked for as well as the gluster volume info, the gluster mount and brick logs. Regards, Nithya Dear Susant, Sorry i completely missed your comment. I don't really think it's possible to update the IO pattern of the App's. It's essentially VM's (via KVM, managed by oVirt) The vm's are running various apps via docker, probably the most IO intensive are Mariadb databases. I've recently updated to gluster 6.4 and are currently running gluster 6.5, and the issue still exists. however i did notice a few things; on VM's where this issue did occur before, i ran sparsify (https://ovirt.org/develop/release-management/features/virt/virt-sparsify.html) after that i've not observed the issue since. On 2 VM's the issue still exists, these VM's do have snapshots, and didn't allow me to run sparsify. They currently only seem to pause when running xfs_fsr. Except once, while a brick-replace was in action (as we're migrating to a new environment). @Nithya I've attached the logs earlier send in the mailing list. If you do require new logs let me know which exactly you're interested in, since it shouldn't be hard to reproduce the issue here. I don't think the mount logs will add much, since the VM's use libgfapi Thanks Olaf Created attachment 1608565 [details]
ansible job to clear T file, showing shard's affected
Created attachment 1608566 [details]
ansible job to clear T file, showing shard's affected
Created attachment 1608567 [details]
qemu log
Created attachment 1608568 [details]
qemu log
Created attachment 1608569 [details]
qemu log
Created attachment 1608570 [details]
qemu log
Created attachment 1608571 [details]
volume info
Created attachment 1608572 [details]
volume info
Created attachment 1608573 [details]
volume info
Created attachment 1608574 [details]
volume info
Created attachment 1608575 [details]
gluster logs
Created attachment 1608576 [details]
gluster logs
Created attachment 1608577 [details]
gluster logs
Created attachment 1608578 [details]
gluster logs
Created attachment 1608579 [details]
gluster logs
Created attachment 1608580 [details]
gluster logs
Created attachment 1608581 [details]
gluster logs
Created attachment 1608582 [details]
gluster logs
Created attachment 1608583 [details]
gluster logs
Created attachment 1609814 [details]
another instance of stale file log's with v6.5
Just adding another sample, of the stale file error.
probably the entry you're interested in is at; lease-08.dc01.adsolutions-data-gfs-bricks-bricka-ovirt-data.log
with line;
[2019-08-30 03:48:16.689485] E [MSGID: 113002] [posix-entry-ops.c:323:posix_lookup] 1-ovirt-data-posix: buf->ia_gfid is null for /data0/gfs/bricks/brick1/ovirt-data/.shard/8a27b91a-ff02-42dc-bd4c-caa019424de8.1682 [No data available]
[2019-08-30 03:48:16.689544] E [MSGID: 115050] [server-rpc-fops_v2.c:158:server4_lookup_cbk] 0-ovirt-data-server: 20674155: LOOKUP /.shard/8a27b91a-ff02-42dc-bd4c-caa019424de8.1682 (be318638-e8a0-4c6d-977d-7a937aa84806/8a27b91a-ff02-42dc-bd4c-caa019424de8.1682), client: CTX_ID:e00800f6-01d4-4d32-b374-1cd8f82dae57-GRAPH_ID:0-PID:18060-HOST:lease-07.dc01.adsolutions-PC_NAME:ovirt-data-client-9-RECON_NO:-0, error-xlator: ovirt-data-posix [No data available]
this error seems different than the other encounters, which usually logged something like;
[2019-07-17 01:21:52.768672] I [MSGID: 113030] [posix-entry-ops.c:1146:posix_unlink] 0-ovirt-kube-posix: open-fd-key-status: 0 for /data/gfs/bricks/brick1/ovirt-kube/.shard/a38d64bc-a28b-4ee1-a0bb-f919e7a1022c.111756
[2019-07-17 01:21:52.768713] I [MSGID: 113031] [posix-entry-ops.c:1053:posix_skip_non_linkto_unlink] 0-posix: linkto_xattr status: 0 for /data/gfs/bricks/brick1/ovirt-kube/.shard/a38d64bc-a28b-4ee1-a0bb-f919e7a1022c.111756
Hello, We're also seeing this issue with some frequency. I've been unable to exactly reproduce the steps, but can attest that it seems to be more prevalent when the volume is closer to full. Seemingly anywhere about 80% or 90% and the issue appears. Heres a log of our instance: [2019-12-03 18:16:14.623264] E [MSGID: 109040] [dht-helper.c:1365:dht_migration_complete_check_task] 0-scratch-dht: c9b3d9cd-ffb2-4ea3-af8a-dff45151f86d: failed to lookup the file on scratch-dht [Stale file handle] [2019-12-03 18:16:14.626687] W [fuse-bridge.c:2830:fuse_readv_cbk] 0-glusterfs-fuse: 189341727: READ => -1 gfid=4068dcae-a9d1-4429-aa86-621f9a242fd1 fd=0x7fe552460b88 (Stale file handle) [2019-12-03 18:16:14.639266] W [fuse-bridge.c:2830:fuse_readv_cbk] 0-glusterfs-fuse: 189341728: READ => -1 gfid=4068dcae-a9d1-4429-aa86-621f9a242fd1 fd=0x7fe552460b88 (Stale file handle) [2019-12-03 18:16:14.639405] W [fuse-bridge.c:2830:fuse_readv_cbk] 0-glusterfs-fuse: 189341729: READ => -1 gfid=4068dcae-a9d1-4429-aa86-621f9a242fd1 fd=0x7fe552460b88 (Stale file handle) [2019-12-03 18:16:14.640918] W [fuse-bridge.c:2830:fuse_readv_cbk] 0-glusterfs-fuse: 189341741: READ => -1 gfid=4068dcae-a9d1-4429-aa86-621f9a242fd1 fd=0x7fe552460b88 (Stale file handle) [2019-12-03 18:16:14.642831] W [fuse-bridge.c:2830:fuse_readv_cbk] 0-glusterfs-fuse: 189341730: READ => -1 gfid=4068dcae-a9d1-4429-aa86-621f9a242fd1 fd=0x7fe552460b88 (Stale file handle) [2019-12-03 18:16:18.023463] W [fuse-bridge.c:2830:fuse_readv_cbk] 0-glusterfs-fuse: 189342328: READ => -1 gfid=4068dcae-a9d1-4429-aa86-621f9a242fd1 fd=0x7fe552460b88 (Stale file handle) We're running Gluster 6.5 on Amazon Linux 2 which is a CentOS 7 variant. Our shard size is set to 512MB if that matters, and we have a 3x2 setup for this volume, totaling 14TB. Thanks! Tim Another bit of detail on this; for some of this at least seems to be persisting in the client as well. I went through each of the servers and cleaned up the empty shard files, but one of my clients continues to have the above error, while one of them does not. I tried to clear cache running: echo 3 | sudo tee /proc/sys/vm/drop_caches But the problem still continued on that client. So it does seem that there is perhaps some issue on the client side only as well? Or at least there is some confused state that doesn't get flushed once the issue on the server is corrected. Hi Tim, I'm sorry i forgot to mention the procedure changed for me with newer gluster versions. First it was enough to just cleanup the stale file's, but now after cleaning the files i always had to run a; gluster volume rebalance [volume name] start. before the it was fixed. So you might want to try a rebalance to see if it resolves the issue. Best Olaf Created attachment 1642453 [details] different stale file Today i found a completely different scenario on which stale files could occur. While i was investigating why the docker image of gitlab (https://hub.docker.com/r/gitlab/gitlab-ce) couldn't start on a regular mount. It just fails on the postgresql initialisation. I've attempted this several times before all the way back on gluster 3.12.15 (the latest tests were performed on gluster 6.6). I decided to give it different route and create a new volume with the options; network.remote-dio off performance.strict-o-direct on cluster.choose-local off interestingly enough this allowed the postgresql initialisation to complete, and the container to fully start-up and being functional. To test further i also wanted to see if it would work with sharding, since none my other docker containers work with a gluster volume which is shared.So i created another volume, with sharding, and this also worked well. Except that both volumes report a (the same) stale file on;/gitlab/data/postgresql/data/pg_stat_tmp/global.stat ([fuse-bridge.c:1509:fuse_fd_cbk] 0-glusterfs-fuse: 13945691: OPEN() /gitlab/data/postgresql/data/pg_stat_tmp/global.stat => -1 (Stale file handle)) i couldn't directly find anything in the logs, but i've attached it all, including the docker stack file used to deploy gitlab (which is one of the most bloated containers i've seen, running many services inside) For the record; ============================ on this volume gitlab wouldn't start, since it wouldn't go passed the postgresql initialisation; gluster v info docker2 Volume Name: docker2 Type: Distributed-Replicate Volume ID: 4e0670a0-3d00-4360-98bd-3da844cedae5 Status: Started Snapshot Count: 0 Number of Bricks: 3 x (2 + 1) = 9 Transport-type: tcp Bricks: Brick1: 10.201.0.5:/data0/gfs/bricks/brick1/docker2 Brick2: 10.201.0.9:/data0/gfs/bricks/brick1/docker2 Brick3: 10.201.0.3:/data0/gfs/bricks/bricka/docker2 (arbiter) Brick4: 10.201.0.6:/data0/gfs/bricks/brick1/docker2 Brick5: 10.201.0.7:/data0/gfs/bricks/brick1/docker2 Brick6: 10.201.0.4:/data0/gfs/bricks/bricka/docker2 (arbiter) Brick7: 10.201.0.1:/data0/gfs/bricks/brick1/docker2 Brick8: 10.201.0.8:/data0/gfs/bricks/brick1/docker2 Brick9: 10.201.0.2:/data0/gfs/bricks/bricka/docker2 (arbiter) Options Reconfigured: performance.cache-size: 128MB transport.address-family: inet nfs.disable: on cluster.brick-multiplex: on ============================ commands used to create none-sharded volume; gluster volume create docker2-dio replica 3 arbiter 1 10.201.0.1:/data0/gfs/bricks/brick1/docker2-dio 10.201.0.5:/data0/gfs/bricks/brick1/docker2-dio 10.201.0.6:/data0/gfs/bricks/bricka/docker2-dio gluster volume set docker2-dio network.remote-dio off gluster volume set docker2-dio performance.strict-o-direct on gluster volume set docker2-dio cluster.choose-local off gluster v start docker2-dio gluster v info docker2-dio Volume Name: docker2-dio Type: Replicate Volume ID: 54b3a0dc-20a9-4d29-a7ea-bd7cd8500b91 Status: Started Snapshot Count: 0 Number of Bricks: 1 x (2 + 1) = 3 Transport-type: tcp Bricks: Brick1: 10.201.0.1:/data0/gfs/bricks/brick1/docker2-dio Brick2: 10.201.0.5:/data0/gfs/bricks/brick1/docker2-dio Brick3: 10.201.0.6:/data0/gfs/bricks/bricka/docker2-dio (arbiter) Options Reconfigured: cluster.choose-local: off performance.strict-o-direct: on network.remote-dio: off transport.address-family: inet nfs.disable: on performance.client-io-threads: off cluster.brick-multiplex: on ============================ commands used to create sharded volume; gluster volume create docker2-dio-shrd replica 3 arbiter 1 10.201.0.1:/data0/gfs/bricks/brick1/docker2-dio-shrd 10.201.0.5:/data0/gfs/bricks/brick1/docker2-dio-shrd 10.201.0.6:/data0/gfs/bricks/bricka/docker2-dio-shrd gluster volume set docker2-dio-shrd network.remote-dio off gluster volume set docker2-dio-shrd performance.strict-o-direct on gluster volume set docker2-dio-shrd cluster.choose-local off gluster volume set docker2-dio-shrd features.shard on gluster v start docker2-dio-shrd gluster v info docker2-dio-shrd Volume Name: docker2-dio-shrd Type: Replicate Volume ID: 50300e86-56d6-4e2d-be61-944f562e58f1 Status: Started Snapshot Count: 0 Number of Bricks: 1 x (2 + 1) = 3 Transport-type: tcp Bricks: Brick1: 10.201.0.1:/data0/gfs/bricks/brick1/docker2-dio-shrd Brick2: 10.201.0.5:/data0/gfs/bricks/brick1/docker2-dio-shrd Brick3: 10.201.0.6:/data0/gfs/bricks/bricka/docker2-dio-shrd (arbiter) Options Reconfigured: features.shard: on cluster.choose-local: off performance.strict-o-direct: on network.remote-dio: off transport.address-family: inet nfs.disable: on performance.client-io-threads: off cluster.brick-multiplex: on ============================ available space; df -h|grep docker2-dio 10.201.0.5:/docker2-dio 5.4T 2.6T 2.9T 47% /mnt/docker2-dio 10.201.0.5:/docker2-dio-shrd 5.4T 2.6T 2.9T 47% /mnt/docker2-dio-shrd I forgot to mention that i'm able to stat the file (/gitlab/data/postgresql/data/pg_stat_tmp/global.stat) and hexdump it..despite all stale file warnings; stat /mnt/docker2-dio-shrd/gitlab/data/postgresql/data/pg_stat_tmp/global.stat File: ‘/mnt/docker2-dio-shrd/gitlab/data/postgresql/data/pg_stat_tmp/global.stat’ Size: 977 Blocks: 2 IO Block: 131072 regular file Device: 50h/80d Inode: 12028693131041868837 Links: 1 Access: (0600/-rw-------) Uid: ( 996/ gluster) Gid: ( 996/ chrony) Context: system_u:object_r:fusefs_t:s0 Access: 2019-12-05 19:39:05.173555251 +0000 Modify: 2019-12-05 19:39:05.195042075 +0000 Change: 2019-12-05 19:39:05.196744057 +0000 hexdump /mnt/docker2-dio-shrd/gitlab/data/postgresql/data/pg_stat_tmp/global.stat |head 0000000 bc9d 01a5 8b56 e066 3bf8 0002 0104 0000 0000010 0000 0000 0002 0000 0000 0000 006c 0000 0000020 0000 0000 0e5f 0000 0000 0000 000b 0000 0000030 0000 0000 0601 0000 0000 0000 0000 0000 0000040 0000 0000 148e 0000 0000 0000 0000 0000 0000050 0000 0000 5c53 0018 0000 0000 8d64 ae66 0000060 3be6 0002 0000 0000 0000 0000 0000 0000 0000070 0000 0000 0000 0000 0000 0000 0000 0000 * 00000e0 0000 0000 8d64 ae66 3be6 0002 0144 0000 This bug is moved to https://github.com/gluster/glusterfs/issues/980, and will be tracked there from now on. Visit GitHub issues URL for further details |