Bug 1532868
Summary: | gluster upgrade causes vm disk errors | |||
---|---|---|---|---|
Product: | [Community] GlusterFS | Reporter: | bugs | |
Component: | bitrot | Assignee: | bugs <bugs> | |
Status: | CLOSED CURRENTRELEASE | QA Contact: | ||
Severity: | medium | Docs Contact: | bugs <bugs> | |
Priority: | unspecified | |||
Version: | 3.12 | CC: | awestbro, bkunal, bugs, bugs, dgallowa, dwojslaw, kdhananj, rassilon, rgowdapp, srangana | |
Target Milestone: | --- | |||
Target Release: | --- | |||
Hardware: | x86_64 | |||
OS: | Linux | |||
Whiteboard: | ||||
Fixed In Version: | glusterfs-3.12.5 | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 1598788 (view as bug list) | Environment: | ||
Last Closed: | 2018-10-23 14:44:42 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 1598788 |
Description
bugs
2018-01-09 22:56:11 UTC
We recently updated the cluster to 3.12.4. Images that had the problem still do. Just figured this is an interesting point since one of the major fixes in 3.12.4 was a shard issue. (In reply to bugs from comment #0) > Description of problem: > We have a gluster replica x3 that is used for vm storage. After upgrading > from 3.10 to 3.12 ovirt started logging "VM <vm name> has been paused due to > unknown storage error." and pausing the effected vms. In order to upgrade I > used the gluster upgrade guide. > > After looking into it some more I found the following lines in our mount > log. To me this seems that there is an issue with sharding. Note copying the > vm image and using the new one seems to fix the problem > > [2018-01-09 15:45:47.098380] E > [shard.c:426:shard_modify_size_and_block_count] > (-->/usr/lib64/glusterfs/3.12.3/xlator/cluster/distribute.so(+0x6abed) > [0x7f7241b3dbed] > -->/usr/lib64/glusterfs/3.12.3/xlator/features/shard.so(+0xbafe) > [0x7f72418bbafe] > -->/usr/lib64/glusterfs/3.12.3/xlator/features/shard.so(+0xb35b) > [0x7f72418bb35b] ) 0-tortoise-shard: Failed to get > trusted.glusterfs.shard.file-size for 6493ab88-f4a8-4696-a52e-6425a595fc80 > [2018-01-09 15:45:47.098419] W [fuse-bridge.c:874:fuse_attr_cbk] > 0-glusterfs-fuse: 4369015: STAT() > /aee19709-5859-4e48-b761-c4f8a140ea61/images/855cfe92-22ad-4f40-94f8- > 2547fa5e0f8e/e29713bf-557a-4943-a7c5-c29edc141c01 => -1 (Invalid argument) > > Version-Release number of selected component (if applicable): > centos-release-gluster310.noarch 1.0-1.el7.centos @extras > > centos-release-gluster312.noarch 1.0-1.el7.centos @extras > > glusterfs.x86_64 3.12.3-1.el7 > @centos-gluster312 > glusterfs-api.x86_64 3.12.3-1.el7 > @centos-gluster312 > glusterfs-cli.x86_64 3.12.3-1.el7 > @centos-gluster312 > glusterfs-client-xlators.x86_64 3.12.3-1.el7 > @centos-gluster312 > glusterfs-fuse.x86_64 3.12.3-1.el7 > @centos-gluster312 > glusterfs-gnfs.x86_64 3.12.3-1.el7 > @centos-gluster312 > glusterfs-libs.x86_64 3.12.3-1.el7 > @centos-gluster312 > glusterfs-server.x86_64 3.12.3-1.el7 > @centos-gluster312 > libntirpc.x86_64 1.5.3-1.el7 > @centos-gluster312 > nfs-ganesha.x86_64 2.5.3-1.el6 > @centos-gluster312 > nfs-ganesha-gluster.x86_64 2.5.3-1.el6 > @centos-gluster312 > userspace-rcu.x86_64 0.10.0-3.el7 > @centos-gluster312 > > previous version: > Dec 06 03:50:42 Updated: glusterfs-libs.x86_64 3.10.8-1.el7 > Dec 06 03:50:42 Updated: glusterfs.x86_64 3.10.8-1.el7 > Dec 06 03:50:42 Updated: glusterfs-client-xlators.x86_64 3.10.8-1.el7 > Dec 06 03:50:42 Updated: glusterfs-api.x86_64 3.10.8-1.el7 > Dec 06 03:50:42 Updated: glusterfs-fuse.x86_64 3.10.8-1.el7 > Dec 06 03:50:42 Updated: glusterfs-cli.x86_64 3.10.8-1.el7 > Dec 06 03:50:45 Updated: glusterfs-server.x86_64 3.10.8-1.el7 > Dec 06 03:50:45 Updated: glusterfs-ganesha.x86_64 3.10.8-1.el > > How reproducible: > Happened on multiple vm images. I do not have another cluster to try it on > > Steps to Reproduce: > 1. > 2. > 3. > > Actual results: > > > Expected results: > > > Additional info: > Volume Name: tortoise > Type: Replicate > Volume ID: d4c00537-f1e8-4c43-b21d-90c9a6c5dee9 > Status: Started > Snapshot Count: 0 > Number of Bricks: 1 x (2 + 1) = 3 > Transport-type: tcp > Bricks: > Brick1: stor1.example.com:/tort/brick > Brick2: stor2.example.com:/tort/brick > Brick3: stor3.example.com:/tort/brick (arbiter) > Options Reconfigured: > user.cifs: off > features.shard: on > cluster.shd-wait-qlength: 10000 > cluster.shd-max-threads: 8 > cluster.locking-scheme: granular > cluster.data-self-heal-algorithm: full > cluster.server-quorum-type: server > cluster.eager-lock: enable > performance.low-prio-threads: 32 > performance.stat-prefetch: off > performance.io-cache: off > performance.read-ahead: off > performance.quick-read: off > auth.allow: * > network.ping-timeout: 10 > cluster.quorum-type: auto > server.allow-insecure: on > storage.owner-gid: 36 > storage.owner-uid: 36 > performance.strict-o-direct: enable > network.remote-dio: enable > transport.address-family: inet > nfs.disable: on > features.shard-block-size: 256MB > cluster.enable-shared-storage: disable 1. So this issue was seen *after* the upgrade and not *during* the same right? 2. Were you running rebalance/remove-brick when you saw this issue? 3. If this bug is consistently reproducible, could you please get the tcpdump output of the client machine and share the same along with the newly generated client and brick logs? Here's what you need to do to capture tcpdump output: tcpdump -i <network interface name> -w <output-filename>.pcap -Krutika > 1. So this issue was seen *after* the upgrade and not *during* the same > right? Yes we started experiencing this issue after our cluster was upgraded to 3.12.3 > 2. Were you running rebalance/remove-brick when you saw this issue? no the cluster was online and was not in a healing state > 3. If this bug is consistently reproducible, could you please get the > tcpdump output of the client machine and share the same along with the newly > generated client and brick logs? > Here's what you need to do to capture tcpdump output: > tcpdump -i <network interface name> -w <output-filename>.pcap This bug is reproducible in the fact that some of our vms will get paused at some point before copying the disks, and if we do not make a new image it will continue happening. I have no guarantee of how much data would be necessary, also the tcpdump would include vm image files. > > > -Krutika I just hit this bug a few hours after upgrading from glusterfs-3.8.4-18.el7rhgs.x86_64 to 3.8.4-54.el7rhgs.x86_64. The VM started back up fine after manually setting it to Up. I have sosreports from the RHV Manager, the hypervisor, and one of the Gluster nodes if it'd help. === RHV Manager === 2018-02-22 19:29:17,200Z INFO [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer] (ForkJoinPool-1-worker-4) [] VM 'da38e6f8-9ccb-4ce0-95f0-afc98a171386'(teuthology) moved from 'Up' --> 'Paused' 2018-02-22 19:29:17,243Z INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (ForkJoinPool-1-worker-4) [] EVENT_ID: VM_PAUSED(1,025), Correlation ID: null, Call Stack: null, Custom ID: null, Custom Event ID: -1, Message: VM teuthology has been paused. 2018-02-22 19:29:17,290Z ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (ForkJoinPool-1-worker-4) [] EVENT_ID: VM_PAUSED_ERROR(139), Correlation ID: null, Call Stack: null, Custom ID: null, Custom Event ID: -1, Message: VM teuthology has been paused due to unknown storage error. === Hypervisor Gluster log === [2018-02-22 19:29:17.197805] E [shard.c:426:shard_modify_size_and_block_count] (-->/usr/lib64/glusterfs/3.8.4/xlator/cluster/distribute.so(+0x6a26d) [0x7f854855926d] -->/usr/lib64/glusterfs/3.8.4/xlator/features/shard.so(+0xb38e) [0x7f85482d838e] -->/usr/lib64/glusterfs/3.8.4/xlator/features/shard.so(+0xac3b) [0x7f85482d7c3b] ) 0-ssdstorage-shard: Failed to get trusted.glusterfs.shard.file-size for 4a9b76ea-b373-4129-adcb-3a9b04b14ea1 [2018-02-22 19:29:17.197840] W [fuse-bridge.c:767:fuse_attr_cbk] 0-glusterfs-fuse: 686161812: STAT() /ba9d818f-aa63-4a96-a9be-8d50d04fe44e/images/e4493c34-2805-41ed-8a1b-bfc1a6adad93/6f0ab3fe-e64f-4ad8-8a25-a0881e9713c3 => -1 (Invalid argument) Nothing relevant in vdsm.log aside from this one line: 2018-02-22 20:29:17,108+0000 INFO (periodic/314) [vdsm.api] START getVolumeSize(sdUUID=u'ba9d818f-aa63-4a96-a9be-8d50d04fe44e', spUUID=u'28fc87ad-2e28-44d2-8ce4-2e63b9bad4c6', imgUUID=u'e4493c34-2805-41ed-8a1b-bfc1a6adad93', volUUID=u'6f0ab3fe-e64f-4ad8-8a25-a0881e9713c3', options=None) from=internal, task_id=15aec0cf-8854-4959-9e14-3bb14a00dc63 (api:46) (In reply to David Galloway from comment #4) > I just hit this bug a few hours after upgrading from > glusterfs-3.8.4-18.el7rhgs.x86_64 to 3.8.4-54.el7rhgs.x86_64. > > The VM started back up fine after manually setting it to Up. > > I have sosreports from the RHV Manager, the hypervisor, and one of the > Gluster nodes if it'd help. > > === RHV Manager === > > 2018-02-22 19:29:17,200Z INFO > [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer] > (ForkJoinPool-1-worker-4) [] VM > 'da38e6f8-9ccb-4ce0-95f0-afc98a171386'(teuthology) moved from 'Up' --> > 'Paused' > 2018-02-22 19:29:17,243Z INFO > [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] > (ForkJoinPool-1-worker-4) [] EVENT_ID: VM_PAUSED(1,025), Correlation ID: > null, Call Stack: null, Custom ID: null, Custom Event ID: -1, Message: VM > teuthology has been paused. > 2018-02-22 19:29:17,290Z ERROR > [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] > (ForkJoinPool-1-worker-4) [] EVENT_ID: VM_PAUSED_ERROR(139), Correlation ID: > null, Call Stack: null, Custom ID: null, Custom Event ID: -1, Message: VM > teuthology has been paused due to unknown storage error. > > === Hypervisor Gluster log === > > [2018-02-22 19:29:17.197805] E > [shard.c:426:shard_modify_size_and_block_count] > (-->/usr/lib64/glusterfs/3.8.4/xlator/cluster/distribute.so(+0x6a26d) > [0x7f854855926d] > -->/usr/lib64/glusterfs/3.8.4/xlator/features/shard.so(+0xb38e) > [0x7f85482d838e] > -->/usr/lib64/glusterfs/3.8.4/xlator/features/shard.so(+0xac3b) > [0x7f85482d7c3b] ) 0-ssdstorage-shard: Failed to get > trusted.glusterfs.shard.file-size for 4a9b76ea-b373-4129-adcb-3a9b04b14ea1 > [2018-02-22 19:29:17.197840] W [fuse-bridge.c:767:fuse_attr_cbk] > 0-glusterfs-fuse: 686161812: STAT() > /ba9d818f-aa63-4a96-a9be-8d50d04fe44e/images/e4493c34-2805-41ed-8a1b- > bfc1a6adad93/6f0ab3fe-e64f-4ad8-8a25-a0881e9713c3 => -1 (Invalid argument) > > > Nothing relevant in vdsm.log aside from this one line: > > 2018-02-22 20:29:17,108+0000 INFO (periodic/314) [vdsm.api] START > getVolumeSize(sdUUID=u'ba9d818f-aa63-4a96-a9be-8d50d04fe44e', > spUUID=u'28fc87ad-2e28-44d2-8ce4-2e63b9bad4c6', > imgUUID=u'e4493c34-2805-41ed-8a1b-bfc1a6adad93', > volUUID=u'6f0ab3fe-e64f-4ad8-8a25-a0881e9713c3', options=None) > from=internal, task_id=15aec0cf-8854-4959-9e14-3bb14a00dc63 (api:46) If this bug is consistently reproducible, could you please get the tcpdump output of the gluster client machine and share the same along with the newly generated client and brick logs? Here's what you need to do to capture tcpdump output: tcpdump -i <network interface name> -w <output-filename>.pcap http://file.rdu.redhat.com/~dgallowa/bz1532868-2min.pcap The pause happens at 16:25:17. I unpaused it this morning and the VM paused again within minutes so I'm consistently able to reproduce at this time. This unfortunately means all of Ceph's upstream testing has ground to a halt, however. [root@ssdstore02 9b]# getfattr -d -m . -e hex 4a9b76ea-b373-4129-adcb-3a9b04b14ea1 # file: 4a9b76ea-b373-4129-adcb-3a9b04b14ea1 security.selinux=0x73797374656d5f753a6f626a6563745f723a756e6c6162656c65645f743a733000 trusted.afr.dirty=0x000000000000000000000000 trusted.afr.ssdstorage-client-0=0x000000000000000000000000 trusted.afr.ssdstorage-client-1=0x000000000000000000000000 trusted.afr.ssdstorage-client-2=0x000000000000000000000000 trusted.bit-rot.version=0x020000000000000059122de60002a5dc trusted.gfid=0x4a9b76eab3734129adcb3a9b04b14ea1 trusted.glusterfs.shard.block-size=0x0000000000400000 trusted.glusterfs.shard.file-size=0x000000fa0000000000000000000000000000000077dd75100000000000000000 More logs: file.rdu.redhat.com/~dgallowa/bz1532868/ Thanks David. This issue is caused due to a bug in bit-rot. The patch at https://review.gluster.org/#/c/19021/ should fix this. There is evidence in the logs to suggest everytime there was a log from shard about missing trusted.glusterfs.shard.file-size xattr, there was a metadata heal on the same file preceding it. There is also evidence in tcpdump shared by David that suggests in one rare instance of STAT, the xattr trusted.glusterfs.shard.file-size xattr was not returned by the brick and right then the pause happened. In upstream this fix is available since glusterfs-3.12.5. You can work around the issue by disabling metadata-self-heal: # gluster volume set <VOL> cluster.metadata-self-heal off Could you please confirm if this fixed the issue for you? -Krutika Bug stands fixed in 3.12.5 as per comment #8, closing the bug. The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days |