Bug 1836233

Summary:	RHHI-V 1.6: one host becomes non-operational
Product:	[Red Hat Storage] Red Hat Gluster Storage	Reporter:	SATHEESARAN <sasundar>
Component:	sharding	Assignee:	Krutika Dhananjay <kdhananj>
Status:	CLOSED ERRATA	QA Contact:	SATHEESARAN <sasundar>
Severity:	high	Docs Contact:
Priority:	high
Version:	rhgs-3.5	CC:	atoborek, dberry, kdhananj, mpandey, pprakash, puebele, rafrojas, rhs-bugs, rkothiya, sheggodu, storage-qa-internal
Target Milestone:	---	Keywords:	ZStream
Target Release:	RHGS 3.5.z Batch Update 2
Hardware:	x86_64
OS:	Linux
Whiteboard:
Fixed In Version:	glusterfs-6.0-35	Doc Type:	No Doc Update
Doc Text:		Story Points:	---
Clone Of:	1815192	Environment:
Last Closed:	2020-06-16 06:20:09 UTC	Type:	---
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:
Bug Blocks:	1815192

Description SATHEESARAN 2020-05-15 13:00:04 UTC

+++ This bug was initially created as a clone of Bug #1815192 +++

Description of problem:
One host (of six) became non-operational in the RHV GUI this week. This seems to be the first instance of the error:

[root@dell-r640-02 vdsm]# xzcat vdsm.log.69.xz | less
...
2020-03-16 12:40:46,953-0400 INFO  (jsonrpc/6) [jsonrpc.JsonRpcServer] RPC call Host.getAllVmIoTunePolicies succeeded in 0.00 seconds (__init__:312)
2020-03-16 12:40:47,351-0400 ERROR (check/loop) [storage.Monitor] Error checking path /rhev/data-center/mnt/glusterSD/dell-r640-01.gluster.tamlab.rdu2.redhat.com:_vmstore/2881e321-b95b-40c8-9874-a21a8eeb3167/dom
_md/metadata (monitor:499)
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/vdsm/storage/monitor.py", line 497, in _pathChecked
    delay = result.delay()
  File "/usr/lib/python2.7/site-packages/vdsm/storage/check.py", line 391, in delay
    raise exception.MiscFileReadException(self.path, self.rc, self.err)
MiscFileReadException: Internal file read failure: (u'/rhev/data-center/mnt/glusterSD/dell-r640-01.gluster.tamlab.rdu2.redhat.com:_vmstore/2881e321-b95b-40c8-9874-a21a8eeb3167/dom_md/metadata', 1, 'Read timeout'
)
2020-03-16 12:40:47,352-0400 INFO  (check/loop) [storage.Monitor] Domain 2881e321-b95b-40c8-9874-a21a8eeb3167 became INVALID (monitor:470)
...
2020-03-16 12:41:49,805-0400 INFO  (periodic/2009) [vdsm.api] START repoStats(domains=()) from=internal, task_id=8b1ff4a9-ac21-4ec4-b777-55ddda3fe03e (api:48)
2020-03-16 12:41:49,806-0400 INFO  (periodic/2009) [vdsm.api] FINISH repoStats return={u'a102ec6f-31b7-4503-87a8-d581f921099f': {'code': 0, 'actual': True, 'version': 5, 'acquired': True, 'delay': '0.000721486',
 'lastCheck': '5.8', 'valid': True}, u'689445af-cc20-4f4d-80f1-cd6cc1ec6bca': {'code': 0, 'actual': True, 'version': 0, 'acquired': True, 'delay': '0.000332825', 'lastCheck': '2.4', 'valid': True}, u'2881e321-b9
5b-40c8-9874-a21a8eeb3167': {'code': 0, 'actual': True, 'version': 5, 'acquired': True, 'delay': '0.000685773', 'lastCheck': '11.9', 'valid': True}, u'05b3133d-04da-4f2b-b289-21e2da94b6e1': {'code': 0, 'actual':
 True, 'version': 0, 'acquired': True, 'delay': '0.000273442', 'lastCheck': '7.2', 'valid': True}} from=internal, task_id=8b1ff4a9-ac21-4ec4-b777-55ddda3fe03e (api:54)
2020-03-16 12:41:50,544-0400 ERROR (check/loop) [storage.Monitor] Error checking path /rhev/data-center/mnt/glusterSD/dell-r640-01.gluster.tamlab.rdu2.redhat.com:_vmstore/2881e321-b95b-40c8-9874-a21a8eeb3167/dom
_md/metadata (monitor:499)
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/vdsm/storage/monitor.py", line 497, in _pathChecked
    delay = result.delay()
  File "/usr/lib/python2.7/site-packages/vdsm/storage/check.py", line 391, in delay
    raise exception.MiscFileReadException(self.path, self.rc, self.err)
MiscFileReadException: Internal file read failure: (u'/rhev/data-center/mnt/glusterSD/dell-r640-01.gluster.tamlab.rdu2.redhat.com:_vmstore/2881e321-b95b-40c8-9874-a21a8eeb3167/dom_md/metadata', 1, bytearray(b"/u
sr/bin/dd: failed to open \'/rhev/data-center/mnt/glusterSD/dell-r640-01.gluster.tamlab.rdu2.redhat.com:_vmstore/2881e321-b95b-40c8-9874-a21a8eeb3167/dom_md/metadata\': Software caused connection abort\n"))
2020-03-16 12:41:50,544-0400 INFO  (check/loop) [storage.Monitor] Domain 2881e321-b95b-40c8-9874-a21a8eeb3167 became INVALID (monitor:470)
2020-03-16 12:41:50,545-0400 ERROR (monitor/2881e32) [storage.Monitor] Error checking domain 2881e321-b95b-40c8-9874-a21a8eeb3167 (monitor:425)
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/vdsm/storage/monitor.py", line 406, in _checkDomainStatus
    self.domain.selftest()
  File "/usr/lib/python2.7/site-packages/vdsm/storage/fileSD.py", line 717, in selftest
    self.oop.os.statvfs(self.domaindir)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/outOfProcess.py", line 245, in statvfs
    return self._iop.statvfs(path)
  File "/usr/lib/python2.7/site-packages/ioprocess/__init__.py", line 479, in statvfs
    resdict = self._sendCommand("statvfs", {"path": path}, self.timeout)
  File "/usr/lib/python2.7/site-packages/ioprocess/__init__.py", line 448, in _sendCommand
    raise OSError(errcode, errstr)
OSError: [Errno 103] Software caused connection abort
2020-03-16 12:41:50,546-0400 INFO  (jsonrpc/3) [storage.StorageDomainManifest] Renaming dir /rhev/data-center/mnt/glusterSD/dell-r640-01.gluster.tamlab.rdu2.redhat.com:_vmstore/2881e321-b95b-40c8-9874-a21a8eeb3167/images/8b1efd6b-e636-4699-bbc9-f5c813d3dd89 to /rhev/data-center/mnt/glusterSD/dell-r640-01.gluster.tamlab.rdu2.redhat.com:_vmstore/2881e321-b95b-40c8-9874-a21a8eeb3167/images/_remove_me_8b1efd6b-e636-4699-bbc9-f5c813d3dd89 (fileSD:240)
2020-03-16 12:41:50,547-0400 ERROR (jsonrpc/3) [storage.StorageDomainManifest] image: /rhev/data-center/mnt/glusterSD/dell-r640-01.gluster.tamlab.rdu2.redhat.com:_vmstore/2881e321-b95b-40c8-9874-a21a8eeb3167/images/8b1efd6b-e636-4699-bbc9-f5c813d3dd89 can't be moved (fileSD:244)
2020-03-16 12:41:50,548-0400 INFO  (jsonrpc/3) [vdsm.api] FINISH deleteImage error=Could not remove all image's volumes: (u'8b1efd6b-e636-4699-bbc9-f5c813d3dd89 [Errno 107] Transport endpoint is not connected',) from=::ffff:10.8.117.7,56200, flow_id=c087f480-fa93-4a76-ae9d-e4a6d58f3814, task_id=f49e527e-e8f8-4366-bfb7-beeb0238dba8 (api:52)
2020-03-16 12:41:50,548-0400 ERROR (jsonrpc/3) [storage.TaskManager.Task] (Task='f49e527e-e8f8-4366-bfb7-beeb0238dba8') Unexpected error (task:875)
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line 882, in _run
    return fn(*args, **kargs)
  File "<string>", line 2, in deleteImage
  File "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line 50, in method
    ret = func(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/hsm.py", line 1527, in deleteImage
    pool.deleteImage(dom, imgUUID, volsByImg)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py", line 79, in wrapper
    return method(self, *args, **kwargs)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/sp.py", line 2003, in deleteImage
    domain.deleteImage(domain.sdUUID, imgUUID, volsByImg)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/sd.py", line 740, in deleteImage
    self._manifest.deleteImage(sdUUID, imgUUID, volsImgs)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/fileSD.py", line 245, in deleteImage
    raise se.ImageDeleteError("%s %s" % (imgUUID, str(e)))
ImageDeleteError: Could not remove all image's volumes: (u'8b1efd6b-e636-4699-bbc9-f5c813d3dd89 [Errno 107] Transport endpoint is not connected',)
2020-03-16 12:41:50,548-0400 INFO  (jsonrpc/3) [storage.TaskManager.Task] (Task='f49e527e-e8f8-4366-bfb7-beeb0238dba8') aborting: Task is aborted: "Could not remove all image's volumes: (u'8b1efd6b-e636-4699-bbc9-f5c813d3dd89 [Errno 107] Transport endpoint is not connected',)" - code 256 (task:1181)
2020-03-16 12:41:50,548-0400 ERROR (jsonrpc/3) [storage.Dispatcher] FINISH deleteImage error=Could not remove all image's volumes: (u'8b1efd6b-e636-4699-bbc9-f5c813d3dd89 [Errno 107] Transport endpoint is not connected',) (dispatcher:83)
2020-03-16 12:41:50,548-0400 INFO  (jsonrpc/3) [jsonrpc.JsonRpcServer] RPC call Image.delete failed (error 256) in 6.66 seconds (__init__:312)

Since then, we are unable to get the host to mount the vmstore storage domain.

[root@dell-r640-02 ~]# df -h
df: ‘/rhev/data-center/mnt/glusterSD/dell-r640-01.gluster.tamlab.rdu2.redhat.com:_vmstore’: Transport endpoint is not connected
Filesystem                                           Size  Used Avail Use% Mounted on
/dev/mapper/rhvh-rhvh--4.3.0.7--0.20190512.0+1        99G  2.0G   92G   3% /
devtmpfs                                              94G     0   94G   0% /dev
tmpfs                                                 94G   16K   94G   1% /dev/shm
tmpfs                                                 94G   99M   94G   1% /run
tmpfs                                                 94G     0   94G   0% /sys/fs/cgroup
/dev/mapper/rhvh-home                                976M  2.6M  907M   1% /home
/dev/sdc1                                            976M  183M  726M  21% /boot
/dev/mapper/rhvh-tmp                                 976M  2.7M  907M   1% /tmp
/dev/mapper/rhvh-var                                  15G  333M   14G   3% /var
/dev/mapper/rhvh-var_crash                           9.8G   37M  9.2G   1% /var/crash
/dev/mapper/rhvh-var_log                              20G  797M   18G   5% /var/log
/dev/mapper/rhvh-var_log_audit                       2.0G   43M  1.8G   3% /var/log/audit
/dev/mapper/gluster_vg-gluster_lv_engine             100G   36G   65G  36% /gluster_bricks/engine
/dev/mapper/gluster_vg-gluster_lv_vmstore            3.5T  3.5T   42G  99% /gluster_bricks/vmstore
dell-r640-01.gluster.tamlab.rdu2.redhat.com:/engine  100G   37G   64G  37% /rhev/data-center/mnt/glusterSD/dell-r640-01.gluster.tamlab.rdu2.redhat.com:_engine
dell-per630-4.gsslab.rdu2.redhat.com:/home/isos      491G  130G  361G  27% /rhev/data-center/mnt/dell-per630-4.gsslab.rdu2.redhat.com:_home_isos
dell-per630-4.gsslab.rdu2.redhat.com:/home/exports   491G  130G  361G  27% /rhev/data-center/mnt/dell-per630-4.gsslab.rdu2.redhat.com:_home_exports
tmpfs                                                 19G     0   19G   0% /run/user/0

We realize that the vmstore SD is full, but none of the other hosts have this issue.

Version-Release number of selected component (if applicable):

[root@dell-r640-02 ~]# rpm -q glusterfs
glusterfs-3.12.2-47.el7rhgs.x86_64

--- Additional comment from RHEL Program Management on 2020-03-19 18:10:37 UTC ---

This bug is automatically being proposed for RHHI-V 1.8 release at Red Hat Hyperconverged Infrastructure for Virtualization product, by setting the release flag 'rhiv‑1.8' to '?'.

If this bug should be proposed for a different release, please manually change the proposed release flag.

--- Additional comment from Krutika Dhananjay on 2020-03-20 12:06:07 UTC ---

What was the size of the VM?

-Krutika

--- Additional comment from Krutika Dhananjay on 2020-03-20 12:50:51 UTC ---

Cancelling needinfo, found the issue.

Upstream tracker - https://github.com/gluster/glusterfs/issues/1127

--- Additional comment from Krutika Dhananjay on 2020-03-23 07:04:18 UTC ---

Upstream patch - https://review.gluster.org/c/glusterfs/+/24244

--- Additional comment from Donald Berry on 2020-03-26 19:15:34 UTC ---

Removed the ghost shards and both nodes became operational. Here is part of the email thread:

Hi Krutika,

1. .remove_me/

The same gfids were in all three hosts (4-6) of that subvolume (brick /rhgs/vmstore), e.g. 4:

[root@dell-r640-04 .remove_me]# ll
total 0
----------. 2 root root 0 Mar 18 20:56 066a8381-42ab-49fe-b18b-31cec5274092
----------. 2 root root 0 Mar 18 18:47 0d3d148a-e0bb-4033-9cc1-a6f0abddf65a
----------. 2 root root 0 Mar 18 18:47 10242631-4de0-478a-bdb4-91bf6420dd2e
----------. 2 root root 0 Mar 18 18:09 1cd8dcd6-3236-4ec0-bcd1-f59d85426ecb
----------. 2 root root 0 Mar 18 18:10 4e62d28e-8910-45a6-9ad8-098b4cf3bb07
----------. 2 root root 0 Mar 18 18:47 5d2b410e-cf1b-4aff-95e5-667450f8aea0
----------. 2 root root 0 Mar 18 18:47 70738a38-5e21-4180-aeec-17fda76991ed
----------. 2 root root 0 Mar 18 18:10 7d7d3fb3-f0fc-472f-93e4-f1d290209ce9
----------. 2 root root 0 Mar 18 18:47 7e9d6fe1-ce21-43a8-8ec6-fb586ba233b2
----------. 2 root root 0 Mar 18 18:47 91ef3b6b-c06d-4b14-aae3-d004bf19a523
----------. 2 root root 0 Mar 18 18:47 92ba4df0-828a-476a-b1d0-a4de40e1e1aa
----------. 2 root root 0 Mar 18 20:56 a78844f2-642e-4e15-a841-293e27edc663
----------. 2 root root 0 Mar 18 18:10 abf696e8-c8e8-480a-bac0-9348241e1d01
----------. 2 root root 0 Mar 18 18:09 adc71f9f-9408-4193-b5f2-f2a0ceb1423b
----------. 2 root root 0 Mar 18 18:09 c186910e-c8f8-4ce0-881d-1da99b7fc48a
----------. 2 root root 0 Mar 24 16:06 cb3ba6bd-b174-44b1-89ba-d8d507c320cb
----------. 2 root root 0 Mar 18 20:56 ed869b0e-470a-4f49-9c06-7e46ed1bf7a0
----------. 2 root root 0 Mar 18 18:47 fbaf5657-f41b-41d5-af7d-bdbfbfa5c305
[root@dell-r640-04 .remove_me]#

There were no gfids in .remove_me/ on hosts 1-3:

[root@dell-r640-01 .remove_me]# ll
total 0
[root@dell-r640-01 .remove_me]#

2. Stale file handle

I checked all 6 hosts, but only two (3, 5) had those messages, and they were for the same gfids that were in .remove_me on 4-6:

[root@dell-r640-03 glusterfs]# awk '/MSGID: 133021/ {print $15}' rhev-data-center-mnt-glusterSD-dell-r640-01.gluster.tamlab.rdu2.redhat.com\:_vmstore.log | sort -u
066a8381-42ab-49fe-b18b-31cec5274092
0d3d148a-e0bb-4033-9cc1-a6f0abddf65a
10242631-4de0-478a-bdb4-91bf6420dd2e
1cd8dcd6-3236-4ec0-bcd1-f59d85426ecb
4e62d28e-8910-45a6-9ad8-098b4cf3bb07
5d2b410e-cf1b-4aff-95e5-667450f8aea0
70738a38-5e21-4180-aeec-17fda76991ed
7d7d3fb3-f0fc-472f-93e4-f1d290209ce9
7e9d6fe1-ce21-43a8-8ec6-fb586ba233b2
91ef3b6b-c06d-4b14-aae3-d004bf19a523
92ba4df0-828a-476a-b1d0-a4de40e1e1aa
a78844f2-642e-4e15-a841-293e27edc663
abf696e8-c8e8-480a-bac0-9348241e1d01
adc71f9f-9408-4193-b5f2-f2a0ceb1423b
c186910e-c8f8-4ce0-881d-1da99b7fc48a
ed869b0e-470a-4f49-9c06-7e46ed1bf7a0
fbaf5657-f41b-41d5-af7d-bdbfbfa5c305

[root@dell-r640-05 glusterfs]# awk '/MSGID: 133021/ {print $15}' rhev-data-center-mnt-glusterSD-dell-r640-01.gluster.tamlab.rdu2.redhat.com\:_vmstore.log | sort -u
066a8381-42ab-49fe-b18b-31cec5274092
0d3d148a-e0bb-4033-9cc1-a6f0abddf65a
10242631-4de0-478a-bdb4-91bf6420dd2e
1cd8dcd6-3236-4ec0-bcd1-f59d85426ecb
4e62d28e-8910-45a6-9ad8-098b4cf3bb07
5d2b410e-cf1b-4aff-95e5-667450f8aea0
70738a38-5e21-4180-aeec-17fda76991ed
7d7d3fb3-f0fc-472f-93e4-f1d290209ce9
7e9d6fe1-ce21-43a8-8ec6-fb586ba233b2
91ef3b6b-c06d-4b14-aae3-d004bf19a523
92ba4df0-828a-476a-b1d0-a4de40e1e1aa
a78844f2-642e-4e15-a841-293e27edc663
abf696e8-c8e8-480a-bac0-9348241e1d01
adc71f9f-9408-4193-b5f2-f2a0ceb1423b
c186910e-c8f8-4ce0-881d-1da99b7fc48a
ed869b0e-470a-4f49-9c06-7e46ed1bf7a0
fbaf5657-f41b-41d5-af7d-bdbfbfa5c305

The other hosts either did not have those messages, or they were from Mar. 22:

[root@dell-r640-01 glusterfs]# grep -cs 'MSGID: 133021' * | grep -v 0$
rhev-data-center-mnt-glusterSD-dell-r640-01.gluster.tamlab.rdu2.redhat.com:_vmstore.log-20200322:338

[root@dell-r640-02 glusterfs]# grep -s 'MSGID: 133021' *
[root@dell-r640-02 glusterfs]#

[root@dell-r640-04 glusterfs]# grep -s 'MSGID: 133021' *
[root@dell-r640-04 glusterfs]#

[root@dell-r640-06 glusterfs]# grep -cs 'MSGID: 133021' * | grep -v 0$
rhev-data-center-mnt-glusterSD-dell-r640-01.gluster.tamlab.rdu2.redhat.com:_vmstore.log-20200322:6147

3. I removed the ghost shards on 4-6:

[root@dell-r640-04 ~]# for gfid in $(ls /rhgs/vmstore/vmstore/.shard/.remove_me); do rm -f /rhgs/vmstore/vmstore/.shard/$gfid*; done
[root@dell-r640-05 ~]# for gfid in $(ls /rhgs/vmstore/vmstore/.shard/.remove_me); do rm -f /rhgs/vmstore/vmstore/.shard/$gfid*; done
[root@dell-r640-06 ~]# for gfid in $(ls /rhgs/vmstore/vmstore/.shard/.remove_me); do rm -f /rhgs/vmstore/vmstore/.shard/$gfid*; done

After a few minutes there were no more gfids in .remove_me/:

[root@dell-r640-04 .remove_me]# ll
total 0
[root@dell-r640-04 .remove_me]#

4. gfids on 1-3

Immediately after deleting the ghost shards on 4-6, nodes 1-3 showed the following shard counts:

create a list:
[root@dell-r640-03 glusterfs]# awk '/MSGID: 133021/ {print $15}' rhev-data-center-mnt-glusterSD-dell-r640-01.gluster.tamlab.rdu2.redhat.com\:_vmstore.log | sort -u > /tmp/gfid

shards:
[root@dell-r640-03 .shard]# for gfid in $(cat /tmp/gfid); do ls /gluster_bricks/vmstore/vmstore/.shard/$gfid* | wc -l; done
2344
22
30
692
949
31
26
934
32
12
29
790
ls: cannot access /gluster_bricks/vmstore/vmstore/.shard/abf696e8-c8e8-480a-bac0-9348241e1d01*: No such file or directory
0
571
355
985
16

After a few minutes, the shard counts were almost 0, but there are still some remaining:

[root@dell-r640-01 .shard]# for gfid in $(cat /tmp/gfid); do echo -n "$gfid: "; ls /gluster_bricks/vmstore/vmstore/.shard/$gfid* 2>/dev/null | wc -l; done
066a8381-42ab-49fe-b18b-31cec5274092: 0
0d3d148a-e0bb-4033-9cc1-a6f0abddf65a: 0
10242631-4de0-478a-bdb4-91bf6420dd2e: 0
1cd8dcd6-3236-4ec0-bcd1-f59d85426ecb: 3 <--
4e62d28e-8910-45a6-9ad8-098b4cf3bb07: 0
5d2b410e-cf1b-4aff-95e5-667450f8aea0: 0
70738a38-5e21-4180-aeec-17fda76991ed: 0
7d7d3fb3-f0fc-472f-93e4-f1d290209ce9: 0
7e9d6fe1-ce21-43a8-8ec6-fb586ba233b2: 0
91ef3b6b-c06d-4b14-aae3-d004bf19a523: 0
92ba4df0-828a-476a-b1d0-a4de40e1e1aa: 0
a78844f2-642e-4e15-a841-293e27edc663: 0
abf696e8-c8e8-480a-bac0-9348241e1d01: 0
adc71f9f-9408-4193-b5f2-f2a0ceb1423b: 7 <--
c186910e-c8f8-4ce0-881d-1da99b7fc48a: 1 <--
ed869b0e-470a-4f49-9c06-7e46ed1bf7a0: 0
fbaf5657-f41b-41d5-af7d-bdbfbfa5c305: 0

Q. Do I need to delete the remaining ghost gfids on nodes 1-3? Do I need to run 'gluster volume heal vmstore full'?

Note, I activated both nodes (2 and 6), and both are still up. Storage/Volumes/vmstore shows 6/6 bricks active.

[root@dell-r640-02 ~]# df /gluster_bricks/vmstore /rhev/data-center/mnt/glusterSD/dell-r640-01.gluster.tamlab.rdu2.redhat.com:_vmstore
Filesystem                                            1K-blocks       Used  Available Use% Mounted on
/dev/mapper/gluster_vg-gluster_lv_vmstore            3757840384 1568428228 2189412156  42% /gluster_bricks/vmstore
dell-r640-01.gluster.tamlab.rdu2.redhat.com:/vmstore 7515680768 3883147280 3632533488  52% /rhev/data-center/mnt/glusterSD/dell-r640-01.gluster.tamlab.rdu2.redhat.com:_vmstore

Thanks.

Don

On 2020-03-26 5:31 a.m., Krutika Dhananjay wrote:
>
>
> On Thu, Mar 26, 2020 at 12:37 AM Donald Berry <dberry <dberry>> wrote:
>
>     Hi Krutika,
>
>     If I check the gfids in .remove_me/, there are many matching entries, e.g. for the first one, 066a8381-42ab-49fe-b18b-31cec5274092 there are 2455 shards:
>
>     [root@dell-r640-06 ~]# cd /rhgs/vmstore/vmstore/.shard/
>     [root@dell-r640-06 .shard]# ll 066a8381-42ab-49fe-b18b-31cec5274092*
>     -rw-rw----. 2 root root 67108864 Sep 26 13:10 066a8381-42ab-49fe-b18b-31cec5274092.1
>     -rw-rw----. 2 root root 67108864 Sep 26 13:10 066a8381-42ab-49fe-b18b-31cec5274092.10
>     -rw-rw----. 2 root root 67108864 Sep 26 13:10 066a8381-42ab-49fe-b18b-31cec5274092.1000
>     ...
>     rw-rw----. 2 root root 67108864 Sep 26 13:10 066a8381-42ab-49fe-b18b-31cec5274092.996
>     -rw-rw----. 2 root root 67108864 Sep 26 13:10 066a8381-42ab-49fe-b18b-31cec5274092.998
>     -rw-rw----. 2 root root 67108864 Sep 26 13:10 066a8381-42ab-49fe-b18b-31cec5274092.999
>     [root@dell-r640-06 .shard]# ll 066a8381-42ab-49fe-b18b-31cec5274092* | wc -l
>     2455
>
> Yes this is expected, they are all sharded files.
>
>     Are you saying we need to remove all of these? Here are the others:
>
>     [root@dell-r640-06 ~]# for gfid in $(ls /rhgs/vmstore/vmstore/.shard/.remove_me); do ls /rhgs/vmstore/vmstore/.shard/$gfid* | wc -l ; done
>     2455
>     21
>     23
>     573
>     970
>     28
>     29
>     985
>     23
>     23
>     24
>     809
>     ls: cannot access /rhgs/vmstore/vmstore/.shard/abf696e8-c8e8-480a-bac0-9348241e1d01*: No such file or directory
>     0
>     571
>     375
>     3
>     1919
>     15
>
>     Also, if I check for dht errors, there is only one error, yesterday at 2:30 AM, which was cleaned up yesterday (shard c186910e-c8f8-4ce0-881d-1da99b7fc48a.480):
>
>
> So here's the thing. That's one part of the solution. But what I'm trying to get at is that there are all these gfids that are listed under .shard/.remove_me of 06 that have ghost shards but the base file (the file to which they all belong) themselves are non-existent.
> I confirmed this by trying to for example find c186910e-c8f8-4ce0-881d-1da99b7fc48a in the file system by checking .glusterfs/c1/86 in the 6 bricks and it doesn't exist. Same thing is true for the rest of the gfids under .remove_me. So while c186910e-c8f8-4ce0-881d-1da99b7fc48a.480 may temporarily fix your problem, there are two issues -
> 1. There could be more files with some shard that has this gfid mismatch. Finding them would be like finding a needle in a haystack.
> 2. Moreover, these are all ghost shards. So their very existence will consume unnecessary space in your filesystem. So they may be safely deleted. In fact, there is even ample evidence in the mount logs that indicate failure to clean up shards of all these gfids. Grep for the particular gfid and filter strings with 'MSGID: 133021'. For example this -
>
> /[2020-03-19 00:56:52.165883] E [MSGID: 133021] [shard.c:3630:shard_delete_shards] 0-vms/
> /tore-shard: Failed to clean up shards of gfid c186910e-c8f8-4ce0-881d-1da99b7fc48a [Stale file handle]/
>
> While under normal circumstances the clean up would have happened automatically in the background, due to this crash and the 'Stale file handle' issues, you'd have to clean them up manually.
>
> Let me know if that made sense.
>
> -Krutika
>
>     [root@dell-r640-06 ~]# cd /var/log/glusterfs
>     [root@dell-r640-06 glusterfs]# grep -Iisr vmstore-dht.*gfid.different rhev-data-center-mnt-glusterSD-dell-r640-01.gluster.tamlab.rdu2.redhat.com:_vmstore.log
>     [2020-03-24 02:30:04.762603] W [MSGID: 109009] [dht-common.c:3071:dht_lookup_linkfile_cbk] 0-vmstore-dht: /.shard/c186910e-c8f8-4ce0-881d-1da99b7fc48a.480: gfid different on data file on vmstore-replicate-0, gfid local = 00000000-0000-0000-0000-000000000000, gfid node = ad489910-9461-4cb3-b06b-2e4e12de7cc1
>
>     Thanks.
>
>     Don

--- Additional comment from Krutika Dhananjay on 2020-03-28 03:08:30 UTC ---

(In reply to Donald Berry from comment #5)
> Removed the ghost shards and both nodes became operational. Here is part of
> the email thread:
> 
> Hi Krutika,
> 
> 1. .remove_me/
> 
> The same gfids were in all three hosts (4-6) of that subvolume (brick
> /rhgs/vmstore), e.g. 4:
> 
> [root@dell-r640-04 .remove_me]# ll
> total 0
> ----------. 2 root root 0 Mar 18 20:56 066a8381-42ab-49fe-b18b-31cec5274092
> ----------. 2 root root 0 Mar 18 18:47 0d3d148a-e0bb-4033-9cc1-a6f0abddf65a
> ----------. 2 root root 0 Mar 18 18:47 10242631-4de0-478a-bdb4-91bf6420dd2e
> ----------. 2 root root 0 Mar 18 18:09 1cd8dcd6-3236-4ec0-bcd1-f59d85426ecb
> ----------. 2 root root 0 Mar 18 18:10 4e62d28e-8910-45a6-9ad8-098b4cf3bb07
> ----------. 2 root root 0 Mar 18 18:47 5d2b410e-cf1b-4aff-95e5-667450f8aea0
> ----------. 2 root root 0 Mar 18 18:47 70738a38-5e21-4180-aeec-17fda76991ed
> ----------. 2 root root 0 Mar 18 18:10 7d7d3fb3-f0fc-472f-93e4-f1d290209ce9
> ----------. 2 root root 0 Mar 18 18:47 7e9d6fe1-ce21-43a8-8ec6-fb586ba233b2
> ----------. 2 root root 0 Mar 18 18:47 91ef3b6b-c06d-4b14-aae3-d004bf19a523
> ----------. 2 root root 0 Mar 18 18:47 92ba4df0-828a-476a-b1d0-a4de40e1e1aa
> ----------. 2 root root 0 Mar 18 20:56 a78844f2-642e-4e15-a841-293e27edc663
> ----------. 2 root root 0 Mar 18 18:10 abf696e8-c8e8-480a-bac0-9348241e1d01
> ----------. 2 root root 0 Mar 18 18:09 adc71f9f-9408-4193-b5f2-f2a0ceb1423b
> ----------. 2 root root 0 Mar 18 18:09 c186910e-c8f8-4ce0-881d-1da99b7fc48a
> ----------. 2 root root 0 Mar 24 16:06 cb3ba6bd-b174-44b1-89ba-d8d507c320cb
> ----------. 2 root root 0 Mar 18 20:56 ed869b0e-470a-4f49-9c06-7e46ed1bf7a0
> ----------. 2 root root 0 Mar 18 18:47 fbaf5657-f41b-41d5-af7d-bdbfbfa5c305
> [root@dell-r640-04 .remove_me]#
> 
> There were no gfids in .remove_me/ on hosts 1-3:
> 
> [root@dell-r640-01 .remove_me]# ll
> total 0
> [root@dell-r640-01 .remove_me]#
> 
> 2. Stale file handle
> 
> I checked all 6 hosts, but only two (3, 5) had those messages, and they were
> for the same gfids that were in .remove_me on 4-6:
> 
> [root@dell-r640-03 glusterfs]# awk '/MSGID: 133021/ {print $15}'
> rhev-data-center-mnt-glusterSD-dell-r640-01.gluster.tamlab.rdu2.redhat.com\:
> _vmstore.log | sort -u
> 066a8381-42ab-49fe-b18b-31cec5274092
> 0d3d148a-e0bb-4033-9cc1-a6f0abddf65a
> 10242631-4de0-478a-bdb4-91bf6420dd2e
> 1cd8dcd6-3236-4ec0-bcd1-f59d85426ecb
> 4e62d28e-8910-45a6-9ad8-098b4cf3bb07
> 5d2b410e-cf1b-4aff-95e5-667450f8aea0
> 70738a38-5e21-4180-aeec-17fda76991ed
> 7d7d3fb3-f0fc-472f-93e4-f1d290209ce9
> 7e9d6fe1-ce21-43a8-8ec6-fb586ba233b2
> 91ef3b6b-c06d-4b14-aae3-d004bf19a523
> 92ba4df0-828a-476a-b1d0-a4de40e1e1aa
> a78844f2-642e-4e15-a841-293e27edc663
> abf696e8-c8e8-480a-bac0-9348241e1d01
> adc71f9f-9408-4193-b5f2-f2a0ceb1423b
> c186910e-c8f8-4ce0-881d-1da99b7fc48a
> ed869b0e-470a-4f49-9c06-7e46ed1bf7a0
> fbaf5657-f41b-41d5-af7d-bdbfbfa5c305
> 
> [root@dell-r640-05 glusterfs]# awk '/MSGID: 133021/ {print $15}'
> rhev-data-center-mnt-glusterSD-dell-r640-01.gluster.tamlab.rdu2.redhat.com\:
> _vmstore.log | sort -u
> 066a8381-42ab-49fe-b18b-31cec5274092
> 0d3d148a-e0bb-4033-9cc1-a6f0abddf65a
> 10242631-4de0-478a-bdb4-91bf6420dd2e
> 1cd8dcd6-3236-4ec0-bcd1-f59d85426ecb
> 4e62d28e-8910-45a6-9ad8-098b4cf3bb07
> 5d2b410e-cf1b-4aff-95e5-667450f8aea0
> 70738a38-5e21-4180-aeec-17fda76991ed
> 7d7d3fb3-f0fc-472f-93e4-f1d290209ce9
> 7e9d6fe1-ce21-43a8-8ec6-fb586ba233b2
> 91ef3b6b-c06d-4b14-aae3-d004bf19a523
> 92ba4df0-828a-476a-b1d0-a4de40e1e1aa
> a78844f2-642e-4e15-a841-293e27edc663
> abf696e8-c8e8-480a-bac0-9348241e1d01
> adc71f9f-9408-4193-b5f2-f2a0ceb1423b
> c186910e-c8f8-4ce0-881d-1da99b7fc48a
> ed869b0e-470a-4f49-9c06-7e46ed1bf7a0
> fbaf5657-f41b-41d5-af7d-bdbfbfa5c305
> 
> The other hosts either did not have those messages, or they were from Mar.
> 22:
> 
> [root@dell-r640-01 glusterfs]# grep -cs 'MSGID: 133021' * | grep -v 0$
> rhev-data-center-mnt-glusterSD-dell-r640-01.gluster.tamlab.rdu2.redhat.com:
> _vmstore.log-20200322:338
> 
> [root@dell-r640-02 glusterfs]# grep -s 'MSGID: 133021' *
> [root@dell-r640-02 glusterfs]#
> 
> [root@dell-r640-04 glusterfs]# grep -s 'MSGID: 133021' *
> [root@dell-r640-04 glusterfs]#
> 
> [root@dell-r640-06 glusterfs]# grep -cs 'MSGID: 133021' * | grep -v 0$
> rhev-data-center-mnt-glusterSD-dell-r640-01.gluster.tamlab.rdu2.redhat.com:
> _vmstore.log-20200322:6147
> 
> 3. I removed the ghost shards on 4-6:
> 
> [root@dell-r640-04 ~]# for gfid in $(ls
> /rhgs/vmstore/vmstore/.shard/.remove_me); do rm -f
> /rhgs/vmstore/vmstore/.shard/$gfid*; done
> [root@dell-r640-05 ~]# for gfid in $(ls
> /rhgs/vmstore/vmstore/.shard/.remove_me); do rm -f
> /rhgs/vmstore/vmstore/.shard/$gfid*; done
> [root@dell-r640-06 ~]# for gfid in $(ls
> /rhgs/vmstore/vmstore/.shard/.remove_me); do rm -f
> /rhgs/vmstore/vmstore/.shard/$gfid*; done
> 
> After a few minutes there were no more gfids in .remove_me/:
> 
> [root@dell-r640-04 .remove_me]# ll
> total 0
> [root@dell-r640-04 .remove_me]#
> 
> 4. gfids on 1-3
> 
> Immediately after deleting the ghost shards on 4-6, nodes 1-3 showed the
> following shard counts:
> 
> create a list:
> [root@dell-r640-03 glusterfs]# awk '/MSGID: 133021/ {print $15}'
> rhev-data-center-mnt-glusterSD-dell-r640-01.gluster.tamlab.rdu2.redhat.com\:
> _vmstore.log | sort -u > /tmp/gfid
> 
> shards:
> [root@dell-r640-03 .shard]# for gfid in $(cat /tmp/gfid); do ls
> /gluster_bricks/vmstore/vmstore/.shard/$gfid* | wc -l; done
> 2344
> 22
> 30
> 692
> 949
> 31
> 26
> 934
> 32
> 12
> 29
> 790
> ls: cannot access
> /gluster_bricks/vmstore/vmstore/.shard/abf696e8-c8e8-480a-bac0-9348241e1d01*:
> No such file or directory
> 0
> 571
> 355
> 985
> 16
> 
> After a few minutes, the shard counts were almost 0, but there are still
> some remaining:
> 
> [root@dell-r640-01 .shard]# for gfid in $(cat /tmp/gfid); do echo -n "$gfid:
> "; ls /gluster_bricks/vmstore/vmstore/.shard/$gfid* 2>/dev/null | wc -l; done
> 066a8381-42ab-49fe-b18b-31cec5274092: 0
> 0d3d148a-e0bb-4033-9cc1-a6f0abddf65a: 0
> 10242631-4de0-478a-bdb4-91bf6420dd2e: 0
> 1cd8dcd6-3236-4ec0-bcd1-f59d85426ecb: 3 <--
> 4e62d28e-8910-45a6-9ad8-098b4cf3bb07: 0
> 5d2b410e-cf1b-4aff-95e5-667450f8aea0: 0
> 70738a38-5e21-4180-aeec-17fda76991ed: 0
> 7d7d3fb3-f0fc-472f-93e4-f1d290209ce9: 0
> 7e9d6fe1-ce21-43a8-8ec6-fb586ba233b2: 0
> 91ef3b6b-c06d-4b14-aae3-d004bf19a523: 0
> 92ba4df0-828a-476a-b1d0-a4de40e1e1aa: 0
> a78844f2-642e-4e15-a841-293e27edc663: 0
> abf696e8-c8e8-480a-bac0-9348241e1d01: 0
> adc71f9f-9408-4193-b5f2-f2a0ceb1423b: 7 <--
> c186910e-c8f8-4ce0-881d-1da99b7fc48a: 1 <--
> ed869b0e-470a-4f49-9c06-7e46ed1bf7a0: 0
> fbaf5657-f41b-41d5-af7d-bdbfbfa5c305: 0
> 
> Q. Do I need to delete the remaining ghost gfids on nodes 1-3? Do I need to
> run 'gluster volume heal vmstore full'?
> 
> Note, I activated both nodes (2 and 6), and both are still up.
> Storage/Volumes/vmstore shows 6/6 bricks active.
> 
> [root@dell-r640-02 ~]# df /gluster_bricks/vmstore
> /rhev/data-center/mnt/glusterSD/dell-r640-01.gluster.tamlab.rdu2.redhat.com:
> _vmstore
> Filesystem                                            1K-blocks       Used 
> Available Use% Mounted on
> /dev/mapper/gluster_vg-gluster_lv_vmstore            3757840384 1568428228
> 2189412156  42% /gluster_bricks/vmstore
> dell-r640-01.gluster.tamlab.rdu2.redhat.com:/vmstore 7515680768 3883147280
> 3632533488  52%
> /rhev/data-center/mnt/glusterSD/dell-r640-01.gluster.tamlab.rdu2.redhat.com:
> _vmstore
> 
> Thanks.
> 
> Don
> 
> On 2020-03-26 5:31 a.m., Krutika Dhananjay wrote:
> >
> >
> > On Thu, Mar 26, 2020 at 12:37 AM Donald Berry <dberry <dberry>> wrote:
> >
> >     Hi Krutika,
> >
> >     If I check the gfids in .remove_me/, there are many matching entries, e.g. for the first one, 066a8381-42ab-49fe-b18b-31cec5274092 there are 2455 shards:
> >
> >     [root@dell-r640-06 ~]# cd /rhgs/vmstore/vmstore/.shard/
> >     [root@dell-r640-06 .shard]# ll 066a8381-42ab-49fe-b18b-31cec5274092*
> >     -rw-rw----. 2 root root 67108864 Sep 26 13:10 066a8381-42ab-49fe-b18b-31cec5274092.1
> >     -rw-rw----. 2 root root 67108864 Sep 26 13:10 066a8381-42ab-49fe-b18b-31cec5274092.10
> >     -rw-rw----. 2 root root 67108864 Sep 26 13:10 066a8381-42ab-49fe-b18b-31cec5274092.1000
> >     ...
> >     rw-rw----. 2 root root 67108864 Sep 26 13:10 066a8381-42ab-49fe-b18b-31cec5274092.996
> >     -rw-rw----. 2 root root 67108864 Sep 26 13:10 066a8381-42ab-49fe-b18b-31cec5274092.998
> >     -rw-rw----. 2 root root 67108864 Sep 26 13:10 066a8381-42ab-49fe-b18b-31cec5274092.999
> >     [root@dell-r640-06 .shard]# ll 066a8381-42ab-49fe-b18b-31cec5274092* | wc -l
> >     2455
> >
> > Yes this is expected, they are all sharded files.
> >
> >     Are you saying we need to remove all of these? Here are the others:
> >
> >     [root@dell-r640-06 ~]# for gfid in $(ls /rhgs/vmstore/vmstore/.shard/.remove_me); do ls /rhgs/vmstore/vmstore/.shard/$gfid* | wc -l ; done
> >     2455
> >     21
> >     23
> >     573
> >     970
> >     28
> >     29
> >     985
> >     23
> >     23
> >     24
> >     809
> >     ls: cannot access /rhgs/vmstore/vmstore/.shard/abf696e8-c8e8-480a-bac0-9348241e1d01*: No such file or directory
> >     0
> >     571
> >     375
> >     3
> >     1919
> >     15
> >
> >     Also, if I check for dht errors, there is only one error, yesterday at 2:30 AM, which was cleaned up yesterday (shard c186910e-c8f8-4ce0-881d-1da99b7fc48a.480):
> >
> >
> > So here's the thing. That's one part of the solution. But what I'm trying to get at is that there are all these gfids that are listed under .shard/.remove_me of 06 that have ghost shards but the base file (the file to which they all belong) themselves are non-existent.
> > I confirmed this by trying to for example find c186910e-c8f8-4ce0-881d-1da99b7fc48a in the file system by checking .glusterfs/c1/86 in the 6 bricks and it doesn't exist. Same thing is true for the rest of the gfids under .remove_me. So while c186910e-c8f8-4ce0-881d-1da99b7fc48a.480 may temporarily fix your problem, there are two issues -
> > 1. There could be more files with some shard that has this gfid mismatch. Finding them would be like finding a needle in a haystack.
> > 2. Moreover, these are all ghost shards. So their very existence will consume unnecessary space in your filesystem. So they may be safely deleted. In fact, there is even ample evidence in the mount logs that indicate failure to clean up shards of all these gfids. Grep for the particular gfid and filter strings with 'MSGID: 133021'. For example this -
> >
> > /[2020-03-19 00:56:52.165883] E [MSGID: 133021] [shard.c:3630:shard_delete_shards] 0-vms/
> > /tore-shard: Failed to clean up shards of gfid c186910e-c8f8-4ce0-881d-1da99b7fc48a [Stale file handle]/
> >
> > While under normal circumstances the clean up would have happened automatically in the background, due to this crash and the 'Stale file handle' issues, you'd have to clean them up manually.
> >
> > Let me know if that made sense.
> >
> > -Krutika
> >
> >     [root@dell-r640-06 ~]# cd /var/log/glusterfs
> >     [root@dell-r640-06 glusterfs]# grep -Iisr vmstore-dht.*gfid.different rhev-data-center-mnt-glusterSD-dell-r640-01.gluster.tamlab.rdu2.redhat.com:_vmstore.log
> >     [2020-03-24 02:30:04.762603] W [MSGID: 109009] [dht-common.c:3071:dht_lookup_linkfile_cbk] 0-vmstore-dht: /.shard/c186910e-c8f8-4ce0-881d-1da99b7fc48a.480: gfid different on data file on vmstore-replicate-0, gfid local = 00000000-0000-0000-0000-000000000000, gfid node = ad489910-9461-4cb3-b06b-2e4e12de7cc1
> >
> >     Thanks.
> >
> >     Don

Just reiterating what was discussed yesterday over mail (missed this notification as there was no needinfo) -

Yes, you just need to remove the ghost shards from 1..3 as well. And no, there's no need to run heal-full, really.

-Krutika

Comment 2 Krutika Dhananjay 2020-05-15 13:12:26 UTC

There are two parts to this issue -

1. The crash itself which is fixed by https://review.gluster.org/#/c/glusterfs/+/24244/
   This is a simple null-dereference bug and the patch is also non-intrusive.

2. The second issue is the source of the ESTALE error itself which leads to error logging being triggered in shard where the crash happens. The ESTALE issue is seen only in distributed-replicate RHHI volumes. This is being tracked at https://bugzilla.redhat.com/show_bug.cgi?id=1835180 as a separate bug in DHT, the source of the ESTALE errors.

The combined issue is such that as long as the ESTALE error persists, the mount process keeps crashing in the same place every time, even after restarts.

So far two customers have hit this bug.
Fix to 1) has been merged upstream. If we at least take fix to 1) in, customers wont be seeing a service disruption in terms of mounts crashing. The fuse mount point will continue to operate smoothly without interruption. Just that background deletion will keep failing with ESTALE but that's ok, it's a background job.

-Krutika

Comment 14 SATHEESARAN 2020-06-10 12:15:11 UTC

Verified with RHGS 3.5.2 - glusterfs-6.0-37.el8rhgs and RHVH 4.4.1

1. Created RHHI-V deployment with glusterfs storage domain for hosted-engine storage
2. Created few more gluster storage domains with 2x3 gluster replica volumes
3. Ran all sorts of workload and VM life cycle operations
4. Added more bricks and expanded the volume in to 3X3
5. Triggered rebalance
6. Removed the set of bricks

With all these operations, there are no errors seen.

Comment 16 errata-xmlrpc 2020-06-16 06:20:09 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:2572

Comment 17 Vinayak Hariharmath 2020-07-02 09:35:55 UTC

*** Bug 1851526 has been marked as a duplicate of this bug. ***

Comment 19 Mohit Agrawal 2022-09-23 05:12:43 UTC

*** Bug 2128703 has been marked as a duplicate of this bug. ***