1856417 – gluster plugin removes sockets from /var/run/gluster [rhel-7.8.z]

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1856417 - gluster plugin removes sockets from /var/run/gluster [rhel-7.8.z]

Summary: gluster plugin removes sockets from /var/run/gluster [rhel-7.8.z]

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Linux 7
Classification:	Red Hat
Component:	sos
Sub Component:
Version:	7.8
Hardware:	All
OS:	Linux
Priority:	urgent
Severity:	urgent
Target Milestone:	rc
Target Release:	---
Assignee:	Jan Jansky
QA Contact:	Miroslav Hradílek
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	1857587 1857590
TreeView+	depends on / blocked

Reported:	2020-07-13 15:03 UTC by Bryn M. Reeves
Modified:	2023-12-15 18:26 UTC (History)
CC List:	25 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Clones:	1857587 1857590 (view as bug list)
Environment:
Last Closed:	2020-08-06 15:38:38 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Github	sosreport sos pull 2153	0	None	closed	[gluster] remove only dump files	2021-01-20 13:47:24 UTC
Red Hat Product Errata	RHBA-2020:3356	0	None	None	None	2020-08-06 15:38:49 UTC

Internal Links: 1857587 1857590

Description Bryn M. Reeves 2020-07-13 15:03:57 UTC

Description of problem:
The sos gluster plugin includes the ability to generate and collect a gluster statedump. This requires signalling the gluster daemon, collecting a set of files, and then cleaning up those files. The collection of the statedump is optional (via gluster.dump option), but importantly the cleanup is not.

Historically statedumps were hard coded to appear at "/tmp/glusterfs-statedumps" (which had its own set of problems), but was an isolated directory used for nothing else: removing all content under this location was conventionally considered safe.

In sos-3.8 a request was received to change the statedump location to "/var/run/gluster" (and later to "/run/gluster", although this change is as-yet unreleased). Unfortunately it appears there was a miscommunication with this request and the directory was treated in the same manner as the private "/tmp/glusterfs-statedumps"; this is not the case and other socket and state files may exist at this location. The current gluster postproc() method will unconditionally remove all non-directory entries found at /var/run/gluster/*.

The bug is slightly tricky to reproduce since the behaviour depends on the order of directory entries returned by the call to os.listdir() that the gluster plugin uses to identify files to delete. Since it uses os.remove(), if any directory entry appears in the list before the sockets or other files then an exception is raised when the plugin calls os.remove() with a directory path: this aborts further processing in postproc() and prevents the effect of the bug from being seen.

Version-Release number of selected component (if applicable):
3.8-6.el7

How reproducible:
Difficult: the tmpfs file system that backs /run returns os.listdir() (getdents) returns entries in reverse order of inode# and the code in the gluster plugin uses os.remove() to delete entries. This means that to be vulnerable the sockets must have a higher inode number than *any* subdirectory of /var/run/gluster.

On the test systems I have been using a "todelete" directory is created after the sockets have been set up and before the dump runs: the existence of this directory masks the problem on those systems. Temporarily deleting or moving this directory causes the bug to happen on any system.

Steps to Reproduce:
0. ls /run/gluster
1. artificial: mv /run/gluster/todelete /run
2. sosreport -o gluster
3. ls /run/gluster

Actual results:
1st ls, sockets are present:

0aa6e551f54c6f79.socket bitd changelog-21357eaae81cbfd8.sock changelog-fd0df90decdc62f7.sock glustershd quotad shared_storage vols
0fc01c18ee6bbc20.socket cfaf3b2f10ccf625.socket changelog-83c57f4b7afaa897.sock d3e3fc322c43b721.socket nfs scrub snaps

2nd ls, only directories remain:

# ls /run/gluster
bitd glustershd nfs quotad scrub shared_storage snaps vols

Expected results:
Only the files matching "*.dump.*" that are generated during the statedump are removed.

Additional info:

Comment 2 Bryn M. Reeves 2020-07-13 15:06:26 UTC

Just confirmed that the "todelete" directory on the test system was an unrelated artefact from another test - this means that on a "normal" gluster deployment it should be possible to reproduce this without making any changes to the content of /var/run/gluster:

1. ls /run/gluster
2. sosreport -o gluster
3. ls /run/gluster

With the same actual and expected results as described in comment #0.

Comment 3 Jake Hunsaker 2020-07-13 15:58:25 UTC

From IRC, there are a few items to consider:

- The direct fix for this is updating `postproc()` within the plugin to only remove `.*dump.*` files.

- However, we still have an issue in that there's some crazy logic going on with `wait_for_statedump_files()` where we're looking for a magic string to know the statedumps have finished writing. The current approach appears to be reading in the entirety of the statedump files while iterating over them for this string - this is potentially very bad if statedumps become huge.

- Should we be verifying the PID in the dump file names, or just assume any statedump present is valuable, since we're removing them in `postproc()` later?

This is definitely 100% an sos bug, but one thought that occurs to me is that sos would have a significantly easier approach to this if we could directly call for statedumps of the gluster processes, and redirect the output to a specific directory via some command option. Looking at [0], gluster volumes currently have a command invocation for statedumps, whereas the gluster processes rely on receiving a particular signal.

Would it be possible to add a gluster command that triggers these statedumps and doesn't return until the dumps are complete? We'd be able to remove a lot of the hoops we're jumping through in sos if this were added.

I'd like to get some input from the gluster team with regards to adjusting our approach to statedump collection. In the immediate term, we need to fix this with a patch for the sos plugin. Is it sufficient to just update the `postproc()` method to only remove `.*dump.*` named files, or do we need to do more? Can we improve (or somehow remove) `wait_for_statedump_files()`?

Comment 7 Mohit Agrawal 2020-07-14 08:27:29 UTC

(In reply to Jake Hunsaker from comment #3)
> From IRC, there are a few items to consider:
> 
> - The direct fix for this is updating `postproc()` within the plugin to only
> remove `.*dump.*` files.
> 
> - However, we still have an issue in that there's some crazy logic going on
> with `wait_for_statedump_files()` where we're looking for a magic string to
> know the statedumps have finished writing. The current approach appears to
> be reading in the entirety of the statedump files while iterating over them
> for this string - this is potentially very bad if statedumps become huge.
> 
> - Should we be verifying the PID in the dump file names, or just assume any
> statedump present is valuable, since we're removing them in `postproc()`
> later?
> 
> 
> This is definitely 100% an sos bug, but one thought that occurs to me is
> that sos would have a significantly easier approach to this if we could
> directly call for statedumps of the gluster processes, and redirect the
> output to a specific directory via some command option. Looking at [0],
> gluster volumes currently have a command invocation for statedumps, whereas
> the gluster processes rely on receiving a particular signal.
> 
> Would it be possible to add a gluster command that triggers these statedumps
> and doesn't return until the dumps are complete? We'd be able to remove a
> lot of the hoops we're jumping through in sos if this were added.
> 
> 
> 
> I'd like to get some input from the gluster team with regards to adjusting
> our approach to statedump collection. In the immediate term, we need to fix
> this with a patch for the sos plugin. Is it sufficient to just update the
> `postproc()` method to only remove `.*dump.*` named files, or do we need to
> do more? Can we improve (or somehow remove) `wait_for_statedump_files()`?

As we know sosreport we use heavily so i believe we can do enhancement later.At the current stage we should focus only the issue we are facing due to function posproc.The function should remove only files those starts from glusterdump and having dump as a string. There is one issue if we use gluster CLI command to generate statedump.The command will generate statedump on all the cluster nodes irrespective of the node on that sosreport command has executed so i believe it is not optimum way to use the gluster CLI command.I think you can do one thing instead of sending USR1 signal to all gluster processes in one shot and then wait you can send a signal to gluster processes ob pid basis (save all gluster pids in an array) to one by one and then check the last file generated based on the pid in /var/run/gluster.In the posproc we can use similar gluster pid arrays to delete statedump in /var/run/gluster along with dump string.

Thanks,
Mohit Agrawal

Comment 9 Bryn M. Reeves 2020-07-14 09:28:28 UTC

> The function should remove only files those starts from glusterdump and having dump as a string.

This doesn't work; there are additional files generated that appear to derive from mount point names - the common pattern is they match "*.dump.$EPOCH_TIMESTAMP":

glusterdump.1452.dump.1594652913
glusterdump.1454.dump.1594652912
glusterdump.1454.dump.1594652913
glusterdump.24959.dump.1594652912
glusterdump.24959.dump.1594652913
glusterdump.24961.dump.1594652912
glusterdump.24961.dump.1594652913
glusterdump.8205.dump.1594652912
glusterdump.8205.dump.1594652913
glusterdump.8245.dump.1594652912
glusterdump.8245.dump.1594652913
glustershd
mnt-data1-1.24697.dump.1594652912
mnt-data1-1.24697.dump.1594652913
mnt-data2-2.24719.dump.1594652912
mnt-data2-2.24719.dump.1594652913
var-lib-glusterd-ss_brick.14293.dump.1594652912
var-lib-glusterd-ss_brick.14293.dump.1594652913

Comment 10 Mohit Agrawal 2020-07-14 09:40:45 UTC

(In reply to Bryn M. Reeves from comment #9)
> > The function should remove only files those starts from glusterdump and having dump as a string.
> 
> This doesn't work; there are additional files generated that appear to
> derive from mount point names - the common pattern is they match
> "*.dump.$EPOCH_TIMESTAMP":
> 
> glusterdump.1452.dump.1594652913
> glusterdump.1454.dump.1594652912
> glusterdump.1454.dump.1594652913
> glusterdump.24959.dump.1594652912
> glusterdump.24959.dump.1594652913
> glusterdump.24961.dump.1594652912
> glusterdump.24961.dump.1594652913
> glusterdump.8205.dump.1594652912
> glusterdump.8205.dump.1594652913
> glusterdump.8245.dump.1594652912
> glusterdump.8245.dump.1594652913
> glustershd
> mnt-data1-1.24697.dump.1594652912
> mnt-data1-1.24697.dump.1594652913
> mnt-data2-2.24719.dump.1594652912
> mnt-data2-2.24719.dump.1594652913
> var-lib-glusterd-ss_brick.14293.dump.1594652912
> var-lib-glusterd-ss_brick.14293.dump.1594652913

Can you please share what are the command line argument of the pid 24697 
ps -aef | grep 24697

Comment 12 Bryn M. Reeves 2020-07-14 11:04:09 UTC

> There is one issue if we use gluster CLI command to generate statedump.The command will generate statedump on all the cluster nodes irrespective of the node on that sosreport command has executed so i believe it is not optimum way to use the gluster CLI command.

I would suggest in that case that Gluster should extend the CLI statedump facility to support something like a --single-node option: it seems crazy that the command implements this, but in a way that is not actually useful for product support purposes... It would also be extremely helpful if gluster internalised the "wait for statedump completion" heuristic; implementing this in another tool is always going to be a hack, and the concerns with large statedumps are valid.

It doesn't help us in the short-term and we will need to implement a solution for this in sos for the time being, but I think looking forward we ought to fix this properly by sorting out the division of labour between sos and gluster and making this procedure more supportable and robust.

Comment 15 Mohit Agrawal 2020-07-15 05:50:46 UTC

(In reply to Bryn M. Reeves from comment #12)
> > There is one issue if we use gluster CLI command to generate statedump.The command will generate statedump on all the cluster nodes irrespective of the node on that sosreport command has executed so i believe it is not optimum way to use the gluster CLI command.
> 
> I would suggest in that case that Gluster should extend the CLI statedump
> facility to support something like a --single-node option: it seems crazy
> that the command implements this, but in a way that is not actually useful
> for product support purposes... It would also be extremely helpful if
> gluster internalised the "wait for statedump completion" heuristic;
> implementing this in another tool is always going to be a hack, and the
> concerns with large statedumps are valid.
> 
> It doesn't help us in the short-term and we will need to implement a
> solution for this in sos for the time being, but I think looking forward we
> ought to fix this properly by sorting out the division of labour between sos
> and gluster and making this procedure more supportable and robust.

We will try to implement it.

Comment 16 Mohit Agrawal 2020-07-15 05:58:39 UTC

(In reply to Mohit Agrawal from comment #15)
> (In reply to Bryn M. Reeves from comment #12)
> > > There is one issue if we use gluster CLI command to generate statedump.The command will generate statedump on all the cluster nodes irrespective of the node on that sosreport command has executed so i believe it is not optimum way to use the gluster CLI command.
> > 
> > I would suggest in that case that Gluster should extend the CLI statedump
> > facility to support something like a --single-node option: it seems crazy
> > that the command implements this, but in a way that is not actually useful
> > for product support purposes... It would also be extremely helpful if
> > gluster internalised the "wait for statedump completion" heuristic;
> > implementing this in another tool is always going to be a hack, and the
> > concerns with large statedumps are valid.
> > 
> > It doesn't help us in the short-term and we will need to implement a
> > solution for this in sos for the time being, but I think looking forward we
> > ought to fix this properly by sorting out the division of labour between sos
> > and gluster and making this procedure more supportable and robust.
> 
> We will try to implement it.

We will try to implement cli option to generate statedump for local node only in future release.
As of now the best approach is to resolve the issue deletes the dump files with matching pattern ".dump.$EPOCH_TIMESTAMP from /var/run/gluster in function postproc.

Comment 18 Miroslav Hradílek 2020-07-15 14:27:46 UTC

Granting QA_Ack since I should be able to reproduce on a provided gluster machine.

I would still feel safer if somebody could verify on their environment providing OtherQE.

Comment 34 Nag Pavan Chilakam 2020-07-21 16:36:19 UTC

I tested the above sosreport rpm (3.8.9) and the fix works
it works fine on gluster system. The socket files under /var/run/gluster are *NOT* getting deleted whereas  When I tested with sos-3.8-8.el7_8.noarch, the socket files in /var/run/gluster were deleted


########### Testing on 3.8.8(where bug exists)
=== before taking the sosreport =====
[root@dhcp35-192 ~]# ls /var/run/gluster
3b5d83783745a9a9.socket  49b4f07db3811e08.socket  6c9b73c025d2fa73.socket  7f8326229b07a3e1.socket  b99c477d219cec2d.socket  bitd  glustershd  nfs  quotad  scrub  snaps  vols

[root@dhcp35-192 ~]# ls /run/gluster
3b5d83783745a9a9.socket  49b4f07db3811e08.socket  6c9b73c025d2fa73.socket  7f8326229b07a3e1.socket  b99c477d219cec2d.socket  bitd  glustershd  nfs  quotad  scrub  snaps  vols
[root@dhcp35-192 ~]# sosreport

==== after taking sosreport =======
 # WE CAN SEE THE SOCKET FILES WERE DELETED 
[root@dhcp35-192 ~]# ls /var/run/gluster
bitd  glustershd  nfs  quotad  scrub  snaps  vols



############## TESTING ON 3.8.9 RPM ####

==== before taking sosreport ======
[root@dhcp35-39 ~]# ls /var/run/gluster
2ff7cd3d52a860f7.socket  74e196dd79a94211.socket  bitd                     nfs     snaps
4b458d8762c9e23f.socket  8325df1cf239166d.socket  d85011831cd42717.socket  quotad  vols
63b963d1f191f2b4.socket  b3bec54e60fb4542.socket  glustershd               scrub


=====after sosreport ====
+++ SOCKET FILES STILL EXIST ++++++

[root@dhcp35-39 ~]# ls /var/run/gluster
2ff7cd3d52a860f7.socket  74e196dd79a94211.socket  bitd                     nfs     snaps
4b458d8762c9e23f.socket  8325df1cf239166d.socket  d85011831cd42717.socket  quotad  vols
63b963d1f191f2b4.socket  b3bec54e60fb4542.socket  glustershd               scrub
[root@dhcp35-39 ~]#

Comment 35 Nag Pavan Chilakam 2020-07-21 16:38:59 UTC

note that i tested on 2 different machines running kernel version "3.10.0-1127.10.1.el7.x86_64" rhel7.8, though it has nothing to do with the issue

Comment 39 Bryn M. Reeves 2020-07-22 14:25:20 UTC

> anything else to check for other than socket files not getting deleted?

The broken versions will remove any non-directory entry from /var/run/gluster; in our testing the socket files are the only content that actually exists at this location (and so the only things that get removed).

If you were to e.g. "touch /var/run/gluster/somefile" then that (and any other user created non-directory objects) would also be removed.

The fixed version will only remove files that match the shell pattern "*.dump.[0-9]*" (and in a later update the gluster daemon state files named like "glusterd_state_[0-9]*_[0-9]*").

Comment 43 errata-xmlrpc 2020-08-06 15:38:38 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (sos bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:3356

Comment 49 Tim Swan 2020-10-06 06:41:15 UTC

We noticed that the sos.noarch rpm appears to be listed as installed however updated rpms are not included in the rhel-7-server-rhvh-4-rpms repo.  We do see several updated versions under the rhel-7-server-rpms.  I'm not certain how this is closed to errata when it is not actually updating on RHVH.

Comment 50 Tim Swan 2020-10-06 06:53:46 UTC

I neglected to mention this has a significant negative impact on RHHI-V.  This bug, is not addressed in the RHVH repo, and has led to a lot of problems in our environment.

Comment 51 Pavel Moravec 2020-10-06 06:58:53 UTC

(In reply to Tim Swan from comment #49)
> We noticed that the sos.noarch rpm appears to be listed as installed however
> updated rpms are not included in the rhel-7-server-rhvh-4-rpms repo.  We do
> see several updated versions under the rhel-7-server-rpms.  I'm not certain
> how this is closed to errata when it is not actually updating on RHVH.

Hello,
sos package should be delivered in rhel-7-server-rpms repo "only". As the package is part of OS. This should - as far as I know - apply also to any layered product.

If you can't update sos package from rhel-7-server-rpms repo, or if RHVH documentation recommends something else, or if there is some other problem around this, please raise a support ticket.

Note You need to log in before you can comment on or make changes to this bug.

agk
bjarolim
bkunal
bmr
cww
fkrska
jhunsaker
jjansky
jreznik
khoes
mduasope
mhradile
moagrawa
nchilaka
plambri
pmoravec
pprakash
rcyriac
rrajaram
sbradley
sbroz
sheggodu
srakonde
timothy.s.swan.ctr
vdas