Bug 1593545 - [GSS] stop-all-gluster-processes.sh script can potentially kill random PIDs
Summary: [GSS] stop-all-gluster-processes.sh script can potentially kill random PIDs
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: core
Version: rhgs-3.3
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
: ---
Assignee: Sanju
QA Contact: Rahul Hinduja
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-06-21 04:28 UTC by Damian Wojsław
Modified: 2021-12-10 16:25 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-09-05 13:07:53 UTC
Embargoed:


Attachments (Terms of Use)

Description Damian Wojsław 2018-06-21 04:28:02 UTC
Description of problem:
Customer has run stop-all-gluster-processes.sh as part of their preparation for upgrade. During the run this for:

From line 75 from https://github.com/gluster/glusterfs/blob/master/extras/stop-all-gluster-processes.sh#L75 :
```
for pidfile in $(find /var/run/gluster/ -name '*.pid');
do
    local pid=$(cat ${pidfile});
    echo "sending SIG${signal} to pid: ${pid}";
    kill -${signal} ${pid};
done
```

found pid files stored in /var/run/gluster/snaps/ from mysql container volume, which resulted in the loop trying to kill pid 1.

root@hostname:/usr/share/glusterfs/scripts # ./stop-all-gluster-processes.sh -g
sending SIGTERM to pid: 1288
sending SIGTERM to pid: 25556
sending SIGTERM to pid: 25275
sending SIGTERM to pid: 22927
sending SIGTERM to pid: 22464
sending SIGTERM to pid: 22243
sending SIGTERM to pid: 8441
sending SIGTERM to pid: 8521
sending SIGTERM to pid: 8354
sending SIGTERM to pid: 8317
sending SIGTERM to pid: 8318
sending SIGTERM to pid: 8319
sending SIGTERM to pid: 1449
sending SIGTERM to pid: 1
sending SIGTERM to pid: 1
sending SIGTERM to pid: 1
sending SIGTERM to pid: 1
sending SIGTERM to pid: 1
sending SIGTERM to pid: 1
sending SIGTERM to pid: 1
sending SIGTERM to pid: 1
sending SIGTERM to pid: 1
sending SIGTERM to pid: 1

sending SIGKILL to pid: 8521
./stop-all-gluster-processes.sh: line 79: kill: (8521) - No such process
sending SIGKILL to pid: 1449
./stop-all-gluster-processes.sh: line 79: kill: (1449) - No such process
sending SIGKILL to pid: 1
sending SIGKILL to pid: 1
sending SIGKILL to pid: 1
sending SIGKILL to pid: 1
sending SIGKILL to pid: 1
sending SIGKILL to pid: 1
sending SIGKILL to pid: 1
sending SIGKILL to pid: 1
sending SIGKILL to pid: 1
sending SIGKILL to pid: 1

Version-Release number of selected component (if applicable):

Red Hat Enterprise Linux Server release 7.5 (Maipo)
redhat-storage-server-3.3.1.0-1.el7rhgs.noarch
glusterfs-server-3.8.4-54.8.el7rhgs.x86_64

How reproducible:
Every time Customer runs the script for maintenance

Actual results:
The script finds all pid processes even from unmounted snaps, this can lead to random pids being killed.

Expected results:
Script only picks up pids for gluster processes, not random pids from /var/run/gluster/snaps.

Additional info:


Note You need to log in before you can comment on or make changes to this bug.