Bug 1471794 - Brick Multiplexing:Different brick processes pointing to same socket, process file and volfile-id
Brick Multiplexing:Different brick processes pointing to same socket, process...
Status: CLOSED WONTFIX
Product: Red Hat Gluster Storage
Classification: Red Hat
Component: core (Show other bugs)
3.3
Unspecified Unspecified
medium Severity urgent
: ---
: ---
Assigned To: Mohit Agrawal
Rahul Hinduja
brick-multiplexing
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2017-07-17 08:54 EDT by nchilaka
Modified: 2018-02-13 07:47 EST (History)
5 users (show)

See Also:
Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2018-02-13 07:47:44 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description nchilaka 2017-07-17 08:54:32 EDT
Description of problem:
===========
I am raising a seperate bug to track the issue I mentioned originally in BZ#1444086 - Brick Multiplexing:Different brick processes pointing to same socket, process file and volfile-id must not lead to IO loss when one of the volume is down


I see that we can end up having different glusterfsd pointing to same socket and volfiles




Same problem is seen even on 3.8.4-33

The reason I am raising seperate bug, is based on discussion in BZ#1444086 (comment 11,12,13,14)




Description of problem:
=========================
with brick multiplexing it is quite easy to end up with two brick processes(glusterfsd) pointing to same socket and volfile-id

While, I need to understand the implications(which i suspect can be severe), this problem is consistently reproducible

Version-Release number of selected component (if applicable):
=====================
3.8.4-22 to 3.8.4-33(all builds)

How reproducible:
======
2/2

Steps to Reproduce:
1.have a gluster setup (i have 6 nodes) with brick multiplexing enabled
2.create a volume say v1 which is 1x3 spanning on n1,n2,n3
Now the glusterfsd will be something like this when checked on a node n1

root     20014     1  0 19:22 ?        00:00:00 /usr/sbin/glusterfsd -s 10.70.35.23 --volfile-id cross31.10.70.35.23.rhs-brick10-cross31 -p /var/lib/glusterd/vols/cross31/run/10.70.35.23-rhs-brick10-cross31.pid -S /var/lib/glusterd/vols/cross31/run/daemon-10.70.35.23.socket --brick-name /rhs/brick10/cross31 -l /var/log/glusterfs/bricks/rhs-brick10-cross31.log --xlator-option *-posix.glusterd-uuid=2b1a4ca7-5c9b-4169-add4-23530cea101a --brick-port 49153 --xlator-option cross31-server.listen-port=49153

3.Now create another 1x3 vol say v2, the bricks of this vol will be attached to same PIDS

4. Now enable USS on v1(or any option which will result in the v1 to get new PIDS for bricks on a restart)
5. Now stop and start v1


Actual results:
==========
it can be seen that a new PID for the bricks is spawned(due to vol option changed and hence avoiding to be connected to the first PID)

but the problem is the new PID is connected with same socket as first PID and so is the volfile-id and log file as below



[root@dhcp35-23 3.8.4-22]# ps -ef|grep glusterfsd
===>old PID
root     20014     1  0 19:22 ?        00:00:00 /usr/sbin/glusterfsd -s 10.70.35.23 --volfile-id cross31.10.70.35.23.rhs-brick10-cross31 -p /var/lib/glusterd/vols/cross31/run/10.70.35.23-rhs-brick10-cross31.pid -S /var/lib/glusterd/vols/cross31/run/daemon-10.70.35.23.socket --brick-name /rhs/brick10/cross31 -l /var/log/glusterfs/bricks/rhs-brick10-cross31.log --xlator-option *-posix.glusterd-uuid=2b1a4ca7-5c9b-4169-add4-23530cea101a --brick-port 49153 --xlator-option cross31-server.listen-port=49153

==>new PID
root     20320     1  0 19:27 ?        00:00:00 /usr/sbin/glusterfsd -s 10.70.35.23 --volfile-id cross31.10.70.35.23.rhs-brick10-cross31 -p /var/lib/glusterd/vols/cross31/run/10.70.35.23-rhs-brick10-cross31.pid -S /var/lib/glusterd/vols/cross31/run/daemon-10.70.35.23.socket --brick-name /rhs/brick10/cross31 -l /var/log/glusterfs/bricks/rhs-brick10-cross31.log --xlator-option *-posix.glusterd-uuid=2b1a4ca7-5c9b-4169-add4-23530cea101a --brick-port 49152 --xlator-option cross31-server.listen-port=49152
root     20340     1  0 19:27 ?        00:00:00 /usr/sbin/glusterfsd -s localhost --volfile-id snapd/cross31 -p /var/lib/glusterd/vols/cross31/run/cross31-snapd.pid -l /var/log/glusterfs/snaps/cross31/snapd.log --brick-name snapd-cross31 -S /var/run/gluster/d451ea3d83a68af025cee105cafdd8a2.socket --brick-port 49154 --xlator-option cross31-server.listen-port=49154 --no-mem-accounting
root     20472 30155  0 19:38 pts/0    00:00:00 grep --color=auto glusterfs

Note You need to log in before you can comment on or make changes to this bug.