Bug 989448 - Dist-geo-rep: Stale glusterfs mount process in slave after geo-rep session stop and delete
Dist-geo-rep: Stale glusterfs mount process in slave after geo-rep session st...
Status: CLOSED EOL
Product: Red Hat Gluster Storage
Classification: Red Hat
Component: geo-replication (Show other bugs)
2.1
x86_64 Linux
high Severity medium
: ---
: ---
Assigned To: Bug Updates Notification Mailing List
shilpa
usability
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2013-07-29 05:44 EDT by M S Vishwanath Bhat
Modified: 2016-05-31 21:56 EDT (History)
7 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2015-11-25 03:48:59 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
glusterfs mount process logfile (24.99 KB, text/x-log)
2013-07-29 05:44 EDT, M S Vishwanath Bhat
no flags Details
glusterd log file from the slave (2.62 KB, text/x-log)
2013-07-29 05:45 EDT, M S Vishwanath Bhat
no flags Details

  None (edit)
Description M S Vishwanath Bhat 2013-07-29 05:44:54 EDT
Created attachment 779710 [details]
glusterfs mount process logfile

Description of problem:
After stopping and deleting the session from the master volume, there is a stale fuse mount process in the slave.


Version-Release number of selected component (if applicable):
glusterfs-3.4.0.12rhs.beta6-1.el6rhs.x86_64

How reproducible:
Hit once.

Steps to Reproduce:
1. Create and start master and slave volumes.
2. Create and start a geo-rep session between master and slave.
3. Now after syncing data stop the geo-rep session and delete it.
4. Stop the master and slave volumes.

Actual results:
In one of the slave

[root@falcon ~]# ps -aef | grep gluster
root     18992     1  0 Jul24 ?        00:00:30 /usr/sbin/glusterd --pid-file=/var/run/glusterd.pid
root     19334     1  0 Jul24 ?        00:00:22 glusterfs -s localhost --xlator-option=*dht.lookup-unhashed=off --volfile-id hosa-slave -l /var/log/glusterfs/geo-replication-slaves/slave.log /tmp/tmp.GoL7yiyMEf


Expected results:
There should be no stale glusterfs processes.


Additional info:

These error messages were seen in the log file of the slave mount.


[2013-07-26 06:11:41.820108] I [glusterfsd-mgmt.c:56:mgmt_cbk_spec] 0-mgmt: Volume file changed
[2013-07-26 06:11:41.829570] I [glusterfsd-mgmt.c:1545:mgmt_getspec_cbk] 0-glusterfs: No change in volfile, continuing
[2013-07-26 06:11:41.829676] I [glusterfsd-mgmt.c:1545:mgmt_getspec_cbk] 0-glusterfs: No change in volfile, continuing
[2013-07-27 21:46:01.493633] I [glusterfsd.c:1128:reincarnate] 0-glusterfsd: Fetching the volume file from server...
[2013-07-27 21:46:01.494777] I [glusterfsd-mgmt.c:1545:mgmt_getspec_cbk] 0-glusterfs: No change in volfile, continuing
[2013-07-29 09:27:50.114288] W [socket.c:522:__socket_rwv] 0-hosa-slave-client-0: readv on 10.70.43.183:49152 failed (No data available)
[2013-07-29 09:27:50.114380] I [client.c:2103:client_rpc_notify] 0-hosa-slave-client-0: disconnected from 10.70.43.183:49152. Client process will keep trying to connect to glusterd until brick's port is available. 
[2013-07-29 09:27:52.248986] W [socket.c:522:__socket_rwv] 0-hosa-slave-client-3: readv on 10.70.42.219:49152 failed (No data available)
[2013-07-29 09:27:52.249323] I [client.c:2103:client_rpc_notify] 0-hosa-slave-client-3: disconnected from 10.70.42.219:49152. Client process will keep trying to connect to glusterd until brick's port is available. 
[2013-07-29 09:27:52.249827] W [socket.c:522:__socket_rwv] 0-hosa-slave-client-1: readv on 10.70.43.197:49152 failed (No data available)
[2013-07-29 09:27:52.250122] I [client.c:2103:client_rpc_notify] 0-hosa-slave-client-1: disconnected from 10.70.43.197:49152. Client process will keep trying to connect to glusterd until brick's port is available. 
[2013-07-29 09:27:52.250168] E [afr-common.c:3822:afr_notify] 0-hosa-slave-replicate-0: All subvolumes are down. Going offline until atleast one of them comes back up.
[2013-07-29 09:27:52.251311] W [socket.c:522:__socket_rwv] 0-hosa-slave-client-2: readv on 10.70.43.48:49152 failed (No data available)
[2013-07-29 09:27:52.251561] I [client.c:2103:client_rpc_notify] 0-hosa-slave-client-2: disconnected from 10.70.43.48:49152. Client process will keep trying to connect to glusterd until brick's port is available. 
[2013-07-29 09:27:52.251608] E [afr-common.c:3822:afr_notify] 0-hosa-slave-replicate-1: All subvolumes are down. Going offline until atleast one of them comes back up.
[2013-07-29 09:28:00.427878] E [client-handshake.c:1741:client_query_portmap_cbk] 0-hosa-slave-client-0: failed to get the port number for remote subvolume. Please run 'gluster volume status' on server to see if brick process is running.
[2013-07-29 09:28:00.428047] I [client.c:2103:client_rpc_notify] 0-hosa-slave-client-0: disconnected from 10.70.43.183:24007. Client process will keep trying to connect to glusterd until brick's port is available. 
[2013-07-29 09:28:02.437290] E [client-handshake.c:1741:client_query_portmap_cbk] 0-hosa-slave-client-3: failed to get the port number for remote subvolume. Please run 'gluster volume status' on server to see if brick process is running.
[2013-07-29 09:28:02.437569] I [client.c:2103:client_rpc_notify] 0-hosa-slave-client-3: disconnected from 10.70.42.219:24007. Client process will keep trying to connect to glusterd until brick's port is available. 
[2013-07-29 09:28:02.443829] E [client-handshake.c:1741:client_query_portmap_cbk] 0-hosa-slave-client-1: failed to get the port number for remote subvolume. Please run 'gluster volume status' on server to see if brick process is running.
[2013-07-29 09:28:02.444072] I [client.c:2103:client_rpc_notify] 0-hosa-slave-client-1: disconnected from 10.70.43.197:24007. Client process will keep trying to connect to glusterd until brick's port is available. 
[2013-07-29 09:28:02.449285] E [client-handshake.c:1741:client_query_portmap_cbk] 0-hosa-slave-client-2: failed to get the port number for remote subvolume. Please run 'gluster volume status' on server to see if brick process is running.
[2013-07-29 09:28:02.449493] I [client.c:2103:client_rpc_notify] 0-hosa-slave-client-2: disconnected from 10.70.43.48:24007. Client process will keep trying to connect to glusterd until brick's port is available. 



I have attached the glusterfs mount log and glusterd log from that slave node.
Comment 1 M S Vishwanath Bhat 2013-07-29 05:45:44 EDT
Created attachment 779711 [details]
glusterd log file from the slave
Comment 3 Amar Tumballi 2013-08-01 06:09:40 EDT
Csaba, can you please have a look on this ASAP. We need to have all the 'time' to be in UTC in the code (so failover-failback should be seemless).
Comment 4 Amar Tumballi 2013-08-01 06:11:58 EDT
above comment was for bug 987272
Comment 5 M S Vishwanath Bhat 2013-08-07 05:02:26 EDT
I hit this again with 3.4.0.15rhs
Comment 6 Aravinda VK 2014-12-24 02:03:00 EST
Need to check behavior with latest 2.1 & 3.0 rpms.
Comment 7 Aravinda VK 2015-11-25 03:48:59 EST
Closing this bug since RHGS 2.1 release reached EOL. Required bugs are cloned to RHGS 3.1. Please re-open this issue if found again.
Comment 8 Aravinda VK 2015-11-25 03:50:50 EST
Closing this bug since RHGS 2.1 release reached EOL. Required bugs are cloned to RHGS 3.1. Please re-open this issue if found again.

Note You need to log in before you can comment on or make changes to this bug.