Bug 989448

Summary:

Dist-geo-rep: Stale glusterfs mount process in slave after geo-rep session stop and delete

Product:

[Red Hat Storage] Red Hat Gluster Storage

Reporter:

M S Vishwanath Bhat <vbhat>

Component:

geo-replication

Assignee:

Bug Updates Notification Mailing List <rhs-bugs>

Status:

CLOSED EOL

QA Contact:

shilpa <smanjara>

Severity:

medium

Docs Contact:

Priority:

high

Version:

2.1

CC:

avishwan, chrisw, csaba, mzywusko, rhs-bugs, rwheeler, vagarwal

Target Milestone:

---

Target Release:

---

Hardware:

x86_64

OS:

Linux

Whiteboard:

usability

Fixed In Version:

Doc Type:

Bug Fix

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

2015-11-25 08:48:59 UTC

Type:

Bug

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Attachments:

Description	Flags
glusterfs mount process logfile	none
glusterd log file from the slave	none

Description M S Vishwanath Bhat 2013-07-29 09:44:54 UTC

Created attachment 779710 [details]
glusterfs mount process logfile

Description of problem:
After stopping and deleting the session from the master volume, there is a stale fuse mount process in the slave.


Version-Release number of selected component (if applicable):
glusterfs-3.4.0.12rhs.beta6-1.el6rhs.x86_64

How reproducible:
Hit once.

Steps to Reproduce:
1. Create and start master and slave volumes.
2. Create and start a geo-rep session between master and slave.
3. Now after syncing data stop the geo-rep session and delete it.
4. Stop the master and slave volumes.

Actual results:
In one of the slave

[root@falcon ~]# ps -aef | grep gluster
root     18992     1  0 Jul24 ?        00:00:30 /usr/sbin/glusterd --pid-file=/var/run/glusterd.pid
root     19334     1  0 Jul24 ?        00:00:22 glusterfs -s localhost --xlator-option=*dht.lookup-unhashed=off --volfile-id hosa-slave -l /var/log/glusterfs/geo-replication-slaves/slave.log /tmp/tmp.GoL7yiyMEf


Expected results:
There should be no stale glusterfs processes.


Additional info:

These error messages were seen in the log file of the slave mount.


[2013-07-26 06:11:41.820108] I [glusterfsd-mgmt.c:56:mgmt_cbk_spec] 0-mgmt: Volume file changed
[2013-07-26 06:11:41.829570] I [glusterfsd-mgmt.c:1545:mgmt_getspec_cbk] 0-glusterfs: No change in volfile, continuing
[2013-07-26 06:11:41.829676] I [glusterfsd-mgmt.c:1545:mgmt_getspec_cbk] 0-glusterfs: No change in volfile, continuing
[2013-07-27 21:46:01.493633] I [glusterfsd.c:1128:reincarnate] 0-glusterfsd: Fetching the volume file from server...
[2013-07-27 21:46:01.494777] I [glusterfsd-mgmt.c:1545:mgmt_getspec_cbk] 0-glusterfs: No change in volfile, continuing
[2013-07-29 09:27:50.114288] W [socket.c:522:__socket_rwv] 0-hosa-slave-client-0: readv on 10.70.43.183:49152 failed (No data available)
[2013-07-29 09:27:50.114380] I [client.c:2103:client_rpc_notify] 0-hosa-slave-client-0: disconnected from 10.70.43.183:49152. Client process will keep trying to connect to glusterd until brick's port is available. 
[2013-07-29 09:27:52.248986] W [socket.c:522:__socket_rwv] 0-hosa-slave-client-3: readv on 10.70.42.219:49152 failed (No data available)
[2013-07-29 09:27:52.249323] I [client.c:2103:client_rpc_notify] 0-hosa-slave-client-3: disconnected from 10.70.42.219:49152. Client process will keep trying to connect to glusterd until brick's port is available. 
[2013-07-29 09:27:52.249827] W [socket.c:522:__socket_rwv] 0-hosa-slave-client-1: readv on 10.70.43.197:49152 failed (No data available)
[2013-07-29 09:27:52.250122] I [client.c:2103:client_rpc_notify] 0-hosa-slave-client-1: disconnected from 10.70.43.197:49152. Client process will keep trying to connect to glusterd until brick's port is available. 
[2013-07-29 09:27:52.250168] E [afr-common.c:3822:afr_notify] 0-hosa-slave-replicate-0: All subvolumes are down. Going offline until atleast one of them comes back up.
[2013-07-29 09:27:52.251311] W [socket.c:522:__socket_rwv] 0-hosa-slave-client-2: readv on 10.70.43.48:49152 failed (No data available)
[2013-07-29 09:27:52.251561] I [client.c:2103:client_rpc_notify] 0-hosa-slave-client-2: disconnected from 10.70.43.48:49152. Client process will keep trying to connect to glusterd until brick's port is available. 
[2013-07-29 09:27:52.251608] E [afr-common.c:3822:afr_notify] 0-hosa-slave-replicate-1: All subvolumes are down. Going offline until atleast one of them comes back up.
[2013-07-29 09:28:00.427878] E [client-handshake.c:1741:client_query_portmap_cbk] 0-hosa-slave-client-0: failed to get the port number for remote subvolume. Please run 'gluster volume status' on server to see if brick process is running.
[2013-07-29 09:28:00.428047] I [client.c:2103:client_rpc_notify] 0-hosa-slave-client-0: disconnected from 10.70.43.183:24007. Client process will keep trying to connect to glusterd until brick's port is available. 
[2013-07-29 09:28:02.437290] E [client-handshake.c:1741:client_query_portmap_cbk] 0-hosa-slave-client-3: failed to get the port number for remote subvolume. Please run 'gluster volume status' on server to see if brick process is running.
[2013-07-29 09:28:02.437569] I [client.c:2103:client_rpc_notify] 0-hosa-slave-client-3: disconnected from 10.70.42.219:24007. Client process will keep trying to connect to glusterd until brick's port is available. 
[2013-07-29 09:28:02.443829] E [client-handshake.c:1741:client_query_portmap_cbk] 0-hosa-slave-client-1: failed to get the port number for remote subvolume. Please run 'gluster volume status' on server to see if brick process is running.
[2013-07-29 09:28:02.444072] I [client.c:2103:client_rpc_notify] 0-hosa-slave-client-1: disconnected from 10.70.43.197:24007. Client process will keep trying to connect to glusterd until brick's port is available. 
[2013-07-29 09:28:02.449285] E [client-handshake.c:1741:client_query_portmap_cbk] 0-hosa-slave-client-2: failed to get the port number for remote subvolume. Please run 'gluster volume status' on server to see if brick process is running.
[2013-07-29 09:28:02.449493] I [client.c:2103:client_rpc_notify] 0-hosa-slave-client-2: disconnected from 10.70.43.48:24007. Client process will keep trying to connect to glusterd until brick's port is available. 



I have attached the glusterfs mount log and glusterd log from that slave node.

Comment 1 M S Vishwanath Bhat 2013-07-29 09:45:44 UTC

Created attachment 779711 [details]
glusterd log file from the slave

Comment 3 Amar Tumballi 2013-08-01 10:09:40 UTC

Csaba, can you please have a look on this ASAP. We need to have all the 'time' to be in UTC in the code (so failover-failback should be seemless).

Comment 4 Amar Tumballi 2013-08-01 10:11:58 UTC

above comment was for bug 987272

Comment 5 M S Vishwanath Bhat 2013-08-07 09:02:26 UTC

I hit this again with 3.4.0.15rhs

Comment 6 Aravinda VK 2014-12-24 07:03:00 UTC

Need to check behavior with latest 2.1 & 3.0 rpms.

Comment 7 Aravinda VK 2015-11-25 08:48:59 UTC

Closing this bug since RHGS 2.1 release reached EOL. Required bugs are cloned to RHGS 3.1. Please re-open this issue if found again.

Comment 8 Aravinda VK 2015-11-25 08:50:50 UTC

Closing this bug since RHGS 2.1 release reached EOL. Required bugs are cloned to RHGS 3.1. Please re-open this issue if found again.