911489 – Georeplication causing Virtual Machines to be put into Read Only mode.

Bug 911489 - Georeplication causing Virtual Machines to be put into Read Only mode.

Summary: Georeplication causing Virtual Machines to be put into Read Only mode.

Keywords:
Status:	CLOSED EOL
Alias:	None
Product:	GlusterFS
Classification:	Community
Component:	geo-replication
Sub Component:
Version:	mainline
Hardware:	All
OS:	Linux
Priority:	medium
Severity:	urgent
Target Milestone:	---
Assignee:	Csaba Henk
QA Contact:
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2013-02-15 07:32 UTC by David Bomba
Modified:	2015-10-22 15:46 UTC (History)
CC List:	5 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2015-10-22 15:46:38 UTC
Regression:	---
Mount Type:	---
Documentation:	---
CRM:
Verified Versions:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description David Bomba 2013-02-15 07:32:47 UTC

Description of problem:

When starting geo-replication of a Virtual Machine Storage Brick. Running Virtual machines (linux) are put into Read Only mode. Requiring reboot and fsck of boot drives prior to being able to restore virtual machines

Version-Release number of selected component (if applicable):

gluster 3.3.1

How reproducible:

Citrix Xenserver with Gluster 3.3.1 Storage Node (1 brick). Georeplicating to A separate gluster node. Start Geo-replication - consistently causing linux VMs that are being replicated to move into read only mode

Steps to Reproduce:
1.
2.
3.
  
Actual results:

read only mode of VMs

Expected results:

Transparent replication without any negative impact on running instances

Additional info:

It has been suggested that adding --no-blocking-io to line 508 of /usr/libexec/glusterfs/python/syncdaemon/resource.py may help prevent any file locking that may be active with rsync on the  virtual machine .vhd file, we cannot test this as we are in production and cannot risk further outages.

Comment 1 Amar Tumballi 2013-02-28 04:49:19 UTC

David,

As of now, if you are syncing the 'active/live' VMs to the destination, then we would be in a 'inconsistent' state between source and destination.

But with your report, not fully clear on why would a source VM get locked up. We just do the 'readonly' type of rsync of this file to destination. No locks used (not even shared locks).

Meantime, can you post logs of geo-rep mount and brick processes ?

Comment 2 David Bomba 2013-02-28 07:26:45 UTC

Hi Amar,

Correct, we started Geo-replication in an inconsistent state. We have a 5Tb Volume of VMs which we started replication on. We got about 300Gb transferred before this issue became noticeable. Our system has been stable since we've disabled geo-replication. So we believe the issue either is a result directly of geo-replication or perhaps an IOWait issue caused by the replication target.

Please find below the log output of the VM Host, and its geo-replication log:

+------------------------------------------------------------------------------+
[2013-01-16 08:54:01.678492] I [rpc-clnt.c:1657:rpc_clnt_reconfig] 0-vol1-client-0: changing port to 24009 (from 0)
[2013-01-16 08:54:05.608849] I [client-handshake.c:1636:select_server_supported_programs] 0-vol1-client-0: Using Program GlusterFS 3.3.1, Num (1298437), Version (330)
[2013-01-16 08:54:05.609233] I [client-handshake.c:1433:client_setvolume_cbk] 0-vol1-client-0: Connected to 192.168.0.201:24009, attached to remote volume '/mnt/gluster/1'.
[2013-01-16 08:54:05.609263] I [client-handshake.c:1445:client_setvolume_cbk] 0-vol1-client-0: Server and Client lk-version numbers are not same, reopening the fds
[2013-01-16 08:54:05.612386] I [fuse-bridge.c:4191:fuse_graph_setup] 0-fuse: switched to graph 0
[2013-01-16 08:54:05.612501] I [client-handshake.c:453:client_set_lk_version_cbk] 0-vol1-client-0: Server lk version = 1
[2013-01-16 08:54:05.612668] I [fuse-bridge.c:3376:fuse_init] 0-glusterfs-fuse: FUSE inited with protocol versions: glusterfs 7.13 kernel 7.13
[2013-01-16 09:08:46.566220] I [fuse-bridge.c:4091:fuse_thread_proc] 0-fuse: unmounting /tmp/gsyncd-aux-mount-rwKjb0
[2013-01-16 09:08:46.572910] W [glusterfsd.c:831:cleanup_and_exit] (-->/lib64/libc.so.6(clone+0x6d) [0x7f3c73a8811d] (-->/lib64/libpthread.so.0(+0x7851) [0x7f3c740d4851] (-->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xdd) [0x405d4d]))) 0-: received signum (15), shutting down
[2013-01-16 09:08:46.572932] I [fuse-bridge.c:4648:fini] 0-fuse: Unmounting '/tmp/gsyncd-aux-mount-rwKjb0'.
[2013-01-18 02:24:44.356748] I [glusterfsd.c:1666:main] 0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 3.3.1
[2013-01-18 02:24:44.508105] I [io-cache.c:1549:check_cache_size_ok] 0-vol1-quick-read: Max cache size is 101432086528
[2013-01-18 02:24:44.508161] I [io-cache.c:1549:check_cache_size_ok] 0-vol1-io-cache: Max cache size is 101432086528
[2013-01-18 02:24:44.508691] I [client.c:2142:notify] 0-vol1-client-0: parent translators are ready, attempting connect on transport
Given volfile:
+------------------------------------------------------------------------------+
  1: volume vol1-client-0
  2:     type protocol/client
  3:     option remote-host gluster
  4:     option remote-subvolume /mnt/gluster/1
  5:     option transport-type tcp
  6:     option username 575e93cc-701f-4bc8-a695-0b7d905bc0ad
  7:     option password 858e5da6-c3ee-4356-a726-b17346a77f91
  8: end-volume
  9: 
 10: volume vol1-write-behind
 11:     type performance/write-behind
 12:     subvolumes vol1-client-0
 13: end-volume
 14: 
 15: volume vol1-read-ahead
 16:     type performance/read-ahead
 17:     subvolumes vol1-write-behind
 18: end-volume
 19: 
 20: volume vol1-io-cache
 21:     type performance/io-cache
 22:     subvolumes vol1-read-ahead
 23: end-volume
 24: 
 25: volume vol1-quick-read
 26:     type performance/quick-read
 27:     subvolumes vol1-io-cache
 28: end-volume
 29: 
 30: volume vol1-md-cache
 31:     type performance/md-cache
 32:     subvolumes vol1-quick-read
 33: end-volume
 34: 
 35: volume vol1
 36:     type debug/io-stats
 37:     option latency-measurement off
 38:     option count-fop-hits off
 39:     subvolumes vol1-md-cache
 40: end-volume

+------------------------------------------------------------------------------+
[2013-01-18 02:24:44.511127] I [rpc-clnt.c:1657:rpc_clnt_reconfig] 0-vol1-client-0: changing port to 24009 (from 0)
[2013-01-18 02:24:48.400660] I [client-handshake.c:1636:select_server_supported_programs] 0-vol1-client-0: Using Program GlusterFS 3.3.1, Num (1298437), Version (330)
[2013-01-18 02:24:48.401007] I [client-handshake.c:1433:client_setvolume_cbk] 0-vol1-client-0: Connected to 192.168.0.201:24009, attached to remote volume '/mnt/gluster/1'.
[2013-01-18 02:24:48.401037] I [client-handshake.c:1445:client_setvolume_cbk] 0-vol1-client-0: Server and Client lk-version numbers are not same, reopening the fds
[2013-01-18 02:24:48.404569] I [fuse-bridge.c:4191:fuse_graph_setup] 0-fuse: switched to graph 0
[2013-01-18 02:24:48.404677] I [client-handshake.c:453:client_set_lk_version_cbk] 0-vol1-client-0: Server lk version = 1
[2013-01-18 02:24:48.404800] I [fuse-bridge.c:3376:fuse_init] 0-glusterfs-fuse: FUSE inited with protocol versions: glusterfs 7.13 kernel 7.13
[2013-01-18 04:34:21.835220] I [fuse-bridge.c:4091:fuse_thread_proc] 0-fuse: unmounting /tmp/gsyncd-aux-mount-uAHWaX
[2013-01-18 04:34:21.854701] W [glusterfsd.c:831:cleanup_and_exit] (-->/lib64/libc.so.6(clone+0x6d) [0x7f8a8b99b11d] (-->/lib64/libpthread.so.0(+0x7851) [0x7f8a8bfe7851] (-->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xdd) [0x405d4d]))) 0-: received signum (15), shutting down
[2013-01-18 04:34:21.854730] I [fuse-bridge.c:4648:fini] 0-fuse: Unmounting '/tmp/gsyncd-aux-mount-uAHWaX'.
[root@gluster vol1]# cat file%3A%2F%2F%2Fmnt%2Freplication%2Fsuperpool.
file%3A%2F%2F%2Fmnt%2Freplication%2Fsuperpool.gluster.log  file%3A%2F%2F%2Fmnt%2Freplication%2Fsuperpool.log
[root@gluster vol1]# cat file%3A%2F%2F%2Fmnt%2Freplication%2Fsuperpool.
file%3A%2F%2F%2Fmnt%2Freplication%2Fsuperpool.gluster.log  file%3A%2F%2F%2Fmnt%2Freplication%2Fsuperpool.log
[root@gluster vol1]# cat file%3A%2F%2F%2Fmnt%2Freplication%2Fsuperpool.
file%3A%2F%2F%2Fmnt%2Freplication%2Fsuperpool.gluster.log  file%3A%2F%2F%2Fmnt%2Freplication%2Fsuperpool.log
[root@gluster vol1]# cat file%3A%2F%2F%2Fmnt%2Freplication%2Fsuperpool.log 
[2013-01-15 04:58:21.411711] I [monitor(monitor):21:set_state] Monitor: new state: starting...
[2013-01-15 04:58:21.420644] I [monitor(monitor):80:monitor] Monitor: ------------------------------------------------------------
[2013-01-15 04:58:21.420742] I [monitor(monitor):81:monitor] Monitor: starting gsyncd worker
[2013-01-15 04:58:21.460053] I [gsyncd:354:main_i] <top>: syncing: gluster://localhost:vol1 -> file:///mnt/replication/superpool
[2013-01-15 04:58:26.832029] I [master:284:crawl] GMaster: new master is f1fdfebf-89eb-4d49-b154-f13ec79184bb
[2013-01-15 04:58:26.832207] I [master:288:crawl] GMaster: primary master with volume id f1fdfebf-89eb-4d49-b154-f13ec79184bb ...
[2013-01-15 04:59:21.674072] I [monitor(monitor):21:set_state] Monitor: new state: OK
[2013-01-15 17:07:09.148614] I [syncdutils(monitor):142:finalize] <top>: exiting.
[2013-01-15 17:07:09.157291] E [syncdutils:173:log_raise_exception] <top>: connection to peer is broken
[2013-01-15 17:07:09.168896] E [resource:191:errlog] Popen: command "/usr/libexec/glusterfs/gsyncd --session-owner f1fdfebf-89eb-4d49-b154-f13ec79184bb -N --listen --timeout 120 file:///mnt/replication/superpool" returned with 1, saying:
[2013-01-15 17:07:09.169038] E [resource:194:logerr] Popen: /usr/libexec/glusterfs/gsyncd> [2013-01-15 04:58:21.471726] W [rpc-transport.c:174:rpc_transport_load] 0-rpc-transport: missing 'option transport-type'. defaulting to "socket"
[2013-01-15 17:07:09.169136] E [resource:194:logerr] Popen: /usr/libexec/glusterfs/gsyncd> [2013-01-15 04:58:21.519780] I [cli-rpc-ops.c:4321:gf_cli3_1_getwd_cbk] 0-cli: Received resp to getwd
[2013-01-15 17:07:09.169231] E [resource:194:logerr] Popen: /usr/libexec/glusterfs/gsyncd> [2013-01-15 04:58:21.519810] I [input.c:46:cli_batch] 0-: Exiting with: 0
[2013-01-15 17:07:09.169337] I [syncdutils:142:finalize] <top>: exiting.
[2013-01-16 08:54:01.116357] I [monitor(monitor):21:set_state] Monitor: new state: starting...
[2013-01-16 08:54:01.117120] I [monitor(monitor):80:monitor] Monitor: ------------------------------------------------------------
[2013-01-16 08:54:01.117218] I [monitor(monitor):81:monitor] Monitor: starting gsyncd worker
[2013-01-16 08:54:01.157299] I [gsyncd:354:main_i] <top>: syncing: gluster://localhost:vol1 -> file:///mnt/replication/superpool
[2013-01-16 08:54:07.69381] I [master:284:crawl] GMaster: new master is f1fdfebf-89eb-4d49-b154-f13ec79184bb
[2013-01-16 08:54:07.69557] I [master:288:crawl] GMaster: primary master with volume id f1fdfebf-89eb-4d49-b154-f13ec79184bb ...
[2013-01-16 08:55:01.694350] I [monitor(monitor):21:set_state] Monitor: new state: OK
[2013-01-16 09:08:46.124933] I [syncdutils(monitor):142:finalize] <top>: exiting.
[2013-01-16 09:08:46.126538] E [syncdutils:173:log_raise_exception] <top>: connection to peer is broken
[2013-01-16 09:08:46.127949] E [resource:191:errlog] Popen: command "/usr/libexec/glusterfs/gsyncd --session-owner f1fdfebf-89eb-4d49-b154-f13ec79184bb -N --listen --timeout 120 file:///mnt/replication/superpool" returned with 1, saying:
[2013-01-16 09:08:46.128273] E [resource:194:logerr] Popen: /usr/libexec/glusterfs/gsyncd> [2013-01-16 08:54:01.169395] W [rpc-transport.c:174:rpc_transport_load] 0-rpc-transport: missing 'option transport-type'. defaulting to "socket"
[2013-01-16 09:08:46.128412] E [resource:194:logerr] Popen: /usr/libexec/glusterfs/gsyncd> [2013-01-16 08:54:01.221267] I [cli-rpc-ops.c:4321:gf_cli3_1_getwd_cbk] 0-cli: Received resp to getwd
[2013-01-16 09:08:46.128546] E [resource:194:logerr] Popen: /usr/libexec/glusterfs/gsyncd> [2013-01-16 08:54:01.221361] I [input.c:46:cli_batch] 0-: Exiting with: 0
[2013-01-16 09:08:46.128693] I [syncdutils:142:finalize] <top>: exiting.
[2013-01-18 02:24:44.166646] I [monitor(monitor):21:set_state] Monitor: new state: starting...
[2013-01-18 02:24:44.167433] I [monitor(monitor):80:monitor] Monitor: ------------------------------------------------------------
[2013-01-18 02:24:44.167541] I [monitor(monitor):81:monitor] Monitor: starting gsyncd worker
[2013-01-18 02:24:44.206756] I [gsyncd:354:main_i] <top>: syncing: gluster://localhost:vol1 -> file:///mnt/replication/superpool
[2013-01-18 02:24:49.591288] I [master:284:crawl] GMaster: new master is f1fdfebf-89eb-4d49-b154-f13ec79184bb
[2013-01-18 02:24:49.591461] I [master:288:crawl] GMaster: primary master with volume id f1fdfebf-89eb-4d49-b154-f13ec79184bb ...
[2013-01-18 02:25:44.498188] I [monitor(monitor):21:set_state] Monitor: new state: OK
[2013-01-18 04:34:19.290539] I [syncdutils(monitor):142:finalize] <top>: exiting.
[2013-01-18 04:34:19.292119] E [syncdutils:173:log_raise_exception] <top>: connection to peer is broken
[2013-01-18 04:34:19.293314] E [resource:191:errlog] Popen: command "/usr/libexec/glusterfs/gsyncd --session-owner f1fdfebf-89eb-4d49-b154-f13ec79184bb -N --listen --timeout 120 file:///mnt/replication/superpool" returned with 1, saying:
[2013-01-18 04:34:19.293431] E [resource:194:logerr] Popen: /usr/libexec/glusterfs/gsyncd> [2013-01-18 02:24:44.219045] W [rpc-transport.c:174:rpc_transport_load] 0-rpc-transport: missing 'option transport-type'. defaulting to "socket"
[2013-01-18 04:34:19.293525] E [resource:194:logerr] Popen: /usr/libexec/glusterfs/gsyncd> [2013-01-18 02:24:44.266324] I [cli-rpc-ops.c:4321:gf_cli3_1_getwd_cbk] 0-cli: Received resp to getwd
[2013-01-18 04:34:19.293617] E [resource:194:logerr] Popen: /usr/libexec/glusterfs/gsyncd> [2013-01-18 02:24:44.266422] I [input.c:46:cli_batch] 0-: Exiting with: 0
[2013-01-18 04:34:19.293720] I [syncdutils:142:finalize] <top>: exiting.

Bricks have no logs....

Comment 3 Venky Shankar 2013-02-28 09:50:26 UTC

David,

I see you try to sync to a local directory tree (/mnt/data/superpool). Since we use rsync for data transfers and the slave being a local dir tree, rsync would use full copy instead of diff transfers. And since this a live vm, each crawl would result in full data copy.

I could not figure out much from the logs at this point, would it be possible for you to put geo-rep in DEBUG mode (via #gluster volume geo-replication <master> <slave> config log_level DEBUG) and provide the logs?

One more thing, is the slave a local FS or a GlusterFS mount by any chance?

Comment 4 David Bomba 2013-02-28 10:08:58 UTC

Hi Venky,

We mount the gluster volume on a mount point /mnt/replication/* and then implemented the geo-rep by the looks of it.

What you have said about full copy vs diff makes sense. We will unmount the brick and then start geo-replication again.

We will enable Debug logging for further info and report back with the results.

Thanks!

Comment 5 Venky Shankar 2013-02-28 14:14:57 UTC

David,

invoke geo-replication like this:

# gluster volume geo-replication vol1 <hostname>::<slave-vol>

Note, the '::' causes the gluster mount to be done on the slave itself, thereby rsync taking the advantage of syncing diffs.

if the slave is specified as <hostname>:<slave-vol>, the slave volume is mounted on the master which is what you may not want (as rsync would again think it's a local directory tree and use full copy).

Comment 6 David Bomba 2013-03-07 11:16:57 UTC

Tried geo-replication with the <hostname>::<slave-vol> however the status returned faulty each time i tried it.

We also unmounted the gluster volume, and performed a standard geo-replication as documented from the gluster website, again we were able to reproduce sending the Virtual Machines into read only mode.

The affected machines were ones that typically performed quite a bit of I/O so we believe there is something interfering with I/O operations when geo-replication is active.

There was nothing out of the ordinary observed in the logs, the geo-replication process was continuing unabated until we stopped the process due to the instability it was causing in our production environment.

No significant IOwait was observer on both the host or slave machines.



I [gsyncd:354:main_i] <top>: syncing: gluster://localhost:vol3 -> ssh://10.1.1.40::/mnt/backup/replication/rackpool
[2013-03-07 22:08:24.140345] E [syncdutils:173:log_raise_exception] <top>: connection to peer is broken
[2013-03-07 22:08:24.140575] E [resource:191:errlog] Popen: command "ssh -oPasswordAuthentication=no -oStrictHostKeyChecking=no -i /var/lib/glusterd/geo-replication/secret.pem -oControlMaster=auto -S /tmp/gsyncd-aux-ssh-JJdCaL/gsycnd-ssh-%r@%h:%p 10.1.1.40 /usr/local/libexec/glusterfs/gsyncd --session-owner 22c059e2-e24c-4f49-90f4-700531869455 -N --listen --timeout 120 gluster://localhost:/mnt/backup/replication/rackpool" returned with 1, saying:
[2013-03-07 22:08:24.140681] E [resource:194:logerr] Popen: ssh> [2013-03-07 22:08:34.942640] W [rpc-transport.c:174:rpc_transport_load] 0-rpc-transport: missing 'option transport-type'. defaulting to "socket"
[2013-03-07 22:08:24.140760] E [resource:194:logerr] Popen: ssh> [2013-03-07 22:08:34.993852] I [cli-rpc-ops.c:4321:gf_cli3_1_getwd_cbk] 0-cli: Received resp to getwd
[2013-03-07 22:08:24.140839] E [resource:194:logerr] Popen: ssh> [2013-03-07 22:08:34.993905] I [input.c:46:cli_batch] 0-: Exiting with: 0
[2013-03-07 22:08:24.141000] I [syncdutils:142:finalize] <top>: exiting.
[2013-03-07 22:08:34.152384] I [monitor(monitor):80:monitor] Monitor: ------------------------------------------------------------
[2013-03-07 22:08:34.152611] I [monitor(monitor):81:monitor] Monitor: starting gsyncd worker
[2013-03-07 22:08:34.192703] I [gsyncd:354:main_i] <top>: syncing: gluster://localhost:vol3 -> ssh://10.1.1.40::/mnt/backup/replication/rackpool
[root@gluster vol3]# gluster volume geo-replication vol3 10.1.1.40::/mnt/backup/replication/rackpool status

Comment 7 Kaleb KEITHLEY 2015-10-22 15:46:38 UTC

because of the large number of bugs filed against mainline version\ is ambiguous and about to be removed as a choice.

If you believe this is still a bug, please change the status back to NEW and choose the appropriate, applicable version for it.

Note You need to log in before you can comment on or make changes to this bug.