Bug 816150 - geo-rep session b/w master volume and slave volume through gluster is broken
Summary: geo-rep session b/w master volume and slave volume through gluster is broken
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: GlusterFS
Classification: Community
Component: geo-replication
Version: mainline
Hardware: x86_64
OS: Linux
unspecified
urgent
Target Milestone: ---
Assignee: Csaba Henk
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2012-04-25 11:01 UTC by Vijaykumar Koppad
Modified: 2014-08-25 00:49 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2012-04-26 13:04:59 UTC
Regression: ---
Mount Type: ---
Documentation: DP
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Description Vijaykumar Koppad 2012-04-25 11:01:56 UTC
Description of problem: If i start geo-rep session between two volume which are two different machines through gluster
( which is gluster volume geo-rep <master-vol> <host:slave-vol> start )
 , the status will show faulty. 

Version-Release number of selected component (if applicable):
[2a59514236630756dc996e08b50f539ccc2d3ff0]

How reproducible: always


Steps to Reproduce:
1.create volumes in two different machines.
2.Start s geo-rep session b/e them like this 
  gluster volume geo-rep <master-vol> <host:slave-vol> start
  
Actual results: It should go to status ok


Expected results:status shows faulty 


Additional info:
Master logs- 
########################[2012-04-25 00:53:05.476368] E [resource:169:errfail] Popen: /usr/local/libexec/glusterfs/gsyncd> [2012-04-25 00:53:05.374375] W [rpc-transport.c:183:rpc_transport_load] 0-rpc-transport: missing 'option transport-type'. defaulting to "socket"
[2012-04-25 00:53:05.476458] E [resource:169:errfail] Popen: /usr/local/libexec/glusterfs/gsyncd> [2012-04-25 00:53:05.420236] I [cli-rpc-ops.c:4255:gf_cli3_1_getwd_cbk] 0-cli: Received resp to getwd
[2012-04-25 00:53:05.476545] E [resource:169:errfail] Popen: /usr/local/libexec/glusterfs/gsyncd> [2012-04-25 00:53:05.420363] I [input.c:46:cli_batch] 0-: Exiting with: 0
[2012-04-25 00:53:05.476644] I [syncdutils:142:finalize] <top>: exiting.
[2012-04-25 00:53:15.487909] I [monitor(monitor):80:monitor] Monitor: ------------------------------------------------------------
[2012-04-25 00:53:15.488188] I [monitor(monitor):81:monitor] Monitor: starting gsyncd worker
[2012-04-25 00:53:15.531266] I [gsyncd:355:main_i] <top>: syncing: gluster://localhost:master_sec -> gluster://10.16.157.30:slave2
[2012-04-25 00:53:15.635225] E [syncdutils:173:log_raise_exception] <top>: connection to peer is broken
[2012-04-25 00:53:15.635406] E [resource:166:errfail] Popen: command "/usr/local/libexec/glusterfs/gsyncd --session-owner b4a5e609-d195-427f-8a73-6e10d2c3c17c -N --listen --timeout 120 gluster://10.16.157.30:slave2" returned with 1, saying:
[2012-04-25 00:53:15.635510] E [resource:169:errfail] Popen: /usr/local/libexec/glusterfs/gsyncd> [2012-04-25 00:53:15.535079] W [rpc-transport.c:183:rpc_transport_load] 0-rpc-transport: missing 'option transport-type'. defaulting to "socket"
[2012-04-25 00:53:15.635600] E [resource:169:errfail] Popen: /usr/local/libexec/glusterfs/gsyncd> [2012-04-25 00:53:15.580887] I [cli-rpc-ops.c:4255:gf_cli3_1_getwd_cbk] 0-cli: Received resp to getwd
[2012-04-25 00:53:15.635688] E [resource:169:errfail] Popen: /usr/local/libexec/glusterfs/gsyncd> [2012-04-25 00:53:15.580991] I [input.c:46:cli_batch] 0-: Exiting with: 0
[2012-04-25 00:53:15.635782] I [syncdutils:142:finalize] <top>: exiting.
[2012-04-25 00:53:25.647176] I [monitor(monitor):80:monitor] Monitor: ------------------------------------------------------------
[2012-04-25 00:53:25.647412] I [monitor(monitor):81:monitor] Monitor: starting gsyncd worker
[2012-04-25 00:53:25.688819] I [gsyncd:355:main_i] <top>: syncing: gluster://localhost:master_sec -> gluster://10.16.157.30:slave2
[2012-04-25 00:53:25.793880] E [syncdutils:173:log_raise_exception] <top>: connection to peer is broken
[2012-04-25 00:53:25.794087] E [resource:166:errfail] Popen: command "/usr/local/libexec/glusterfs/gsyncd --session-owner b4a5e609-d195-427f-8a73-6e10d2c3c17c -N --listen --timeout 120 gluster://10.16.157.30:slave2" returned with 1, saying:
[2012-04-25 00:53:25.794204] E [resource:169:errfail] Popen: /usr/local/libexec/glusterfs/gsyncd> [2012-04-25 00:53:25.692889] W [rpc-transport.c:183:rpc_transport_load] 0-rpc-transport: missing 'option transport-type'. defaulting to "socket"
[2012-04-25 00:53:25.794312] E [resource:169:errfail] Popen: /usr/local/libexec/glusterfs/gsyncd> [2012-04-25 00:53:25.738895] I [cli-rpc-ops.c:4255:gf_cli3_1_getwd_cbk] 0-cli: Received resp to getwd
[2012-04-25 00:53:25.794401] E [resource:169:errfail] Popen: /usr/local/libexec/glusterfs/gsyncd> [2012-04-25 00:53:25.738994] I [input.c:46:cli_batch] 0-: Exiting with: 0
[2012-04-25 00:53:25.794497] I [syncdutils:142:finalize] <top>: exiting.
################################################
 Slave logs 
###############################################

[2012-04-25 00:53:35.951468] E [syncdutils(slave):178:log_raise_exception] <top>: glusterfs session went down
[2012-04-25 00:53:35.951666] I [syncdutils(slave):142:finalize] <top>: exiting.
[2012-04-25 00:53:46.98125] I [gsyncd(slave):355:main_i] <top>: syncing: gluster://10.16.157.30:slave2
[2012-04-25 00:53:46.111968] E [syncdutils(slave):178:log_raise_exception] <top>: glusterfs session went down
[2012-04-25 00:53:46.112171] I [syncdutils(slave):142:finalize] <top>: exiting.
[2012-04-25 00:53:56.257071] I [gsyncd(slave):355:main_i] <top>: syncing: gluster://10.16.157.30:slave2
[2012-04-25 00:53:56.270577] E [syncdutils(slave):178:log_raise_exception] <top>: glusterfs session went down
[2012-04-25 00:53:56.270762] I [syncdutils(slave):142:finalize] <top>: exiting.
[2012-04-25 00:54:06.416770] I [gsyncd(slave):355:main_i] <top>: syncing: gluster://10.16.157.30:slave2
[2012-04-25 00:54:06.430231] E [syncdutils(slave):178:log_raise_exception] <top>: glusterfs session went down
[2012-04-25 00:54:06.430422] I [syncdutils(slave):142:finalize] <top>: exiting.
[2012-04-25 00:54:16.575644] I [gsyncd(slave):355:main_i] <top>: syncing: gluster://10.16.157.30:slave2
[2012-04-25 00:54:16.589251] E [syncdutils(slave):178:log_raise_exception] <top>: glusterfs session went down
[2012-04-25 00:54:16.589467] I [syncdutils(slave):142:finalize] <top>: exiting.
[2012-04-25 00:54:26.733683] I [gsyncd(slave):355:main_i] <top>: syncing: gluster://10.16.157.30:slave2
[2012-04-25 00:54:26.747188] E [syncdutils(slave):178:log_raise_exception] <top>: glusterfs session went down
[2012-04-25 00:54:26.747419] I [syncdutils(slave):142:finalize] <top>: exiting.
[2012-04-25 00:54:36.892220] I [gsyncd(slave):355:main_i] <top>: syncing: gluster://10.16.157.30:slave2
[2012-04-25 00:54:36.905839] E [syncdutils(slave):178:log_raise_exception] <top>: glusterfs session went down
[2012-04-25 00:54:36.906047] I [syncdutils(slave):142:finalize] <top>: exiting.
[2012-04-25 00:54:47.50266] I [gsyncd(slave):355:main_i] <top>: syncing: gluster://10.16.157.30:slave2
[2012-04-25 00:54:47.63973] E [syncdutils(slave):178:log_raise_exception] <top>: glusterfs session went down
[2012-04-25 00:54:47.64186] I [syncdutils(slave):142:finalize] <top>: exiting.
(END) 
######################################################
slave gluster logs
#######################################################

[2012-04-25 00:55:07.377849] I [glusterfsd-mgmt.c:1664:mgmt_rpc_notify] 0-glusterfsd-mgmt: -1 connect attempts left
[2012-04-25 00:55:07.377963] W [glusterfsd.c:794:cleanup_and_exit] (-->/usr/local/lib/libgfrpc.so.0(rpc_transport_notify+0x130) [0x7f513ae55ec4] (-->/usr/local/lib/libgfrpc.so.0(rpc_clnt_notify+0x219) [0x7f513ae59d27] (-->/usr/local/sbin/glusterfs() [0x40ce63]))) 0-: received signum (1), shutting down
[2012-04-25 00:55:07.377996] I [fuse-bridge.c:4649:fini] 0-fuse: Unmounting '/tmp/gsyncd-aux-mount-He7gRK'.
[2012-04-25 00:55:17.533616] I [glusterfsd.c:1629:main] 0-/usr/local/sbin/glusterfs: Started running /usr/local/sbin/glusterfs version 3git
[2012-04-25 00:55:17.534071] E [mount.c:580:fuse_mount_sys] 0-glusterfs-fuse: calling mount
[2012-04-25 00:55:17.534344] E [mount.c:583:fuse_mount_sys] 0-glusterfs-fuse: mount returned 0
[2012-04-25 00:55:17.534635] E [mount.c:625:fuse_mount_sys] 0-glusterfs-fuse: writing status
[2012-04-25 00:55:17.534666] E [mount.c:637:fuse_mount_sys] 0-glusterfs-fuse: Mount child exiting
[2012-04-25 00:55:17.537444] E [socket.c:1724:socket_connect_finish] 0-glusterfs: connection to  failed (No route to host)
[2012-04-25 00:55:17.537540] E [glusterfsd-mgmt.c:1661:mgmt_rpc_notify] 0-glusterfsd-mgmt: failed to connect with remote-host: Transport endpoint is not connected
[2012-04-25 00:55:17.537553] I [glusterfsd-mgmt.c:1664:mgmt_rpc_notify] 0-glusterfsd-mgmt: -1 connect attempts left
[2012-04-25 00:55:17.537671] W [glusterfsd.c:794:cleanup_and_exit] (-->/usr/local/lib/libgfrpc.so.0(rpc_transport_notify+0x130) [0x7f5269231ec4] (-->/usr/local/lib/libgfrpc.so.0(rpc_clnt_notify+0x219) [0x7f5269235d27] (-->/usr/local/sbin/glusterfs() [0x40ce63]))) 0-: received signum (1), shutting down
[2012-04-25 00:55:17.537696] I [fuse-bridge.c:4649:fini] 0-fuse: Unmounting '/tmp/gsyncd-aux-mount-5MDAA8'.
[2012-04-25 00:55:27.692183] I [glusterfsd.c:1629:main] 0-/usr/local/sbin/glusterfs: Started running /usr/local/sbin/glusterfs version 3git
[2012-04-25 00:55:27.692650] E [mount.c:580:fuse_mount_sys] 0-glusterfs-fuse: calling mount
[2012-04-25 00:55:27.692877] E [mount.c:583:fuse_mount_sys] 0-glusterfs-fuse: mount returned 0
[2012-04-25 00:55:27.693175] E [mount.c:625:fuse_mount_sys] 0-glusterfs-fuse: writing status
[2012-04-25 00:55:27.693217] E [mount.c:637:fuse_mount_sys] 0-glusterfs-fuse: Mount child exiting
[2012-04-25 00:55:27.696134] E [socket.c:1724:socket_connect_finish] 0-glusterfs: connection to  failed (No route to host)
[2012-04-25 00:55:27.696197] E [glusterfsd-mgmt.c:1661:mgmt_rpc_notify] 0-glusterfsd-mgmt: failed to connect with remote-host: Transport endpoint is not connected
[2012-04-25 00:55:27.696210] I [glusterfsd-mgmt.c:1664:mgmt_rpc_notify] 0-glusterfsd-mgmt: -1 connect attempts left
[2012-04-25 00:55:27.696328] W [glusterfsd.c:794:cleanup_and_exit] (-->/usr/local/lib/libgfrpc.so.0(rpc_transport_notify+0x130) [0x7f48b0647ec4] (-->/usr/local/lib/libgfrpc.so.0(rpc_clnt_notify+0x219) [0x7f48b064bd27] (-->/usr/local/sbin/glusterfs() [0x40ce63]))) 0-: received signum (1), shutting down
[2012-04-25 00:55:27.696355] I [fuse-bridge.c:4649:fini] 0-fuse: Unmounting '/tmp/gsyncd-aux-mount-3f_LQn'.

Comment 1 Vijaykumar Koppad 2012-04-26 13:04:59 UTC
Apparently i found out , this issue is because of iptables. By doing iptables -F helped to resolve the issue , although this is not recommended for the customers, so we will do that with proper documentation.


Note You need to log in before you can comment on or make changes to this bug.