| Summary: | geo-replication fails with mesg "connection to peer is broken" | ||||||
|---|---|---|---|---|---|---|---|
| Product: | [Community] GlusterFS | Reporter: | Lakshmipathi G <lakshmipathi> | ||||
| Component: | geo-replication | Assignee: | Csaba Henk <csaba> | ||||
| Status: | CLOSED CURRENTRELEASE | QA Contact: | |||||
| Severity: | high | Docs Contact: | |||||
| Priority: | medium | ||||||
| Version: | 3.3-beta | CC: | gluster-bugs, rahulcs, vijay, vshankar | ||||
| Target Milestone: | --- | ||||||
| Target Release: | --- | ||||||
| Hardware: | x86_64 | ||||||
| OS: | Linux | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | Type: | --- | |||||
| Regression: | RTNR | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Attachments: |
|
||||||
|
Description
Csaba Henk
2011-10-03 10:03:33 UTC
There are some problems with logging to /dev/stderr by slave-side cli on Centos 5.2 (kernel 2.6.18-238.el5).
Debugging with strace, we can see:
open("/dev/stderr", O_WRONLY|O_CREAT|O_APPEND, 0666) = -1 ENXIO (No such device or address)
See the following test program:
# echo '#!/bin/sh
echo foo > "$1"' > /tmp/test.sh
# chmod a+x /tmp/test.sh
# /tmp/test.sh /dev/stderr
foo
# ssh localhost /tmp/test.sh /dev/stderr
/tmp/test.sh: line 2: /dev/stderr: No such device or address
Apparenty opening /dev/stderr writably fails with ENXIO in this system if stderr is a socket. I don't know, is it a bug or a feature?
It can be worked around with the attached hotfix, which sends those logs to /dev/null instead.
starting geo-replication with glusterfs-3.3qa13 fails with following error message.(on the same setup ,if i install glfs-3.2.3 -it works) #gluster volume geo-replication pythonchk root.11.140:/pychk2 start # cat /usr/local/var/log/glusterfs/geo-replication/pythonchk/ssh%3A%2F%2Froot%4010.1.11.140%3Afile%3A%2F%2F%2Fpychk2.log [2011-10-03 04:23:12.913885] I [monitor(monitor):22:set_state] Monitor: new state: starting... [2011-10-03 04:23:12.918194] I [monitor(monitor):63:monitor] Monitor: ------------------------------------------------------------ [2011-10-03 04:23:12.918314] I [monitor(monitor):64:monitor] Monitor: starting gsyncd worker [2011-10-03 04:23:12.964476] I [gsyncd:352:main_i] <top>: syncing: gluster://localhost:pythonchk -> ssh://root.11.140:/pychk2 [2011-10-03 04:23:13.182882] E [syncdutils:171:log_raise_exception] <top>: connection to peer is broken [2011-10-03 04:23:13.183144] E [resource:166:errfail] Popen: command "ssh -oPasswordAuthentication=no -oStrictHostKeyChecking=no -i /etc/glusterd/geo-replication/secret.pem -oControlMaster=auto -S /tmp/gsyncd-aux-ssh-POjX7l/gsycnd-ssh-%r@%h:%p root.11.140 /usr/local/libexec/glusterfs/gsyncd --session-owner b4efd5e8-c72a-478b-88cb-0dad5298aeaf -N --listen --timeout 120 file:///pychk2" returned with 1, saying: [2011-10-03 04:23:13.183264] E [resource:170:errfail] Popen: ssh> Warning: Identity file /etc/glusterd/geo-replication/secret.pem not accessible: No such file or directory. [2011-10-03 04:23:13.183352] E [resource:170:errfail] Popen: ssh> ERROR: failed to open logfile "/dev/stderr" (No such device or address) [2011-10-03 04:23:13.183438] E [resource:170:errfail] Popen: ssh> ERROR: failed to open logfile /dev/stderr [2011-10-03 04:23:13.183521] E [resource:170:errfail] Popen: ssh> gsyncd initializaion failed [2011-10-03 04:23:13.183667] I [syncdutils:140:finalize] <top>: exiting. [2011-10-03 04:23:14.185211] I [monitor(monitor):22:set_state] Monitor: new state: faulty [2011-10-03 04:23:24.188642] I [monitor(monitor):63:monitor] Monitor: ------------------------------------------------------------ [2011-10-03 04:23:24.188827] I [monitor(monitor):64:monitor] Monitor: starting gsyncd worker [2011-10-03 04:23:24.235424] I [gsyncd:352:main_i] <top>: syncing: gluster://localhost:pythonchk -> ssh://root.11.140:/pychk2 [2011-10-03 04:23:24.388060] E [syncdutils:171:log_raise_exception] <top>: connection to peer is broken [2011-10-03 04:23:24.388242] E [resource:166:errfail] Popen: command "ssh -oPasswordAuthentication=no -oStrictHostKeyChecking=no -i /etc/glusterd/geo-replication/secret.pem -oControlMaster=auto -S /tmp/gsyncd-aux-ssh-ISX8fc/gsycnd-ssh-%r@%h:%p root.11.140 /usr/local/libexec/glusterfs/gsyncd --session-owner b4efd5e8-c72a-478b-88cb-0dad5298aeaf -N --listen --timeout 120 file:///pychk2" returned with 1, saying: [2011-10-03 04:23:24.388345] E [resource:170:errfail] Popen: ssh> Warning: Identity file /etc/glusterd/geo-replication/secret.pem not accessible: No such file or directory. [2011-10-03 04:23:24.388432] E [resource:170:errfail] Popen: ssh> ERROR: failed to open logfile "/dev/stderr" (No such device or address) [2011-10-03 04:23:24.388550] E [resource:170:errfail] Popen: ssh> ERROR: failed to open logfile /dev/stderr [2011-10-03 04:23:24.388641] E [resource:170:errfail] Popen: ssh> gsyncd initializaion failed [2011-10-03 04:23:24.388806] I [syncdutils:140:finalize] <top>: exiting. CHANGE: http://review.gluster.com/560 (This works around broken /dev/stderr on some systems.) merged in master by Vijay Bellur (vijay) |