Bug 1340756

Summary: [geo-rep]: AttributeError: 'Popen' object has no attribute 'elines'
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Rahul Hinduja <rhinduja>
Component: geo-replicationAssignee: Aravinda VK <avishwan>
Status: CLOSED ERRATA QA Contact: Rahul Hinduja <rhinduja>
Severity: high Docs Contact:
Priority: unspecified    
Version: rhgs-3.1CC: amukherj, asrivast, avishwan, csaba, hamiller, khiremat, olim, rabhat, rcyriac, rhinduja
Target Milestone: ---Keywords: ZStream
Target Release: RHGS 3.2.0   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: glusterfs-3.8.4-1 Doc Type: Bug Fix
Doc Text:
Previously, when the rsync command failed with an error, geo-replication attempted to retrieve the error status after the child rsync process was already closed. This caused geo-replication to fail with an elines error. The elines attribute in the error object is now initialized correctly so that this failure does not occur.
Story Points: ---
Clone Of:
: 1372193 (view as bug list) Environment:
Last Closed: 2017-03-23 05:33:35 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1351522, 1351530, 1372193, 1374595, 1374596, 1374597    

Description Rahul Hinduja 2016-05-30 08:57:18 UTC
Description of problem:
=======================

During tarssh syncup and rmdir from Master nfs client, following traceback is seen on one of the master nodes:

[2016-05-29 22:40:09.701273] I [master(/bricks/brick0/master_brick1):1192:crawl] _GMaster: slave's time: (1464560999, 0)
[2016-05-29 22:40:10.335872] I [syncdutils(/bricks/brick1/master_brick6):220:finalize] <top>: exiting.
[2016-05-29 22:40:10.336153] E [syncdutils(/bricks/brick0/master_brick1):276:log_raise_exception] <top>: FAIL: 
Traceback (most recent call last):
  File "/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py", line 306, in twrap
    tf(*aa)
  File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1575, in syncjob
    po.errfail()
  File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 242, in errfail
    self.errlog()
  File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 223, in errlog
    if self.elines:
AttributeError: 'Popen' object has no attribute 'elines'
[2016-05-29 22:40:10.336755] E [syncdutils(/bricks/brick0/master_brick1):252:log_raise_exception] <top>: connection to peer is broken
[2016-05-29 22:40:10.340313] E [resource(/bricks/brick0/master_brick1):226:errlog] Popen: command "ssh -oPasswordAuthentication=no -oStrictHostKeyChecking=no -i /var/lib/glusterd/geo-replication/secret.pem -p 22 -oControlMaster=auto -S /tmp/gsyncd-aux-ssh-jsxe5z/55c207f77f58e7b6352df3f6f7e6779b.sock root.37.88 /nonexistent/gsyncd --session-owner 4f616379-ac10-4dde-a8c8-1a8c5dfb71f8 -N --listen --timeout 120 gluster://localhost:slave" returned with 255, saying:
[2016-05-29 22:40:10.341049] E [resource(/bricks/brick0/master_brick1):230:logerr] Popen: ssh> [2016-05-29 22:30:00.731420] I [cli.c:721:main] 0-cli: Started running /usr/sbin/gluster with version 3.7.9


Version-Release number of selected component (if applicable):
==============================================================

glusterfs-geo-replication-3.7.9-6.el7rhgs.x86_64
glusterfs-3.7.9-6.el7rhgs.x86_64


How reproducible:
=================

Seen only once, not sure about the occurrence. Will update BZ if observe again. 


Steps to Reproduce:
===================
1. Geo-Rep automated cases which does create,rmdir and other fops. Client: NFS protocol and Sync: tarssh

Comment 2 Rahul Hinduja 2016-05-30 09:09:47 UTC
sosreports @ http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/1340756/

Comment 3 Jeff 2016-07-13 21:46:03 UTC
I am receiving the same error

[2016-07-13 19:25:52.16657] I [master(/srv/gluster):1192:crawl] _GMaster: slave's time: (1468363119, 0)
[2016-07-13 19:25:52.583666] E [syncdutils(/srv/gluster):276:log_raise_exception] <top>: FAIL: 
Traceback (most recent call last):
  File "/usr/lib/x86_64-linux-gnu/glusterfs/python/syncdaemon/syncdutils.py", line 306, in twrap
    tf(*aa)
  File "/usr/lib/x86_64-linux-gnu/glusterfs/python/syncdaemon/master.py", line 1575, in syncjob
    po.errfail()
  File "/usr/lib/x86_64-linux-gnu/glusterfs/python/syncdaemon/resource.py", line 242, in errfail
    self.errlog()
  File "/usr/lib/x86_64-linux-gnu/glusterfs/python/syncdaemon/resource.py", line 223, in errlog
    if self.elines:
AttributeError: 'Popen' object has no attribute 'elines'
[2016-07-13 19:25:52.585411] I [syncdutils(/srv/gluster):220:finalize] <top>: exiting.
[2016-07-13 19:25:52.588901] I [repce(agent):92:service_loop] RepceServer: terminating on reaching EOF.
[2016-07-13 19:25:52.589368] I [syncdutils(agent):220:finalize] <top>: exiting.
[2016-07-13 19:25:52.928951] I [monitor(monitor):343:monitor] Monitor: worker(/srv/gluster) died in startup phase

geo-replication status is Faulty and it appears as if the gsyncd.py process is unable to start on the slave server.

gluster-server 3.7.11-1~bpo8+1 on Debian 8.

Comment 4 Jeff 2016-07-14 23:59:55 UTC
The error I was receiving was due to rsync not being installed on the slave server.

Comment 6 Aravinda VK 2016-09-01 07:13:31 UTC
Upstream patch sent for review
http://review.gluster.org/#/c/15379/

Comment 8 Atin Mukherjee 2016-09-19 09:00:19 UTC
Upstream mainline : http://review.gluster.org/15379
Upstream 3.8 : http://review.gluster.org/15447
downstream patch : https://code.engineering.redhat.com/gerrit/#/c/85005

Comment 10 Atin Mukherjee 2016-09-19 09:02:54 UTC
(In reply to Atin Mukherjee from comment #8)
> Upstream mainline : http://review.gluster.org/15379
> Upstream 3.8 : http://review.gluster.org/15447
> downstream patch : https://code.engineering.redhat.com/gerrit/#/c/85005

Correction, downstream patch link is https://code.engineering.redhat.com/gerrit/#/c/85007

Comment 20 Alok 2017-02-07 12:12:30 UTC
Approving the accelerated fix. Please note that the fix has to be in 3.2 to avoid regression.

Comment 32 errata-xmlrpc 2017-03-23 05:33:35 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2017-0486.html