Bug 1374286 - [geo-rep]: defunct tar process while using tar+ssh sync
Summary: [geo-rep]: defunct tar process while using tar+ssh sync
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: geo-replication
Version: mainline
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
Assignee: Aravinda VK
QA Contact:
URL:
Whiteboard:
Depends On: 1373976
Blocks: 1375541 1375542 1375543
TreeView+ depends on / blocked
 
Reported: 2016-09-08 12:00 UTC by Aravinda VK
Modified: 2017-03-06 17:25 UTC (History)
6 users (show)

Fixed In Version: glusterfs-3.10.0
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1373976
: 1375541 1375542 1375543 (view as bug list)
Environment:
Last Closed: 2017-03-06 17:25:17 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:


Attachments (Terms of Use)

Description Aravinda VK 2016-09-08 12:00:00 UTC
+++ This bug was initially created as a clone of Bug #1373976 +++

Description of problem:
=======================

While syncing data using tar, the sync completes but lots of tar process becomes defunct. 

[root@dhcp41-167 ~]# ps -eaf | grep tar
root     12520  4519  1 17:19 ?        00:00:00 tar --sparse -cf - --files-from -
root     12521  4522  1 17:19 ?        00:00:00 tar --sparse -cf - --files-from -
root     12522  4519  6 17:19 ?        00:00:00 ssh -oPasswordAuthentication=no -oStrictHostKeyChecking=no -i /var/lib/glusterd/geo-replication/tar_ssh.pem -p 22 root@10.70.41.203 tar --overwrite -xf - -C /proc/22664/cwd
root     12523  4522  6 17:19 ?        00:00:00 ssh -oPasswordAuthentication=no -oStrictHostKeyChecking=no -i /var/lib/glusterd/geo-replication/tar_ssh.pem -p 22 root@10.70.41.203 tar --overwrite -xf - -C /proc/22663/cwd
root     12524  4510  1 17:19 ?        00:00:00 tar --sparse -cf - --files-from -
root     12525  4510 10 17:19 ?        00:00:00 ssh -oPasswordAuthentication=no -oStrictHostKeyChecking=no -i /var/lib/glusterd/geo-replication/tar_ssh.pem -p 22 root@10.70.41.203 tar --overwrite -xf - -C /proc/22665/cwd
root     12526  4498  0 17:19 ?        00:00:00 tar --sparse -cf - --files-from -
root     12527  4498  0 17:19 ?        00:00:00 ssh -oPasswordAuthentication=no -oStrictHostKeyChecking=no -i /var/lib/glusterd/geo-replication/tar_ssh.pem -p 22 root@10.70.41.203 tar --overwrite -xf - -C /proc/22662/cwd
root     12529  4186  0 17:19 pts/0    00:00:00 grep tar
[root@dhcp41-167 ~]#
[root@dhcp41-167 ~]# ps -eaf | grep tar
root     12520  4519  1 17:19 ?        00:00:00 [tar] <defunct>
root     12521  4522  0 17:19 ?        00:00:00 tar --sparse -cf - --files-from -
root     12523  4522  5 17:19 ?        00:00:00 ssh -oPasswordAuthentication=no -oStrictHostKeyChecking=no -i /var/lib/glusterd/geo-replication/tar_ssh.pem -p 22 root@10.70.41.203 tar --overwrite -xf - -C /proc/22663/cwd
root     12524  4510  1 17:19 ?        00:00:00 tar --sparse -cf - --files-from -
root     12525  4510  7 17:19 ?        00:00:00 ssh -oPasswordAuthentication=no -oStrictHostKeyChecking=no -i /var/lib/glusterd/geo-replication/tar_ssh.pem -p 22 root@10.70.41.203 tar --overwrite -xf - -C /proc/22665/cwd
root     12526  4498  0 17:19 ?        00:00:00 tar --sparse -cf - --files-from -
root     12527  4498  1 17:19 ?        00:00:00 ssh -oPasswordAuthentication=no -oStrictHostKeyChecking=no -i /var/lib/glusterd/geo-replication/tar_ssh.pem -p 22 root@10.70.41.203 tar --overwrite -xf - -C /proc/22662/cwd
root     12531  4186  0 17:19 pts/0    00:00:00 grep tar
[root@dhcp41-167 ~]#
[root@dhcp41-167 ~]# ps -eaf | grep tar
root     12520  4519  0 17:19 ?        00:00:00 [tar] <defunct>
root     12521  4522  0 17:19 ?        00:00:00 [tar] <defunct>
root     12524  4510  0 17:19 ?        00:00:00 [tar] <defunct>
root     12526  4498  0 17:19 ?        00:00:00 [tar] <defunct>
root     12533  4186  0 17:19 pts/0    00:00:00 grep tar
[root@dhcp41-167 ~]#
[root@dhcp41-167 ~]# ps -eaf | grep tar
root     12520  4519  0 17:19 ?        00:00:00 [tar] <defunct>
root     12521  4522  0 17:19 ?        00:00:00 [tar] <defunct>
root     12524  4510  0 17:19 ?        00:00:00 [tar] <defunct>
root     12526  4498  0 17:19 ?        00:00:00 [tar] <defunct>
root     12543  4186  0 17:19 pts/0    00:00:00 grep tar
[root@dhcp41-167 ~]# 


Steps to Reproduce:
===================
1. Setup geo-rep between master and slave
2. Set config parameter use-tarssh true
3. Start geo-replication
4. Write some data on master volume 
5, Monitor tar process on master nodes using "ps -eaf | grep tar" 

Actual results:
===============

Data at master and slave is synced and arequal checksum matches, However, lots of process gets defunct. 
[root@dhcp41-167 ~]# ps -eaf | grep tar
root     12520  4519  0 17:19 ?        00:00:00 [tar] <defunct>
root     12521  4522  0 17:19 ?        00:00:00 [tar] <defunct>
root     12524  4510  0 17:19 ?        00:00:00 [tar] <defunct>
root     12526  4498  0 17:19 ?        00:00:00 [tar] <defunct>
root     12543  4186  0 17:19 pts/0    00:00:00 grep tar
[root@dhcp41-167 ~]# 


Expected results:
=================
No tar process should be defunct

Comment 1 Worker Ant 2016-09-08 12:02:46 UTC
REVIEW: http://review.gluster.org/15426 (geo-rep: Defunct tar process after sync) posted (#1) for review on master by Aravinda VK (avishwan@redhat.com)

Comment 2 Worker Ant 2016-09-08 16:13:22 UTC
REVIEW: http://review.gluster.org/15426 (geo-rep: Defunct tar process after sync) posted (#2) for review on master by Aravinda VK (avishwan@redhat.com)

Comment 3 Worker Ant 2016-09-13 10:59:31 UTC
COMMIT: http://review.gluster.org/15426 committed in master by Aravinda VK (avishwan@redhat.com) 
------
commit 6b30e9bf5a612e105eb7ded0a89ef25fd8530ba5
Author: Aravinda VK <avishwan@redhat.com>
Date:   Thu Sep 8 17:30:37 2016 +0530

    geo-rep: Defunct tar process after sync
    
    After every sync iteration with tarssh mode leaves defunct tar
    process.
    
    Added wait for tar process to prevent this issue.
    
    BUG: 1374286
    Change-Id: I9953239ef601cc1970c814b00074b45eb00f481e
    Signed-off-by: Aravinda VK <avishwan@redhat.com>
    Reviewed-on: http://review.gluster.org/15426
    Smoke: Gluster Build System <jenkins@build.gluster.org>
    NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
    Reviewed-by: Saravanakumar Arumugam <sarumuga@redhat.com>
    CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
    Reviewed-by: Kotresh HR <khiremat@redhat.com>

Comment 4 Shyamsundar 2017-03-06 17:25:17 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.10.0, please open a new bug report.

glusterfs-3.10.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://lists.gluster.org/pipermail/gluster-users/2017-February/030119.html
[2] https://www.gluster.org/pipermail/gluster-users/


Note You need to log in before you can comment on or make changes to this bug.