+++ This bug was initially created as a clone of Bug #1373976 +++ Description of problem: ======================= While syncing data using tar, the sync completes but lots of tar process becomes defunct. [root@dhcp41-167 ~]# ps -eaf | grep tar root 12520 4519 1 17:19 ? 00:00:00 tar --sparse -cf - --files-from - root 12521 4522 1 17:19 ? 00:00:00 tar --sparse -cf - --files-from - root 12522 4519 6 17:19 ? 00:00:00 ssh -oPasswordAuthentication=no -oStrictHostKeyChecking=no -i /var/lib/glusterd/geo-replication/tar_ssh.pem -p 22 root.41.203 tar --overwrite -xf - -C /proc/22664/cwd root 12523 4522 6 17:19 ? 00:00:00 ssh -oPasswordAuthentication=no -oStrictHostKeyChecking=no -i /var/lib/glusterd/geo-replication/tar_ssh.pem -p 22 root.41.203 tar --overwrite -xf - -C /proc/22663/cwd root 12524 4510 1 17:19 ? 00:00:00 tar --sparse -cf - --files-from - root 12525 4510 10 17:19 ? 00:00:00 ssh -oPasswordAuthentication=no -oStrictHostKeyChecking=no -i /var/lib/glusterd/geo-replication/tar_ssh.pem -p 22 root.41.203 tar --overwrite -xf - -C /proc/22665/cwd root 12526 4498 0 17:19 ? 00:00:00 tar --sparse -cf - --files-from - root 12527 4498 0 17:19 ? 00:00:00 ssh -oPasswordAuthentication=no -oStrictHostKeyChecking=no -i /var/lib/glusterd/geo-replication/tar_ssh.pem -p 22 root.41.203 tar --overwrite -xf - -C /proc/22662/cwd root 12529 4186 0 17:19 pts/0 00:00:00 grep tar [root@dhcp41-167 ~]# [root@dhcp41-167 ~]# ps -eaf | grep tar root 12520 4519 1 17:19 ? 00:00:00 [tar] <defunct> root 12521 4522 0 17:19 ? 00:00:00 tar --sparse -cf - --files-from - root 12523 4522 5 17:19 ? 00:00:00 ssh -oPasswordAuthentication=no -oStrictHostKeyChecking=no -i /var/lib/glusterd/geo-replication/tar_ssh.pem -p 22 root.41.203 tar --overwrite -xf - -C /proc/22663/cwd root 12524 4510 1 17:19 ? 00:00:00 tar --sparse -cf - --files-from - root 12525 4510 7 17:19 ? 00:00:00 ssh -oPasswordAuthentication=no -oStrictHostKeyChecking=no -i /var/lib/glusterd/geo-replication/tar_ssh.pem -p 22 root.41.203 tar --overwrite -xf - -C /proc/22665/cwd root 12526 4498 0 17:19 ? 00:00:00 tar --sparse -cf - --files-from - root 12527 4498 1 17:19 ? 00:00:00 ssh -oPasswordAuthentication=no -oStrictHostKeyChecking=no -i /var/lib/glusterd/geo-replication/tar_ssh.pem -p 22 root.41.203 tar --overwrite -xf - -C /proc/22662/cwd root 12531 4186 0 17:19 pts/0 00:00:00 grep tar [root@dhcp41-167 ~]# [root@dhcp41-167 ~]# ps -eaf | grep tar root 12520 4519 0 17:19 ? 00:00:00 [tar] <defunct> root 12521 4522 0 17:19 ? 00:00:00 [tar] <defunct> root 12524 4510 0 17:19 ? 00:00:00 [tar] <defunct> root 12526 4498 0 17:19 ? 00:00:00 [tar] <defunct> root 12533 4186 0 17:19 pts/0 00:00:00 grep tar [root@dhcp41-167 ~]# [root@dhcp41-167 ~]# ps -eaf | grep tar root 12520 4519 0 17:19 ? 00:00:00 [tar] <defunct> root 12521 4522 0 17:19 ? 00:00:00 [tar] <defunct> root 12524 4510 0 17:19 ? 00:00:00 [tar] <defunct> root 12526 4498 0 17:19 ? 00:00:00 [tar] <defunct> root 12543 4186 0 17:19 pts/0 00:00:00 grep tar [root@dhcp41-167 ~]# Steps to Reproduce: =================== 1. Setup geo-rep between master and slave 2. Set config parameter use-tarssh true 3. Start geo-replication 4. Write some data on master volume 5, Monitor tar process on master nodes using "ps -eaf | grep tar" Actual results: =============== Data at master and slave is synced and arequal checksum matches, However, lots of process gets defunct. [root@dhcp41-167 ~]# ps -eaf | grep tar root 12520 4519 0 17:19 ? 00:00:00 [tar] <defunct> root 12521 4522 0 17:19 ? 00:00:00 [tar] <defunct> root 12524 4510 0 17:19 ? 00:00:00 [tar] <defunct> root 12526 4498 0 17:19 ? 00:00:00 [tar] <defunct> root 12543 4186 0 17:19 pts/0 00:00:00 grep tar [root@dhcp41-167 ~]# Expected results: ================= No tar process should be defunct
REVIEW: http://review.gluster.org/15426 (geo-rep: Defunct tar process after sync) posted (#1) for review on master by Aravinda VK (avishwan)
REVIEW: http://review.gluster.org/15426 (geo-rep: Defunct tar process after sync) posted (#2) for review on master by Aravinda VK (avishwan)
COMMIT: http://review.gluster.org/15426 committed in master by Aravinda VK (avishwan) ------ commit 6b30e9bf5a612e105eb7ded0a89ef25fd8530ba5 Author: Aravinda VK <avishwan> Date: Thu Sep 8 17:30:37 2016 +0530 geo-rep: Defunct tar process after sync After every sync iteration with tarssh mode leaves defunct tar process. Added wait for tar process to prevent this issue. BUG: 1374286 Change-Id: I9953239ef601cc1970c814b00074b45eb00f481e Signed-off-by: Aravinda VK <avishwan> Reviewed-on: http://review.gluster.org/15426 Smoke: Gluster Build System <jenkins.org> NetBSD-regression: NetBSD Build System <jenkins.org> Reviewed-by: Saravanakumar Arumugam <sarumuga> CentOS-regression: Gluster Build System <jenkins.org> Reviewed-by: Kotresh HR <khiremat>
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.10.0, please open a new bug report. glusterfs-3.10.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] http://lists.gluster.org/pipermail/gluster-users/2017-February/030119.html [2] https://www.gluster.org/pipermail/gluster-users/