Bug 1100107 - xfsprogs: xfs_copy succeeds but exits with error code
Summary: xfsprogs: xfs_copy succeeds but exits with error code
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: xfsprogs
Version: 6.5
Hardware: All
OS: Linux
unspecified
medium
Target Milestone: rc
: ---
Assignee: Eric Sandeen
QA Contact: Eryu Guan
URL:
Whiteboard:
Depends On:
Blocks: 1100376
TreeView+ depends on / blocked
 
Reported: 2014-05-22 03:27 UTC by Junxiao Bi
Modified: 2014-10-14 07:49 UTC (History)
1 user (show)

Fixed In Version: xfsprogs-3.1.1-16.el6
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 1100376 (view as bug list)
Environment:
Last Closed: 2014-10-14 07:49:55 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2014:1564 0 normal SHIPPED_LIVE xfsprogs bug update 2014-10-14 01:27:44 UTC

Description Junxiao Bi 2014-05-22 03:27:23 UTC
Description of problem:
xfs_copy used SIGKILL to kill its child thread before exit, that will end the whole process, so xfs_copy will exit with an error code 137. That will confuse script whether it successes.

This can fix it.

From b3580a15e10e153d7443a2e0c05f570d94b9b5a6 Mon Sep 17 00:00:00 2001
From: Junxiao Bi <junxiao.bi@oracle.com>
Date: Tue, 6 May 2014 14:27:31 +0800
Subject: [PATCH] xfsprogs: xfs_copy: use exit() to replace killall()

Sending a SIGKILL signal to child thread will terminate the whole process,
xfs_copy will return an error value 137. This cause confuse for script to
know whether the copy successes.

Calling exit() in main thread can terminate the whole process and return the
right value. Replace killall()+abort() with exit(1) to match the old way
exit in error case. Also remove killall()+pthread_exit(NULL) since return 0
will be followed by an exit(0) to terminate the process.

Bug story from Christoph Hellwig:
Btw, I think the reason for this cruft is that xfs_copy was originally
written using the IRIX sproc interface, and the port to pthreads didn't
remove this gem:

http://marc.info/?l=linux-xfs&m=99535721110020&w=2

Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Joe jin <joe.jin@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: John Haxby <john.haxby@oracle.com>
Reviewed-by: Ethan Zhao <ethan.zhao@oracle.com>
---
 copy/xfs_copy.c |   30 +-----------------------------
 1 files changed, 1 insertions(+), 29 deletions(-)

diff --git a/copy/xfs_copy.c b/copy/xfs_copy.c
index 39517da..39bb9d7 100644
--- a/copy/xfs_copy.c
+++ b/copy/xfs_copy.c
@@ -217,25 +217,6 @@ handle_error:
 }
 
 void
-killall(void)
-{
-	int i;
-
-	/* only the parent gets to kill things */
-
-	if (getpid() != parent_pid)
-		return;
-
-	for (i = 0; i < num_targets; i++)  {
-		if (target[i].state == ACTIVE)  {
-			/* kill up target threads */
-			pthread_kill(target[i].pid, SIGKILL);
-			pthread_mutex_unlock(&targ[i].wait);
-		}
-	}
-}
-
-void
 handler(int sig)
 {
 	pid_t	pid = getpid();
@@ -400,8 +381,7 @@ read_wbuf(int fd, wbuf *buf, xfs_mount_t *mp)
 	if (buf->length > buf->size)  {
 		do_warn(_("assert error:  buf->length = %d, buf->size = %d\n"),
 			buf->length, buf->size);
-		killall();
-		abort();
+		exit(1);
 	}
 
 	if ((res = read(fd, buf->data, buf->length)) < 0)  {
@@ -591,11 +571,6 @@ main(int argc, char **argv)
 
 	parent_pid = getpid();
 
-	if (atexit(killall))  {
-		do_log(_("%s: couldn't register atexit function.\n"), progname);
-		die_perror();
-	}
-
 	/* open up source -- is it a file? */
 
 	open_flags = O_RDONLY;
@@ -1154,9 +1129,6 @@ main(int argc, char **argv)
 	}
 
 	check_errors();
-	killall();
-	pthread_exit(NULL);
-	/*NOTREACHED*/
 	return 0;
 }
 
-- 
1.7.1


Version-Release number of selected component (if applicable):
xfsprogs-3.1.1

How reproducible:


Steps to Reproduce:
1. xfs_copy source target
2. echo $?
3.

Actual results:


Expected results:


Additional info:

Comment 2 Eric Sandeen 2014-05-22 16:58:05 UTC
Yep, may as well fix this.  It is committed upstream now:

http://oss.sgi.com/cgi-bin/gitweb.cgi?p=xfs/cmds/xfsprogs.git;a=commitdiff;h=2277ce35c37c75aa3c146261d5abe32f9cc39baa

Comment 4 Eryu Guan 2014-06-29 12:29:03 UTC
Verified with /kernel/filesystems/xfs/1104956-xfs_copy-corrupt, test passed with xfsprogs-3.1.1-16.el6

Comment 5 errata-xmlrpc 2014-10-14 07:49:55 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2014-1564.html


Note You need to log in before you can comment on or make changes to this bug.