1100107 – xfsprogs: xfs_copy succeeds but exits with error code

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1100107 - xfsprogs: xfs_copy succeeds but exits with error code

Summary: xfsprogs: xfs_copy succeeds but exits with error code

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Linux 6
Classification:	Red Hat
Component:	xfsprogs
Sub Component:
Version:	6.5
Hardware:	All
OS:	Linux
Priority:	unspecified
Severity:	medium
Target Milestone:	rc
Target Release:	---
Assignee:	Eric Sandeen
QA Contact:	Eryu Guan
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	1100376
TreeView+	depends on / blocked

Reported:	2014-05-22 03:27 UTC by Junxiao Bi
Modified:	2014-10-14 07:49 UTC (History)
CC List:	1 user (show)
Fixed In Version:	xfsprogs-3.1.1-16.el6
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Clones:	1100376 (view as bug list)
Environment:
Last Closed:	2014-10-14 07:49:55 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHBA-2014:1564	0	normal	SHIPPED_LIVE	xfsprogs bug update	2014-10-14 01:27:44 UTC

Description Junxiao Bi 2014-05-22 03:27:23 UTC

Description of problem:
xfs_copy used SIGKILL to kill its child thread before exit, that will end the whole process, so xfs_copy will exit with an error code 137. That will confuse script whether it successes.

This can fix it.

From b3580a15e10e153d7443a2e0c05f570d94b9b5a6 Mon Sep 17 00:00:00 2001
From: Junxiao Bi <junxiao.bi>
Date: Tue, 6 May 2014 14:27:31 +0800
Subject: [PATCH] xfsprogs: xfs_copy: use exit() to replace killall()

Sending a SIGKILL signal to child thread will terminate the whole process,
xfs_copy will return an error value 137. This cause confuse for script to
know whether the copy successes.

Calling exit() in main thread can terminate the whole process and return the
right value. Replace killall()+abort() with exit(1) to match the old way
exit in error case. Also remove killall()+pthread_exit(NULL) since return 0
will be followed by an exit(0) to terminate the process.

Bug story from Christoph Hellwig:
Btw, I think the reason for this cruft is that xfs_copy was originally
written using the IRIX sproc interface, and the port to pthreads didn't
remove this gem:

http://marc.info/?l=linux-xfs&m=99535721110020&w=2

Signed-off-by: Junxiao Bi <junxiao.bi>
Cc: Joe jin <joe.jin>
Reviewed-by: Christoph Hellwig <hch>
Reviewed-by: John Haxby <john.haxby>
Reviewed-by: Ethan Zhao <ethan.zhao>
---
 copy/xfs_copy.c |   30 +-----------------------------
 1 files changed, 1 insertions(+), 29 deletions(-)

diff --git a/copy/xfs_copy.c b/copy/xfs_copy.c
index 39517da..39bb9d7 100644
--- a/copy/xfs_copy.c
+++ b/copy/xfs_copy.c
@@ -217,25 +217,6 @@ handle_error:
 }
 
 void
-killall(void)
-{
-	int i;
-
-	/* only the parent gets to kill things */
-
-	if (getpid() != parent_pid)
-		return;
-
-	for (i = 0; i < num_targets; i++)  {
-		if (target[i].state == ACTIVE)  {
-			/* kill up target threads */
-			pthread_kill(target[i].pid, SIGKILL);
-			pthread_mutex_unlock(&targ[i].wait);
-		}
-	}
-}
-
-void
 handler(int sig)
 {
 	pid_t	pid = getpid();
@@ -400,8 +381,7 @@ read_wbuf(int fd, wbuf *buf, xfs_mount_t *mp)
 	if (buf->length > buf->size)  {
 		do_warn(_("assert error:  buf->length = %d, buf->size = %d\n"),
 			buf->length, buf->size);
-		killall();
-		abort();
+		exit(1);
 	}
 
 	if ((res = read(fd, buf->data, buf->length)) < 0)  {
@@ -591,11 +571,6 @@ main(int argc, char **argv)
 
 	parent_pid = getpid();
 
-	if (atexit(killall))  {
-		do_log(_("%s: couldn't register atexit function.\n"), progname);
-		die_perror();
-	}
-
 	/* open up source -- is it a file? */
 
 	open_flags = O_RDONLY;
@@ -1154,9 +1129,6 @@ main(int argc, char **argv)
 	}
 
 	check_errors();
-	killall();
-	pthread_exit(NULL);
-	/*NOTREACHED*/
 	return 0;
 }
 
-- 
1.7.1


Version-Release number of selected component (if applicable):
xfsprogs-3.1.1

How reproducible:


Steps to Reproduce:
1. xfs_copy source target
2. echo $?
3.

Actual results:


Expected results:


Additional info:

Comment 2 Eric Sandeen 2014-05-22 16:58:05 UTC

Yep, may as well fix this.  It is committed upstream now:

http://oss.sgi.com/cgi-bin/gitweb.cgi?p=xfs/cmds/xfsprogs.git;a=commitdiff;h=2277ce35c37c75aa3c146261d5abe32f9cc39baa

Comment 4 Eryu Guan 2014-06-29 12:29:03 UTC

Verified with /kernel/filesystems/xfs/1104956-xfs_copy-corrupt, test passed with xfsprogs-3.1.1-16.el6

Comment 5 errata-xmlrpc 2014-10-14 07:49:55 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2014-1564.html

Note You need to log in before you can comment on or make changes to this bug.