RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 654822 - GNU make hanging at end of build
Summary: GNU make hanging at end of build
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: make
Version: 6.0
Hardware: Unspecified
OS: Unspecified
low
medium
Target Milestone: rc
: ---
Assignee: Petr Machata
QA Contact: qe-baseos-tools-bugs
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2010-11-18 20:11 UTC by Anthony Green
Modified: 2015-05-05 01:35 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2011-11-25 12:10:35 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
srpm that won't finish building (1.27 MB, application/x-rpm)
2011-01-07 16:20 UTC, Anthony Green
no flags Details

Description Anthony Green 2010-11-18 20:11:21 UTC
Description of problem:
I'm porting a package from Fedora to EPEL.  However, when I build the attached srpm on RHEL 6.0 it appears as through GNU make is hanging on a read (based on strace output).

Version-Release number of selected component (if applicable):
make-3.81-19.el6.x86_64

How reproducible:
Always

Steps to Reproduce:
1.rpmbuild --rebuild autogen-5.11-1.fc15.src.rpm
2.
3.
  
Actual results:
Freezes at the end of make 

Expected results:
Continues on with packaging (ends up failing for other reasons).

Additional info:

Comment 2 Petr Machata 2010-12-09 13:45:16 UTC
Not reproducible with 5.9.4-7, the latest in fedora repository.  Could you attach the problematic srpm to this bugzilla?

Comment 3 RHEL Program Management 2011-01-07 15:31:23 UTC
This request was evaluated by Red Hat Product Management for
inclusion in the current release of Red Hat Enterprise Linux.
Because the affected component is not scheduled to be updated
in the current release, Red Hat is unfortunately unable to
address this request at this time. Red Hat invites you to
ask your support representative to propose this request, if
appropriate and relevant, in the next release of Red Hat
Enterprise Linux. If you would like it considered as an
exception in the current release, please ask your support
representative.

Comment 4 Anthony Green 2011-01-07 16:20:33 UTC
Created attachment 472256 [details]
srpm that won't finish building

Comment 5 Petr Machata 2011-01-10 15:39:53 UTC
It's reproducible with that srpm.  (Even on Fedora with the right make version.)

Comment 6 Petr Machata 2011-01-11 19:52:57 UTC
What seems to be minimal reproducer:

--checkopt-xx.def--
AutoGen Definitions options;
prog-name = check;
prog-title = "Checkout Automated Options";
flag = { name = e; };

--Makefile--
all-am:
	agen5/autogen checkopt-xx.def # $(MAKE)
	echo DONE

$ make -C doc -r -j2

Notes: the comment with $(MAKE) has to be there.  Other variable won't do.  It has to be the new autogen that is launched.  In -jN, N must be >1 (supposedly to enable jobserver).

Comment 7 Petr Machata 2011-01-12 23:33:30 UTC
So the minimal stand-alone reproducer is this:

--hang.mf--
run: hang
	+./hang
	echo DONE

--hang.c--
#include <stdio.h>
int main(int argc, char ** argv) {
    if (fork() == 0)
      execl("/bin/cat", "/bin/cat", NULL);
    return 0;
}

$ make -r -j2 -f hang.mf run
./hang
echo DONE
DONE
#and it hangs, waiting for cat to finish. Pressing C-d does that.

When you remove the initial "+", make doesn't hang.  I don't know what the problem is yet, but that's the essence of the autogen build hang.  In autogen what hangs make is the process "sh".  When rpmbuild hangs, "pstree" shows a pack of sh's rooted right under "init", as the autogen process that launched them died without collecting them.  Killing those sh's un-hangs the build and rpmbuild finishes with error.

The easiest workaround is not to pass %{?_smp_mflags} to make.  I'll look into fixing the make problem next.

Comment 8 Petr Machata 2011-01-14 00:46:11 UTC
When make sees $(MAKE) or initial + in recipe, it assumes that the command will recurse and therefore, in jobserver mode, leaves the jobserver pipe open in sub-process.  (That pipe is used to coordinate parallel builds in face of several make instances.)  Before the toplevel make exits, it looks into that pipe and waits for all the synchronization tokens to turn up.  But your build is stuck in some innocent "sh" that has no idea that it's supposed to be part of a recursive build and that it should close those descriptors that it will never use anyway.  So it doesn't, and the toplevel make hangs there indefinitely.

On make side, dropping master_job_slots sanity check in main.c:clean_jobserver gets rid of the problem.

On autogen side, in doc/Makefile.in, doing something like this gets rid of the recursion trigger:
_MAKE := $(MAKE)
agdoc.texi      : # self-depends upon all executables
	MAKE=$(_MAKE) ./mk-agen-texi.sh
But note that this is just working around the problem.  That variable is being passed down presumably to be used in recursive make invocation, so technically make is right to catch that.

The only upstreamable solution, I think, would be if autogen collected its children.  I don't know why it doesn't, I think I've seen some comments related to SIGCHLD etc. in the code.  FWIW, the shell that stays hanging is opened in agen5/agShell.c:chainOpen.


Note You need to log in before you can comment on or make changes to this bug.