Description of Problem: dhcpcd-1.3.22pl1-7.src.rpm fails to build out-of-the-box on a stock 7.3 "everything" install. Output of "rpm -ba" is attached. Version-Release number of selected component (if applicable): # rpm -q dhcpcd dhcpcd-1.3.22pl1-7 # ls -l dhcpcd-1.3.22pl1-7.src.rpm -rw-r--r-- 1 root root 149330 Apr 16 22:17 dhcpcd-1.3.22pl1-7.src.rpm # md5sum dhcpcd-1.3.22pl1-7.src.rpm c1cd851bfda824ad3a552df8a7edf896 dhcpcd-1.3.22pl1-7.src.rpm # rpm -K dhcpcd-1.3.22pl1-7.src.rpm dhcpcd-1.3.22pl1-7.src.rpm: md5 gpg OK How Reproducible: 100% (always) Steps to Reproduce: 1. Install stock 7.3 (Valhalla) "everything" install 2. rpm -Uvh {73-distro}/SRPMS/SRPMS/dhcpcd-1.3.22pl1-7.src.rpm 3. rpm -ba /usr/src/redhat/SPECS/dhcpcd.spec Actual Results: output attached, build does not complete Expected Results: build should/would complete
Created attachment 59446 [details] Output of "rpm -ba"
Try doing: rpm --define '_smp_mflags -j1' --rebuild foo.src.rpm and seeing if that helps. My guess is that a race condition in the makefiles is breaking things on SMP machines under "unique" circumstances.
My build machine is a laptop with a single processor, so there's no multi-CPU interaction issues even *possible*. If I run a (bash) for-loop that installs and builds each SRPM one at a time, fetching them over an NFS mount, I almost always get this error. If I just build it by hand, it seems to build, even without trying the _smp_mflags suggestion. Strange.
Need information that allows reproducing it here...
OK, when I use the command: rpm --define '_smp_mflags -j1' --rebuild dhcpcd*.src.rpm ... it builds successfully. As I originally filed the information in this defect, I _am_ able to reproduce the problem by compiling it on a 7.3-everything install, 2-way system with: rpm -Uvh dhcpcd*.src.rpm rpm -ba /usr/src/redhat/SPECS/dhcpcd.spec ... but if I add the --define '_smp_mflags -j1' to the latter rpm command line, I still get the build failure.
Couldn't reproduce no matter what...
I am seeing the same problem, Glen.. you're not crazy. It is seemingly timing dependent. A successful build will have the following steps at the beginning of the make: + make 'CFLAGS=-O2 -march=i386 -mcpu=i686' cd . && \ /bin/sh /usr/src/redhat/BUILD/dhcpcd-1.3.22-pl1/missing --run automake --gnu Makefile cd . && \ CONFIG_HEADERS= CONFIG_LINKS= \ CONFIG_FILES=Makefile /bin/sh ./config.status config.status: creating Makefile ... An unsuccessful build will omit the second part and just have: + make 'CFLAGS=-O2 -march=i386 -mcpu=i686' cd . && \ /bin/sh /usr/src/redhat/BUILD/dhcpcd-1.3.22-pl1/missing --run automake --gnu Makefile ... What seems to be happening here is something hinky with the following rules in the original Makefile: $(srcdir)/Makefile.in: Makefile.am $(top_srcdir)/configure.in $(ACLOCAL_M4) cd $(top_srcdir) && \ $(AUTOMAKE) --gnu Makefile Makefile: $(srcdir)/Makefile.in $(top_builddir)/config.status cd $(top_builddir) && \ CONFIG_HEADERS= CONFIG_LINKS= \ CONFIG_FILES=$@ $(SHELL) ./config.status In a successful compile, both rules are executed.. in an unsuccessful compile, only the top one is. The result in a bad compile is that Makefile.in is updated but Makefile is not. That is bad because cache.o is not included in the list of objects (since cache.c was added as part of a patch and thus not in the original Makefile.in). It would seem to be a timing issue with the filesystem (ick!) or a bug in make (almost as bad). I can run it one time (that is, rpm -bc) and do a rpm --clean and run it again and it fails. My configuration is a UP machine with RH7.3, updated to the latest RPMs as of a couple of weeks ago, running an ext3 filesystem. I hope the details help. --Jim
A little more detective work shows that the problem is in fact a timing problem that is caused by the patching done to Makefile.am. Let me try to give a time sequence that illustrates the problem: Time 0: RPM is unpacked and patched. Makefile.am (was patched) now has time 0. Time 1: RPM runs the configure script. configure script will generate a new Makefile based upon Makefile.in as the very last thing it does. Makefile now has time 1. Time 1 OR 2: RPM runs make. The first actual target to be made is Makefile. Makefile depends on Makefile.in. Makefile.in depends on Makefile.am. Makefile.in is pristine from the untar.. it's time is effectively -1. Makefile.am has time 0 from the patch; thus Makefile.in is rebuilt. Now it is time to decide if the Makefile should be rebuilt: Makefile has time 1. Makefile.in MAY have time 1 OR MAY have time2 If Makefile.in has time 1, Makefile is NOT rebuilt and so does not contain the new file "cache.o" and the build fails. If Makefile.in has time 2, Makefile must be rebuilt and everyone is happy. I was able to get around this by doing the following (ick!): Add a patch to Makefile.in that adds a "sleep 1" to the Makefile.in rule BEFORE automake is run. This will guarantee that Makefile.in will have "time 2" in my previous example. However, this patch itself makes Makefile.in have a new time stamp which screws things up worse (since it is the same or newer than Makefile.am). So, after the patch, I use touch to set Makefile.in's timestamp to something older (like --reference=configure). This will work everytime. Of course, this is a silly practical solution, but I was testing my theory. I'm not sure what the best thing is to do here but it is definitely a timing problem. I tried this on many different systems -- sometimes it failed 25% of the time, sometimes it failed 75% of the time. This was true for RH7.1 and RH7.3. I hope someone is listening to this email. It is a real problem. Thanks, --Jim
I re-installed 7.3 on my 2-way P-II (yes, 2) Kayak today and was able to reproduce the failure with a script that I will attach shortly. I believe that if you run the script on a 2-way 7.3 x86 with the arguments: # ./buildscript $PATH_TO_7_3_SRPMS/dhcpcd-1.3.22pl1-7.src.rpm ... you'll see the problem in a relatively short timeframe. Sorry it's not more reprodicuble than that, but this _will_ eventually show the problem on my system. The script repeatedly installs-builds-cleans until rpmbuild reports a non-zero exit status. I ran the script 3 times and was able to see the failure: a) after 13 times (first execution of the script) b) after 20 times (second execution of the script) c) after 7 times (third execution of the script) ... given Jim's extra investigation of time-stamps and the timing of this, I think it is a reproducible issue and should get more investigation (IMO). I believe there's a nastier bug lurking behind this...
Created attachment 68924 [details] Test-case (script) to reproduce the bug