It has been observed in the wild that distinct RPM builds (usually of related software, slight version variance) can result in some files with ELF build-id tags that match each other. This is undesirable if there is a need to concurrently install the related -debuginfo subrpms, because then file conflicts are detected by rpm under /usr/lib/debug/.build-id/XX/YYYYYY*. It is sometimes desirable to install debuginfo for multiple versions, if e.g. more than one kernel happens to be installed and one may want to debug each of them after a reboot; or to have a machine serve as a debuginfo repository for other purposes (such as systemtap module compilation). The problem comes from the build-id numbers being coded in Fedora/RHEL as the default ELF-content-hash. For similar enough software, the compiler's output content may match for some executables/libraries, and thus result in the duplicate-build-id. That results in conflicts. Roland McGrath has opined [1] that using a uuid (random, apprx guaranteed non-conflicting) build-id would not be appropriate because it would sacrifice repeatable builds. I'm not so sure about that this is actually so valuable. Still, it turns out that we don't have to sacrifice the property while having unique build-ids across similar builds. The proposed answer is to salt the content hash: to include an additional build-identification string into the hash calculations, for example the RPM's n-v-r-a full name. The attached patch does this at the rpm/find-debuginfo/debugedit level. Testing indicates that the resulting files/packages are fine: - just as functional as before, both execution (.text segments bitwise identical) and debugging - stripped .debug files are indeed named differently by n-v-r-a, even for identical sources & direct object code - the new .debug hashes are preserved across recompilation of the n-v-r-a rpm (let's call this maintaining the Roland Invariant) - the change is system-wide, but transparent in that existing debuggers etc. work without modification or awareness that the hashing algorithm has changed - no changes are required to binutils/ld (because debugedit already recalculates the build-ids from scratch (!)). The same salting concept may be welcome there (for non-fedora-distro usage). - no changes are required to individual packages / spec files [1] http://fedoraproject.org/wiki/Releases/FeatureBuildId
Created attachment 791552 [details] prototype for build-id salting at the debugedit level
Nice. I like the simple solution, that it works transparently and the fact that it maintains the "Roland Invariant". Since you use the ${name} in the salt this should also help in the case of an SCL (Software Collection) packages (which changes the package name, but might have some identical binaries).
Nice work, Frank. I wish I'd thought of it myself. ;-)
The original plan for this, which still is the actually right answer, is that no two builds not intended to be identical should use the same source file names. That is, include N-V-R.A in the source directory name, where historically only N-V was included. If that's done, then no two builds would ever collide on build IDs unless they lacked -g, and any such lacks should just be fixed. Frank's hack is an approximation of this, and suffices for the build ID part of the reason that N-V-R.A in source directory names is desireable. It doesn't address the other reasons that it's desireable, and by papering over the build ID problem with the status quo it just further delays the proper resolution (which has already been delayed five years).
(In reply to Roland McGrath from comment #4) > The original plan for this, which still is the actually right answer, is > that no two builds not intended to be identical should use the same source > file names. That is, include N-V-R.A in the source directory name, where > historically only N-V was included. If that's done, then no two builds > would ever collide on build IDs unless they lacked -g, and any such lacks > should just be fixed. Aha, so that is why debugedit recalculates the build-id. Because it changes the DWARF source file locations. Currently we call debugedit (from find-debuginfo.sh) with: -b "$RPM_BUILD_DIR" -d /usr/src/debug So one solution would just be to call it with: -b "$RPM_BUILD_DIR" -d /usr/src/debug/${N-V-R.A} But maybe we should adjust rpmbuild/fedpkg/koji to use ${N-V-R.A} directly? I don't know where that is precisely. Is that something that is controlled by an rpm macro?
Just an note to make sure to double check the ${N-V-R.A} is truly unique with respect to SCL (Software Collection) packages. Currently it seems they are both build with the same "basename" as their non-SCLified counterparts (e.g. both are build inside a directory called elfutils-0.154). But I am probably mixing my terminology. Just want to make sure that when using the ${N-V-R.A} for the source directory name inside the RPM_BUILD_DIR is indeed unique for a "base package" and a "SCL" package.
I wrote some patches for NVRA rpm source dirs in 2008: http://people.redhat.com/jkratoch/multidebug/ This is probably why I got the Cc from Roland. Going to rebase them and recheck what those patches did.
Created attachment 792472 [details] find-debuginfo.sh + debugedit.c patch (In reply to Jan Kratochvil from comment #7) > I wrote some patches for NVRA rpm source dirs in 2008: > http://people.redhat.com/jkratoch/multidebug/ This was too ugly. (In reply to Mark Wielaard from comment #5) > Currently we call debugedit (from find-debuginfo.sh) with: > -b "$RPM_BUILD_DIR" -d /usr/src/debug > > So one solution would just be to call it with: > -b "$RPM_BUILD_DIR" -d /usr/src/debug/${N-V-R.A} I have tried this idea and it seems to work (tested on dwdiff.fc19) but I haven't tried a test-mass-rebuild with it.
I was under the impression that debugedit with a larger destdir was avoided partly because of string length limitations: rpm tools/debugedit.c: if (strlen (dest_dir) > strlen (base_dir)) { fprintf (stderr, "Dest dir longer than base dir is not supported\n"); exit (1); }
yeah, the build path (base_dir) has to be larger than the install path (dest_dir) since debugedit just does a simple replacement and pad with NUL instead of fiddling with the binary in any way.
Is it a problem in this case? Test build went fine for me but I did not test it in Koji as it would require a chain build I am not aware how to do. Or maybe we could finally rewrite debugedit as full read-in/write-out?
We can ask rel-eng for a private tag for this if you want. I don't think it's a problem, but I don't off-hand remember what the base_dir is... I *think* it's /builddir/build/BUILD/N-V but my memory is hazy.
If here is an agreement with my Comment 8 we could go with the test rebuild. In fact Koji path should be equal to mock path so it should be all reproducible locally. I will ask at #fedora-devel whether Koji tag is OK or if I rather should run a test mass rebuild on some Red Hat machine.
Without mechanically ensuring enlargement of the builddir path length, you're likely to get sporadic failures, e.g. from packages that do their own insufficiently-long BuildRoot:-or-equivalent setup in their spec files.
repoquery --disablerepo='*' --enablerepo=rawhide-source --archlist=src --qf '-%{release}.x86_64 %{name}-%{version}' -qa|perl -lne 'print length((/^(\S+)/)[0])." $_"'|sort -n|tail 41 -0.9.20120403gitf7d3f802f3750.fc20.x86_64 gauche-gtk-0.6 So the largest enlargement is by 41 characters which can be workarounded for mock (from its original '/builddir') by: config_opts['chroothome'] = '/builddir11111111112222222222333333333344444444445555555555' config_opts['environment']['HOME'] = config_opts['chroothome'] config_opts['macros']['%_topdir'] = '%s/build' % config_opts['chroothome'] But that also means such packages may no longer be buildable by users with regular rpmbuild. Maybe that will at least force their maintainers to make the revision strings shorter. :-)
s/revision strings/release strings/
Although it may be better to fix rather the build source directory, which will have no problems with debugedit then. It may be incompatible with some .spec files but Fedora can fix that for the next release.
(... whereas the salting approach works for current & unmodified spec files.)
Created attachment 793468 [details] %{buildsubdir} -> NVRA rpm patch But the salting approach works only for the kernel package, isn't it? Other packages conflict due to /usr/src/debug/N-V/ . Running a new mass rebuild with this patch instead. It needs no mock changes and it has no problem with debugedit.
Mass rebuild results so far for the Comment 20 patch: 352 rebuilt fine 28 failed - 5 are unrelated FTBFS, 4 are real regressions; rest still untested. A clear reason can be seen for example in: Canna-3.7p3-40.fc20 in .spec: - cat $RPM_BUILD_DIR/Canna-%{version}/zipcode[...] But apparently the number of regressions needing a fix should be very small.
Created attachment 795744 [details] %{buildsubdir} -> NVRA rpm patch #2
4093 pass 243 fail 81 cleanfail 42 cleanpass -> 100.0*42/(4093+243) = 0.968% of packages PASS->FAIL to build with new rpm With 14166 Fedora packages it is about 137 packages needing to be fixed. I have verified few of the packages that they really have needless .spec assumptions: Canna-3.7p3-40.fc20.src.rpm - cat $RPM_BUILD_DIR/Canna-%{version}/zipcode[...] GMT-4.5.9-6.fc21.src.rpm - ln -s %{gmthome}/coast $RPM_BUILD_DIR/GMT%{version}/share GAPDoc-1.5.1-4.fc20.src.rpm - cp -a %{name}-%{version} $RPM_BUILD_ROOT%{_gap_dir}/pkg R-3.0.1-4.fc20.src.rpm - %define __perl_provides %{_builddir}/R-%{version}/%{name}-prov chmod +x %{__perl_provides} android-tools-20130123git98d0789-2.fc20.src.rpm - Wrong %setup, also not using -T + NVRA cannot work with two top directories %setup -q -b 1 -n extras %setup -q -b 0 -n %{packdname} Is the Comment 22 patch acceptable for F-21 rpmbuild? It needs doc updates and to write a Fedora Change page. While it fixes the build-id conflicts it still does not enable installing multiple *-debuginfo.rpm (except for kernel-debuginfo.rpm) because files like /usr/lib/debug/usr/bin/BINARY will fail. It would need an additional change (so that /usr/lib/debug/.build-id/** turn symlinks->files, /usr/lib/debug/usr/** turn files->symlinks and conflicts of those symlinks are ignored, like with some multi-arch coloring).
"It would need an additional change (so that /usr/lib/debug/.build-id/** turn symlinks->files, /usr/lib/debug/usr/** turn files->symlinks and conflicts of those symlinks are ignored, like with some multi-arch coloring)." Those changes are IMHO well overdue.
Changing the needinfo to Panu, as he is the most qualified one to answer this.
I've a rather uneasy feeling about the directory rename. A less disruptive way of achieving NVRA in the paths could be arranging $RPM_BUILD_DIR to be per-package private, ie create an intermediate N(E)VRA directory as $RPM_BUILD_DIR / %{_builddir}. AFAICS this would achieve the desired N(E)VRA salting in absolutely all packages, avoid the breakages witnessed in comment #23 and make the whole thing a bit "safer" against bad tarballs and the like.
The disadvantage would be that /usr/src/debug/* directories would all have one additional subdirectory (of mostly the same name). This does not seem clean to me. With the few .spec files failing compatibility test I checked in Comment 23 it mostly pointed to already imperfectly coded .spec files anyway.
The specs in comment #23 aren't doing anything particularly evil though, and if Fedora specs have such "imperfections" despite all the gazillion guidelines, think of all the specs elsewhere in the world. Unnecessarily breaking non-evil specs is bad. There might well be better alternatives yet, but the private builddir approach solves other (ages old) issues besides the one at hand here.
(In reply to Panu Matilainen from comment #26) > I've a rather uneasy feeling about the directory rename. A less disruptive > way of achieving NVRA in the paths could be arranging $RPM_BUILD_DIR to be > per-package private, ie create an intermediate N(E)VRA directory as > $RPM_BUILD_DIR / %{_builddir}. Yes, that's fine. > AFAICS this would achieve the desired N(E)VRA salting in absolutely all packages No, because there exist packages (e.g. ocaml) that generate debuginfo without embedded build-tree name references, which lead to pure content-based hash collisions. The debugedit-level hash salting would still be beneficial.
(In reply to Frank Ch. Eigler from comment #29) > (In reply to Panu Matilainen from comment #26) > > AFAICS this would achieve the desired N(E)VRA salting in absolutely all packages > > No, because there exist packages (e.g. ocaml) that generate debuginfo without > embedded build-tree name references, which lead to pure content-based > hash collisions. The debugedit-level hash salting would still be beneficial. Ah, what I really meant (I suppose) is that it'd cause practically all packages to contain the "salting" in their source filenames no matter what the spec does. Whether other tooling actually uses the source filenames for this is a different issue. Anyway, purely from rpm POV the debugedit-level salting would be by far the least disruptive (and thus preferred) option and one that seems more independent of other tooling as well.
So the pros and cons are: pro: keeping compatibility with ~0.968% of .spec files con: needless single subdirectory in /usr/src/debug/*/ con: it is only rpm-level hack, upstream tools are still not fixed
FYI I am not going to work on the remaining part if rpms are going to be broken this way. I believed it can be done the right way and not just hacked up.
Um, what exactly are you referring with "this way"? I see about three different approaches being suggested, each with different set of pros and cons.
I was referring to the Comment 28 introducing /usr/src/debug/* directories containing always just one subdirectory.
This should be fixed now in rawhide as of rpm >= 4.13.0.1-4 as a side-effect of the backports to enable parallel-installable debuginfo.