Bug 2284036 - rpm 4.20 alpha will not buiild the current kernel spec
Summary: rpm 4.20 alpha will not buiild the current kernel spec
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Fedora
Classification: Fedora
Component: rpm
Version: rawhide
Hardware: Unspecified
OS: Linux
unspecified
urgent
Target Milestone: ---
Assignee: Panu Matilainen
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2024-05-30 21:05 UTC by Justin M. Forbes
Modified: 2024-07-04 08:07 UTC (History)
5 users (show)

Fixed In Version: rpm-4.19.91-5.fc41
Clone Of:
Environment:
Last Closed: 2024-05-31 15:13:20 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github rpm-software-management rpm pull 3137 0 None open Fix a %buildroot regression on an early %__spec_install_pre %global o… 2024-05-31 09:08:48 UTC

Description Justin M. Forbes 2024-05-30 21:05:38 UTC
https://koji.fedoraproject.org/koji/taskinfo?taskID=118301536 has the most recent logs, but I did attempt to rebuild the srpm from Monday's kernel build just to verify that it was rpm and not a kernel change. 

We end up with a whole bunch of warnings, on top of the errors which eventually fail the build.

Reproducible: Always

Steps to Reproduce:
Attempt to build a kernel in rawhide.
Actual Results:  
Many warnings, and a failed build

Expected Results:  
a properly built kernel.

Not being able to build rawhide kernels is a substantial issue.

Comment 1 Panu Matilainen 2024-05-31 06:14:08 UTC
Thanks for the report, we'll look into this as prio 1 of course.

The warnings seem to be for a good reason:

%define uname_suffix %{lua:
        local flavour = rpm.expand('%{?1:+%{1}}')
        flavour = flavour:gsub('-', '_')
        if flavour ~= '' then
                print(flavour)
        end
}

That is *not* a paremetric macro, so there's never any macro %{1} defined. I don't see how that would have ever worked for the argumented case, but older rpm versions just didn't warn about this. Fixing this is simple enough: just add parentheses after the macro name for both these macros, ie

%define uname_suffix() %{lua:
...

As for the the errors, I see the following: 

1) apparent compile switch errors, such as:
- relocation R_X86_64_32 against `.rodata' can not be used when making a PIE object; recompile with -fPIE
/usr/bin/ld: failed to set dynamic section sizes: bad value
- clang: error: unsupported argument 'gnu2' to option '-mtls-dialect=' for target 'x86_64-redhat-linux-gnu'

2) error: could not create '/kernel-6.10-rc1-27-g4a4be1ad3a6e': Permission denied

The latter looks like some macro is expanding to "" unexpectedly, no idea would that or the compile switch problems be. I'll report back as soon as I have something, off to investigate now.

Comment 2 Panu Matilainen 2024-05-31 06:27:27 UTC
> I did attempt to rebuild the srpm from Monday's kernel build just to verify that it was rpm and not a kernel change. 

If you have the logs, it would be helpful to be able to compare the success/failure of an otherwise identical kernel build. I can see eln-builds on koji but I'm not sure how closely that reflects current Fedora.

Comment 3 Panu Matilainen 2024-05-31 06:41:19 UTC
Looking at the logs from the most recent successful f41 build before rpm 4.20: https://koji.fedoraproject.org/koji/buildinfo?buildID=2457885, eg https://kojipkgs.fedoraproject.org//packages/kernel/6.10.0/0.rc1.20240528git2bfcfd584ff5.18.fc41/data/logs/x86_64/build.log

The -fPIE and '-mtls-dialect=' errors are there already, eg:

usr/bin/ld: /builddir/build/BUILD/kernel-6.10-rc1-13-g2bfcfd584ff5/linux-6.10.0-0.rc1.20240528git2bfcfd584ff5.18.fc41.x86_64/samples/bpf/bpftool/bootstrap/libbpf/libbpf.a(libbpf-in.o): relocation R_X86_64_32 against `.rodata' can not be used when making a PIE object; recompile with -fPIE
/usr/bin/ld: failed to set dynamic section sizes: bad value
collect2: error: ld returned 1 exit status

clang: error: unsupported argument 'gnu2' to option '-mtls-dialect=' for target 'x86_64-redhat-linux-gnu'
clang: error: unsupported argument 'gnu2' to option '-mtls-dialect=' for target 'x86_64-redhat-linux-gnu'
make[1]: *** [Makefile:231: /builddir/build/BUILD/kernel-6.10-rc1-13-g2bfcfd584ff5/linux-6.10.0-0.rc1.20240528git2bfcfd584ff5.18.fc41.x86_64/tools/testing/selftests/bpf/liburandom_read.so] Error 1

So, there are things failing in that build before 4.20, whether you were aware of that or not. Anyway, for me this is good news because I couldn't see how rpm would throw up bad compiler options now. So the issue with 4.20 comes down to just this:

error: could not create '/kernel-6.10-rc1-27-g4a4be1ad3a6e': Permission denied

Now I only need to find what *that* is...

Comment 4 Panu Matilainen 2024-05-31 06:56:11 UTC
Okay, found where it goes wrong. On a successful build this looks like:

> '/usr/bin/python3' util/setup.py --quiet install --root='//builddir/build/BUILDROOT/kernel-6.10.0-0.rc1.20240528git2bfcfd584ff5.18.fc41.x86_64'
> [...]
>  self.initialize_options()
> + mkdir -p /builddir/build/BUILDROOT/kernel-6.10.0-0.rc1.20240528git2bfcfd584ff5.18.fc41.x86_64//usr/share/man/man1

On the failing builds:

> '/usr/bin/python3' util/setup.py --quiet install --root='/kernel-6.10-rc1-27-g4a4be1ad3a6e'
> [...]
>  self.initialize_options()
> error: could not create '/kernel-6.10-rc1-27-g4a4be1ad3a6e': Permission denied

So somehow somewhere the %{buildroot} and/or $RPM_BUILD_ROOT is empty. There have been changes around that so it kinda figures and the path is quite different now, but then it works for most of the spec so this must be some rather special case. Back to digging.

Comment 5 Panu Matilainen 2024-05-31 07:55:53 UTC
Okay so the problem comes, somehow, from this:

%global __spec_install_pre %{___build_pre}

I don't yet quite understand how that causes RPM_BUILD_ROOT to be empty inside %install, but that's it, I've been able to reproduce this locally with a much smaller spec.

I don't know how to build a custom kernel on fedora infra without uploading an entire src.rpm so haven't been able to test this (yet anyway), but changing the %global to %define *should* fix it. 

The Fedora recommendation to always use %global is a bad one because it's much more subtle than that, %global causes the macro body to be expanded immediately whereas many, many uses want expand-on-use behavior instead.

I'm off to chase down the underlying cause here because this doesn't seem right, but using %define would seem more appropriate there regardless of rpm version. Assuming that works, because it IS certainly possible to have problems in the other direction too.

Comment 6 Panu Matilainen 2024-05-31 08:23:08 UTC
Scratch the above, I see kernel is aware of the %global/%define difference and crafted for it. I'll note that such overrides to deep internals are playing with fire.

But I see the actual issue now. There's a well over 20 years old conditional in rpm macros around buildroot that is used for setting up RPM_BUILD_ROOT environment in the scriptlet execution environment:

> %{?buildroot:RPM_BUILD_ROOT=\"%{buildroot}\"

At the very early part of the spec this __spec_install_pre is being expanded, the *real* buildroot is not yet known at all because name, version etc are not known. For the rest of the stuff like RPM_PACKAGE_NAME=%{NAME} it doesn't matter because it "falls through" as-is, and gets re-expanded later to the proper value, but that silly conditional from pre-historic times causes RPM_BUILD_ROOT to be dropped entirely from the scriptlet execution environment.

Fix coming right up.

Comment 7 Panu Matilainen 2024-05-31 09:34:02 UTC
That particular issue fixed in rpm-4.19.91-5.fc41 now, hopefully that's all there is.

Apologies for the disruption, this was a good catch, but the kernel spec is meddling with internals that we cannot guarantee to stay stable in all cases. And you'll probably want to look into those unrelated pre-existing errors in the build.

Comment 8 Panu Matilainen 2024-05-31 09:37:48 UTC
Actually, lets leave this open until verified by a build...

Comment 9 Panu Matilainen 2024-05-31 11:41:10 UTC
Build here https://bodhi.fedoraproject.org/updates/FEDORA-2024-15aa3e51a5, once it passes the gating (makes airports seem walk in the park, these gates)

Comment 10 Justin M. Forbes 2024-05-31 13:18:58 UTC
I have a new kernel build going, thanks for looking into this so quickly.  Yes, the compile error is on some selftests which are explicitly allowed to fail.

Comment 11 Panu Matilainen 2024-05-31 14:18:09 UTC
Right, seems to be passing (although not entirely complete yet).

Comment 12 Justin M. Forbes 2024-05-31 15:13:20 UTC
Completed now. Thank you very much!


Note You need to log in before you can comment on or make changes to this bug.