Bug 1691430 - dnf.exceptions.Error: Incorrect or unknown "arch": armv7hcnl
Summary: dnf.exceptions.Error: Incorrect or unknown "arch": armv7hcnl
Keywords:
Status: POST
Alias: None
Product: Fedora
Classification: Fedora
Component: dnf
Version: 30
Hardware: aarch64
OS: Linux
high
unspecified
Target Milestone: ---
Assignee: Jaroslav Rohel
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks: ARMTracker
TreeView+ depends on / blocked
 
Reported: 2019-03-21 15:15 UTC by Paul Whalen
Modified: 2019-08-13 16:42 UTC (History)
18 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:


Attachments (Terms of Use)
anaconda log (5.23 KB, text/plain)
2019-03-21 15:21 UTC, Paul Whalen
no flags Details

Description Paul Whalen 2019-03-21 15:15:11 UTC
Description of problem:
Armv7 virt install on aarch64 fails with:

Traceback (most recent call last):
  File "/sbin/anaconda", line 615, in <module>
    display.setup_display(anaconda, opts)
  File "/usr/lib/python3.7/site-packages/pyanaconda/display.py", line 346, in setup_display
    anaconda.initInterface()
  File "/usr/lib/python3.7/site-packages/pyanaconda/anaconda.py", line 288, in initInterface
    self._intf = TextUserInterface(self.storage, self.payload)
  File "/usr/lib/python3.7/site-packages/pyanaconda/anaconda.py", line 100, in payload
    self._payload = klass(self.ksdata)




  File "/usr/lib/python3.7/site-packages/pyanaconda/payload/dnfpayload.py", line 304, in __init__
    self._configure()
  File "/usr/lib/python3.7/site-packages/pyanaconda/payload/dnfpayload.py", line 660, in _configure
    self._base = dnf.Base()
  File "/usr/lib/python3.7/site-packages/dnf/base.py", line 93, in __init__
    self._conf = conf or self._setup_default_conf()
  File "/usr/lib/python3.7/site-packages/dnf/base.py", line 152, in _setup_default_conf
    conf = dnf.conf.Conf()
  File "/usr/lib/python3.7/site-packages/dnf/conf/config.py", line 213, in __init__
    self.arch = hawkey.detect_arch()
  File "/usr/lib/python3.7/site-packages/dnf/conf/config.py", line 80, in __setattr__


Version-Release number of selected component (if applicable):
dnf-4.2.1-1.fc30.noarch


How reproducible:
everytime

Comment 1 Paul Whalen 2019-03-21 15:21:42 UTC
Created attachment 1546574 [details]
anaconda log

Comment 2 Paul Whalen 2019-03-21 15:44:34 UTC
This is happening on an F30 aarch64 host, attempting to install an F30 armv7 guest. F29 aarch64 host with f30 armv7 guest works ok

Comment 3 Jeremy Linton 2019-04-23 21:25:20 UTC
This problem seems to have worked its way into the F29 repo's too. I just `dnf upgraded` a F29 armv7 guest, and now its doing this too. 

dnf is 4.2.2-2.fc29.

Worse yet, it seems --forcearch armv7hl doesn't work around the problem.

Comment 4 Jeremy Linton 2019-04-23 21:35:36 UTC
I just hacked up the _BASEARCH_MAP function in dnf/rpm/__init__.py, and now I can forcearch it.

Comment 5 Paul Whalen 2019-05-06 15:32:57 UTC
This seems to affect Seattle hardware, not reproducible on a Mustang.

Comment 6 David Rheinsberg 2019-05-28 07:30:15 UTC
This makes `dnf` refuse operation on all my `armv7hl` machines. This affects both F29 and F30.

I investigated and I assume this is triggered by a change in `libdnf` which generalized the architecture detection on ARM. It now produces `armv7hcnl` for my machines, because it detects NEON and AES support. Before, it would return `armv7hl`.

I opened a PR against `dnf` to properly detect `armv7hcnl` as architecture:

    https://github.com/rpm-software-management/dnf/pull/1404

As a workaround I use the following command to patch dnf on a running system:

    sed -i "s/'armv7hnl', 'armv8hl'/'armv7hnl', 'armv7hcnl', 'armv8hl'/" /usr/lib/python3.7/site-packages/dnf/rpm/__init__.py

(It is safe to run this multiple times. It will only have an effect the first time it is run.)

Comment 7 Yaakov Selkowitz 2019-05-30 23:56:52 UTC
That doesn't seem to be sufficient.  On F29 and F30 VMs hosted on a Seattle, editing dnf/rpm/__init__.py is enough to get metadata to download, but then trying to install or update anything fails because "package ____.armv7hl does not have a compatible architecture".

Comment 8 David Rheinsberg 2019-05-31 04:58:13 UTC
(In reply to Yaakov Selkowitz from comment #7)
> That doesn't seem to be sufficient.  On F29 and F30 VMs hosted on a Seattle,
> editing dnf/rpm/__init__.py is enough to get metadata to download, but then
> trying to install or update anything fails because "package ____.armv7hl
> does not have a compatible architecture".

Correct. You still need `--forcearch=armv7hl`. With that, everything works fine for me. If someone figures out how to fix that properly, please go ahead ;)

Comment 9 Neal Gompa 2019-06-04 10:57:09 UTC
(In reply to David Rheinsberg from comment #8)
> (In reply to Yaakov Selkowitz from comment #7)
> > That doesn't seem to be sufficient.  On F29 and F30 VMs hosted on a Seattle,
> > editing dnf/rpm/__init__.py is enough to get metadata to download, but then
> > trying to install or update anything fails because "package ____.armv7hl
> > does not have a compatible architecture".
> 
> Correct. You still need `--forcearch=armv7hl`. With that, everything works
> fine for me. If someone figures out how to fix that properly, please go
> ahead ;)

Fixing that requires rpm to declare 32-bit arm arches to be compatible with aarch64 in the same way that 32-bit x86 is compatible with x86_64.

Comment 10 Peter Robinson 2019-06-04 12:48:03 UTC
(In reply to Neal Gompa from comment #9)
> (In reply to David Rheinsberg from comment #8)
> > (In reply to Yaakov Selkowitz from comment #7)
> > > That doesn't seem to be sufficient.  On F29 and F30 VMs hosted on a Seattle,
> > > editing dnf/rpm/__init__.py is enough to get metadata to download, but then
> > > trying to install or update anything fails because "package ____.armv7hl
> > > does not have a compatible architecture".
> > 
> > Correct. You still need `--forcearch=armv7hl`. With that, everything works
> > fine for me. If someone figures out how to fix that properly, please go
> > ahead ;)
> 
> Fixing that requires rpm to declare 32-bit arm arches to be compatible with
> aarch64 in the same way that 32-bit x86 is compatible with x86_64.

Why? They're not compatible. And they're as a result reported completely differently. The 64 bit variant is reported as aarch64, where as the 32 bit variant is reported as armv7l armv8l etc

Comment 11 Neal Gompa 2019-06-04 12:55:15 UTC
(In reply to Peter Robinson from comment #10)
> (In reply to Neal Gompa from comment #9)
> > (In reply to David Rheinsberg from comment #8)
> > > (In reply to Yaakov Selkowitz from comment #7)
> > > > That doesn't seem to be sufficient.  On F29 and F30 VMs hosted on a Seattle,
> > > > editing dnf/rpm/__init__.py is enough to get metadata to download, but then
> > > > trying to install or update anything fails because "package ____.armv7hl
> > > > does not have a compatible architecture".
> > > 
> > > Correct. You still need `--forcearch=armv7hl`. With that, everything works
> > > fine for me. If someone figures out how to fix that properly, please go
> > > ahead ;)
> > 
> > Fixing that requires rpm to declare 32-bit arm arches to be compatible with
> > aarch64 in the same way that 32-bit x86 is compatible with x86_64.
> 
> Why? They're not compatible. And they're as a result reported completely
> differently. The 64 bit variant is reported as aarch64, where as the 32 bit
> variant is reported as armv7l armv8l etc

It's not that black-and-white. There are AArch64 systems that do support running 32-bit ARM code (the ARM builders we use in Mageia are such systems). OpenMandriva and openSUSE have similar builders in place too, so that they can use more performant hardware to build for 32-bit ARM.

Unfortunately, RPM is not able to determine this compatibility at runtime, so mock with --forcearch is used in such cases so that we can do builds for 32-bit ARM on AArch64.

Comment 12 Peter Robinson 2019-06-04 18:37:44 UTC
> > Why? They're not compatible. And they're as a result reported completely
> > differently. The 64 bit variant is reported as aarch64, where as the 32 bit
> > variant is reported as armv7l armv8l etc
> 
> It's not that black-and-white. There are AArch64 systems that do support
> running 32-bit ARM code (the ARM builders we use in Mageia are such

It is actually very black and white but you're conflating two completely different issues.

The actual problem:
The architecture ISAs, unlike x86 and x86_64, are incompatible. More on that below.

Your answer:
Some vendor SoCs ship both ISAs in the same piece of silicon to enable application compatibility by being able to run both ISAs side by side giving an appearance of compatibility when there isn't.

The x86 -> x86_64 ISAs are compatible, the later initially being purely 64 bit instructions added to x86 instructions. A instruction superset if you will.

The "arm" and "aarch64" ISAs are not, and they're certainly not a sub/superset like x86, there are components around the ISA that are compatible or mostly compatible such as FPV and SIMD but the core ISAs are incompatible (that's why in the kernel has two separate arm/arm64 directories, unlike say powerpc or x86 where they're combined) with things like registers and other such things widely different.

The silicon, whether it be the Cortex-Axx references from Arm, or third party designs from Arm licensees can choose which components they put in the silicon because it adds cost and power and other such issues.

There's "ARMv8" silicon that has purely aarch64 ISA components (EG Marvell ThunderX2), both arm and aarch64 ISA (Cortex-A57) or just arm (Cortex-A32).

> Unfortunately, RPM is not able to determine this compatibility at runtime,
> so mock with --forcearch is used in such cases so that we can do builds for
> 32-bit ARM on AArch64.

Which is why it should only attempt this if the device reports armv7l or armv8l from uname -a, if it reports aarch64 it should be assumed it's aarch64 and hence incompatible.

This is actually going to get even worse in newer upcoming chips where they can possibly run EL0 as arm (32-bit userspace) but not EL1 (32-bit kernel).

This has clearly regressed due to trying to make assumptions around the two architectures that are naive at best or just wrong.

Comment 13 Gerd Hoffmann 2019-06-05 06:50:57 UTC
> Fixing that requires rpm to declare 32-bit arm arches to be compatible with
> aarch64 in the same way that 32-bit x86 is compatible with x86_64.

Hmm?  I don't think so.  rpm (or dnf?) needs to learn that armv7hcnl is a
superset of armv7hl and thus armv7hl rpms will work just fine on armv7hcnl
machines.

comment 6 explains this.

To compare with x86:  It's like dnf/rpm knowing that i386 rpms will work just
fine on i686 machines because i686 is a i386 superset.

Comment 14 Panu Matilainen 2019-06-05 08:41:40 UTC
> Which is why it should only attempt this if the device reports armv7l or armv8l from uname -a, if it reports aarch64 it  
> should be assumed it's aarch64 and hence incompatible.
>
> This is actually going to get even worse in newer upcoming chips where they can possibly run EL0 as arm (32-bit userspace) > but not EL1 (32-bit kernel).

They really should've named it "aargh"...

Comment 15 Jaroslav Rohel 2019-06-20 11:09:10 UTC
The reported bug "dnf.exceptions.Error: Incorrect or unknown "arch": armv7hcnl" was fixed.
More info in Comment 6 and PR https://github.com/rpm-software-management/dnf/pull/1404 was merged.

I will close the bug. If there is another problem, please open new bugreport.

Comment 16 Fedora Update System 2019-07-04 13:50:21 UTC
FEDORA-2019-58c2d3f1aa has been submitted as an update to Fedora 30. https://bodhi.fedoraproject.org/updates/FEDORA-2019-58c2d3f1aa

Comment 17 Fedora Update System 2019-07-05 00:45:55 UTC
dnf-4.2.7-1.fc30, libdnf-0.35.1-1.fc30 has been pushed to the Fedora 30 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2019-58c2d3f1aa

Comment 18 Paul Whalen 2019-07-19 17:16:53 UTC
This still fails, now with incompatible arch errors:


 Problem 1: conflicting requests
  - package kernel-lpae-5.3.0-0.rc0.git3.1.fc31.armv7hl does not have a
compatible architecture
  - nothing provides kernel-lpae-core-uname-r =
5.3.0-0.rc0.git3.1.fc31.armv7hl+lpae needed by kernel-
lpae-5.3.0-0.rc0.git3.1.fc31.armv7hl
  - nothing provides kernel-lpae-modules-uname-r =
5.3.0-0.rc0.git3.1.fc31.armv7hl+lpae needed by kernel-
lpae-5.3.0-0.rc0.git3.1.fc31.armv7hl
 Problem 2: conflicting requests
  - package grubby-deprecated-8.40-34.fc31.armv7hl does not have a compatible
architecture

Comment 19 Peter Robinson 2019-07-19 18:05:23 UTC
pkratoch: all these arm changes need to be reverted, they're incorrect and breaking things. Or at the very least made configurable so the distros that want incorrect implementations can opt into them, we in Fedora (I'm speaking as both the Fedora Arm lead and as part of the actual Arm community) do not want this and it's causing us significant support load so please revert all the Arm changes and engage with the Arm community in Fedora on subsequent changes.

Comment 20 Pavla Kratochvilova 2019-07-22 07:16:11 UTC
Peter, I can only see PR https://github.com/rpm-software-management/dnf/pull/1404 associated with this bug. Is it sufficient to revert only this PR or are there more Arm changes you wish to revert?

Comment 21 Peter Robinson 2019-07-22 11:07:15 UTC
There was at least two changes, the first one where we started to get these errors, which are an incorrect "enhancement" to running ARMv7 on aarch64, which is when we started seeing the errors in this bug report, then there was the fix to this bug. All should go.

Comment 22 Pavla Kratochvilova 2019-07-22 11:59:23 UTC
Do you know, by any chance, which commit caused this? And if not, is it ok, if for now I make a Fedora 30 update with only the second patch reverted and revert the first one after I can discuss it with Jaroslav Rohel? If I understand this correctly, the first patch is in F30 now, the second is not, so such an update would not change anything.

Comment 23 Peter Robinson 2019-07-22 14:12:48 UTC
No, I don't, but if you just revert the second it'll still be broken.

Comment 24 Fedora Update System 2019-07-23 07:21:17 UTC
FEDORA-2019-672a74d688 has been submitted as an update to Fedora 30. https://bodhi.fedoraproject.org/updates/FEDORA-2019-672a74d688

Comment 25 Pavla Kratochvilova 2019-07-23 07:25:25 UTC
This bug, of course, shouldn't have been added to the new update to Fedora 30. Sorry about that, I removed it now. Also, I am moving this back to NEW.

Comment 26 Jun Aruga 2019-08-04 00:12:25 UTC
I faced this issue when running Fedora ARM 32 container image "arm32v7/fedora": https://hub.docker.com/r/arm32v7/fedora/ with QEMU on Travis CI.

https://travis-ci.org/junaruga/fedora-workshop-multiarch/jobs/567414034#L994
> Error: Incorrect or unknown "arch": armv7hcnl

Can we add a unit test to check this, by adding ARM-32 bit case to dnf upstream project's CI or adding dnf.spec %check section?

Comment 27 Jun Aruga 2019-08-05 09:10:54 UTC
> https://travis-ci.org/junaruga/fedora-workshop-multiarch/jobs/567414034#L994
> > Error: Incorrect or unknown "arch": armv7hcnl

I created the reproducer that you can check it on your local.

Prepare below Dockerfile.

```
$ cat Dockerfile 
FROM arm32v7/fedora

RUN uname -m
RUN rpm -q rpm --qf "%{arch}\n"
RUN rpm -q dnf

RUN ARCH=$(rpm -q rpm --qf "%{arch}") && \
  dnf -y --forcearch "${ARCH}" upgrade && \
  dnf -y --forcearch "${ARCH}" install gcc
```

Install qemu-user-static RPM if you have not installed it yet. [1]

```
$ sudo dnf install qemu-user-static
```

Then you will see the /proc/sys/fs/binfmt_misc/qemu-$arch files installed by the RPM on your local.
You can run different architecture's container on your local. My environment is x86_64.

```
$ ls /proc/sys/fs/binfmt_misc/qemu-*

$ uname -m
x86_64

$ podman run --rm -t arm32v7/fedora uname -m
armv7l
```

Then build above Dockerfile like this to see the error message.

```
$ podman build --rm -t my-fedora-armv7hl .
...
Installed:
  gnupg2-smime-2.2.13-1.fc29.armv7hl     grubby-8.40-18.fc29.armv7hl           
  libxkbcommon-0.8.2-1.fc29.armv7hl      pinentry-1.1.0-4.fc29.armv7hl         
  trousers-0.3.13-11.fc29.armv7hl        libsecret-0.18.7-1.fc29.armv7hl       
  xkeyboard-config-2.24-5.fc29.noarch    trousers-lib-0.3.13-11.fc29.armv7hl   

Complete!
Error: Incorrect or unknown "arch": armv7hcnl
Error: error building at STEP "RUN ARCH=$(rpm -q rpm --qf "%{arch}") &&   dnf -y --forcearch "${ARCH}" upgrade &&   dnf -y --forcearch "${ARCH}" install   gcc": error while running runtime: exit status 1
```

The container image is below environment.

```
$ uname -m
armv7l

$ rpm -q rpm --qf "%{arch}\n"
armv7hl

$ rpm -q dnf
dnf-4.0.9-2.fc29.noarch
```

* [1] qemu-user-static RPM installs /proc/sys/fs/binfmt_misc/qemu-* files.
  The files are not removed even when the RPM is removed by "dnf remove qemu-user-static".
  The files are not harmful, but if you want to remove the files, you can run below command on your responsibility.

  ```
  # find /proc/sys/fs/binfmt_misc -type f -name 'qemu-*' -exec sh -c 'echo -1 > {}' \;
  ```

  I reported the issue to qemu project.
  qemu-user-static: qemu-user-static works even after "dnf remove qemu-user-static"
  https://bugzilla.redhat.com/show_bug.cgi?id=1732178

Comment 28 Jaroslav Rohel 2019-08-07 14:15:30 UTC
Comment 21

>There was at least two changes, the first one where we started to get these errors, which are an incorrect "enhancement" to running ARMv7 on aarch64,

Do you mean the commit "Improve ARM detection" ( https://github.com/rpm-software-management/libdnf/pull/442 ) ? The commit unifies detection. It is good idea.
And the same algorithm was added to the RPM https://github.com/rpm-software-management/rpm/commit/8c3a7b8fa92b49a811fe36b60857b12f5d7db8a8 .

I'm not sure if armv7 with crypto instructions physically exists. May be it is problem only in "qemu" virtualization -> Emulation of physically not existent CPU.
I created utility with the same detection algorithm and try it with "qemu-arm" hosted on x86-64. The problem occurs only if "qemu-arm" is started without cpu definition or with definition "any" or "max" (max in this case means all features). 

#qemu-arm -cpu cortex-a7 /home/containers/fedora30_arm/arch_detect
Result 'armv7hnl'

#qemu-arm -cpu cortex-r5f /home/containers/fedora30_arm/arch_detect
Result 'armv7hl'

#qemu-arm -cpu any /home/containers/fedora30_arm/arch_detect
Result 'armv7hcnl'

#qemu-arm -cpu max /home/containers/fedora30_arm/arch_detect
Result 'armv7hcnl'

#qemu-arm /home/containers/fedora30_arm/arch_detect
Result 'armv7hcnl'

Anyway from my point of view "armv7hcnl" is "armv7hnl" with added crypto instruction set. Isn't it?

Comment 29 Jaroslav Rohel 2019-08-08 10:13:01 UTC
I suggest to add the new "armv7hcnl" (as superset of "armv7hnl") architecture into the "rpm" and "libsolv". What do you think about it?

Maybe the "armv7hcnl" is nonsense. And the "qemu-arm" or the "uname" reports wrongly armv7 with crypto instead armv8.

Summary of situation:
Some time ago there was merged patches  "Improve arm detection". into "libdnf" https://github.com/rpm-software-management/libdnf/pull/442 and into "rpm" https://github.com/rpm-software-management/rpm/commit/8c3a7b8fa92b49a811fe36b60857b12f5d7db8a8. 
The patches unify detection of all armv* architecture subtypes and add support for crypto extensions detection.
I don't know if physically exists arm v7 with crypto extension but "qemu-arm" with default (not specified) CPU is detected (according to my tests comment #28) as "armv7hcnl" (fpu, crypto, NEON, little endian) now.
The "armv7hcnl" architecture was not supported in libdnf. There was added patch into "libdnf"  https://github.com/rpm-software-management/dnf/pull/1404/files which add the "armv7hcnl" architecture and assumes that "armv7hcnl" is superset of "armv7hnl".
But the similar table with architectures is on more places. I found it in the libsolv project ("libsolv/src/poolarch.c"). I also found "arch_canon" in the rpm project ("rpm/rpm.c"). It is problem for DNF. Probably there is a workaround using "--forcearch armv7hnl".

Comment 30 Peter Robinson 2019-08-08 12:43:31 UTC
(In reply to Jaroslav Rohel from comment #29)
> I suggest to add the new "armv7hcnl" (as superset of "armv7hnl")
> architecture into the "rpm" and "libsolv". What do you think about it?
> 
> Maybe the "armv7hcnl" is nonsense. And the "qemu-arm" or the "uname" reports
> wrongly armv7 with crypto instead armv8.

No, please don't add this at all, the "c" component is actually garbage in the context. It should not be added at all. It's not an extension that is available on ARMv7 see some of the explanation in comment 12

Comment 31 Jaroslav Rohel 2019-08-09 07:14:36 UTC
Reply to Peter Robinson comment #30

DNF on "qemu-arm" is (in some configuration) broken now. We want to fix it.
I see 2 solutions:
1. adding the new "armv7hcnl" (as superset of "armv7hnl")
2. change of the detection algorithm to detect crypto extension only if arm version >= 8. This means "armv7hnl" will be detected instead "armv7hcnl".

I considered both solutions. I'm goint to do second one -> crypto will be detected only if arm version >= 8.
OK?

Comment 32 Jaroslav Rohel 2019-08-09 08:20:09 UTC
PR https://github.com/rpm-software-management/libdnf/pull/771
The crypto extension is detected only on arm version >= 8.

Comment 33 Peter Robinson 2019-08-13 16:42:41 UTC
> DNF on "qemu-arm" is (in some configuration) broken now. We want to fix it.
> I see 2 solutions:
> 1. adding the new "armv7hcnl" (as superset of "armv7hnl")
> 2. change of the detection algorithm to detect crypto extension only if arm
> version >= 8. This means "armv7hnl" will be detected instead "armv7hcnl".
> 
> I considered both solutions. I'm goint to do second one -> crypto will be
> detected only if arm version >= 8.
> OK?

3. Revert all the changes made around this so it's not special cased at all.

The optimisation of this specific feature is not something that should be optimised for by compile time, especially on ARMv7. It should be run time detected. Please don't do this.


Note You need to log in before you can comment on or make changes to this bug.