Bug 915269 - [arch] dnf doesn't work on armv7hl
[arch] dnf doesn't work on armv7hl
Status: CLOSED ERRATA
Product: Fedora
Classification: Fedora
Component: dnf (Show other bugs)
18
arm Linux
low Severity unspecified
: ---
: ---
Assigned To: Ales Kozumplik
Fedora Extras Quality Assurance
:
Depends On:
Blocks: ARMTracker
  Show dependency treegraph
 
Reported: 2013-02-25 06:01 EST by T.C. Hollingsworth
Modified: 2014-09-30 19:41 EDT (History)
11 users (show)

See Also:
Fixed In Version: hawkey-0.3.16-2.git1e5a593.fc19
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2013-08-14 23:01:21 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description T.C. Hollingsworth 2013-02-25 06:01:41 EST
dnf can only find noarch packages on armv7hl.  It fails to install arched packages and just says "Nothing to do".

This might be related to the fact that uname -m returns armv7l with the OLPC kernel.  I believe yum was patched to handle this.
Comment 1 Ales Kozumplik 2013-02-25 06:12:52 EST
The good thing is libsolv *does* seem to have the armv7hl support. Probably some magic in dnf is missing, but since the arch.py is waiting to be replaced anyway this current bug will be handled then.
Comment 2 Ales Kozumplik 2013-02-27 02:47:31 EST
Hm, also---where can I test this?  I don't have any armv7l machine available.
Comment 3 T.C. Hollingsworth 2013-02-27 16:44:53 EST
There's an ARMv7 test machine that all Fedora packagers have SSH and sudo access to, but I think it's on F17. :-(
http://fedoraproject.org/wiki/Test_Machine_Resources_For_Package_Maintainers

OLPC also gives out free armv7hl XO-1.75 laptops to people working on Fedora ARM:
http://wiki.laptop.org/go/Contributors

You might also want to ping jcm@redhat.com or blc@redhat.com in case they have something set up for RH folks.
Comment 4 Ales Kozumplik 2013-04-16 11:31:33 EDT
Hello Jonathan, are there any armv7l machines (remotely) available internally? I'd like to test DNF there, it's because of this current bug.
Comment 5 Ales Kozumplik 2013-04-19 07:40:31 EDT
T.C.,

I looked at this today and found that hawkey, the underlying library of DNF, depends on uname reporting the architecture precisely. In this case, it won't think that armv7hl packges are eligible and so will skip them. This is something that should be fixed in the kernel so we don't have to hack the tools everywhere.

Reassigning.
Comment 6 Ales Kozumplik 2013-04-19 08:19:13 EDT
FWIW, it looks like the Yum patch the OP is referring to is c757d314aed28e59dd92dcd0af44b0cc43744f59, which looks at the expansion of the rpm macro _target_cpu.

And it looks like a fedora patch is helping RPM get this right:

rpm-4.9.0-armhfp-logic.patch:
...
+           if (strcmp(un.machine, "armv7l") == 0 ) {
+               if (has_neon() && has_hfp())
+                    strcpy(un.machine, "armv7hnl");
+                else if (has_hfp())
+                    strcpy(un.machine, "armv7hl");
...

This really ought to be fixed in the kernel, before more people start developing for Fedora on ARM.
Comment 7 Peter Robinson 2013-04-19 09:08:08 EDT
There are ARM machines available for testing through beaker now I believe for people at RedHat. 

armv7 is the right designator for the HW. The issue is there are three  floating-point ABIs that can be used on ARM:
*) soft (software FP emulation) - which has never been used on Fedora AFAIK
*) softfp - allows the generation of code using hardware floating-point instructions, but still uses the soft-float calling conventions. This is used on the armv5tel and armv7l binaries (with an optional armv7nl for NEON optimised binaries). This was the only version supported on Fedora releases < F-15. It was dropped in Fedora 19 so the last supported release will be Fedora 18. 
*) hard - allows generation of floating-point instructions and uses FPU-specific calling conventions. This was introduced on Fedora during Fedora 15 for a hardfp bringup and was officially supported from Fedora 17 (F-16 was skipped). As of Fedora 19 it's the only ARM 32bit option supported. The ABI is identified with rpms using armv7hl (with an optional armv7hnl for NEON optimised binaries).

The problem is all three of the above can run on the same HW and like on ix86 the kernel doesn't care about FPU so you can use the same kernel to run any of the three userspace but the hard-float and soft-float ABIs are not link-compatible; you must compile your entire program with the same ABI, and link with a compatible set of libraries. From Fedora 18 there's a separate linker path.

So there's no easy/real way to detect this at the kernel level and hence the rpm/yum hacks. I suppose they will need to be replicated in dnf/zif/whatever.
Comment 8 Ales Kozumplik 2013-04-19 09:22:59 EDT
(In reply to comment #7)
> There are ARM machines available for testing through beaker now I believe
> for people at RedHat.

That's what I tried and the job failed: http://beaker.farm.hsv.redhat.com/bkr/jobs/350

> 
> armv7 is the right designator for the HW. The issue is there are three 
> floating-point ABIs that can be used on ARM:
> *) soft (software FP emulation) - which has never been used on Fedora AFAIK
> *) softfp - allows the generation of code using hardware floating-point
> instructions, but still uses the soft-float calling conventions. This is
> used on the armv5tel and armv7l binaries (with an optional armv7nl for NEON
> optimised binaries). This was the only version supported on Fedora releases
> < F-15. It was dropped in Fedora 19 so the last supported release will be
> Fedora 18. 
> *) hard - allows generation of floating-point instructions and uses
> FPU-specific calling conventions. This was introduced on Fedora during
> Fedora 15 for a hardfp bringup and was officially supported from Fedora 17
> (F-16 was skipped). As of Fedora 19 it's the only ARM 32bit option
> supported. The ABI is identified with rpms using armv7hl (with an optional
> armv7hnl for NEON optimised binaries).

is 'armv7hnl' the same as a 'armv7nhl'? I see libsolv knows about the latter and won't accept 'armv7hnl'. Why is this not standardized across distributions?

> 
> The problem is all three of the above can run on the same HW and like on
> ix86 the kernel doesn't care about FPU so you can use the same kernel to run
> any of the three userspace but the hard-float and soft-float ABIs are not
> link-compatible; you must compile your entire program with the same ABI, and
> link with a compatible set of libraries. 

For clarification, this is the same like x86_64 and i686, no? Both can run on the same HW but must be linked with the respective libraries.

> So there's no easy/real way to detect this at the kernel level and hence the
> rpm/yum hacks. I suppose they will need to be replicated in dnf/zif/whatever.

OK. As an alternative to fixing this in the kernel the ARM team can create a library that tells us what .rpms can be installed on the particular machine (like in this case it would return "armv7hl"). So the remaining tools can free themselves from this. To prevent conditional compilation the library should transparently use uname() on non-arm arches.
Comment 9 Ales Kozumplik 2013-04-19 09:47:07 EDT
(In reply to comment #7)
> So there's no easy/real way to detect this at the kernel level

I am not sure this is exactly true.  The RPM patch detects the neon and floating point as such:

+static int has_neon()
+{
+        char buffer[4096], *p;
+        int fd = open("/proc/cpuinfo", O_RDONLY);
+        if (read(fd, &buffer, sizeof(buffer) - 1) == -1) {
+                rpmlog(RPMLOG_WARNING, _("read(/proc/cpuinfo) failed\n"));
+                close(fd);
+                return 0;
+        }
+        close(fd);
+
+        p = strstr(buffer, "Features");
+        p = strtok(p, "\n");
+        p = strstr(p, "neon");
+        p = strtok(p, " ");
+        if (p == NULL) {
+                rpmlog(RPMLOG_WARNING, _("/proc/cpuinfo has no 'Features' line\n"));
+                return 0;
+        } else if (strcmp(p, "neon") == 0) {
+                return 1;
+        }
+        return 0;
+}
+
+static int has_hfp()
+{
+        char buffer[4096], *p;
+        int fd = open("/proc/cpuinfo", O_RDONLY);
+        if (read(fd, &buffer, sizeof(buffer) - 1) == -1) {
+                rpmlog(RPMLOG_WARNING, _("read(/proc/cpuinfo) failed\n"));
+                close(fd);
+                return 0;
+        }
+        close(fd);
+
+        p = strstr(buffer, "Features");
+        p = strtok(p, "\n");
+        p = strstr(p, "vfpv3");
+        p = strtok(p, " ");
+        if (p == NULL) {
+                rpmlog(RPMLOG_WARNING, _("/proc/cpuinfo has no 'Features' line\n"));
+                return 0;
+        } else if (strcmp(p, "vfpv3") == 0) {
+                return 1;
+        }
+        return 0;
+}
+#endif

So it looks at /proc/cpuinfo. And isn't it the kernel that puts the information in /proc/cpuinfo? So it must know about both the FP and Neon.
Comment 10 Peter Robinson 2013-04-19 10:00:59 EDT
> So it looks at /proc/cpuinfo. And isn't it the kernel that puts the
> information in /proc/cpuinfo? So it must know about both the FP and Neon.

You can have HW that is capable of HFP but still be running a software stack that is softfp. Same goes for NEON. EG we compile all binaries without neon support on Fedora because there is HW that doesn't have it (Tegra2, Marvell) so you should never use the information in /proc/cpuinfo to make the decision as to what binaries to install.
Comment 11 Ales Kozumplik 2013-04-22 02:36:45 EDT
Peter you are contradicting yourself, it was said in comment 7:

> So there's no easy/real way to detect this at the kernel level and hence the
> rpm/yum hacks. I suppose they will need to be replicated in dnf/zif/whatever.

then I showed the "rpm hack" and you are saying /proc/cpuinfo should never be used to make the decision about binaries to install. The precise reason why RPM needs to know the arch is to see what architectures are binary compatible and what packages thus safe to install.

In any case, if /proc/cpuinfo can not be trusted it is quite clearly the case this issue is something the userspace can not do much about.
Comment 12 Peter Robinson 2013-04-27 05:54:10 EDT
(In reply to comment #11)
> Peter you are contradicting yourself, it was said in comment 7:

How so?

> > So there's no easy/real way to detect this at the kernel level and hence the
> > rpm/yum hacks. I suppose they will need to be replicated in dnf/zif/whatever.
> 
> then I showed the "rpm hack" and you are saying /proc/cpuinfo should never
> be used to make the decision about binaries to install. The precise reason
> why RPM needs to know the arch is to see what architectures are binary
> compatible and what packages thus safe to install.

/proc/cpuinfo is used but it's not the only thing that has to be used. As explained it's possible to have 3 different binary types running on the same HW. So while the contents of that file can be used to determine whether or not to install a neon (or iwmmxt or something else) optimised version of a package it can't be the only thing to determine that.

> In any case, if /proc/cpuinfo can not be trusted it is quite clearly the
> case this issue is something the userspace can not do much about.

It's not the only thing needed to take into account. Userspace works fine as it works in yum, rpm and I believe even zif supports it fine.
Comment 13 Ales Kozumplik 2013-04-29 02:36:26 EDT
There's the contradiction again. For ARM, as far as I can see, neither yum nor rpm looks at anything else than /proc/cpuinfo.
Comment 14 Josh Boyer 2013-07-03 10:11:10 EDT
This stalled out.  I don't think it's a kernel problem.  I'm gathering that Ales doesn't think it's a yum/rpm problem.  I'm sticking it in the distribution component until you guys work it out.  Play nice.
Comment 15 Bill Nottingham 2013-07-09 12:06:37 EDT
Take the example of x86_64. rpm supports building packages for:

x86_64
ia32e
amd64
em64t
athlon
i686
i586
...
noarch

Hence:
arch_compat: x86_64: amd64 em64t athlon noarch
arch_compat: amd64: x86_64 em64t athlon noarch
arch_compat: ia32e: x86_64 em64t athlon noarch

uname, in the kernel, reports x86_64. Distinctions beyond uname (ia32e, amd64, etc.) are done in the RPM code, and filtered up to higher-level apps (yum, dnf, etc.)

So, in this case, I think it would be the correct solution where:

- uname, in the kernel, reports the generic armv7l
- distinctions beyond that are done at the RPM level
- and then filtered up to the higher level code
Comment 16 Dennis Gilmore 2013-07-21 20:10:03 EDT
rpm applies a patch adding has_hfp() based on if the target arch is armv5tel or not. the build of rpm for software floating point support doesn't know how to detect hard or soft floating point at run time so its set at build time.  yum calls into rpm to detect what arches are compatible.
Comment 17 Dennis Gilmore 2013-07-22 15:50:23 EDT
reassigning to dnf as that is where the bug lies and where it needs to be fixed.
Comment 18 Ales Kozumplik 2013-07-23 04:17:07 EDT
I'm depriortizing this as there's currently not many users waiting for DNF on ARM (yell if untrue). If this changes and if I can get an ARM machine borrowed I'll fix this using the same hacks yum and rpm do. I still doubt it's the right thing to do, to hack this in random places in userspace when the correct machine name should be returned by uname (and kernel definitely has the means to implement this as I've shown in comment 9) as all the other arches do.
Comment 19 Ales Kozumplik 2013-07-29 08:07:56 EDT
Dennis can I please get a (temporary) ssh access to an F19 arm machine to build hawkey and test it?
Comment 20 Peter Robinson 2013-07-29 08:14:11 EDT
(In reply to Ales Kozumplik from comment #19)
> Dennis can I please get a (temporary) ssh access to an F19 arm machine to
> build hawkey and test it?

You can provisions F-19 ARM devices via the internal Beaker instance
Comment 23 Ales Kozumplik 2013-07-30 11:15:08 EDT
I'm targeting this fix for F19 still.

Libsolv will have to be rebased: https://github.com/openSUSE/libsolv/commit/a59d11d84d0ff7dcb0787c493eff1a6a982fc2fb
Comment 24 Ales Kozumplik 2013-07-31 11:44:16 EDT
handled in hawkey commit 1e5a593, dnf on v7 ARMs should work starting with hawkey-0.3.16-2 and libsolv-0.3.0-8.
Comment 25 Fedora Update System 2013-08-01 04:36:08 EDT
hawkey-0.3.16-2.git1e5a593.fc19,libsolv-0.3.0-8.gita59d11d.fc19 has been submitted as an update for Fedora 19.
https://admin.fedoraproject.org/updates/hawkey-0.3.16-2.git1e5a593.fc19,libsolv-0.3.0-8.gita59d11d.fc19
Comment 26 Michal Toman 2013-08-01 09:40:09 EDT
(In reply to Peter Robinson from comment #10)
> > So it looks at /proc/cpuinfo. And isn't it the kernel that puts the
> > information in /proc/cpuinfo? So it must know about both the FP and Neon.
> 
> You can have HW that is capable of HFP but still be running a software stack
> that is softfp. Same goes for NEON. EG we compile all binaries without neon
> support on Fedora because there is HW that doesn't have it (Tegra2, Marvell)
> so you should never use the information in /proc/cpuinfo to make the
> decision as to what binaries to install.

I've ran into similar problems some time ago when playing with arm and I need to agree with Ales. If one kernel is able to execute several userspace architectures there should be a common way to determine which one is actually used (maybe a client library?). The decision is apparently too complex to be embedded into each application.

I can imagine the situation is similar for i386 userspace running on x86_64 kernel.
Comment 27 Fedora Update System 2013-08-02 17:56:27 EDT
Package hawkey-0.3.16-2.git1e5a593.fc19, libsolv-0.3.0-8.gita59d11d.fc19:
* should fix your issue,
* was pushed to the Fedora 19 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing hawkey-0.3.16-2.git1e5a593.fc19 libsolv-0.3.0-8.gita59d11d.fc19'
as soon as you are able to.
Please go to the following url:
https://admin.fedoraproject.org/updates/FEDORA-2013-14092/hawkey-0.3.16-2.git1e5a593.fc19,libsolv-0.3.0-8.gita59d11d.fc19
then log in and leave karma (feedback).
Comment 28 Dennis Gilmore 2013-08-06 21:15:45 EDT
at least in rawhide dnf doesnt work.

dnf-0.3.10-2.giteb9dddb.fc20.noarch
hawkey-0.3.16-1.git4e79abc.fc20.armv7hl
libsolv-0.3.0-9.gita59d11d.fc20.armv7hl

I have the above installed which is the latest in rawhide
Comment 29 Ales Kozumplik 2013-08-07 04:10:01 EDT
hawkey-0.3.16-2.git1e5a593.fc20 is the latest in f20. You probably can't see it for armv7 because of bug 991910 but as explained there that is was Mass Rebuild's fault. All should be fine with the next round of builds.
Comment 30 Fedora Update System 2013-08-14 23:01:21 EDT
hawkey-0.3.16-2.git1e5a593.fc19, libsolv-0.3.0-8.gita59d11d.fc19 has been pushed to the Fedora 19 stable repository.  If problems still persist, please make note of it in this bug report.

Note You need to log in before you can comment on or make changes to this bug.