Bug 1483278

Summary: lorax runtime-install tries to install 'rdma' on armhfp, but 'rdma-core' (which replaced it) is not built for armhfp - breaks composes
Product: [Fedora] Fedora Reporter: Kevin Fenzi <kevin>
Component: loraxAssignee: Brian Lane <bcl>
Status: CLOSED ERRATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: rawhideCC: anaconda-maint-list, awilliam, bcl, dledford, honli, jarodwilson, pbrobinson, robatino
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: AcceptedBlocker
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-09-06 20:34:50 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1396702    

Description Kevin Fenzi 2017-08-20 01:10:30 UTC
The old rdma package built and was provided on armv7 (I have no idea if it worked or not). Thus, lorax expects that it will be available to make netinstall isos and other images. 

The new rdma-core package excludes arvm7 and breaks composes where it's expected to exist. 

See:
https://koji.fedoraproject.org/koji/taskinfo?taskID=21322139

So, possible solutions: 

1) drop the exclude in rdma-core. Does it still apply? 

or

2) Patch lorax to not require rdma on armv7.

Comment 1 Kevin Fenzi 2017-08-21 20:29:34 UTC
Until we sort this out I have unblocked rdma and libmlx4 in f27. 

Once we fix this properly, we should block them in f28/rawhide and f27.

Comment 2 Honggang LI 2017-08-22 00:34:44 UTC
(In reply to Kevin Fenzi from comment #0)
> The old rdma package built and was provided on armv7 (I have no idea if it
> worked or not). Thus, lorax expects that it will be available to make
> netinstall isos and other images. 
> 
> The new rdma-core package excludes arvm7 and breaks composes where it's
> expected to exist. 
> 
> See:
> https://koji.fedoraproject.org/koji/taskinfo?taskID=21322139
> 
> So, possible solutions: 
> 
> 1) drop the exclude in rdma-core. Does it still apply? 

We can't drop the exclude in rdma-core, please see the rdma-core.spec file for details:

rdma-core (f27)]$ vi rdma-core.spec
 41 # 32-bit arm is missing required arch-specific memory barriers,
 42 ExcludeArch: %{arm}

> or
> 
> 2) Patch lorax to not require rdma on armv7.

Comment 3 Doug Ledford 2017-08-22 00:43:20 UTC
Just because the old rdma package (which was nothing but scripts and was a noarch package) could be installed on armv7, it did not mean that libibverbs and librdmacm (the two basic components of user space RDMA applications) were also there and usable (they might have been there, but they certainly were not usable).  Along with the transition from a separate librdmacm and libibverbs to a single rdma-core that combines these components (as well as several others) into a cohesive package, we tightened down which arches we allowed the build to happen on, and if we don't have the right support on an arch (especially if the arch will appear to work but produce random corruptions due to coherency issues, like all 32bit arm arches will since they none have the required PCI DMA coherency properties for RDMA to work right), then we no longer allow the package to build on that arch.  It is better to refuse to build than to build a known broken product that gives the appearance of working.

So, we will need to update lorax to know that the rdma-core package is not applicable to any 32bit arm arch.

Comment 4 Kevin Fenzi 2017-08-22 16:22:55 UTC
Fair enough. 

Moving to lorax...

Comment 5 Adam Williamson 2017-08-22 20:48:19 UTC
So I've submitted a patch for lorax:

https://github.com/rhinstaller/lorax/pull/240

however, pbrobinson notes there are also qemu and libvirt dep chains to worry about here too. qemu currently builds against rdma for ARM, because its spec has this:

%ifnarch s390 s390x
BuildRequires: librdmacm-devel
%endif

libvirt also has deps chain running through the new rdma-core: libvirt-daemon-driver-storage requires libvirt-daemon-driver-storage-rbd , which requires librbd and librados, both of which require libibverbs (which comes from rdma-core). Both those chains exist on ARM.

There are really quite a lot of dep chains running through libibverbs, actually:

[root@adam vms]# dnf repoquery --whatrequires "libibverbs.so.1()(64bit)"
Failed to synchronize cache for repo 'kanarip-phabricator', disabling.
Last metadata expiration check: 0:00:00 ago on Tue 22 Aug 2017 01:45:14 PM PDT.
ceph-base-1:12.1.3-1.fc27.x86_64
ceph-common-1:12.1.3-1.fc27.x86_64
ceph-fuse-1:12.1.3-1.fc27.x86_64
ceph-mds-1:12.1.3-1.fc27.x86_64
ceph-mgr-1:12.1.3-1.fc27.x86_64
ceph-mon-1:12.1.3-1.fc27.x86_64
ceph-osd-1:12.1.3-1.fc27.x86_64
ceph-radosgw-1:12.1.3-1.fc27.x86_64
ceph-test-1:12.1.3-1.fc27.x86_64
corosynclib-0:2.4.2-4.fc27.x86_64
dapl-0:2.1.9-4.fc26.x86_64
fio-0:2.99-3.fc27.x86_64
ga-openmpi-0:5.3b-24.fc27.x86_64
glusterfs-rdma-0:3.11.2-3.fc27.x86_64
ibacm-0:14-4.fc27.x86_64
libcephfs2-1:12.1.3-1.fc27.x86_64
libcephfs_jni1-1:12.1.3-1.fc27.x86_64
libfabric-0:1.4.2-4.fc27.x86_64
libhfi1-0:0.5-26.fc27.x86_64
libibcm-0:14-4.fc27.x86_64
libibverbs-devel-0:1.2.1-6.fc27.x86_64
libibverbs-utils-0:14-4.fc27.x86_64
libipathverbs-0:1.3-4.fc27.x86_64
libmlx4-0:1.0.6-9.fc27.x86_64
libmlx5-0:1.0.2-5.fc27.x86_64
libmthca-0:1.0.6-18.fc27.x86_64
libnes-0:1.1.4-10.fc27.x86_64
libocrdma-0:1.0.8-6.fc27.x86_64
librados-devel-1:12.1.3-1.fc27.x86_64
librados2-1:12.1.3-1.fc27.x86_64
libradosstriper1-1:12.1.3-1.fc27.x86_64
librbd1-1:12.1.3-1.fc27.x86_64
librdmacm-0:14-4.fc27.x86_64
librdmacm-utils-0:14-4.fc27.x86_64
librgw2-1:12.1.3-1.fc27.x86_64
libusnic_verbs-0:2.0.2-4.fc27.x86_64
libvma-0:8.0.1-1.fc25.x86_64
openmpi-0:2.0.2-2.fc26.x86_64
perftest-0:3.0-4.fc27.x86_64
qemu-system-aarch64-core-2:2.10.0-0.2.rc3.fc27.x86_64
qemu-system-alpha-core-2:2.10.0-0.2.rc3.fc27.x86_64
qemu-system-arm-core-2:2.10.0-0.2.rc3.fc27.x86_64
qemu-system-cris-core-2:2.10.0-0.2.rc3.fc27.x86_64
qemu-system-lm32-core-2:2.10.0-0.2.rc3.fc27.x86_64
qemu-system-m68k-core-2:2.10.0-0.2.rc3.fc27.x86_64
qemu-system-microblaze-core-2:2.10.0-0.2.rc3.fc27.x86_64
qemu-system-mips-core-2:2.10.0-0.2.rc3.fc27.x86_64
qemu-system-moxie-core-2:2.10.0-0.2.rc3.fc27.x86_64
qemu-system-nios2-core-2:2.10.0-0.2.rc3.fc27.x86_64
qemu-system-or1k-core-2:2.10.0-0.2.rc3.fc27.x86_64
qemu-system-ppc-core-2:2.10.0-0.2.rc3.fc27.x86_64
qemu-system-s390x-core-2:2.10.0-0.2.rc3.fc27.x86_64
qemu-system-sh4-core-2:2.10.0-0.2.rc3.fc27.x86_64
qemu-system-sparc-core-2:2.10.0-0.2.rc3.fc27.x86_64
qemu-system-tricore-core-2:2.10.0-0.2.rc3.fc27.x86_64
qemu-system-unicore32-core-2:2.10.0-0.2.rc3.fc27.x86_64
qemu-system-x86-core-2:2.10.0-0.2.rc3.fc27.x86_64
qemu-system-xtensa-core-2:2.10.0-0.2.rc3.fc27.x86_64
qperf-0:0.4.9-11.fc27.x86_64
qpid-cpp-client-rdma-0:1.36.0-6.fc27.x86_64
rbd-mirror-1:12.1.3-1.fc27.x86_64
rbd-nbd-1:12.1.3-1.fc27.x86_64
rdma-core-devel-0:14-4.fc27.x86_64
scsi-target-utils-0:1.0.70-3.fc27.x86_64
srp_daemon-0:14-4.fc27.x86_64
srptools-0:1.0.3-2.fc26.x86_64

probably all, or most, of those exist on ARM. So this change is really pretty disruptive beyond just the lorax thing, and even after we fix lorax I suspect many of those dep chains are going to cause compose issues. Someone really ought to have planned all this out before ditching libibverbs from ARM.

Another thing I noticed about this, BTW - I can't see any code in anaconda or blivet that ensures 'rdma' or 'rdma-core' is installed if you set things up such that one of the system partitions is accessed via an RDMA network. Nor is rdma listed in any of the core comps groups anywhere I can find. So I rather suspect if you actually do install with a system partition accessed via RDMA, if you don't manually install the relevant packages afterwards, then the first time you update a kernel (or run dracut for any other reason) after installing, you'll suddenly find the system doesn't boot any more. That's rather a separate issue, though.

Comment 6 Adam Williamson 2017-08-22 21:21:49 UTC
I filed #1484155 to cover the problems with dependencies, leaving this to be specifically about the lorax problem.

Comment 7 Adam Williamson 2017-08-22 21:32:09 UTC
This is an automatic Beta blocker, per "Bugs which entirely prevent the composition of one or more of the release-blocking images required to be built for a currently-pending (pre-)release" - https://fedoraproject.org/wiki/QA:SOP_blocker_bug_process#Automatic_blockers .

Comment 8 Adam Williamson 2017-08-22 21:32:27 UTC
https://github.com/rhinstaller/lorax/pull/240

Comment 9 Adam Williamson 2017-09-06 20:34:50 UTC
This is resolved for a while now, composes aren't choking on it any more.