[RHEL7] weak-modules will always pick the highest kernel revision version of a module if multiple are installed, even if that is incompatible with an older kernel that's installed
Cause: The problem is that weak-modules does not have group information for --add-kernel (vs --add-modules) and wants to check modules one by one (which won't work in general case, that was the reason to implements grouping for --add-modules). The bug here is it feeds the link installer with all the modules at once and since they land at the same place, only the last one wins before the configuration check.
Consequence: weak-modules will always pick the highest kernel revision version of a module if multiple are installed, even if that is incompatible with an older kernel that's installed
Fix: Implement grouping for --add-kernel case as well.
Result: weak-modules picks up the highest compatible version
Created attachment 1497575[details]
versioncheck patch
Description of problem:
If we have multiple builds of an 'extra' module available on a host and install a kernel older than the newest (kvers) built one, weak-modules will always just try to use that highest version even if it's not ABI compatible. The result is that in the final check, it determines that the version is incompatible and then removes the links, thus resulting in the weak-modules not linking to *any* version for that module.
That's a bit confusing wording-wise, but here's an example:
<snip>
<snip>
[root@rhel75-dhcptest1 ~]# rpm -qa | grep enterpriseonload-kmod
enterpriseonload-kmod-3.10.0-693.21.1.el7-5.0.5_ms1-1.el7.x86_64
enterpriseonload-kmod-3.10.0-693.11.1.el7-5.0.5_ms1-1.el7.x86_64
enterpriseonload-kmod-3.10.0-862.2.3.el7-5.0.5_ms1-1.el7.x86_64
[root@rhel75-dhcptest1 ~]# rpm -q kernel
kernel-3.10.0-693.11.1.el7.x86_64
kernel-3.10.0-693.37.4.el7.x86_64
<snip>
We have the openonload drivers installed for:
3.10.0-693.11.1.el7
3.10.0-693.21.1.el7
3.10.0-862.2.3.el7
These are all the same release of the driver, just built against different kversions.
If, with those already installed, we install a kernel version that's older than the newest of those we don't get a weak-modules link created because in the end it only checks the newest one and then deletes it since it's not compatible:
<snip>
[root@rhel75-dhcptest1 ~]# grep 'ln -sf \|rm \|compatible' weak-updates-3.10.0-693.37.4.el7.didntwork.txt
+ rm -f /tmp/weak-modules.xR2Hjj/tmp.gnpdCS0tJp
+ ln -sf /lib/modules/3.10.0-693.11.1.el7.x86_64/extra/onload.ko /lib/modules/3.10.0-693.37.4.el7.x86_64/weak-updates/onload.ko
+ ln -sf /lib/modules/3.10.0-693.11.1.el7.x86_64/extra/onload_cplane.ko /lib/modules/3.10.0-693.37.4.el7.x86_64/weak-updates/onload_cplane.ko
+ ln -sf /lib/modules/3.10.0-693.11.1.el7.x86_64/extra/sfc.ko /lib/modules/3.10.0-693.37.4.el7.x86_64/weak-updates/sfc.ko
+ ln -sf /lib/modules/3.10.0-693.11.1.el7.x86_64/extra/sfc_affinity.ko /lib/modules/3.10.0-693.37.4.el7.x86_64/weak-updates/sfc_affinity.ko
+ ln -sf /lib/modules/3.10.0-693.11.1.el7.x86_64/extra/sfc_char.ko /lib/modules/3.10.0-693.37.4.el7.x86_64/weak-updates/sfc_char.ko
+ ln -sf /lib/modules/3.10.0-693.11.1.el7.x86_64/extra/sfc_resource.ko /lib/modules/3.10.0-693.37.4.el7.x86_64/weak-updates/sfc_resource.ko
+ ln -sf /lib/modules/3.10.0-693.21.1.el7.x86_64/extra/onload.ko /lib/modules/3.10.0-693.37.4.el7.x86_64/weak-updates/onload.ko
+ ln -sf /lib/modules/3.10.0-693.21.1.el7.x86_64/extra/onload_cplane.ko /lib/modules/3.10.0-693.37.4.el7.x86_64/weak-updates/onload_cplane.ko
+ ln -sf /lib/modules/3.10.0-693.21.1.el7.x86_64/extra/sfc.ko /lib/modules/3.10.0-693.37.4.el7.x86_64/weak-updates/sfc.ko
+ ln -sf /lib/modules/3.10.0-693.21.1.el7.x86_64/extra/sfc_affinity.ko /lib/modules/3.10.0-693.37.4.el7.x86_64/weak-updates/sfc_affinity.ko
+ ln -sf /lib/modules/3.10.0-693.21.1.el7.x86_64/extra/sfc_char.ko /lib/modules/3.10.0-693.37.4.el7.x86_64/weak-updates/sfc_char.ko
+ ln -sf /lib/modules/3.10.0-693.21.1.el7.x86_64/extra/sfc_resource.ko /lib/modules/3.10.0-693.37.4.el7.x86_64/weak-updates/sfc_resource.ko
+ ln -sf /lib/modules/3.10.0-862.2.3.el7.x86_64/extra/onload.ko /lib/modules/3.10.0-693.37.4.el7.x86_64/weak-updates/onload.ko
+ ln -sf /lib/modules/3.10.0-862.2.3.el7.x86_64/extra/onload_cplane.ko /lib/modules/3.10.0-693.37.4.el7.x86_64/weak-updates/onload_cplane.ko
+ ln -sf /lib/modules/3.10.0-862.2.3.el7.x86_64/extra/sfc.ko /lib/modules/3.10.0-693.37.4.el7.x86_64/weak-updates/sfc.ko
+ ln -sf /lib/modules/3.10.0-862.2.3.el7.x86_64/extra/sfc_affinity.ko /lib/modules/3.10.0-693.37.4.el7.x86_64/weak-updates/sfc_affinity.ko
+ ln -sf /lib/modules/3.10.0-862.2.3.el7.x86_64/extra/sfc_char.ko /lib/modules/3.10.0-693.37.4.el7.x86_64/weak-updates/sfc_char.ko
+ ln -sf /lib/modules/3.10.0-862.2.3.el7.x86_64/extra/sfc_resource.ko /lib/modules/3.10.0-693.37.4.el7.x86_64/weak-updates/sfc_resource.ko
+ rm -f /lib/modules/3.10.0-693.37.4.el7.x86_64/weak-updates/onload.ko
+ pr_verbose 'Module onload.ko from kernel 3.10.0-862.2.3.el7.x86_64 is not compatible with kernel 3.10.0-693.37.4.el7.x86_64 in symbols: sme_me_mask'
+ rm -f /lib/modules/3.10.0-693.37.4.el7.x86_64/weak-updates/sfc_resource.ko
+ pr_verbose 'Module sfc_resource.ko from kernel 3.10.0-862.2.3.el7.x86_64 is not compatible with kernel 3.10.0-693.37.4.el7.x86_64 in symbols: sme_me_mask iommu_group_get iommu_present arch_dma_alloc_attrs iommu_detach_device iommu_capable iommu_domain_free iommu_domain_alloc iommu_unmap iommu_map iommu_group_add_device iommu_attach_device iommu_group_remove_device'
+ rm -f /lib/modules/3.10.0-693.37.4.el7.x86_64/weak-updates/onload_cplane.ko
+ pr_verbose 'Module onload_cplane.ko from kernel 3.10.0-862.2.3.el7.x86_64 is not compatible with kernel 3.10.0-693.37.4.el7.x86_64 in symbols: neigh_update'
+ rm -f /lib/modules/3.10.0-693.37.4.el7.x86_64/weak-updates/sfc.ko
+ pr_verbose 'Module sfc.ko from kernel 3.10.0-862.2.3.el7.x86_64 is not compatible with kernel 3.10.0-693.37.4.el7.x86_64 in symbols: sme_me_mask alloc_etherdev_mqs_rh napi_complete_done arch_dma_alloc_attrs flow_keys_dissector ___pskb_trim_adjust_truesize ktime_get_snapshot __skb_flow_dissect napi_schedule_prep genl_register_family'
+ rm -f /lib/modules/3.10.0-693.37.4.el7.x86_64/weak-updates/sfc_affinity.ko
+ pr_verbose 'Module sfc_affinity.ko from kernel 3.10.0-862.2.3.el7.x86_64 is not compatible with kernel 3.10.0-693.37.4.el7.x86_64 in symbols: efx_dl_unregister_driver efx_dl_filter_remove efx_dl_filter_insert efx_dl_register_driver_api_ver_22'
+ rm -f /lib/modules/3.10.0-693.37.4.el7.x86_64/weak-updates/sfc_char.ko
+ pr_verbose 'Module sfc_char.ko from kernel 3.10.0-862.2.3.el7.x86_64 is not compatible with kernel 3.10.0-693.37.4.el7.x86_64 in symbols: efrm_resource_ref efrm_pt_pace efrm_pd_from_resource efrm_client_get_nic efrm_vi_tx_alt_alloc efrm_pio_unlink_vi efrm_client_put efrm_pd_dma_unmap efrm_vf_resource_release efrm_filter_block_kernel efrm_pd_vport_alloc efrm_pd_set_min_align efrm_vi_get_pd efrm_filter_remove efhw_nic_release_dl_device efrm_eventq_kill_callback efrm_vi_q_alloc_sanitize_size __efrm_vi_attr_init efrm_vi_attr_set_ps_buffer_size efrm_filter_insert efrm_vi_set_alloc efrm_port_sniff efrm_vi_q_alloc efrm_client_add_ref efrm_vi_alloc efrm_pt_flush efrm_vi_set_from_resource efx_dl_vport_filter_remove efrm_license_challenge efrm_vi_set_to_resource efrm_pd_has_vport efhw_nic_acquire_dl_device efrm_pd_release efrm_vi_set_get_rss_context efrm_pio_link_vi efrm_tx_port_sniff efrm_resource_release efrm_vf_resource_alloc efrm_eventq_register_callback efrm_eventq_request_wakeup efrm_vi_resource_release efrm_vi_attr_set_with_interrupt efrm_pio_from_resource efrm_client_get efrm_v3_license_challenge efrm_vi_set_num_vis efrm_pio_to_resource efrm_vi_tx_alt_free efrm_vi_attr_set_interrupt_core efx_dl_vport_filter_insert efrm_pd_stack_id_get efrm_vi_attr_set_instance efrm_vi_set_get_pd efrm_pd_get_vport_id efrm_vi_register_flush_callback efrm_pio_alloc efrm_pd_dma_map efrm_pd_alloc efrm_pd_to_resource efrm_vi_attr_set_packed_stream efrm_vi_attr_set_pd'
+ rm -f /tmp/weak-modules.xR2Hjj/tmp.N7AjctZYup
+ rm -rf /tmp/weak-modules.xR2Hjj
<snip>
It iterates through the extra modules in ascending order, ending up with the 862 build being the "final" ones and then when it checks those for validity, they're (unsurprisingly) incompatible and are removed. So, no weak-modules for the kernel.
Apparently in 20-15 we largely did the same thing, but sorted in decending order so we always checked against the oldest version for whatever reason. That worked for the client, but that was just luck.
I worked up a quick patch for this that only does one thing, it checks if the kvers that it's going to link is newer than the installed/target kernel and just doesn't create the link. So, the logic would have us iterate through and end up with the highest installed module kvers that's older or equal to the target/installed kernel (patch attached)
Ideally, we'd check each version and just install the newest compatible one, but that'd require a more extensive change to the logic in the script. That said, I don't know if this is the correct way to go about ameliorating this so we'll have to review.
Version-Release number of selected component (if applicable):
kmod-20-21
How reproducible:
Always per the instructions above
Steps to Reproduce:
1. Install multiple kvers 'extras' modules
2. Install kernel older than newest one of 1)
3. If that newer kvers build is incompatible, we don't get a link at all
Actual results:
Installed kernel doesn't get a weak-modules link even though there's a compatible version installed on the host.
Expected results:
Installed kernel should get a weak-modules link
Additional info:
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory, and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.
https://access.redhat.com/errata/RHBA-2019:2020
Created attachment 1497575 [details] versioncheck patch Description of problem: If we have multiple builds of an 'extra' module available on a host and install a kernel older than the newest (kvers) built one, weak-modules will always just try to use that highest version even if it's not ABI compatible. The result is that in the final check, it determines that the version is incompatible and then removes the links, thus resulting in the weak-modules not linking to *any* version for that module. That's a bit confusing wording-wise, but here's an example: <snip> <snip> [root@rhel75-dhcptest1 ~]# rpm -qa | grep enterpriseonload-kmod enterpriseonload-kmod-3.10.0-693.21.1.el7-5.0.5_ms1-1.el7.x86_64 enterpriseonload-kmod-3.10.0-693.11.1.el7-5.0.5_ms1-1.el7.x86_64 enterpriseonload-kmod-3.10.0-862.2.3.el7-5.0.5_ms1-1.el7.x86_64 [root@rhel75-dhcptest1 ~]# rpm -q kernel kernel-3.10.0-693.11.1.el7.x86_64 kernel-3.10.0-693.37.4.el7.x86_64 <snip> We have the openonload drivers installed for: 3.10.0-693.11.1.el7 3.10.0-693.21.1.el7 3.10.0-862.2.3.el7 These are all the same release of the driver, just built against different kversions. If, with those already installed, we install a kernel version that's older than the newest of those we don't get a weak-modules link created because in the end it only checks the newest one and then deletes it since it's not compatible: <snip> [root@rhel75-dhcptest1 ~]# grep 'ln -sf \|rm \|compatible' weak-updates-3.10.0-693.37.4.el7.didntwork.txt + rm -f /tmp/weak-modules.xR2Hjj/tmp.gnpdCS0tJp + ln -sf /lib/modules/3.10.0-693.11.1.el7.x86_64/extra/onload.ko /lib/modules/3.10.0-693.37.4.el7.x86_64/weak-updates/onload.ko + ln -sf /lib/modules/3.10.0-693.11.1.el7.x86_64/extra/onload_cplane.ko /lib/modules/3.10.0-693.37.4.el7.x86_64/weak-updates/onload_cplane.ko + ln -sf /lib/modules/3.10.0-693.11.1.el7.x86_64/extra/sfc.ko /lib/modules/3.10.0-693.37.4.el7.x86_64/weak-updates/sfc.ko + ln -sf /lib/modules/3.10.0-693.11.1.el7.x86_64/extra/sfc_affinity.ko /lib/modules/3.10.0-693.37.4.el7.x86_64/weak-updates/sfc_affinity.ko + ln -sf /lib/modules/3.10.0-693.11.1.el7.x86_64/extra/sfc_char.ko /lib/modules/3.10.0-693.37.4.el7.x86_64/weak-updates/sfc_char.ko + ln -sf /lib/modules/3.10.0-693.11.1.el7.x86_64/extra/sfc_resource.ko /lib/modules/3.10.0-693.37.4.el7.x86_64/weak-updates/sfc_resource.ko + ln -sf /lib/modules/3.10.0-693.21.1.el7.x86_64/extra/onload.ko /lib/modules/3.10.0-693.37.4.el7.x86_64/weak-updates/onload.ko + ln -sf /lib/modules/3.10.0-693.21.1.el7.x86_64/extra/onload_cplane.ko /lib/modules/3.10.0-693.37.4.el7.x86_64/weak-updates/onload_cplane.ko + ln -sf /lib/modules/3.10.0-693.21.1.el7.x86_64/extra/sfc.ko /lib/modules/3.10.0-693.37.4.el7.x86_64/weak-updates/sfc.ko + ln -sf /lib/modules/3.10.0-693.21.1.el7.x86_64/extra/sfc_affinity.ko /lib/modules/3.10.0-693.37.4.el7.x86_64/weak-updates/sfc_affinity.ko + ln -sf /lib/modules/3.10.0-693.21.1.el7.x86_64/extra/sfc_char.ko /lib/modules/3.10.0-693.37.4.el7.x86_64/weak-updates/sfc_char.ko + ln -sf /lib/modules/3.10.0-693.21.1.el7.x86_64/extra/sfc_resource.ko /lib/modules/3.10.0-693.37.4.el7.x86_64/weak-updates/sfc_resource.ko + ln -sf /lib/modules/3.10.0-862.2.3.el7.x86_64/extra/onload.ko /lib/modules/3.10.0-693.37.4.el7.x86_64/weak-updates/onload.ko + ln -sf /lib/modules/3.10.0-862.2.3.el7.x86_64/extra/onload_cplane.ko /lib/modules/3.10.0-693.37.4.el7.x86_64/weak-updates/onload_cplane.ko + ln -sf /lib/modules/3.10.0-862.2.3.el7.x86_64/extra/sfc.ko /lib/modules/3.10.0-693.37.4.el7.x86_64/weak-updates/sfc.ko + ln -sf /lib/modules/3.10.0-862.2.3.el7.x86_64/extra/sfc_affinity.ko /lib/modules/3.10.0-693.37.4.el7.x86_64/weak-updates/sfc_affinity.ko + ln -sf /lib/modules/3.10.0-862.2.3.el7.x86_64/extra/sfc_char.ko /lib/modules/3.10.0-693.37.4.el7.x86_64/weak-updates/sfc_char.ko + ln -sf /lib/modules/3.10.0-862.2.3.el7.x86_64/extra/sfc_resource.ko /lib/modules/3.10.0-693.37.4.el7.x86_64/weak-updates/sfc_resource.ko + rm -f /lib/modules/3.10.0-693.37.4.el7.x86_64/weak-updates/onload.ko + pr_verbose 'Module onload.ko from kernel 3.10.0-862.2.3.el7.x86_64 is not compatible with kernel 3.10.0-693.37.4.el7.x86_64 in symbols: sme_me_mask' + rm -f /lib/modules/3.10.0-693.37.4.el7.x86_64/weak-updates/sfc_resource.ko + pr_verbose 'Module sfc_resource.ko from kernel 3.10.0-862.2.3.el7.x86_64 is not compatible with kernel 3.10.0-693.37.4.el7.x86_64 in symbols: sme_me_mask iommu_group_get iommu_present arch_dma_alloc_attrs iommu_detach_device iommu_capable iommu_domain_free iommu_domain_alloc iommu_unmap iommu_map iommu_group_add_device iommu_attach_device iommu_group_remove_device' + rm -f /lib/modules/3.10.0-693.37.4.el7.x86_64/weak-updates/onload_cplane.ko + pr_verbose 'Module onload_cplane.ko from kernel 3.10.0-862.2.3.el7.x86_64 is not compatible with kernel 3.10.0-693.37.4.el7.x86_64 in symbols: neigh_update' + rm -f /lib/modules/3.10.0-693.37.4.el7.x86_64/weak-updates/sfc.ko + pr_verbose 'Module sfc.ko from kernel 3.10.0-862.2.3.el7.x86_64 is not compatible with kernel 3.10.0-693.37.4.el7.x86_64 in symbols: sme_me_mask alloc_etherdev_mqs_rh napi_complete_done arch_dma_alloc_attrs flow_keys_dissector ___pskb_trim_adjust_truesize ktime_get_snapshot __skb_flow_dissect napi_schedule_prep genl_register_family' + rm -f /lib/modules/3.10.0-693.37.4.el7.x86_64/weak-updates/sfc_affinity.ko + pr_verbose 'Module sfc_affinity.ko from kernel 3.10.0-862.2.3.el7.x86_64 is not compatible with kernel 3.10.0-693.37.4.el7.x86_64 in symbols: efx_dl_unregister_driver efx_dl_filter_remove efx_dl_filter_insert efx_dl_register_driver_api_ver_22' + rm -f /lib/modules/3.10.0-693.37.4.el7.x86_64/weak-updates/sfc_char.ko + pr_verbose 'Module sfc_char.ko from kernel 3.10.0-862.2.3.el7.x86_64 is not compatible with kernel 3.10.0-693.37.4.el7.x86_64 in symbols: efrm_resource_ref efrm_pt_pace efrm_pd_from_resource efrm_client_get_nic efrm_vi_tx_alt_alloc efrm_pio_unlink_vi efrm_client_put efrm_pd_dma_unmap efrm_vf_resource_release efrm_filter_block_kernel efrm_pd_vport_alloc efrm_pd_set_min_align efrm_vi_get_pd efrm_filter_remove efhw_nic_release_dl_device efrm_eventq_kill_callback efrm_vi_q_alloc_sanitize_size __efrm_vi_attr_init efrm_vi_attr_set_ps_buffer_size efrm_filter_insert efrm_vi_set_alloc efrm_port_sniff efrm_vi_q_alloc efrm_client_add_ref efrm_vi_alloc efrm_pt_flush efrm_vi_set_from_resource efx_dl_vport_filter_remove efrm_license_challenge efrm_vi_set_to_resource efrm_pd_has_vport efhw_nic_acquire_dl_device efrm_pd_release efrm_vi_set_get_rss_context efrm_pio_link_vi efrm_tx_port_sniff efrm_resource_release efrm_vf_resource_alloc efrm_eventq_register_callback efrm_eventq_request_wakeup efrm_vi_resource_release efrm_vi_attr_set_with_interrupt efrm_pio_from_resource efrm_client_get efrm_v3_license_challenge efrm_vi_set_num_vis efrm_pio_to_resource efrm_vi_tx_alt_free efrm_vi_attr_set_interrupt_core efx_dl_vport_filter_insert efrm_pd_stack_id_get efrm_vi_attr_set_instance efrm_vi_set_get_pd efrm_pd_get_vport_id efrm_vi_register_flush_callback efrm_pio_alloc efrm_pd_dma_map efrm_pd_alloc efrm_pd_to_resource efrm_vi_attr_set_packed_stream efrm_vi_attr_set_pd' + rm -f /tmp/weak-modules.xR2Hjj/tmp.N7AjctZYup + rm -rf /tmp/weak-modules.xR2Hjj <snip> It iterates through the extra modules in ascending order, ending up with the 862 build being the "final" ones and then when it checks those for validity, they're (unsurprisingly) incompatible and are removed. So, no weak-modules for the kernel. Apparently in 20-15 we largely did the same thing, but sorted in decending order so we always checked against the oldest version for whatever reason. That worked for the client, but that was just luck. I worked up a quick patch for this that only does one thing, it checks if the kvers that it's going to link is newer than the installed/target kernel and just doesn't create the link. So, the logic would have us iterate through and end up with the highest installed module kvers that's older or equal to the target/installed kernel (patch attached) Ideally, we'd check each version and just install the newest compatible one, but that'd require a more extensive change to the logic in the script. That said, I don't know if this is the correct way to go about ameliorating this so we'll have to review. Version-Release number of selected component (if applicable): kmod-20-21 How reproducible: Always per the instructions above Steps to Reproduce: 1. Install multiple kvers 'extras' modules 2. Install kernel older than newest one of 1) 3. If that newer kvers build is incompatible, we don't get a link at all Actual results: Installed kernel doesn't get a weak-modules link even though there's a compatible version installed on the host. Expected results: Installed kernel should get a weak-modules link Additional info: