Bug 1777308 - [Docs] How to blacklist the Matrox Video Driver 4.11.0 in RHVH
Summary: [Docs] How to blacklist the Matrox Video Driver 4.11.0 in RHVH
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: Documentation
Version: 4.3.8
Hardware: Unspecified
OS: Unspecified
low
medium
Target Milestone: ovirt-4.4.6
: ---
Assignee: Richard Hoch
QA Contact: rhev-docs@redhat.com
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-11-27 11:24 UTC by Daniele
Modified: 2023-03-24 16:12 UTC (History)
24 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-06-15 10:42:17 UTC
oVirt Team: Node
Target Upstream Version:
Embargoed:
sgoodman: needinfo-


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Knowledge Base (Solution) 4639181 0 None None None 2019-12-05 14:48:07 UTC

Comment 1 Yuval Turgeman 2019-11-27 11:33:12 UTC
Does this really work on 7.7 ? the module is built for 3.10.0-514.el7.x86_64

Comment 2 Juan Orti 2019-11-29 08:08:30 UTC
It doesn't work on 7.7, it has a hard dependency on the exact kernel version.

The PCI ID mentioned in the errata (102b:0538 Matrox MGA G200eH3), is listed as supported in the 7.7 kernel. My doubt is, was the version 4.11 of the driver merged on RHEl > 7.3 ?

filename:       /lib/modules/3.10.0-1062.el7.x86_64/kernel/drivers/gpu/drm/mgag200/mgag200.ko.xz
license:        GPL
description:    MGA G200 SE
author:         Matthew Garrett
retpoline:      Y
rhelversion:    7.7
srcversion:     D81D70DB8DF7D3D172A3999
alias:          pci:v0000102Bd00000538sv*sd*bc*sc*i*  <----
alias:          pci:v0000102Bd00000536sv*sd*bc*sc*i*
alias:          pci:v0000102Bd00000534sv*sd*bc*sc*i*
alias:          pci:v0000102Bd00000533sv*sd*bc*sc*i*
alias:          pci:v0000102Bd00000532sv*sd*bc*sc*i*
alias:          pci:v0000102Bd00000530sv*sd*bc*sc*i*
alias:          pci:v0000102Bd00000524sv*sd*bc*sc*i*
alias:          pci:v0000102Bd00000522sv*sd*bc*sc*i*
depends:        drm,drm_kms_helper,ttm,i2c-algo-bit
intree:         Y
vermagic:       3.10.0-1062.el7.x86_64 SMP mod_unload modversions 
signer:         Red Hat Enterprise Linux kernel signing key
sig_key:        AA:82:E3:5D:3B:30:2F:E1:49:5F:77:7E:DC:90:37:79:1F:0C:C5:9F
sig_hashalgo:   sha256
parm:           modeset:Disable/Enable modesetting (int)

Comment 3 Juan Orti 2019-12-02 08:32:54 UTC
The reported driver version is 1.0.0

kernel: mgag200 0000:02:00.0: fb0: mgadrmfb frame buffer device
kernel: [drm] Initialized mgag200 1.0.0 20110418 for 0000:02:00.0 on minor 0

Lenovo support indicates to upgrade to the driver version 4.11 or blacklist it to avoid the PCIe errors seen in the customer's system.

BIOS Information
        Vendor: Lenovo
        Version: -[TEE142E-2.30]-
        Release Date: 07/02/2019
        Address: 0xF0000
        Runtime Size: 64 kB
        ROM Size: 32 MB
        Characteristics:
                PCI is supported
                PNP is supported
                BIOS is upgradeable
                BIOS shadowing is allowed
                Boot from CD is supported
                Selectable boot is supported
                EDD is supported
                Serial services are supported (int 14h)
                ACPI is supported
                USB legacy is supported
                BIOS boot specification is supported
                Targeted content distribution is supported
                UEFI is supported
        BIOS Revision: 2.30
        Firmware Revision: 2.80
System Information
        Manufacturer: Lenovo
        Product Name: ThinkSystem SR850 -[7X19CTO1WW]-
        Version: 07
Base Board Information
        Manufacturer: Lenovo
        Product Name: -[7X19CTO1WW]-
        Version: none


02:00.0 VGA compatible controller [0300]: Matrox Electronics Systems Ltd. MGA G200e [Pilot] ServerEngines (SEP1) [102b:0522] (rev 42) (prog-if 00 [VGA controller])
        Subsystem: Emulex Corporation Device [19a2:0101]
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Latency: 0, Cache Line Size: 64 bytes
        Interrupt: pin A routed to IRQ 16
        NUMA node: 0
        Region 0: Memory at d0000000 (32-bit, non-prefetchable) [size=16M]
        Region 1: Memory at d1a10000 (32-bit, non-prefetchable) [size=16K]
        Region 2: Memory at d1000000 (32-bit, non-prefetchable) [size=8M]
        Expansion ROM at d1a00000 [disabled] [size=64K]
        Capabilities: [dc] Power Management version 2
                Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
                Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
        Capabilities: [54] MSI: Enable- Count=1/1 Maskable- 64bit-
                Address: 00000000  Data: 0000
        Kernel driver in use: mgag200
        Kernel modules: mgag200

Comment 4 Martin Tessun 2019-12-03 08:40:23 UTC
(In reply to Daniele from comment #0)
> Description of problem:
> A customer of mine (VTS) has had some hypervisors crash. Lenovo has
> identified the Matrox driver as a possible cause.
> 
> Lenovo has pointed at https://support.lenovo.com/us/en/solutions/ht505175 ,
> which mentions that where the driver v4.11.0 is not used it could potential
> cause PCIe issues and other anomaly.
> 
> Matrox Video Driver 4.11.0 for RHEL Linux Enterprise 7
> https://datacentersupport.lenovo.com/us/en/products/servers/thinksystem/
> sr850/7x19/downloads/DS500305
> 
> Per RedHat https://access.redhat.com/errata/RHEA-2017:1730, Affected
> Products include RHEL 7.7
> 
> 
> This driver seems to only be released as part of the driver update program,
> which means it is not available in their repos at the moment and/or in the
> image.
> They've been asked to download and install it manually or to reinstall the
> host with RHEL.
> Of course, they're not eager to do that. They expect this to be added to the
> next RHV-H release.
> 

What about simpoly blacklisting the matrox driver? RHV-H does not really need graphics - so normal (S)VGA output should be sufficient?

> Version-Release number of selected component (if applicable):
> RHV-H 4.3
> 
> How reproducible:
> 
> Steps to Reproduce:
> 1. 
> 2.
> 3.
> 
> Actual results:
> Updated Matrox driver not available in the RHV repos.
> 
> Expected results:
> Updated Matrox driver available in the RHV repos.
> 
> Additional info:

Comment 5 Juan Orti 2019-12-05 14:32:41 UTC
Lenovo recommends the version 4.11 of the driver to avoid stability problems.

We couldn't capture vmcores with kdump in this system because the host rebooted immediately after crash, but blacklisting the mgag200 module fixed the problem. That's enough for RHV, but it's clear that the version shipped in RHEL 7.7 has some problems.

Comment 6 Juan Orti 2019-12-18 07:12:07 UTC
Clearing the needinfo as it's already answered in comment 2.

Comment 10 Sandro Bonazzola 2020-01-14 08:54:34 UTC
Moving this to documentation for covering how to blacklist the matrox driver

Comment 12 Marina Kalinin 2020-05-01 22:08:00 UTC
Why this is not a RHEL issue?
Is the driver present correctly in any RHEL version?
IS there a RHEL equivalent bug?

?

Comment 14 ctomasko 2021-02-18 04:33:26 UTC
Placing bug back into the documentation backlog until writing resources become available to resolve.

Comment 17 Steve Goodman 2021-04-20 15:24:53 UTC
Sandro, Martin,

How do you blacklist the Matrox driver?

Comment 18 Steve Goodman 2021-04-20 15:26:04 UTC
Also, I just want to point out that I'll be updating the 4.4 documentation, referencing RHEL 8.4/RHVH 4.4 hypervisors, but this bug originally referenced 4.3.

Comment 19 Sandro Bonazzola 2021-04-20 15:39:41 UTC
(In reply to Steve Goodman from comment #17)
> Sandro, Martin,
> 
> How do you blacklist the Matrox driver?

With this procedure: https://access.redhat.com/solutions/41278
by blacklisting mgag200 module.

Comment 20 Richard Hoch 2021-05-12 14:58:07 UTC
(In reply to Sandro Bonazzola from comment #19)
> (In reply to Steve Goodman from comment #17)
> > Sandro, Martin,
> > 
> > How do you blacklist the Matrox driver?
> 
> With this procedure: https://access.redhat.com/solutions/41278
> by blacklisting mgag200 module.

Sandro,

Are you saying that we adopt https://access.redhat.com/solutions/41278 for RHV documentation (RHEL 8 option only)? Do we want to focus on mgag200 or simply list it as an example? Where should this appear in the documentation?

Comment 21 Steve Goodman 2021-05-13 07:25:51 UTC
Here's more specific guidance on the scope the request in this bug:

1. Add new module describing how to add a new driver and how to deny a specific module. Make this a section in the appendix of the installation guides.
2. For RHVH: Add a link to that module after the last step in registering a host on Satellite in [1].
3. For RHEL Hosts: Add a link to that module after the last step in [2].

[1] https://access.redhat.com/documentation/en-us/red_hat_virtualization/4.4/html/installing_red_hat_virtualization_as_a_self-hosted_engine_using_the_cockpit_web_interface/installing_the_self-hosted_engine_deployment_host_she_cockpit_deploy#Installing_Red_Hat_Virtualization_Hosts_SHE_deployment_host
[2] https://access.redhat.com/documentation/en-us/red_hat_virtualization/4.4/html/installing_red_hat_virtualization_as_a_self-hosted_engine_using_the_cockpit_web_interface/installing_the_self-hosted_engine_deployment_host_she_cockpit_deploy#Enabling_the_Red_Hat_Enterprise_Linux_Host_Repositories_SHE_deployment_host

Comment 22 Steve Goodman 2021-05-13 07:28:55 UTC
The information on how to deny a specific module is described in the KB article from comment 20: How do I prevent a kernel module from loading automatically? [1]

[1] https://access.redhat.com/solutions/41278

Comment 23 Richard Hoch 2021-05-19 09:45:35 UTC
(In reply to Steve Goodman from comment #21)
> Here's more specific guidance on the scope the request in this bug:
> 
> 1. Add new module describing how to add a new driver and how to deny a
> specific module. Make this a section in the appendix of the installation
> guides.
> 2. For RHVH: Add a link to that module after the last step in registering a
> host on Satellite in [1].
> 3. For RHEL Hosts: Add a link to that module after the last step in [2].
> 
> [1]
> https://access.redhat.com/documentation/en-us/red_hat_virtualization/4.4/
> html/installing_red_hat_virtualization_as_a_self-
> hosted_engine_using_the_cockpit_web_interface/installing_the_self-
> hosted_engine_deployment_host_she_cockpit_deploy#Installing_Red_Hat_Virtualiz
> ation_Hosts_SHE_deployment_host
> [2]
> https://access.redhat.com/documentation/en-us/red_hat_virtualization/4.4/
> html/installing_red_hat_virtualization_as_a_self-
> hosted_engine_using_the_cockpit_web_interface/installing_the_self-
> hosted_engine_deployment_host_she_cockpit_deploy#Enabling_the_Red_Hat_Enterpr
> ise_Linux_Host_Repositories_SHE_deployment_host

Sandro,

How does step 1 differ from what we already have in https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8/html/managing_monitoring_and_updating_the_kernel/managing-kernel-modules_managing-monitoring-and-updating-the-kernel#loading-kernel-modules-automatically-at-system-boot-time_managing-kernel-modules  and https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8/html/managing_monitoring_and_updating_the_kernel/managing-kernel-modules_managing-monitoring-and-updating-the-kernel#preventing-kernel-modules-from-being-automatically-loaded-at-system-boot-time_managing-kernel-modules ?

Comment 24 Richard Hoch 2021-05-19 15:03:30 UTC
Sandro (and everyone else) -- Just to keep everyone in the loop, Steve is covering how to add a driver in response to https://bugzilla.redhat.com/show_bug.cgi?id=1834298 in one module and I'm covering how to deny a module in another module that will appear after Steve's. Th goal is that when both our MRs are completed, merged, and published, customers will experience them as a single unit.

Comment 40 Steve Goodman 2021-06-01 09:31:45 UTC
Peer review completed. I made a few comments. Looks good overall. After fixing those issues I noted, you can move this forward.

Comment 41 Richard Hoch 2021-06-01 14:40:28 UTC
Sandro, Lev -- I git a couple questions from the peer reviewer that I could not answer. see below:


>>In order to prevent kernel modules loading during boot, the module name must be added to a configuration file for the "modprobe" utility. This file must reside in /etc/modprobe.d .


Q: Does this configuration file specific name? Is there any existing documentation about how to create this configuration file or can we assume our users know how to create it?


>> Ensure the module is not configured to get loaded in /etc/modprobe.conf, /etc/modprobe.d/*, /etc/rc.modules, or /etc/sysconfig/modules/* before making the following modifications.


Q: Is there any existing documentation that explains how a user ensures the module is not configured to get loaded in any of the files mentioned or can we assume our users know how to do thia?

Comment 42 Sandro Bonazzola 2021-06-03 07:50:08 UTC
(In reply to Richard Hoch from comment #41)
> Sandro, Lev -- I git a couple questions from the peer reviewer that I could
> not answer. see below:
> 
> 
> >>In order to prevent kernel modules loading during boot, the module name must be added to a configuration file for the "modprobe" utility. This file must reside in /etc/modprobe.d .
> 
> 
> Q: Does this configuration file specific name? Is there any existing
> documentation about how to create this configuration file or can we assume
> our users know how to create it?

it's documented in man page (https://man7.org/linux/man-pages/man5/modprobe.d.5.html) any .conf file within that directory will be processed.


> 
> 
> >> Ensure the module is not configured to get loaded in /etc/modprobe.conf, /etc/modprobe.d/*, /etc/rc.modules, or /etc/sysconfig/modules/* before making the following modifications.
> 
> 
> Q: Is there any existing documentation that explains how a user ensures the
> module is not configured to get loaded in any of the files mentioned or can
> we assume our users know how to do thia?

I guess "modprobe --showconfig" as per https://man7.org/linux/man-pages/man8/modprobe.8.html and check there for "install" lines

Comment 44 Richard Hoch 2021-06-09 10:15:55 UTC
Steve, Lev approved the changes. Please review. https://gitlab.cee.redhat.com/rhci-documentation/docs-Red_Hat_Enterprise_Virtualization/-/merge_requests/1982

Comment 45 Steve Goodman 2021-06-10 08:34:45 UTC
LGTM!

Comment 47 Richard Hoch 2021-06-10 10:14:20 UTC
Lucie or Guilherme, please review this MR: https://gitlab.cee.redhat.com/rhci-documentation/docs-Red_Hat_Enterprise_Virtualization/-/merge_requests/1982. Lev and Steve have approved it. The procedure is basically a rewriting of https://access.redhat.com/solutions/41278 for RHEL 8 only.

Comment 48 Lucie Leistnerova 2021-06-10 11:06:52 UTC
Hi Richard, this issue is for Node team. Chen, can you please take a look. Thanks!


Note You need to log in before you can comment on or make changes to this bug.