Does this really work on 7.7 ? the module is built for 3.10.0-514.el7.x86_64
It doesn't work on 7.7, it has a hard dependency on the exact kernel version. The PCI ID mentioned in the errata (102b:0538 Matrox MGA G200eH3), is listed as supported in the 7.7 kernel. My doubt is, was the version 4.11 of the driver merged on RHEl > 7.3 ? filename: /lib/modules/3.10.0-1062.el7.x86_64/kernel/drivers/gpu/drm/mgag200/mgag200.ko.xz license: GPL description: MGA G200 SE author: Matthew Garrett retpoline: Y rhelversion: 7.7 srcversion: D81D70DB8DF7D3D172A3999 alias: pci:v0000102Bd00000538sv*sd*bc*sc*i* <---- alias: pci:v0000102Bd00000536sv*sd*bc*sc*i* alias: pci:v0000102Bd00000534sv*sd*bc*sc*i* alias: pci:v0000102Bd00000533sv*sd*bc*sc*i* alias: pci:v0000102Bd00000532sv*sd*bc*sc*i* alias: pci:v0000102Bd00000530sv*sd*bc*sc*i* alias: pci:v0000102Bd00000524sv*sd*bc*sc*i* alias: pci:v0000102Bd00000522sv*sd*bc*sc*i* depends: drm,drm_kms_helper,ttm,i2c-algo-bit intree: Y vermagic: 3.10.0-1062.el7.x86_64 SMP mod_unload modversions signer: Red Hat Enterprise Linux kernel signing key sig_key: AA:82:E3:5D:3B:30:2F:E1:49:5F:77:7E:DC:90:37:79:1F:0C:C5:9F sig_hashalgo: sha256 parm: modeset:Disable/Enable modesetting (int)
The reported driver version is 1.0.0 kernel: mgag200 0000:02:00.0: fb0: mgadrmfb frame buffer device kernel: [drm] Initialized mgag200 1.0.0 20110418 for 0000:02:00.0 on minor 0 Lenovo support indicates to upgrade to the driver version 4.11 or blacklist it to avoid the PCIe errors seen in the customer's system. BIOS Information Vendor: Lenovo Version: -[TEE142E-2.30]- Release Date: 07/02/2019 Address: 0xF0000 Runtime Size: 64 kB ROM Size: 32 MB Characteristics: PCI is supported PNP is supported BIOS is upgradeable BIOS shadowing is allowed Boot from CD is supported Selectable boot is supported EDD is supported Serial services are supported (int 14h) ACPI is supported USB legacy is supported BIOS boot specification is supported Targeted content distribution is supported UEFI is supported BIOS Revision: 2.30 Firmware Revision: 2.80 System Information Manufacturer: Lenovo Product Name: ThinkSystem SR850 -[7X19CTO1WW]- Version: 07 Base Board Information Manufacturer: Lenovo Product Name: -[7X19CTO1WW]- Version: none 02:00.0 VGA compatible controller [0300]: Matrox Electronics Systems Ltd. MGA G200e [Pilot] ServerEngines (SEP1) [102b:0522] (rev 42) (prog-if 00 [VGA controller]) Subsystem: Emulex Corporation Device [19a2:0101] Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 0, Cache Line Size: 64 bytes Interrupt: pin A routed to IRQ 16 NUMA node: 0 Region 0: Memory at d0000000 (32-bit, non-prefetchable) [size=16M] Region 1: Memory at d1a10000 (32-bit, non-prefetchable) [size=16K] Region 2: Memory at d1000000 (32-bit, non-prefetchable) [size=8M] Expansion ROM at d1a00000 [disabled] [size=64K] Capabilities: [dc] Power Management version 2 Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-) Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME- Capabilities: [54] MSI: Enable- Count=1/1 Maskable- 64bit- Address: 00000000 Data: 0000 Kernel driver in use: mgag200 Kernel modules: mgag200
(In reply to Daniele from comment #0) > Description of problem: > A customer of mine (VTS) has had some hypervisors crash. Lenovo has > identified the Matrox driver as a possible cause. > > Lenovo has pointed at https://support.lenovo.com/us/en/solutions/ht505175 , > which mentions that where the driver v4.11.0 is not used it could potential > cause PCIe issues and other anomaly. > > Matrox Video Driver 4.11.0 for RHEL Linux Enterprise 7 > https://datacentersupport.lenovo.com/us/en/products/servers/thinksystem/ > sr850/7x19/downloads/DS500305 > > Per RedHat https://access.redhat.com/errata/RHEA-2017:1730, Affected > Products include RHEL 7.7 > > > This driver seems to only be released as part of the driver update program, > which means it is not available in their repos at the moment and/or in the > image. > They've been asked to download and install it manually or to reinstall the > host with RHEL. > Of course, they're not eager to do that. They expect this to be added to the > next RHV-H release. > What about simpoly blacklisting the matrox driver? RHV-H does not really need graphics - so normal (S)VGA output should be sufficient? > Version-Release number of selected component (if applicable): > RHV-H 4.3 > > How reproducible: > > Steps to Reproduce: > 1. > 2. > 3. > > Actual results: > Updated Matrox driver not available in the RHV repos. > > Expected results: > Updated Matrox driver available in the RHV repos. > > Additional info:
Lenovo recommends the version 4.11 of the driver to avoid stability problems. We couldn't capture vmcores with kdump in this system because the host rebooted immediately after crash, but blacklisting the mgag200 module fixed the problem. That's enough for RHV, but it's clear that the version shipped in RHEL 7.7 has some problems.
Clearing the needinfo as it's already answered in comment 2.
Moving this to documentation for covering how to blacklist the matrox driver
Why this is not a RHEL issue? Is the driver present correctly in any RHEL version? IS there a RHEL equivalent bug? ?
Placing bug back into the documentation backlog until writing resources become available to resolve.
Sandro, Martin, How do you blacklist the Matrox driver?
Also, I just want to point out that I'll be updating the 4.4 documentation, referencing RHEL 8.4/RHVH 4.4 hypervisors, but this bug originally referenced 4.3.
(In reply to Steve Goodman from comment #17) > Sandro, Martin, > > How do you blacklist the Matrox driver? With this procedure: https://access.redhat.com/solutions/41278 by blacklisting mgag200 module.
(In reply to Sandro Bonazzola from comment #19) > (In reply to Steve Goodman from comment #17) > > Sandro, Martin, > > > > How do you blacklist the Matrox driver? > > With this procedure: https://access.redhat.com/solutions/41278 > by blacklisting mgag200 module. Sandro, Are you saying that we adopt https://access.redhat.com/solutions/41278 for RHV documentation (RHEL 8 option only)? Do we want to focus on mgag200 or simply list it as an example? Where should this appear in the documentation?
Here's more specific guidance on the scope the request in this bug: 1. Add new module describing how to add a new driver and how to deny a specific module. Make this a section in the appendix of the installation guides. 2. For RHVH: Add a link to that module after the last step in registering a host on Satellite in [1]. 3. For RHEL Hosts: Add a link to that module after the last step in [2]. [1] https://access.redhat.com/documentation/en-us/red_hat_virtualization/4.4/html/installing_red_hat_virtualization_as_a_self-hosted_engine_using_the_cockpit_web_interface/installing_the_self-hosted_engine_deployment_host_she_cockpit_deploy#Installing_Red_Hat_Virtualization_Hosts_SHE_deployment_host [2] https://access.redhat.com/documentation/en-us/red_hat_virtualization/4.4/html/installing_red_hat_virtualization_as_a_self-hosted_engine_using_the_cockpit_web_interface/installing_the_self-hosted_engine_deployment_host_she_cockpit_deploy#Enabling_the_Red_Hat_Enterprise_Linux_Host_Repositories_SHE_deployment_host
The information on how to deny a specific module is described in the KB article from comment 20: How do I prevent a kernel module from loading automatically? [1] [1] https://access.redhat.com/solutions/41278
(In reply to Steve Goodman from comment #21) > Here's more specific guidance on the scope the request in this bug: > > 1. Add new module describing how to add a new driver and how to deny a > specific module. Make this a section in the appendix of the installation > guides. > 2. For RHVH: Add a link to that module after the last step in registering a > host on Satellite in [1]. > 3. For RHEL Hosts: Add a link to that module after the last step in [2]. > > [1] > https://access.redhat.com/documentation/en-us/red_hat_virtualization/4.4/ > html/installing_red_hat_virtualization_as_a_self- > hosted_engine_using_the_cockpit_web_interface/installing_the_self- > hosted_engine_deployment_host_she_cockpit_deploy#Installing_Red_Hat_Virtualiz > ation_Hosts_SHE_deployment_host > [2] > https://access.redhat.com/documentation/en-us/red_hat_virtualization/4.4/ > html/installing_red_hat_virtualization_as_a_self- > hosted_engine_using_the_cockpit_web_interface/installing_the_self- > hosted_engine_deployment_host_she_cockpit_deploy#Enabling_the_Red_Hat_Enterpr > ise_Linux_Host_Repositories_SHE_deployment_host Sandro, How does step 1 differ from what we already have in https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8/html/managing_monitoring_and_updating_the_kernel/managing-kernel-modules_managing-monitoring-and-updating-the-kernel#loading-kernel-modules-automatically-at-system-boot-time_managing-kernel-modules and https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8/html/managing_monitoring_and_updating_the_kernel/managing-kernel-modules_managing-monitoring-and-updating-the-kernel#preventing-kernel-modules-from-being-automatically-loaded-at-system-boot-time_managing-kernel-modules ?
Sandro (and everyone else) -- Just to keep everyone in the loop, Steve is covering how to add a driver in response to https://bugzilla.redhat.com/show_bug.cgi?id=1834298 in one module and I'm covering how to deny a module in another module that will appear after Steve's. Th goal is that when both our MRs are completed, merged, and published, customers will experience them as a single unit.
Steve, Eli, or Donna please review https://gitlab.cee.redhat.com/rhci-documentation/docs-Red_Hat_Enterprise_Virtualization/-/merge_requests/1982
Peer review completed. I made a few comments. Looks good overall. After fixing those issues I noted, you can move this forward.
Sandro, Lev -- I git a couple questions from the peer reviewer that I could not answer. see below: >>In order to prevent kernel modules loading during boot, the module name must be added to a configuration file for the "modprobe" utility. This file must reside in /etc/modprobe.d . Q: Does this configuration file specific name? Is there any existing documentation about how to create this configuration file or can we assume our users know how to create it? >> Ensure the module is not configured to get loaded in /etc/modprobe.conf, /etc/modprobe.d/*, /etc/rc.modules, or /etc/sysconfig/modules/* before making the following modifications. Q: Is there any existing documentation that explains how a user ensures the module is not configured to get loaded in any of the files mentioned or can we assume our users know how to do thia?
(In reply to Richard Hoch from comment #41) > Sandro, Lev -- I git a couple questions from the peer reviewer that I could > not answer. see below: > > > >>In order to prevent kernel modules loading during boot, the module name must be added to a configuration file for the "modprobe" utility. This file must reside in /etc/modprobe.d . > > > Q: Does this configuration file specific name? Is there any existing > documentation about how to create this configuration file or can we assume > our users know how to create it? it's documented in man page (https://man7.org/linux/man-pages/man5/modprobe.d.5.html) any .conf file within that directory will be processed. > > > >> Ensure the module is not configured to get loaded in /etc/modprobe.conf, /etc/modprobe.d/*, /etc/rc.modules, or /etc/sysconfig/modules/* before making the following modifications. > > > Q: Is there any existing documentation that explains how a user ensures the > module is not configured to get loaded in any of the files mentioned or can > we assume our users know how to do thia? I guess "modprobe --showconfig" as per https://man7.org/linux/man-pages/man8/modprobe.8.html and check there for "install" lines
Sandro, I changed the Prerequisites and steps 1 and 2 of the procedure in the MR: https://gitlab.cee.redhat.com/rhci-documentation/docs-Red_Hat_Enterprise_Virtualization/-/merge_requests/1982. Please review. My main concern is the lines of code I added in steps 1 and 2. It may be easier to see them in this preview: https://cee-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/CCS/job/ccs-mr-preview/37817/artifact/assembly-Installing_Red_Hat_Virtualization_as_a_self-hosted_engine_using_the_Cockpit_web_interface/preview/index.html#proc-Preventing_Kernel_Modules_from_Loading_Automatically_Install_nodes_RHVH
Steve, Lev approved the changes. Please review. https://gitlab.cee.redhat.com/rhci-documentation/docs-Red_Hat_Enterprise_Virtualization/-/merge_requests/1982
LGTM!
Lucie or Guilherme, please review this MR: https://gitlab.cee.redhat.com/rhci-documentation/docs-Red_Hat_Enterprise_Virtualization/-/merge_requests/1982. Lev and Steve have approved it. The procedure is basically a rewriting of https://access.redhat.com/solutions/41278 for RHEL 8 only.
Hi Richard, this issue is for Node team. Chen, can you please take a look. Thanks!
MR merged and published. https://gitlab.cee.redhat.com/rhci-documentation/docs-Red_Hat_Enterprise_Virtualization/-/merge_requests/1982