Bug 2038101

Summary: [WRB] virtio-net-failover test freeze on x86_64
Product: Red Hat Enterprise Linux 9 Reporter: Miroslav Rezanina <mrezanin>
Component: qemu-kvmAssignee: Virtualization Maintenance <virt-maint>
qemu-kvm sub component: Networking QA Contact: Yanhui Ma <yama>
Status: CLOSED CURRENTRELEASE Docs Contact:
Severity: medium    
Priority: unspecified CC: coli, jinzhao, juzhang, lvivier, virt-maint
Version: unspecified   
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Unspecified   
Whiteboard:
Fixed In Version: WRB 2022-01-26 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-01-26 11:07:47 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Miroslav Rezanina 2022-01-07 09:44:24 UTC
Found on WRB:
2022-01-05

Affected commit:
Enable make check

Upstream change introducing issue:
unidentified

Issue:
When running make check on x86_64 it is stuck forever on virtio-net-failover test that reports following error:

   ERROR:../tests/qtest/virtio-net-failover.c:336:start_virtio_net: 'dev' should not be NULL ERROR


Temporary solution:
Test disabled.

Expected solution:
Test/qemu code fixed so test pass 

Additional information:

Comment 1 Yanhui Ma 2022-01-10 04:11:51 UTC
hello Miroslav,

Could you help provide test steps and qemu version for the bug? Because currently there is no WRB qemu-kvm for rhel9 in brewweb.
We just want to know whether QE can do something for it.

Regards,
Yanhui

Comment 2 Miroslav Rezanina 2022-01-12 16:29:18 UTC
(In reply to Yanhui Ma from comment #1)
> hello Miroslav,
> 
> Could you help provide test steps and qemu version for the bug? Because
> currently there is no WRB qemu-kvm for rhel9 in brewweb.
> We just want to know whether QE can do something for it.
> 
> Regards,
> Yanhui

Hi,

the build with failure is: https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=42354697

For testing with code you just need to use following patch on 220112-6.2.50 branch for wrb:

diff --git a/tests/qtest/meson.build b/tests/qtest/meson.build
index a4dcd0051d..c986125a69 100644
--- a/tests/qtest/meson.build
+++ b/tests/qtest/meson.build
@@ -68,6 +68,10 @@ qtests_i386 = \
   (config_all_devices.has_key('CONFIG_RTL8139_PCI') ? ['rtl8139-test'] : []) +              \
   (config_all_devices.has_key('CONFIG_E1000E_PCI_EXPRESS') ? ['fuzz-e1000e-test'] : []) +   \
   (config_all_devices.has_key('CONFIG_ESP_PCI') ? ['am53c974-test'] : []) +                 \
+  (config_all_devices.has_key('CONFIG_VIRTIO_NET') and                                      \
+   config_all_devices.has_key('CONFIG_Q35') and                                             \
+   config_all_devices.has_key('CONFIG_VIRTIO_PCI') and                                      \
+   slirp.found() ? ['virtio-net-failover'] : []) +                                          \
   qtests_pci +                                                                              \
   ['fdc-test',
    'ide-test',

Comment 3 Laurent Vivier 2022-01-17 11:07:13 UTC
Hi Mirek,

I'm able to reproduce the problem, I have a look. It seems downstream only.

Comment 4 Laurent Vivier 2022-01-17 13:34:48 UTC
The problem seems to be introduced by:

commit db31eb85ebb233e7948e9556faf47ed454eab807
Author: Miroslav Rezanina <mrezanin>
Date:   Fri Oct 19 13:10:31 2018 +0200

    Add x86_64 machine types
    
    Adding changes to add RHEL machine types for x86_64 architecture.
    
    Signed-off-by: Miroslav Rezanina <mrezanin>
    
    Rebase notes (6.1.0):
    - Update qemu64 cpu spec
    
    Merged patches (6.1.0):
    - 59c284ad3b x86: Add x86 rhel8.5 machine types
    - a8868b42fe redhat: x86: Enable 'kvm-asyncpf-int' by default
    - a3995e2eff Remove RHEL 7.0.0 machine type (only x86_64 changes)
    - ad3190a79b Remove RHEL 7.1.0 machine type (only x86_64 changes)
    - 84bbe15d4e Remove RHEL 7.2.0 machine type (only x86_64 changes)
    - 0215eb3356 Remove RHEL 7.3.0 machine types (only x86_64 changes)
    - af69d1ca6e Remove RHEL 7.4.0 machine types (only x86_64 changes)
    - 8f7a74ab78 Remove RHEL 7.5.0 machine types (only x86_64 changes)
    
    Merged patches (weekly-220105):
    - eae7d8dd3c x86/rhel machine types: Add pc_rhel_8_5_compat
    - 6762f56469 x86/rhel machine types: Wire compat into q35 and i440fx

Comment 5 Laurent Vivier 2022-01-17 14:39:08 UTC
The problem comes because acpi-pci-hotplug-with-bridge-support is disabled by default with RHEL machine types.

You can fix the test by enabling it:

diff --git a/tests/qtest/virtio-net-failover.c b/tests/qtest/virtio-net-failover.c
index 22ad54bb9594..aba3261cbbe0 100644
--- a/tests/qtest/virtio-net-failover.c
+++ b/tests/qtest/virtio-net-failover.c
@@ -23,6 +23,7 @@
 #define PCI_SEL_BASE            0x0010
 
 #define BASE_MACHINE "-M q35 -nodefaults " \
+    "-global ICH9-LPC.acpi-pci-hotplug-with-bridge-support=on " \
     "-device pcie-root-port,id=root0,addr=0x1,bus=pcie.0,chassis=1 " \
     "-device pcie-root-port,id=root1,addr=0x2,bus=pcie.0,chassis=2 "
 
I'm going to have a look to see if can fix it with support set to off (I think it fails because hotplug is slower with PCIe native hotplug).

Comment 6 Miroslav Rezanina 2022-01-19 10:10:34 UTC
(In reply to Laurent Vivier from comment #3)
> Hi Mirek,
> 
> I'm able to reproduce the problem, I have a look. It seems downstream only.

Yes, this test pass correctly when using upstream configuration in 'Initial redhat build' but fails after downstream config is done in 'Enable make check'.

Comment 7 Laurent Vivier 2022-01-19 11:21:48 UTC
(In reply to Miroslav Rezanina from comment #6)
> (In reply to Laurent Vivier from comment #3)
> > Hi Mirek,
> > 
> > I'm able to reproduce the problem, I have a look. It seems downstream only.
> 
> Yes, this test pass correctly when using upstream configuration in 'Initial
> redhat build' but fails after downstream config is done in 'Enable make
> check'.

Could you apply downstream the patch in comment #5?

I'm not sure I'll have the time to fix the test to run with a non-default upstream configuration.

Comment 8 Miroslav Rezanina 2022-01-26 11:07:47 UTC
With fix from comment #5 test is working properly. Applied in WRB on 2022-01-26.

Comment 9 Yanhui Ma 2022-01-29 06:28:09 UTC
Can reproduce the issue with following steps, paste them here just for possible reference.

#git clone https://gitlab.com/redhat/rhel/sst/virtualization/qemu-kvm-weekly-rebase
# git checkout 220112-6.2.50
Then apply patch on comment 2.
# ./configure --enable-kvm --target-list=x86_64-softmmu
# make
# cd build/
# make check