Bug 1684466

Summary: boot rhel8 guest failed with 98 virtio disks that using multifunction of pcie-root-port
Product: Red Hat Enterprise Linux Advanced Virtualization Reporter: Yiqian Wei <yiwei>
Component: qemu-kvmAssignee: Sergio Lopez <slopezpa>
Status: CLOSED ERRATA QA Contact: Yiqian Wei <yiwei>
Severity: low Docs Contact:
Priority: low    
Version: 8.0CC: chayang, coli, ddepaula, jinzhao, juzhang, rbalakri, slopezpa, virt-maint, xuwei, yiwei
Target Milestone: rc   
Target Release: 8.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: qemu-kvm-4.0.0-3.module+el8.1.0+3265+26c4ed71 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-11-06 07:13:36 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
boot guest with cmd
none
Full console log
none
/run/initramfs/rdsosreport.txt
none
fixed log none

Description Yiqian Wei 2019-03-01 10:40:57 UTC
Description of problem:
boot rhel8 guest failed with 98 virtio disks that using multifunction of pcie-root-port,then login to guest via console serial:
...
[  185.591489] random: fast init done
[ TIME ] Timed out waiting for device dev-ma…d\x2d74\x2d\x2d184\x2dswap.device.
[DEPEND] Dependency failed for Resume from h…/dev/mapper/rhel_vm--74--184-swap.
[  OK  ] Reached target Local File Systems (Pre).
[  OK  ] Reached target Local File Systems.
         Starting Create Volatile Files and Directories...
[  OK  ] Started Create Volatile Files and Directories.
[  OK  ] Reached target System Initialization.
[  OK  ] Reached target Basic System.
[  318.489514] dracut-initqueue[392]: Warning: dracut-initqueue timeout - starting timeout scripts
[  319.072988] dracut-initqueue[392]: Warning: dracut-initqueue timeout - starting timeout scripts
---
[  384.064983] dracut-initqueue[392]: Warning: Could not boot.
...

Version-Release number of selected component (if applicable):
host version:
qemu-kvm-3.1.0-18.module+el8+2834+fa8bb6e2.x86_64
kernel-4.18.0-74.el8.x86_64
seabios-1.12.0-1.module+el8+2756+0dd6db45.x86_64
guest:rhel8(kernel-4.18.0-74.el8.x86_64)

How reproducible:
100%

Steps to Reproduce:
1.boot guest with cmd in attachment
2.In host,login to guest via console serial
# nc -U /tmp/console

Actual results:
---
[  318.489514] dracut-initqueue[392]: Warning: dracut-initqueue timeout - starting timeout scripts
[  319.072988] dracut-initqueue[392]: Warning: dracut-initqueue timeout - starting timeout scripts
---
[  384.064983] dracut-initqueue[392]: Warning: Could not boot.
...
Expected results:
guest can boot up and login success.

Additional info:

Comment 1 Yiqian Wei 2019-03-01 10:53:01 UTC
Created attachment 1539786 [details]
boot  guest with cmd

Comment 2 Yiqian Wei 2019-03-01 11:06:48 UTC
Created attachment 1539789 [details]
Full console log

Actual results:
[  185.591489] random: fast init done
[ TIME ] Timed out waiting for device dev-ma…d\x2d74\x2d\x2d184\x2dswap.device.
[DEPEND] Dependency failed for Resume from h…/dev/mapper/rhel_vm--74--184-swap.
[  OK  ] Reached target Local File Systems (Pre).
[  OK  ] Reached target Local File Systems.
         Starting Create Volatile Files and Directories...
[  OK  ] Started Create Volatile Files and Directories.
[  OK  ] Reached target System Initialization.
[  OK  ] Reached target Basic System.
[  318.489514] dracut-initqueue[392]: Warning: dracut-initqueue timeout - starting timeout scripts
[  319.072988] dracut-initqueue[392]: Warning: dracut-initqueue timeout - starting timeout scripts
---
[  384.061781] dracut-initqueue[392]: Warning: dracut-initqueue timeout - starting timeout scripts
[  384.064983] dracut-initqueue[392]: Warning: Could not boot.
         Starting Setup Virtual Console...
[  OK  ] Started Setup Virtual Console.
         Starting Dracut Emergency Shell...
Warning: /dev/mapper/rhel_vm--74--184-root does not exist
Warning: /dev/rhel_vm-74-184/root does not exist
Warning: /dev/rhel_vm-74-184/swap does not exist

Generating "/run/initramfs/rdsosreport.txt"


Entering emergency mode. Exit the shell to continue.
Type "journalctl" to view system logs.
You might want to save "/run/initramfs/rdsosreport.txt" to a USB stick or /boot
after mounting them and attach it to a bug report.


dracut:/#

Comment 3 Yiqian Wei 2019-03-01 11:08:20 UTC
Created attachment 1539790 [details]
/run/initramfs/rdsosreport.txt

Comment 4 Sergio Lopez 2019-03-12 11:33:48 UTC
I've reproduced the issue here, identifying two problems:

 - Extremely slow PCI device initialization. This is caused by the intel-iommu creating a Flatview for each device, and it's being worked on upstream. We should probably create a separate BZ for tracking this.

 - The Guest fails to initialize the AHCI PCI device which holds the boot disk. The problem here is that the Guest runs out of IRQ vectors (-28 is ENOSPC), with each virtio device consuming two (config + vq). I think this is more a limitation than an actual bug, and we should simply document it somewhere.

Comment 5 Sergio Lopez 2019-04-10 13:53:56 UTC
Upstream has a patch addressing this issue, included in qemu-4.0:

commit 4b519ef1de9a7cb8123abadab9e6c5697373087c
Author: Peter Xu <peterx>
Date:   Wed Mar 13 17:43:23 2019 +0800

    intel-iommu: optimize nodmar memory regions
    
    Previously we have per-device system memory aliases when DMAR is
    disabled by the system.  It will slow the system down if there are
    lots of devices especially when DMAR is disabled, because each of the
    aliased system address space will contain O(N) slots, and rendering
    such N address spaces will be O(N^2) complexity.
    
    This patch introduces a shared nodmar memory region and for each
    device we only create an alias to the shared memory region.  With the
    aliasing, QEMU memory core API will be able to detect when devices are
    sharing the same address space (which is the nodmar address space)
    when rendering the FlatViews and the total number of FlatViews can be
    dramatically reduced when there are a lot of devices.
    
    Suggested-by: Paolo Bonzini <pbonzini>
    Signed-off-by: Peter Xu <peterx>
    Message-Id: <20190313094323.18263-1-peterx>
    Signed-off-by: Paolo Bonzini <pbonzini>

Comment 7 Yiqian Wei 2019-06-12 02:43:56 UTC
(In reply to Sergio Lopez from comment #4)
Hi Sergio

I had tested this issue with the fixed version. following are detailed test result.

> I've reproduced the issue here, identifying two problems:
> 
>  - Extremely slow PCI device initialization. This is caused by the
> intel-iommu creating a Flatview for each device, and it's being worked on
> upstream. We should probably create a separate BZ for tracking this.

Compared with comment0, more fast when PCI device initializtion. detailed infomation, you can check attachment(fixed log)
Is it the accepted result? If yes, we can verify this issue according to the test result of fixed version.

> 
>  - The Guest fails to initialize the AHCI PCI device which holds the boot
> disk. The problem here is that the Guest runs out of IRQ vectors (-28 is
> ENOSPC), with each virtio device consuming two (config + vq). I think this
> is more a limitation than an actual bug, and we should simply document it
> somewhere.

For above issue, guest also login failed when boot up with 98 disks.
I also tried it with RHEL7.7 guest according to comment0, guest can boot up successfully with 98 disks.

I am not sure above issue you metioned is same with https://bugzilla.redhat.com/show_bug.cgi?id=1526370, could you help to check it?

If yes, we can track above issue through bz1526370.


Thanks
Jing

Comment 8 Yiqian Wei 2019-06-12 02:47:22 UTC
Created attachment 1579564 [details]
fixed log

Comment 9 Sergio Lopez 2019-06-12 06:37:56 UTC
(In reply to Yiqian Wei from comment #7)
> (In reply to Sergio Lopez from comment #4)
> Hi Sergio
> 
> I had tested this issue with the fixed version. following are detailed test
> result.
> 
> > I've reproduced the issue here, identifying two problems:
> > 
> >  - Extremely slow PCI device initialization. This is caused by the
> > intel-iommu creating a Flatview for each device, and it's being worked on
> > upstream. We should probably create a separate BZ for tracking this.
> 
> Compared with comment0, more fast when PCI device initializtion. detailed
> infomation, you can check attachment(fixed log)
> Is it the accepted result? If yes, we can verify this issue according to the
> test result of fixed version.

Yes, we can consider it verified.

> >  - The Guest fails to initialize the AHCI PCI device which holds the boot
> > disk. The problem here is that the Guest runs out of IRQ vectors (-28 is
> > ENOSPC), with each virtio device consuming two (config + vq). I think this
> > is more a limitation than an actual bug, and we should simply document it
> > somewhere.
> 
> For above issue, guest also login failed when boot up with 98 disks.
> I also tried it with RHEL7.7 guest according to comment0, guest can boot up
> successfully with 98 disks.

This is a limitation on the guest (it's running out of interrupt vectors), so this isn't exactly a bug. We should probably just document it somewhere and move on.
 
> I am not sure above issue you metioned is same with
> https://bugzilla.redhat.com/show_bug.cgi?id=1526370, could you help to check
> it?
> 
> If yes, we can track above issue through bz1526370.

No, it isn't exactly the same.

Thanks,
Sergio.

Comment 14 errata-xmlrpc 2019-11-06 07:13:36 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:3723