Bug 1463163

Summary: Guest OS will down when disk enable the IOMMU for Virtio
Product: Red Hat Enterprise Linux 7 Reporter: Jingjing Shao <jishao>
Component: seabiosAssignee: jason wang <jasowang>
Status: CLOSED ERRATA QA Contact: FuXiangChun <xfu>
Severity: high Docs Contact:
Priority: high    
Version: 7.4CC: ailan, chayang, coli, dyuan, jasowang, jherrman, jishao, jtomko, juzhang, knoel, michen, mtessun, pezhang, virt-maint, wainersm, xuzhang, yalzhang
Target Milestone: rcKeywords: ZStream
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: seabios-1.10.2-4.el7 Doc Type: Bug Fix
Doc Text:
Previously, after enabling the Input/Output Memory Management Unit (IOMMU) for a Q35 machine type guest, the guest failed to boot and displayed a "No bootable device" error message. This update fixes the handling of IOMMU on Q35 guests, and the described problem no longer occurs.
Story Points: ---
Clone Of:
: 1472131 (view as bug list) Environment:
Last Closed: 2018-04-10 14:26:48 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1471076, 1472131    
Attachments:
Description Flags
bios that supports IOMMU_PLATFORM
none
scsi controller enable virtio iommu none

Description Jingjing Shao 2017-06-20 09:35:56 UTC
Description of problem:
Guest OS will down when disk enable the IOMMU for Virtio

Version-Release number of selected component (if applicable):
3.10.0-681.el7.x86_64
qemu-kvm-rhev-2.9.0-10.el7.x86_64
libvirt-3.2.0-10.el7.x86_64

How reproducible:
100%

Steps to Reproduce:
1. In host, add "iommu=pt intel_iommu=on" to kernel line
2. Add "intel_iommu=on" to kernel line of  q35  guest
3. Add the xml as below in the guest 
   <features>
    <ioapic driver='qemu'/>
  </features>
  ...
   <iommu model='intel'>
      <driver intremap='on' iotlb='on'/>
   </iommu>
 
4. Add the iommu='on' ats='on' to guest driver

<disk type='file' device='disk'>
      <driver name='qemu' type='qcow2' cache='none' iommu='on' ats='on'/>
      <source file='/var/lib/libvirt/images/rhel7.3-q35.qcow2'/>
      <backingStore/>
      <target dev='vda' bus='virtio'/>
      <alias name='virtio-disk0'/>
      <address type='pci' domain='0x0000' bus='0x01' slot='0x00' function='0x0'/>
    </disk>

5.Start the guest
The guest OS will report  "No bootable device" 


Actual results:
As the step5 shows

Expected results:
The guest should start and can be login successfully
 
Additional info:
If delete the iommu='on' ats='on', the guest OS will start successfully.

Comment 2 Jingjing Shao 2017-06-26 07:24:32 UTC
There is a familiar error info with the scsi controller enabled virtio iommu,so
I add the comment here . If they are caused by different reasons and need to file a new bug, I will.


1. In host, add "iommu=pt intel_iommu=on" to kernel line
2. Add "intel_iommu=on" to kernel line of  q35  guest
3. Add the xml as below in the guest 
   <features>
    <ioapic driver='qemu'/>
  </features>
  ...
   <iommu model='intel'>
      <driver intremap='on' iotlb='on'/>
   </iommu>
 
4. Add the iommu='on' ats='on' to scsi controller and change the disk bus to scsi.
....
     <controller type='scsi' index='0' model='virtio-scsi'>
       <driver iommu='on' ats='on'/>
     </controller>
....
     <disk type='file' device='disk'>
      <driver name='qemu' type='qcow2' cache='none  iommu='on' ats='on'/>
      <source file='/nfs/rhel7.3-q35.qcow2'/>
      <target dev='vda' bus='scsi'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/>
    </disk>


5. Start the guest and find the guest can not be accessed.

Comment 3 jason wang 2017-07-03 04:27:22 UTC
Created attachment 1293726 [details]
bios that supports IOMMU_PLATFORM

Comment 4 jason wang 2017-07-03 04:28:36 UTC
Please try the attached bios.bin to see if it solves the issue for 7.4 guest.

Note: you need to use qemu-kvm -bios bios.bin to test.

Thanks

Comment 6 Jingjing Shao 2017-07-03 12:34:39 UTC
(In reply to jason wang from comment #4)
> Please try the attached bios.bin to see if it solves the issue for 7.4 guest.
> 
> Note: you need to use qemu-kvm -bios bios.bin to test.
> 
> Thanks


Yes, I use the attached bios.bin to start the guest.

For the disk, it works and can start successfully enabled iommu
For the scsi controller, it does not work and the guest hanged

Comment 7 jason wang 2017-07-04 02:05:44 UTC
(In reply to Jingjing Shao from comment #6)
> (In reply to jason wang from comment #4)
> > Please try the attached bios.bin to see if it solves the issue for 7.4 guest.
> > 
> > Note: you need to use qemu-kvm -bios bios.bin to test.
> > 
> > Thanks
> 
> 
> Yes, I use the attached bios.bin to start the guest.
> 
> For the disk, it works and can start successfully enabled iommu
> For the scsi controller, it does not work and the guest hanged

Interesting, scsi seems to work for upstream qemu.

Please show me

1) qemu cli & version
2) guest kernel version
3) what phase did guest hang? did you see grub?

Thanks

Comment 8 jason wang 2017-07-04 02:21:59 UTC
Works for me with following cli:

$qemu_path -bios bios.bin \
           -drive file=$img_path,if=none,id=hd \
           -device virtio-scsi-pci,id=scsi,disable-legacy=on,disable-modern=off,iommu_platform=on \
           -device scsi-hd,drive=hd \
           -vnc :10 \
           -netdev tap,id=hn0,vhost=on,queues=4 \
           -device virtio-net-pci,netdev=hn0,vectors=10,mq=on,mrg_rxbuf=off,guest_tso4=off,guest_tso6=off,guest_ecn=off,guest_ufo=off,mac=$mac \
           -m 2G -enable-kvm \
           -cpu host \
           -smp 4 \
           -serial stdio \
           -snapshot

But I'm using upstream kernel as guest.

Comment 9 Jingjing Shao 2017-07-04 06:34:36 UTC
> Interesting, scsi seems to work for upstream qemu.
> 
> Please show me
> 
> 1) qemu cli & version
> 2) guest kernel version
> 3) what phase did guest hang? did you see grub?
> 
> Thanks

qemu-kvm-rhev-2.9.0-14.el7.x86_64
guest kernel : 3.10.0-691.el7.x86_64

qemu cli:
 -device virtio-scsi-pci,iommu_platform=on,ats=on,id=scsi0,bus=pci.3,addr=0x0
 -drive file=/var/lib/libvirt/images/q35-bios.qcow2,format=qcow2,if=none,id=drive-scsi0-0-0-0 -device scsi-hd,bus=scsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0-0-0-0,id=scsi0-0-0-0,bootindex=1

But maybe I found the point.

If I add the "iommu=on" to the guest kernel command line, I can reproduce this scsi issue and add the image as attachment.

If I delete the "iommu=on" from the guest kernel command line, the guest can boot successfully.

But the disk part is OK for both scenarios. 


Additional info:
I also try the NIC device enable virtio iommu with the guest with "iommu=on" in kernel command line. The guest can not start successfully.

qemu cli:
-device virtio-net-pci,netdev=hostnet1,id=net1,mac=52:54:00:7e:af:e6,bus=pci.7,addr=0x0,iommu_platform=on,ats=on

Comment 10 Jingjing Shao 2017-07-04 06:36:41 UTC
Created attachment 1294097 [details]
scsi controller enable virtio iommu

Comment 11 jason wang 2017-07-04 09:07:55 UTC
Thanks, intel_iommu=on works for upstream kernel. Let me try RHEL.

Comment 12 jason wang 2017-07-04 09:09:57 UTC
Btw, could you please try gpxe to see if it works? If net, please open a new bz for gpxe and assign it to me.

Thanks a lot.

Comment 15 jason wang 2017-07-04 11:45:43 UTC
(In reply to jason wang from comment #11)
> Thanks, intel_iommu=on works for upstream kernel. Let me try RHEL.

Speak too fast, looks like upstream is broken too. Please file a new bug with component set to kernel, and assign it to me.

Thanks

Comment 21 Jingjing Shao 2017-07-05 08:34:08 UTC
(In reply to jason wang from comment #15)
> (In reply to jason wang from comment #11)
> > Thanks, intel_iommu=on works for upstream kernel. Let me try RHEL.
> 
> Speak too fast, looks like upstream is broken too. Please file a new bug
> with component set to kernel, and assign it to me.
> 
> Thanks

Double check with jason, file new qemu-kvm-rhev bug for the scsi issue. 

https://bugzilla.redhat.com/show_bug.cgi?id=1467811

Comment 22 jason wang 2017-07-06 02:37:48 UTC
Seabios path has been posted upstream.

Comment 23 jason wang 2017-07-13 04:22:12 UTC
Patch accepted upstream.

Comment 44 Wainer dos Santos Moschetta 2017-09-28 13:11:40 UTC
Fixed in seabios-1.10.2-4.el7

Comment 46 FuXiangChun 2017-11-28 06:35:11 UTC
Reproduced bug with seabios-1.10.2-3.el7.x86_64 & qemu-kvm-rhev-2.10.0-7.el7.x86_64 & kernel-3.10.0-798.el7.x86_64

result:
The guest OS will report  "No bootable device" 

Verified bug with seabios-1.11.0-1.el7.x86_64 & qemu-kvm-rhev-2.10.0-7.el7.x86_64 & kernel-3.10.0-798.el7.x86_64

Guest works well.


steps:

For q35
1. In host, add "iommu=pt intel_iommu=on" to kernel line
2. Add "intel_iommu=on" to kernel line of q35  guest
3. iommu_platform=on,ats=on to "-device virtio-blk-pci...."
4. key qemu command

-machine pc-q35-rhel7.5.0
-drive file=/home/seabio-new-system-disk.qcow2,format=qcow2,if=none,id=drive-virtio-disk0 -device virtio-blk-pci,scsi=off,bus=pci.4,addr=0x0,drive=drive-virtio-disk0,id=virtio-disk0,iommu_platform=on,ats=on,bootindex=1

For pc

5.key qemu command

/usr/libexec/qemu-kvm -name guest=q35-seabios,debug-threads=on -machine pc -cpu SandyBridge,vmx=on -m 8192 -realtime mlock=off -smp 4,sockets=2,cores=2,threads=1 -drive file=/home/seabio-new-system-disk.qcow2,format=qcow2,if=none,id=drive-virtio-disk0 -device virtio-blk-pci,scsi=off,drive=drive-virtio-disk0,id=virtio-disk0,iommu_platform=on,ats=on,disable-legacy=on,disable-modern=off,bootindex=1 -netdev tap,fd=20,id=hostnet0,vhost=on -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:ee:67:31,iommu_platform=on,ats=on,disable-legacy=on,disable-modern=off -vnc :2 -monitor stdi

Comment 47 FuXiangChun 2017-11-28 06:36:59 UTC
According to comment46, set this bug as verified.

Comment 50 errata-xmlrpc 2018-04-10 14:26:48 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:0814

Comment 51 jason wang 2021-04-07 03:00:33 UTC
Clear the unnecessary needinfo flag as the bug has been fixed.

Thanks