Bug 1505873 - Enabling SR-IOV does not work on Dell Poweredge M830
Summary: Enabling SR-IOV does not work on Dell Poweredge M830
Keywords:
Status: CLOSED DUPLICATE of bug 1506887
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-host
Version: 4.1.4
Hardware: Unspecified
OS: Unspecified
medium
high
Target Milestone: ovirt-4.1.8
: ---
Assignee: Alona Kaplan
QA Contact: Pavel Stehlik
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-10-24 12:58 UTC by Frank DeLorey
Modified: 2021-03-11 17:46 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-10-27 05:29:01 UTC
oVirt Team: Network
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
messages file from host (4.34 MB, text/plain)
2017-10-24 12:58 UTC, Frank DeLorey
no flags Details
dmesg file from failing host (401.33 KB, text/plain)
2017-10-24 12:59 UTC, Frank DeLorey
no flags Details
lspci from failing host (290.93 KB, text/plain)
2017-10-24 12:59 UTC, Frank DeLorey
no flags Details
BIOS settings from host (999.76 KB, application/zip)
2017-10-24 13:00 UTC, Frank DeLorey
no flags Details
New dmesg after solving reinstall issue (448.67 KB, text/plain)
2017-10-24 21:07 UTC, Frank DeLorey
no flags Details
hostdevListByCaps (835.59 KB, text/plain)
2017-10-25 18:04 UTC, Frank DeLorey
no flags Details


Links
System ID Private Priority Status Summary Last Updated
oVirt gerrit 83350 0 master MERGED engine: SRIOV physical function can have more than one child device 2017-11-01 13:09:18 UTC

Description Frank DeLorey 2017-10-24 12:58:46 UTC
Created attachment 1342739 [details]
messages file from host

Description of problem:
Following the RHV 4.1 documentation for enabling SR-IOV on a host does not work on this system. 


Version-Release number of selected component (if applicable):

RHV 4.1.4
RHV-H: rhvh-4.1-0.20170808


How reproducible:
100%


Steps to Reproduce:
1.Verify that SR-IOV is enabled in the BIOS
2.Edit the host and check Hostdev Passthrough & SR-IOV option this adds the kernel parameter intel_iommu=on
3.After host reboots none of the Network interfaces are marked with the SR-IOV icon.

Actual results:
None of the hosts network ports are marked with the SR-IOV icon

Expected results:
The network ports should show SR-IOV as available

Additional info:

I am attaching the lspci output, dmesg and messages files

Comment 1 Frank DeLorey 2017-10-24 12:59:24 UTC
Created attachment 1342741 [details]
dmesg file from failing host

Comment 2 Frank DeLorey 2017-10-24 12:59:57 UTC
Created attachment 1342742 [details]
lspci from failing host

Comment 3 Frank DeLorey 2017-10-24 13:00:36 UTC
Created attachment 1342743 [details]
BIOS settings from host

Comment 4 Frank DeLorey 2017-10-24 15:35:35 UTC
Update from the customer:

I checked the passthrough & SR-IOV option, but I noticed that /etc/grub.cfg  was not updated as I had expected....

The  'rhvh-4.1-0.20170808.0'  entry, which I think is the default, does not include the intel_iommu=on option, while the "tboot 1.9.5" entry does.
Again, the only thing I did was check the box in the UI and reboot.. Should i be using the tboot option instead?

### BEGIN /etc/grub.d/10_linux ###
menuentry 'rhvh-4.1-0.20170808.0' --class red --class gnu-linux --class gnu --class os --unrestricted $menuentry_id_option 'gnulinux-3.10.0-693.el7.x86_64-advanced-/dev/mapper/rhvh_nsc--cld--ulpst--0101-root' {
        load_video
        set gfxpayload=keep
        insmod gzio
        insmod part_msdos
        insmod ext2
        set root='hd0,msdos1'
        if [ x$feature_platform_search_hint = xy ]; then
          search --no-floppy --fs-uuid --set=root --hint-bios=hd0,msdos1 --hint-efi=hd0,msdos1 --hint-baremetal=ahci0,msdos1 --hint='hd0,msdos1'  1998df2d-9c75-4190-97c2-3fae97584e4f
        else
          search --no-floppy --fs-uuid --set=root 1998df2d-9c75-4190-97c2-3fae97584e4f
        fi
        linux16 /rhvh-4.1-0.20170808.0+1/vmlinuz-3.10.0-693.el7.x86_64 root=/dev/rhvh_nsc-cld-ulpst-0101/rhvh-4.1-0.20170808.0+1 ro crashkernel=auto rd.lvm.lv=rhvh_nsc-cld-ulpst-0101/swap rd.lvm.lv=rhvh_nsc-cld-ulpst-0101/rhvh-4.1-0.20170808.0+1 rhgb quiet LANG=en_US.UTF-8 img.bootid=rhvh-4.1-0.20170808.0+1
        initrd16 /rhvh-4.1-0.20170808.0+1/initramfs-3.10.0-693.el7.x86_64.img
}

### END /etc/grub.d/10_linux ###

### BEGIN /etc/grub.d/20_linux_tboot ###
submenu "tboot 1.9.5" {
menuentry 'Red Hat Enterprise Linux GNU/Linux, with tboot 1.9.5 and Linux 3.10.0-693.el7.x86_64' --class red --class gnu-linux --class gnu --class os --class tboot {
        insmod part_msdos
        insmod ext2
        set root='hd0,msdos1'
        if [ x$feature_platform_search_hint = xy ]; then
          search --no-floppy --fs-uuid --set=root --hint-bios=hd0,msdos1 --hint-efi=hd0,msdos1 --hint-baremetal=ahci0,msdos1 --hint='hd0,msdos1'  1998df2d-9c75-4190-97c2-3fae97584e4f
        else
          search --no-floppy --fs-uuid --set=root 1998df2d-9c75-4190-97c2-3fae97584e4f
        fi
        echo    'Loading tboot 1.9.5 ...'
        multiboot       /tboot.gz logging=serial,memory,vga
        echo    'Loading Linux 3.10.0-693.el7.x86_64 ...'
        module /vmlinuz-3.10.0-693.el7.x86_64 root=/dev/mapper/rhvh_nsc--cld--ulpst--0101-root ro crashkernel=auto rd.lvm.lv=rhvh_nsc-cld-ulpst-0101/root rd.lvm.lv=rhvh_nsc-cld-ulpst-0101/swap rhgb quiet intel_iommu=on
        echo    'Loading initial ramdisk ...'
        module /initramfs-3.10.0-693.el7.x86_64.img
}
}
### END /etc/grub.d/20_linux_tboot ###

Comment 5 Frank DeLorey 2017-10-24 20:46:31 UTC
We found that the Installation Guide listed one of the parameters incorrectly, it shows intel_iommu=pt, this should be iommu=pt. This causes the host to not boot. We corrected this and did the reinstall from the Host sub-tab. We rebooted the node and now it boots and dmesg looks good but still none of the network ports on this host show SR-IOV??

[root@nsc-cld-ulpst-0101 ~]# cat /var/log/dmesg | grep -i iommu | more
[    0.000000] Command line: BOOT_IMAGE=/rhvh-4.1-0.20170808.0+1/vmlinuz-3.10.0-693.el7.x86_64 root=/dev/rhvh_nsc-cld-ulpst-0101/rhvh-4.1-0.20170808
.0+1 ro crashkernel=auto rd.lvm.lv=rhvh_nsc-cld-ulpst-0101/swap rd.lvm.lv=rhvh_nsc-cld-ulpst-0101/rhvh-4.1-0.20170808.0+1 rhgb quiet LANG=en_US.UTF-
8 img.bootid=rhvh-4.1-0.20170808.0+1 intel_iommu=on iommu=pt
[    0.000000] Kernel command line: BOOT_IMAGE=/rhvh-4.1-0.20170808.0+1/vmlinuz-3.10.0-693.el7.x86_64 root=/dev/rhvh_nsc-cld-ulpst-0101/rhvh-4.1-0.2
0170808.0+1 ro crashkernel=auto rd.lvm.lv=rhvh_nsc-cld-ulpst-0101/swap rd.lvm.lv=rhvh_nsc-cld-ulpst-0101/rhvh-4.1-0.20170808.0+1 rhgb quiet LANG=en_
US.UTF-8 img.bootid=rhvh-4.1-0.20170808.0+1 intel_iommu=on iommu=pt
[    0.000000] DMAR: IOMMU enabled
[    0.861783] DMAR-IR: IOAPIC id 12 under DRHD base  0xfbffc000 IOMMU 2
[    0.861784] DMAR-IR: IOAPIC id 11 under DRHD base  0xe3ffc000 IOMMU 1
[    0.861785] DMAR-IR: IOAPIC id 10 under DRHD base  0xc7ffc000 IOMMU 0
[    0.861786] DMAR-IR: IOAPIC id 8 under DRHD base  0xabffc000 IOMMU 3
[    0.861787] DMAR-IR: IOAPIC id 9 under DRHD base  0xabffc000 IOMMU 3
[    5.624611] iommu: Adding device 0000:00:00.0 to group 0
[    5.624661] iommu: Adding device 0000:00:01.0 to group 1
[    5.624709] iommu: Adding device 0000:00:02.0 to group 2
[    5.624756] iommu: Adding device 0000:00:03.0 to group 3
[    5.624802] iommu: Adding device 0000:00:03.2 to group 4
[    5.624998] iommu: Adding device 0000:00:05.0 to group 5"

This is becoming critical for this customer as it is holding up their deployments.

Comment 6 Frank DeLorey 2017-10-24 21:07:23 UTC
Created attachment 1342951 [details]
New dmesg after solving reinstall issue

Comment 7 Dan Kenigsberg 2017-10-25 07:05:41 UTC
I suspect that this is a dup of bug 1474638. Would the customer upgrade to RHV-H-4.1.6 in order to verify that?

Also, lowering urgency: this is about a single customer not being able to use and advanced feature. Urgent severity should be kept for widely affecting bugs or crippling of a production deployment.

Comment 9 Frank DeLorey 2017-10-25 10:06:34 UTC
Dan I do not believe that we could be hitting bug 1474638 yet as the PF is not showing SR-IOV so we are not yet at the point of creating the VFs. I will recommend the upgrade though as we will most likely hit this bug eventually once the initial problem is fixed. The sos report from the host is here:

https://api.access.redhat.com/rs/cases/01953577/attachments/b2851de7-0593-4bd4-8766-4be59c8ba1ef

Comment 10 Frank DeLorey 2017-10-25 18:04:00 UTC
Created attachment 1343362 [details]
hostdevListByCaps

Comment 15 Germano Veit Michel 2017-10-27 05:29:01 UTC

*** This bug has been marked as a duplicate of bug 1506887 ***


Note You need to log in before you can comment on or make changes to this bug.