> Description of problem: Xen PCI-passthrough does not work with a Emulex Saturn-X LightPulse Fibre Channel Host Adapter" According to the tests, from the various reproducers we have determined that the issue does not occur for PCI pass-through on non-Emulex cards (non-lpfc) and indeed also does not occur on non-Saturn-X based lpfc cards. > Version-Release number of selected component (if applicable): Red Hat Enterprise Linux 5 Xen Virtualization Emulex Corporation Saturn-X: LightPulse Fibre Channel Host Adapter (Emulex: LPe12002, HP: AJ763A/82E) > How reproducible: Always > Steps to Reproduce: modprobe.conf entry: --- options pciback hide="(0000:1a:00.0)(0000:1a:00.1)" install lpfc /sbin/modprobe pciback ; /sbin/modprobe --first-time --ignore-install lpfc --- After boot, loaded lpfc: modprobe lpfc the PCI device has been bound successfully to the pciback driver: --- [root@ibm-x3550m3-01 ~]# ll /sys/bus/pci/drivers/lpfc/ total 0 --w------- 1 root root 4096 Feb 7 10:59 bind lrwxrwxrwx 1 root root 0 Feb 7 10:59 module -> ../../../../module/lpfc --w------- 1 root root 4096 Feb 7 10:59 new_id --w------- 1 root root 4096 Feb 7 10:59 remove_id --w------- 1 root root 4096 Feb 7 10:59 unbind [root@ibm-x3550m3-01 ~]# ll /sys/bus/pci/drivers/pciback total 0 lrwxrwxrwx 1 root root 0 Feb 7 11:00 0000:1a:00.0 -> ../../../../devices/pci0000:00/0000:00:07.0/0000:1a:00.0 lrwxrwxrwx 1 root root 0 Feb 7 11:00 0000:1a:00.1 -> ../../../../devices/pci0000:00/0000:00:07.0/0000:1a:00.1 --w------- 1 root root 4096 Feb 7 11:00 bind lrwxrwxrwx 1 root root 0 Feb 7 11:00 module -> ../../../../module/pciback --w------- 1 root root 4096 Feb 7 11:00 new_id --w------- 1 root root 4096 Feb 7 11:00 new_slot -rw------- 1 root root 4096 Feb 7 11:00 permissive -rw------- 1 root root 4096 Feb 7 11:00 quirks --w------- 1 root root 4096 Feb 7 11:00 remove_id --w------- 1 root root 4096 Feb 7 11:00 remove_slot -r-------- 1 root root 4096 Feb 7 11:00 slots --w------- 1 root root 4096 Feb 7 11:00 unbind --- restarted the xend service: --- [root@ibm-x3550m3-01 ~]# service xend restart restart xend: [ OK ] --- Assignable devices are shown as follows before vm1 is started: --- [root@ibm-x3550m3-01 ~]# xm pci-list-assignable-devices 0000:1a:00.1 0000:1a:00.0 --- Then started the RHEL 5 virtual machine: --- [root@ibm-x3550m3-01 ~]# virsh start vm1 Domain vm1 started --- On the dom0, we see the following devices assigned to the domU vm1: --- [root@ibm-x3550m3-01 ~]# xm pci-list vm1 domain bus slot func 0 1a 0 0 0 1a 0 1 --- ...they also have now been hidden from the list of assignable devices: --- [root@ibm-x3550m3-01 ~]# xm pci-list-assignable-devices [root@ibm-x3550m3-01 ~]# --- Now on the console for vm1, lspci shows the following devices assigned: --- [root@localhost ~]# lspci | grep Saturn 00:00.0 Fibre Channel: Emulex Corporation Saturn-X: LightPulse Fibre Channel Host Adapter (rev 03) 00:00.1 Fibre Channel: Emulex Corporation Saturn-X: LightPulse Fibre Channel Host Adapter (rev 03) --- However, /var/log/messages still shows the same errors: --- Feb 7 12:16:23 localhost kernel: Emulex LightPulse Fibre Channel SCSI driver 8.2.0.96.2p Feb 7 12:16:23 localhost kernel: Copyright(c) 2004-2011 Emulex. All rights reserved. Feb 7 12:16:23 localhost kernel: PCI: Enabling device 0000:00:00.0 (0000 -> 0002) Feb 7 12:16:23 localhost kernel: lpfc 0000:00:00.0: ioremap failed for SLIM memory. Feb 7 12:16:23 localhost kernel: lpfc 0000:00:00.0: 0:1402 Failed to set up pci memory space. Feb 7 12:16:23 localhost kernel: PCI: Enabling device 0000:00:00.1 (0000 -> 0002) Feb 7 12:16:23 localhost kernel: lpfc 0000:00:00.1: ioremap failed for SLIM memory. Feb 7 12:16:23 localhost kernel: lpfc 0000:00:00.1: 0:1402 Failed to set up pci memory space. --- > Actual results: the domU can not access the device # dmesg | grep lpfc lpfc 0000:00:00.0: ioremap failed for SLIM memory. lpfc 0000:00:00.0: 0:1402 Failed to set up pci memory space. lpfc 0000:00:00.1: ioremap failed for SLIM memory. lpfc 0000:00:00.1: 0:1402 Failed to set up pci memory space. > Expected results: the domU can access the device
The error message "ioremap failed for SLIM memory" is printed by lpfc_sli_pci_mem_setup() [drivers/scsi/lpfc/lpfc_init.c, 2.6.18-274.17.1.el5]: 5293 /* Get the bus address of Bar0 and Bar2 and the number of bytes 5294 * required by each mapping. 5295 */ 5296 phba->pci_bar0_map = pci_resource_start(pdev, 0); 5297 bar0map_len = pci_resource_len(pdev, 0); 5298 5299 phba->pci_bar2_map = pci_resource_start(pdev, 2); 5300 bar2map_len = pci_resource_len(pdev, 2); 5301 5302 /* Map HBA SLIM to a kernel virtual address. */ 5303 phba->slim_memmap_p = ioremap(phba->pci_bar0_map, bar0map_len); 5304 if (!phba->slim_memmap_p) { 5305 dev_printk(KERN_ERR, &pdev->dev, 5306 "ioremap failed for SLIM memory.\n"); 5307 goto out; 5308 } According to the Sep 23, 2011 domU config file attached to the CP case, the guest is PV. (This seems consistent with the tendency that HVM guests get passed-through devices as 06:00.0 and 07:00.0, IIRC, but the BDFs reported in comment 0 are different.) First, PCI passthrough to a PV guest is insecure (guest could setup a DMA wherever it wants). Second, the guest kernel is thus a xenified kernel, which diverts ioremap() as follows: ioremap() [include/asm-x86_64/mach-xen/asm/io.h] -> __ioremap() [arch/i386/mm/ioremap-xen.c] -> get_vm_area() [mm/vmalloc.c] -> __direct_remap_pfn_range() [arch/i386/mm/ioremap-xen.c] -> HYPERVISOR_mmu_update() After some checks, __ioremap() grabs a virtual address range, then __direct_remap_pfn_range() kicks the hypervisor in a batched loop to point the "init_mm" PTEs, covering the vaddr range, to the requested machine frames. Either the early checks fire, or the hypervisor refuses one of the PTE updates. "lspci -v -v -v" on the reproducer machine mentioned in comment 2 reports the following regions: 1a:00.0 Fibre Channel: Emulex Corporation Saturn-X: LightPulse Fibre Channel Host Adapter (rev 03) Region 0: Memory at 97a08000 (64-bit, non-prefetchable) [size=4K] Region 2: Memory at 97a00000 (64-bit, non-prefetchable) [size=16K] Region 4: I/O ports at 2100 [size=256] 1a:00.1 Fibre Channel: Emulex Corporation Saturn-X: LightPulse Fibre Channel Host Adapter (rev 03) Region 0: Memory at 97a09000 (64-bit, non-prefetchable) [size=4K] Region 2: Memory at 97a04000 (64-bit, non-prefetchable) [size=16K] Region 4: I/O ports at 2000 [size=256] According to the error message and the lpfc_sli_pci_mem_setup() source, region 0 is "SLIM memory", while region 2 is "HBA control registers". So the xenified ioremap() fails to map 97a08000..+4K and 97a09000..+4K. I would recommend: - checking 97a08000 against the Xen E820 map (xm dmesg) -- does the card specify a region that Xen considers reserved RAM (or RAM at all)? - testing the PV guest with RHEL-5.8 GA in both host and guest - passthrough to an HVM guest (with iommu enabled) instead of the PV guest - if none of those work, I'll add debug logging to __ioremap() (see above) and figure out where exactly it fails.
Created attachment 565629 [details] Xen __ioremap() debugging (In reply to comment #3) > I would recommend: > > - checking 97a08000 against the Xen E820 map (xm dmesg) -- does the card > specify a region that Xen considers reserved RAM (or RAM at all)? > > - testing the PV guest with RHEL-5.8 GA in both host and guest > > - passthrough to an HVM guest (with iommu enabled) instead of the PV guest > > - if none of those work, I'll add debug logging to __ioremap() (see above) > and figure out where exactly it fails. For the fourth option.
Reproduced the bug in-house with the x86_64 -308 xen kernel (both domU and dom0). When the guest (domid == 1) logs (during boot): PCI: Enabling device 0000:00:00.0 (0000 -> 0002) PCI: Setting latency timer of device 0000:00:00.0 to 64 lpfc 0000:00:00.0: ioremap failed for SLIM memory. lpfc 0000:00:00.0: 0:1402 Failed to set up pci memory space. The hypervisor prints: (XEN) mm.c:630:d1 Non-privileged (1) attempt to map I/O space 00097a08 Referring back to comment 3: > Region 0: Memory at 97a08000 (64-bit, non-prefetchable) [size=4K] > [...] > According to the error message and the lpfc_sli_pci_mem_setup() source, > region 0 is "SLIM memory", Machine frame 97a08 corresponds to this mach addr. Adding vendor-id:device-id ('10df:f100') to /etc/xen/xend-pci-permissive.sxp only takes care of the PCI conf space access: pciback 0000:1a:00.0: enabling permissive mode configuration space accesses! pciback 0000:1a:00.0: permissive mode is potentially unsafe! pciback 0000:1a:00.1: enabling permissive mode configuration space accesses! pciback 0000:1a:00.1: permissive mode is potentially unsafe! but the hypervisor keeps complaining about the non-privileged mapping attempt.
This bug is a duplicate of bug 735890. See in particular: - bug 735890 comment 15, - bug 735890 comment 16, - bug 735890 comment 17. Applying that analysis to the current values: [root@in-house-reproducer ~]# cat -n \ /sys/bus/pci/devices/0000:1a:00.0/resource 1 0x0000000097a08000 0x0000000097a08fff 0x0000000000020204 2 0x0000000000000000 0x0000000000000000 0x0000000000000000 3 0x0000000097a00000 0x0000000097a03fff 0x0000000000020204 4 0x0000000000000000 0x0000000000000000 0x0000000000000000 5 0x0000000000002100 0x00000000000021ff 0x0000000000020101 6 0x0000000000000000 0x0000000000000000 0x0000000000000000 7 0x0000000098200000 0x000000009823ffff 0x0000000000027200 From those, line 1, 3, 7 are iomem ranges, while line 5 describes an ioport range. Other lines are ignored. The following is a xend log excerpt, written at domU startup: Unconstrained device: 0000:1a:00.0 pci: enabling ioport 0x2100/0x100 pci: enabling iomem 0x97a08000/0x1000 pfn 0x97a08/0x1 pci: enabling iomem 0x97a00000/0x4000 pfn 0x97a00/0x4 pci: enabling iomem 0x98200000/0x40000 pfn 0x98200/0x40 pci-msix: remove permission for 0x97a02000/0x20000 0x97a02/0x20 pci-msix: remove permission for 0x97a03000/0x1000 0x97a03/0x1 Note that "iomem 0x97a08000/0x1000" from the above corresponds to Region 0: Memory at 97a08000 (64-bit, non-prefetchable) [size=4K] from comment 3, and to (XEN) mm.c:630:d1 Non-privileged (1) attempt to map I/O space 00097a08 from comment 7. When xend removes the permission for the MSI-X range 0x97a02000/0x20000 (see duplicate bug 735890), it kills any previously granted permission for included/overlapping iomem regions: 0x97a02000 -- start of MSI-X iomem (inclusive) 0x97a08000 -- start of lpfc SLI memory (inclusive) 0x97a09000 -- end of lpfc SLI memory (exclusive), size 0x1000 0x97a22000 -- end of MSI-X iomem (exclusive), size 0x20000 which causes lpfc_sli_pci_mem_setup() to fail in the guest. *** This bug has been marked as a duplicate of bug 735890 ***