Bug 652210 - change igb max_vfs with wrong reaction
Summary: change igb max_vfs with wrong reaction
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel
Version: 5.6
Hardware: x86_64
OS: Linux
low
medium
Target Milestone: rc
: ---
Assignee: Stefan Assmann
QA Contact: Red Hat Kernel QE team
URL:
Whiteboard:
Depends On: 645284
Blocks:
TreeView+ depends on / blocked
 
Reported: 2010-11-11 11:42 UTC by Hangbin Liu
Modified: 2011-03-02 15:44 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2011-03-02 15:44:27 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
lspci-vvv.txt (38.34 KB, text/plain)
2010-12-10 09:40 UTC, Stefan Assmann
no flags Details

Description Hangbin Liu 2010-11-11 11:42:05 UTC
Description of problem:

while testing bz645284 on hp-dl2x170g6-01.rhts.eng.bos.redhat.com , change the igb max_vfs number ,but have a wrong reaction

Version-Release number of selected component (if applicable):

[root@hp-dl2x170g6-01 ~]# uname -r
2.6.18-227.el5

[root@hp-dl2x170g6-01 ~]# lspci | grep 82576
05:00.0 Ethernet controller: Intel Corporation 82576 Gigabit Network Connection
(rev 01)
05:00.1 Ethernet controller: Intel Corporation 82576 Gigabit Network Connection
(rev 01)

[root@hp-dl2x170g6-01 ~]# rpm -qa | grep kvm
etherboot-zroms-kvm-5.4.4-13.el5
kvm-qemu-img-83-164.el5
kvm-tools-83-164.el5
kvm-83-164.el5
etherboot-roms-kvm-5.4.4-13.el5
kmod-kvm-83-164.el5


How reproducible:

100%

Steps to Reproduce:
1.modprobe -r igb;modprobe igb max_vfs=10
2.
3.
  
Actual results:

[root@hp-dl2x170g6-01 ~]# modprobe -r igb;modprobe igb max_vfs=10
dca service started, version 1.8
PCI: Enabling device 0000:05:00.0 (0000 -> 0002)
PCI: Enabling device 0000:05:00.1 (0000 -> 0002)


Expected results:

host reboot in loop (kernel 2.6.18-227.el5)
or  generate igb vf message (other RHEL5.5 kernel)

Additional info:

with kernel 2.6.18-194.el5 on this system have the same reaction

Comment 1 Stefan Assmann 2010-11-15 09:47:32 UTC
The system hp-dl2x170g6-01.rhts.eng.bos.redhat.com is not SR-IOV capable. Booting the system with RHEL6 revealed:
igb 0000:05:00.1: not enough MMIO resources for SR-IOV

However the message did not appear on RHEL5. Will investigate why this is not shown on RHEL5.

Comment 3 Stefan Assmann 2010-11-25 12:12:47 UTC
I took some time to dig into this and the reason why there's no message displayed about MMIO resources is that the extended PCI config space is not accessible. Thus the SRIOV capability register are not accessible.

pci_cfg_space_size() already fails to recognize the extended PCI config space.
The exact point where things go wrong is
if (pci_read_config_dword(dev, 256, &status) != PCIBIOS_SUCCESSFUL)
      goto fail;
[...]
 fail:  
        return PCI_CFG_SPACE_SIZE;

Prarit, any idea why this doesn't work?

Comment 4 Prarit Bhargava 2010-11-29 13:01:25 UTC
(In reply to comment #3)
> I took some time to dig into this and the reason why there's no message
> displayed about MMIO resources is that the extended PCI config space is not
> accessible. Thus the SRIOV capability register are not accessible.
> 
> pci_cfg_space_size() already fails to recognize the extended PCI config space.
> The exact point where things go wrong is
> if (pci_read_config_dword(dev, 256, &status) != PCIBIOS_SUCCESSFUL)
>       goto fail;
> [...]
>  fail:  
>         return PCI_CFG_SPACE_SIZE;
> 
> Prarit, any idea why this doesn't work?

No idea :/ ... but ... ddd, wasn't there something in the errata about this?

P.

Comment 7 Stefan Assmann 2010-12-02 12:16:09 UTC
ping ddutile

Comment 8 Don Dutile (Red Hat) 2010-12-09 22:00:38 UTC
Stefan's assessment of the problem is correct:
(a) the BIOS is not reserving enough space in the PCI bridge mmap-resources to configure the VF devices
(b) Linux's PCI configuration sw isn't savvy enough to re-allocate mmap-resources to resolve this issue (and not cause regressions in other machines).
(although another bz claims some patches may fix it; these same patches have caused regressions in other machines.)

The BIOS must be 'SRIOV savvy'  meaning, it must scan the (architected) PCI cap structures of devices looking for SRIOV caps, and ensure the PCI bridges leading to those devices have sufficient memory resources for the PF & VF devices below it. 

btw: I hope VTd was enabled in the BIOS, and the kernel command line had something like intel_iommu=on or amd_iommu=on (Q: is the machine an Intel or AMD machine, and what is it's chipset? ... an lspci -vvv dump would be helpful confirmation.)

Comment 9 Stefan Assmann 2010-12-10 09:38:54 UTC
The question I'm currently after is, why can't RHEL5 access the extended PCI config space on this machine (RHEL6 can). See comment #3.

This should be independent of any *iommu=on or BIOS option.

Q: is the machine an Intel or AMD machine, and what is it's chipset?
A: Intel machine with 55x0 chipset

Comment 10 Stefan Assmann 2010-12-10 09:40:23 UTC
Created attachment 467936 [details]
lspci-vvv.txt

lspci -vvv on RHEL5

Comment 11 Don Dutile (Red Hat) 2010-12-10 15:05:29 UTC
(In reply to comment #9)
> The question I'm currently after is, why can't RHEL5 access the extended PCI
> config space on this machine (RHEL6 can). See comment #3.
> 
> This should be independent of any *iommu=on or BIOS option.
> 
> Q: is the machine an Intel or AMD machine, and what is it's chipset?
> A: Intel machine with 55x0 chipset

I have _not_ seen a system where RHEL6 can see more of config space then RHEL5 wrt SRIOV caps, so something else is amiss.

If this was tested with a *recent* rhel5.5 installation, and SELinux is enabled,
there were SELinux access (default policy) problems which maybe the culprit.
An update to the nightly RHEL5 repo may resolve it.... but....

Comparing the pci_cfg_space_size() code btwn rhel5 & rhel6, it appears
rhel6 filters this access a bit further dependent on the type of device
it is checking.  So, it's possible if yet-another-BIOS bug doesn't properly enable pci-mmconf for the particular device &/or PCI bridge, RHEL5 will be
limited and RHEL6 will work around it.

Given this is the first failure I've seen like this (btwn RHEL5 & RHEL6 dev-assignment) and the first on this HP system,  I (still) tend to believe
a BIOS fix/upgrade would resolve this issue.

Comment 12 Stefan Assmann 2010-12-16 14:15:23 UTC
So I'll check for a new BIOS. Anyway I don't think it's SElinux related because that's one of the first thing I disable.

Comment 14 Tony Camuso 2011-02-12 14:14:47 UTC
I am planning to be on-site in Westford on Tuesday.
I will update the BIOS for this system then.

Comment 15 Don Dutile (Red Hat) 2011-02-21 15:42:54 UTC
Not sure why the sriov messages differ on boot-up, but here's the culprit,
besides not enough MMIO space:

This is the PCI bridge above the 82579's (note the sec/sub-bus num's of 05):

00:05.0 PCI bridge: Intel Corporation 5520/X58 I/O Hub PCI Express Root Port 5 (rev 13) (prog-if 00 [Normal decode])
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx+
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0, Cache Line Size: 64 bytes
	Bus: primary=00, secondary=05, subordinate=05, sec-latency=0
	I/O behind bridge: 0000e000-0000efff
	Memory behind bridge: fbe00000-fbefffff
	Prefetchable memory behind bridge: 00000000fff00000-00000000000fffff
	Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- <SERR- <PERR-
	BridgeCtl: Parity- SERR+ NoISA- VGA- MAbort- >Reset- FastB2B-
		PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
	Capabilities: [40] Subsystem: Hewlett-Packard Company Device 330b
	Capabilities: [60] MSI: Enable+ Count=1/2 Maskable+ 64bit-
		Address: fee10000  Data: 40c9
		Masking: 00000002  Pending: 00000000
	Capabilities: [90] Express (v2) Root Port (Slot+), MSI 00
		DevCap:	MaxPayload 256 bytes, PhantFunc 0, Latency L0s <64ns, L1 <1us
			ExtTag+ RBE+ FLReset-
		DevCtl:	Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
			RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-
			MaxPayload 256 bytes, MaxReadReq 128 bytes
		DevSta:	CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
		LnkCap:	Port #0, Speed 5GT/s, Width x4, ASPM L0s L1, Latency L0 <512ns, L1 <64us
			ClockPM- Surprise+ LLActRep+ BwNot+
		LnkCtl:	ASPM Disabled; RCB 64 bytes Disabled- Retrain- CommClk-
			ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
		LnkSta:	Speed 2.5GT/s, Width x4, TrErr- Train- SlotClk+ DLActive+ BWMgmt- ABWMgmt-
		SltCap:	AttnBtn- PwrCtrl- MRL- AttnInd- PwrInd- HotPlug- Surprise-
			Slot #0, PowerLimit 0.000W; Interlock- NoCompl-
		SltCtl:	Enable: AttnBtn- PwrFlt- MRL- PresDet- CmdCplt- HPIrq- LinkChg-
			Control: AttnInd Off, PwrInd Off, Power- Interlock-
		SltSta:	Status: AttnBtn- PowerFlt- MRL- CmdCplt- PresDet+ Interlock-
			Changed: MRL- PresDet+ LinkState+
		RootCtl: ErrCorrectable- ErrNon-Fatal- ErrFatal- PMEIntEna- CRSVisible+
		RootCap: CRSVisible-
		RootSta: PME ReqID 0000, PMEStatus- PMEPending-
		DevCap2: Completion Timeout: Range BCD, TimeoutDis+ ARIFwd+
		DevCtl2: Completion Timeout: 260ms to 900ms, TimeoutDis- ARIFwd-
                                                                         ^^^^^

Without ARIFwd enabled on this bridge (by the BIOS), then the 82576 VFs cannot
be accessed (use BDF's beside PF's 05.00.x, which requires ARI).

We've seen this recently on a couple systems.  Sometimes it's alleviated by
trying a different slot in the machine.  If not, it once again points to a BIOS that is not SRIOV savvy.

Comment 16 Tony Camuso 2011-02-22 15:56:28 UTC
Didn't have time to update the BIOS last Tuesday. 

Maybe later this week ...

Comment 17 Stefan Assmann 2011-03-02 15:44:27 UTC
With an updated BIOS I still couldn't get SR-IOV to work. However I'm getting
igb 0000:05:00.0: not enough MMIO resources for SR-IOV
on RHEL5 now with 2.6.18-246.el5.

That's probably all we can do, having a message saying "sorry no SR-IOV for you".
Case closed, imho. :)


Note You need to log in before you can comment on or make changes to this bug.