1055331 – virDevicePCIAddressParseXML check failed for PCI device 0000:00:00.0

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1055331 - virDevicePCIAddressParseXML check failed for PCI device 0000:00:00.0

Summary: virDevicePCIAddressParseXML check failed for PCI device 0000:00:00.0

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Linux 7
Classification:	Red Hat
Component:	libvirt
Sub Component:
Version:	7.0
Hardware:	x86_64
OS:	Linux
Priority:	medium
Severity:	medium
Target Milestone:	rc
Target Release:	7.0
Assignee:	Pavel Hrdina
QA Contact:	Virtualization Bugs
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2014-01-20 02:11 UTC by Hao Liu
Modified:	2016-11-21 18:28 UTC (History)
CC List:	10 users (show)
Fixed In Version:	libvirt-1.3.1-1.el7
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2016-11-03 18:07:42 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHSA-2016:2577	0	normal	SHIPPED_LIVE	Moderate: libvirt security, bug fix, and enhancement update	2016-11-03 12:07:06 UTC

Description Hao Liu 2014-01-20 02:11:44 UTC

Description of problem:
virDevicePCIAddressParseXML check failed for PCI device 0000:00:00.0

Version-Release number of selected component (if applicable):
libvirt-1.1.1-18.el7.x86_64

How reproducible:
100%

Steps to Reproduce:
# virsh nodedev-reset pci_0000_00_00_0

Actual results:
error: Failed to detach device pci_0000_00_00_0
error: internal error: Insufficient specification for PCI address

Expected results:
error: Failed to reset device pci_0000_00_00_0
error: internal error: Unable to reset PCI device 0000:00:00.0: no FLR, PM reset or bus reset available

This also happens for nodedev-detach and nodedev-reattach. 
But be careful to test this, If it works, you might crash the host.

Additional info:
this is caused in src/conf/device_conf.c when virDevicePCIAddressParseXML calling virDevicePCIAddressIsValid.
In virDevicePCIAddressParseXML, we set all addr bits to 0 as default. 
In virDevicePCIAddressIsValid, we assume the address is valid if any one in domain/bus/slot is non-zero.
This will make address 0000:00:00.0 invalid, but it's valid.

Comment 5 Pavel Hrdina 2015-06-18 10:09:05 UTC

Moving this bug to RHEL-7.3 because it's a minor issue and trying to do anything with host bridge (which is always at pci_0000_00_00_0) has no reasonable usage.  We can rewrite the libvirt to not use all 0 values as not specified address or just modify the error message as I don't think, that rewriting the libvirt give as any value.

Comment 7 Pavel Hrdina 2016-05-25 11:00:01 UTC

Upstream commit:

commit f8fe8f03455783afcd62d79db7ce4120f514c629
Author: Laine Stump <laine>
Date:   Wed Jul 22 11:59:00 2015 -0400

    conf: more useful error message when pci function is out of range

Comment 9 Jingjing Shao 2016-08-31 05:35:14 UTC

I try to reproduce the bug with libvirt-1.1.1-18.el7.x86_64 in the description

    virsh nodedev-reset pci_0000_00_00_0
    error: Failed to reset device pci_0000_00_00_0
    error: internal error: Invalid device 0000:00:00.0 driver file /sys/bus/pci/devices/0000:00:00.0/driver is not a symlink

I see that the patch commit on 2015-06-22 in comment7

I test with the 7.2 released version : libvirt-1.2.17-13.el7.x86_64

    virsh nodedev-reset pci_0000_00_00_0
    error: Failed to reset device pci_0000_00_00_0
    error: internal error: Unable to reset PCI device 0000:00:00.0: no FLR, PM reset or bus reset available      <----the error info is as expected

I test with the newest verison in rhel7.3：
libvirt-2.0.0-6.el7.x86_64

    virsh nodedev-reset pci_0000_00_00_0
    error: Failed to reset device pci_0000_00_00_0
    error: internal error: Unable to reset PCI device 0000:00:00.0: no FLR, PM reset or bus reset available      <----the error info is as expected

So I think it can be verified

Comment 10 Jingjing Shao 2016-10-10 07:09:09 UTC

Hi Pavel，

I find another issue about nodedev-reset as below. 
This command should only be used for the endpoint device, is it right?  
Should we prevent the reset action on the device which is not the endpoint?

Can you help to check this issue and update something? Thank you in advance



# virsh nodedev-list --tree
....
 +- pci_0000_00_1c_0
  |   |
  |   +- pci_0000_01_00_0
  |       |
  |       +- pci_0000_02_00_0
  |       |   |
  |       |   +- pci_0000_03_00_0
  |       |       |
  |       |       +- pci_0000_04_00_0
  |       |         
  |       +- pci_0000_02_01_0


# lspci -v
01:00.0 PCI bridge: Renesas Technology Corp. SH7757 PCIe Switch [PS] (prog-if 00 [Normal decode])
	Flags: bus master, fast devsel, latency 0
	BIST result: 00
	Bus: primary=01, secondary=02, subordinate=05, sec-latency=0
	Memory behind bridge: c1000000-c18fffff
	Prefetchable memory behind bridge: 00000000c0000000-00000000c0ffffff
	Capabilities: [40] Power Management version 3
	Capabilities: [50] MSI: Enable- Count=1/1 Maskable- 64bit+
	Capabilities: [70] Express Upstream Port, MSI 00
	Capabilities: [b0] Subsystem: Renesas Technology Corp. SH7757 PCIe Switch [PS]
	Capabilities: [100] Advanced Error Reporting
	Kernel driver in use: pcieport
	Kernel modules: shpchp

02:00.0 PCI bridge: Renesas Technology Corp. SH7757 PCIe Switch [PS] (prog-if 00 [Normal decode])
	Flags: bus master, fast devsel, latency 0
	BIST result: 00
	Bus: primary=02, secondary=03, subordinate=04, sec-latency=0
	Memory behind bridge: c1000000-c18fffff
	Prefetchable memory behind bridge: 00000000c0000000-00000000c0ffffff
	Capabilities: [40] Power Management version 3
	Capabilities: [50] MSI: Enable- Count=1/1 Maskable- 64bit+
	Capabilities: [70] Express Downstream Port (Slot-), MSI 00
	Capabilities: [b0] Subsystem: Renesas Technology Corp. SH7757 PCIe Switch [PS]
	Capabilities: [100] Advanced Error Reporting
	Kernel driver in use: pcieport
	Kernel modules: shpchp

02:01.0 PCI bridge: Renesas Technology Corp. SH7757 PCIe Switch [PS] (prog-if 00 [Normal decode])
	Flags: fast devsel
	BIST result: 00
	Bus: primary=02, secondary=05, subordinate=05, sec-latency=0
	Capabilities: [40] Power Management version 3
	Capabilities: [50] MSI: Enable- Count=1/1 Maskable- 64bit+
	Capabilities: [70] Express Downstream Port (Slot-), MSI 00
	Capabilities: [b0] Subsystem: Renesas Technology Corp. SH7757 PCIe Switch [PS]
	Capabilities: [100] Advanced Error Reporting
	Capabilities: [150] Access Control Services
	Kernel driver in use: pcieport
	Kernel modules: shpchp

03:00.0 PCI bridge: Renesas Technology Corp. SH7757 PCIe-PCI Bridge [PPB] (prog-if 00 [Normal decode])
	Flags: bus master, fast devsel, latency 0
	BIST result: 00
	Bus: primary=03, secondary=04, subordinate=04, sec-latency=0
	Memory behind bridge: c1000000-c18fffff
	Prefetchable memory behind bridge: 00000000c0000000-00000000c0ffffff
	Capabilities: [40] Power Management version 3
	Capabilities: [50] MSI: Enable- Count=1/1 Maskable- 64bit+
	Capabilities: [70] Express PCI-Express to PCI/PCI-X Bridge, MSI 00
	Capabilities: [b0] Subsystem: Renesas Technology Corp. SH7757 PCIe-PCI Bridge [PPB]
	Capabilities: [100] Advanced Error Reporting
	Kernel modules: shpchp

04:00.0 VGA compatible controller: Matrox Electronics Systems Ltd. G200eR2 (prog-if 00 [VGA controller])
	Flags: medium devsel, IRQ 16
	Memory at c0000000 (32-bit, prefetchable) [disabled] [size=16M]
	[virtual] Memory at c1800000 (32-bit, non-prefetchable) [size=16K]
	[virtual] Memory at c1000000 (32-bit, non-prefetchable) [size=8M]
	Expansion ROM at <unassigned> [disabled]
	Capabilities: [dc] Power Management version 1
	Kernel driver in use: mgag200
	Kernel modules: mgag200



# virsh nodedev-reset pci_0000_01_00_0
Device pci_0000_01_00_0 reset

# lspci -v
01:00.0 PCI bridge: Renesas Technology Corp. SH7757 PCIe Switch [PS] (prog-if 00 [Normal decode])
	Flags: bus master, fast devsel, latency 0
	BIST result: 00
	Bus: primary=01, secondary=02, subordinate=05, sec-latency=0
	Memory behind bridge: c1000000-c18fffff
	Prefetchable memory behind bridge: 00000000c0000000-00000000c0ffffff
	Capabilities: [40] Power Management version 3
	Capabilities: [50] MSI: Enable- Count=1/1 Maskable- 64bit+
	Capabilities: [70] Express Upstream Port, MSI 00
	Capabilities: [b0] Subsystem: Renesas Technology Corp. SH7757 PCIe Switch [PS]
	Capabilities: [100] Advanced Error Reporting
	Kernel driver in use: pcieport
	Kernel modules: shpchp

02:00.0 PCI bridge: Renesas Technology Corp. SH7757 PCIe Switch [PS] (prog-if 00 [Normal decode])
	Flags: fast devsel
	BIST result: 00
	Bus: primary=00, secondary=00, subordinate=00, sec-latency=0
	Capabilities: [40] Power Management version 3
	Capabilities: [50] MSI: Enable- Count=1/1 Maskable- 64bit+
	Capabilities: [70] Express Downstream Port (Slot-), MSI 00
	Capabilities: [b0] Subsystem: Renesas Technology Corp. SH7757 PCIe Switch [PS]
	Capabilities: [100] Advanced Error Reporting
	Kernel driver in use: pcieport
	Kernel modules: shpchp

02:01.0 PCI bridge: Renesas Technology Corp. SH7757 PCIe Switch [PS] (prog-if 00 [Normal decode])
	Flags: fast devsel
	BIST result: 00
	Bus: primary=00, secondary=00, subordinate=00, sec-latency=0
	Capabilities: [40] Power Management version 3
	Capabilities: [50] MSI: Enable- Count=1/1 Maskable- 64bit+
	Capabilities: [70] Express Downstream Port (Slot-), MSI 00
	Capabilities: [b0] Subsystem: Renesas Technology Corp. SH7757 PCIe Switch [PS]
	Capabilities: [100] Advanced Error Reporting
	Capabilities: [150] Access Control Services
	Kernel driver in use: pcieport
	Kernel modules: shpchp

03:00.0 PCI bridge: Renesas Technology Corp. SH7757 PCIe-PCI Bridge [PPB] (rev ff) (prog-if ff)
	!!! Unknown header type 7f
	Kernel modules: shpchp

04:00.0 VGA compatible controller: Matrox Electronics Systems Ltd. G200eR2 (rev ff) (prog-if ff)
	!!! Unknown header type 7f
	Kernel driver in use: mgag200
	Kernel modules: mgag200

Comment 11 Laine Stump 2016-10-13 13:57:42 UTC

For a question like this, I always turn to Alex - so Alex, you can see from Comment 10 that when we do a reset of the PCIe switch at 01:00.0, that causes the PCI capabilities of 03:00.0 and 04:00.0 (which are both below it in the hierarchy) to get screwed up. But should libvirt really be preventing this? It seems more like a kernel bug to me - maybe a reset of a PCI controller should reset all other controllers and devices below that one in the hierarchy? I would figure that libvirt should simply "do what it's told", and that anything not allowed should be failed and reported by lower layers...

Comment 12 Alex Williamson 2016-10-13 18:09:12 UTC

What resets are available on this upstream switch port?  (lspci -vvvs 1:00.0)  I wonder if it supports any sort of FLR or if we did a secondary bus reset on the parent device.  The latter should be prevent on anything with a subordinate bus, so I sort of expect to find an FLR.  Should we be expecting that an FLR of a bridge pulls reset on the secondary interface?  There's arguably a kernel issue here, but you're giving the user a pretty powerful interface by allowing reset of arbitrary devices.  On one hand, the kernel did reset the device you asked to reset and dealing with the repercussions of that are the user's problem.  On the other hand, we restore the device, so perhaps we should restore downstream devices as well.  I don't see any practical reason that libvirt would allow reset, or in fact any interaction whatsoever, of non-endpoint devices though.

Comment 13 Jingjing Shao 2016-10-19 08:42:50 UTC

(In reply to Alex Williamson from comment #12)
> What resets are available on this upstream switch port?  (lspci -vvvs
> 1:00.0)  I wonder if it supports any sort of FLR or if we did a secondary
> bus reset on the parent device.  The latter should be prevent on anything
> with a subordinate bus, so I sort of expect to find an FLR. Should we be
> expecting that an FLR of a bridge pulls reset on the secondary interface? 
> There's arguably a kernel issue here, but you're giving the user a pretty
> powerful interface by allowing reset of arbitrary devices.  On one hand, the
> kernel did reset the device you asked to reset and dealing with the
> repercussions of that are the user's problem.  On the other hand, we restore
> the device, so perhaps we should restore downstream devices as well.  I
> don't see any practical reason that libvirt would allow reset, or in fact
> any interaction whatsoever, of non-endpoint devices though.

Hi Alex,
1. I try this issue on another machine, and get the result, I can not find the FLR info in the upstream port. Does it mean it can not be reset?

1.+- pci_0000_00_1c_6
  |   |
  |   +- pci_0000_0c_00_0
  |       |
  |       +- pci_0000_0d_02_0
  |       |   |
  |       |   +- pci_0000_0e_00_0



2. # lspci -vvvs 0c:00.0
0c:00.0 PCI bridge: Integrated Device Technology, Inc. [IDT] PES12N3A PCI Express Switch (rev 0e) (prog-if 00 [Normal decode])
	Physical Slot: 1
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx-
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0, Cache Line Size: 64 bytes
	Bus: primary=0c, secondary=0d, subordinate=11, sec-latency=0
	I/O behind bridge: 00008000-00009fff
	Memory behind bridge: e5400000-e68fffff
	Prefetchable memory behind bridge: 00000000fff00000-00000000000fffff
	Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- <SERR- <PERR-
	BridgeCtl: Parity- SERR+ NoISA- VGA- MAbort- >Reset- FastB2B-
		PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
	Capabilities: [40] Express (v1) Upstream Port, MSI 00
		DevCap:	MaxPayload 2048 bytes, PhantFunc 0
			ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ SlotPowerLimit 10.000W
		DevCtl:	Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
			RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-
			MaxPayload 128 bytes, MaxReadReq 128 bytes
		DevSta:	CorrErr+ UncorrErr- FatalErr- UnsuppReq+ AuxPwr- TransPend-
		LnkCap:	Port #0, Speed 2.5GT/s, Width x4, ASPM L0s L1, Exit Latency L0s <512ns, L1 <4us
			ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp-
		LnkCtl:	ASPM Disabled; Disabled- CommClk+
			ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
		LnkSta:	Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
	Capabilities: [c0] Power Management version 3
		Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
		Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
	Capabilities: [100 v1] Advanced Error Reporting
		UESta:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
		UEMsk:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
		UESvrt:	DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
		CESta:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
		CEMsk:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
		AERCap:	First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn-
	Capabilities: [200 v1] Virtual Channel
		Caps:	LPEVC=0 RefClk=100ns PATEntryBits=4
		Arb:	Fixed- WRR32- WRR64- WRR128-
		Ctrl:	ArbSelect=Fixed
		Status:	InProgress-
		VC0:	Caps:	PATOffset=02 MaxTimeSlots=1 RejSnoopTrans-
			Arb:	Fixed+ WRR32+ WRR64- WRR128- TWRR128- WRR256-
			Ctrl:	Enable+ ID=0 ArbSelect=Fixed TC/VC=01
			Status:	NegoPending- InProgress-
			Port Arbitration Table <?>
	Kernel driver in use: pcieport
	Kernel modules: shpchp

Comment 14 Alex Williamson 2016-10-31 15:38:02 UTC

Can you provide lspci -vvv of the parent bridge, 0000:00:1c.6 in the case of comment 13?  I note that the PCH root ports on my system report NoSoftRst- in the PM capability, indicating that a soft reset occurs on D3(hot)->D0 transition.  Laine, is libvirt exclusively using the kernel 'reset' attributes in pci-sysfs to perform device resets or will it induce its own bus resets?  In the latter case, have fun, it's not a kernel issue, in the former case it's bizarre to me that the kernel is offering a reset which /might/ reset the entire hierarchy, when we don't seem to be doing sufficient save and restore from this code path.  I'm not sure why any non-endpoints would list themselves as supporting pci_reset_function().

Additionally the fact that this particular switch loses downstream endpoints after reset (ie. they're not just unprogrammed, they're missing), suggests that perhaps a secondary bug is that these switch ports should have PCI_DEV_FLAGS_NO_BUS_RESET for the kernel to avoid doing bus resets on these devices altogether.

So there might be improvements that could be made in various places here, but the workaround is simply "Don't do that", there's no reason a customer should ever do this and the solution will likely be removing the capability to do it in the first place.  libvirt in particular has no business issuing resets on anything other than endpoint devices.

Comment 15 Jingjing Shao 2016-11-02 06:40:06 UTC

(In reply to Alex Williamson from comment #14)
> Can you provide lspci -vvv of the parent bridge, 0000:00:1c.6 in the case of
> comment 13?  I note that the PCH root ports on my system report NoSoftRst-
> in the PM capability, indicating that a soft reset occurs on D3(hot)->D0
> transition. 

The info of "0000:00:1c.6" is as below, also can get the “NoSoftRst” in the PM capability
 
# lspci -vvvs  00:1c.6
00:1c.6 PCI bridge: Intel Corporation C600/X79 series chipset PCI Express Root Port 3 (rev b5) (prog-if 00 [Normal decode])
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR+ FastB2B- DisINTx-
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0, Cache Line Size: 64 bytes
	Interrupt: pin C routed to IRQ 18
	Bus: primary=00, secondary=0c, subordinate=11, sec-latency=0
	I/O behind bridge: 00008000-00009fff
	Memory behind bridge: e5400000-e68fffff
	Prefetchable memory behind bridge: 00000000fff00000-00000000000fffff
	Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort+ <SERR- <PERR-
	BridgeCtl: Parity+ SERR+ NoISA- VGA- MAbort- >Reset- FastB2B-
		PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
	Capabilities: [40] Express (v2) Root Port (Slot+), MSI 00
		DevCap:	MaxPayload 128 bytes, PhantFunc 0
			ExtTag- RBE+
		DevCtl:	Report errors: Correctable- Non-Fatal- Fatal+ Unsupported+
			RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-
			MaxPayload 128 bytes, MaxReadReq 128 bytes
		DevSta:	CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr+ TransPend-
		LnkCap:	Port #3, Speed 5GT/s, Width x1, ASPM L0s L1, Exit Latency L0s <512ns, L1 <4us
			ClockPM- Surprise- LLActRep+ BwNot- ASPMOptComp-
		LnkCtl:	ASPM Disabled; RCB 64 bytes Disabled- CommClk+
			ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
		LnkSta:	Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive+ BWMgmt+ ABWMgmt-
		SltCap:	AttnBtn- PwrCtrl- MRL- AttnInd- PwrInd- HotPlug- Surprise-
			Slot #1, PowerLimit 10.000W; Interlock- NoCompl+
		SltCtl:	Enable: AttnBtn- PwrFlt- MRL- PresDet- CmdCplt- HPIrq- LinkChg-
			Control: AttnInd Unknown, PwrInd Unknown, Power- Interlock-
		SltSta:	Status: AttnBtn- PowerFlt- MRL- CmdCplt- PresDet+ Interlock-
			Changed: MRL- PresDet- LinkState+
		RootCtl: ErrCorrectable- ErrNon-Fatal- ErrFatal+ PMEIntEna- CRSVisible-
		RootCap: CRSVisible-
		RootSta: PME ReqID 0000, PMEStatus- PMEPending-
		DevCap2: Completion Timeout: Range BC, TimeoutDis+, LTR-, OBFF Not Supported ARIFwd-
		DevCtl2: Completion Timeout: 1s to 3.5s, TimeoutDis-, LTR-, OBFF Disabled ARIFwd-
		LnkCtl2: Target Link Speed: 2.5GT/s, EnterCompliance- SpeedDis-
			 Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
			 Compliance De-emphasis: -6dB
		LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete-, EqualizationPhase1-
			 EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
	Capabilities: [80] MSI: Enable- Count=1/1 Maskable- 64bit-
		Address: 00000000  Data: 0000
	Capabilities: [90] Subsystem: Hewlett-Packard Company Device 158a
	Capabilities: [a0] Power Management version 2
		Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
		Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
	Capabilities: [100 v1] Advanced Error Reporting
		UESta:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
		UEMsk:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
		UESvrt:	DLP+ SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
		CESta:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
		CEMsk:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
		AERCap:	First Error Pointer: 00, GenCap- CGenEn- ChkCap- ChkEn-
	Kernel driver in use: pcieport
	Kernel modules: shpchp



>Laine, is libvirt exclusively using the kernel 'reset'
> attributes in pci-sysfs to perform device resets or will it induce its own
> bus resets?  In the latter case, have fun, it's not a kernel issue, in the
> former case it's bizarre to me that the kernel is offering a reset which
> /might/ reset the entire hierarchy, when we don't seem to be doing
> sufficient save and restore from this code path.  I'm not sure why any
> non-endpoints would list themselves as supporting pci_reset_function().

Hi Laine，
Would you please check this part? Thank you in advance.


> 
> Additionally the fact that this particular switch loses downstream endpoints
> after reset (ie. they're not just unprogrammed, they're missing), suggests
> that perhaps a secondary bug is that these switch ports should have
> PCI_DEV_FLAGS_NO_BUS_RESET for the kernel to avoid doing bus resets on these
> devices altogether.
> So there might be improvements that could be made in various places here,
> but the workaround is simply "Don't do that", there's no reason a customer
> should ever do this and the solution will likely be removing the capability
> to do it in the first place.  libvirt in particular has no business issuing
> resets on anything other than endpoint devices.


Hi Alex, need I file a new bug to tracking this part? Or is it just OK for libvirt issue reset on endpoint?

Comment 16 Jingjing Shao 2016-11-02 06:42:09 UTC

Hi Laine，
Would you please check the second part of comment15 ? Thank you in advance.

Comment 18 errata-xmlrpc 2016-11-03 18:07:42 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2016-2577.html

Comment 19 Laine Stump 2016-11-21 18:28:40 UTC

(In reply to Alex Williamson from comment #14)
> Laine, is libvirt exclusively using the kernel 'reset'
> attributes in pci-sysfs to perform device resets or will it induce its own
> bus resets?  In the latter case, have fun, it's not a kernel issue, in the
> former case it's bizarre to me that the kernel is offering a reset which
> /might/ reset the entire hierarchy, when we don't seem to be doing
> sufficient save and restore from this code path.  I'm not sure why any
> non-endpoints would list themselves as supporting pci_reset_function().

(As a preface: I understand very little about why past libvirt has done what it has done wrt resetting PCI devices, as the code doing that has been around for a very long time and just moved around occasionally.)

I just looked in virpci.c and found that the only reason that any kind of reset function *at all* is performed in response to the virsh nodedev-reset command is because there is no indication that the device is going to be assigned via VFIO (i.e. it's a low level function that is unaware what might be done (or might have been done in the past) with the device). When we are assigning devices, or cleaning up after an assigned device is no longer needed by a guest, we completely bypass *all* device reset code if VFIO was used / will be used for the device assignment (since VFIO handles all of that for us).

(If the kernel is old enough and/or the user insistent enough that legacy KVM device assignment is used, then libvirt will try to do FLR for the device if available, failing that will try to do a power management reset, and failing *that* will attempt to reset the parent device (I guess that means the PCI controller), but only if there is no other device connected to that PCI controller. But since RHEL7 kernels don't even allow legacy KVM device assignment, that code is effectively dead.)

I wonder if there is really any practical use for the nodedev-reset function any more...

Note You need to log in before you can comment on or make changes to this bug.