Bug 2022589

Summary: libvirtd log got some internal error about node device VPD parse error
Product: Red Hat Enterprise Linux 9 Reporter: yalzhang <yalzhang>
Component: libvirtAssignee: Daniel Berrangé <berrange>
libvirt sub component: General QA Contact: yalzhang <yalzhang>
Status: CLOSED ERRATA Docs Contact:
Severity: unspecified    
Priority: unspecified CC: jdenemar, jtomko, lcheng, virt-maint
Version: 9.0Flags: pm-rhel: mirror+
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: libvirt-7.10.0-1.el9 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-05-17 12:45:49 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version: 7.10.0
Embargoed:

Description yalzhang@redhat.com 2021-11-12 05:27:46 UTC
Description of problem:
libvirtd log got some internal error about node device VPD parse error

Version-Release number of selected component (if applicable):
libvirt-7.9.0-1.el9.x86_64

How reproducible:
100% on some system

Steps to Reproduce:
1. start the system and run "virsh nodedev-list", then check the libvirtd log, there is some error as below
# grep -i error /var/log/libvirt/libvirtd.log
2021-11-12 03:58:57.998+0000: 15024: error : virPCIVPDResourceIsValidTextValue:195 : internal error: The provided value contains invalid characters: N/A
2021-11-12 03:58:57.998+0000: 15024: error : virPCIVPDParseVPDLargeResourceFields:536 : internal error: Field value contains invalid characters

2. check the vpd info:
# find  / -name vpd
/sys/devices/pci0000:00/0000:00:03.1/0000:01:00.0/vpd
/sys/devices/pci0000:00/0000:00:03.1/0000:01:00.1/vpd
/sys/devices/pci0000:00/0000:00:01.0/0000:03:00.0/vpd
/sys/devices/pci0000:00/0000:00:03.0/0000:02:00.0/vpd
/sys/devices/pci0000:00/0000:00:03.0/0000:02:00.1/vpd
/sys/devices/pci0000:00/0000:00:02.2/0000:06:00.0/vpd
/sys/devices/pci0000:00/0000:00:02.0/0000:04:00.1/vpd
/sys/devices/pci0000:00/0000:00:02.0/0000:04:00.0/vpd

# virsh nodedev-list --cap vpd
pci_0000_01_00_0
pci_0000_01_00_1
pci_0000_02_00_0
pci_0000_02_00_1
pci_0000_04_00_0
pci_0000_04_00_1

3. from the outputs above, we can know the error may related with device 06:00.0 and 03:00.0:
# lspci -vvv -s 06:00.0
06:00.0 Ethernet controller: Mellanox Technologies MT27520 Family [ConnectX-3 Pro]
..
Capabilities: [48] Vital Product Data
Product Name: CX354A - ConnectX-3 Pro QSFP
Read-only fields:
[PN] Part number: MCX354A-FCCT        
[EC] Engineering changes: A9
[SN] Serial number: MT1616X19101            
[V0] Vendor specific: PCIe Gen3 x8    
[RV] Reserved: checksum good, 0 byte(s) reserved
Read/write fields:
[V1] Vendor specific: N/A  
[YA] Asset tag: N/A                    
[RW] Read-write area: 101 byte(s) free
[RW] Read-write area: 253 byte(s) free
[RW] Read-write area: 253 byte(s) free
[RW] Read-write area: 253 byte(s) free
[RW] Read-write area: 253 byte(s) free
[RW] Read-write area: 253 byte(s) free
[RW] Read-write area: 253 byte(s) free
[RW] Read-write area: 253 byte(s) free
[RW] Read-write area: 253 byte(s) free
[RW] Read-write area: 253 byte(s) free
[RW] Read-write area: 253 byte(s) free
[RW] Read-write area: 253 byte(s) free
[RW] Read-write area: 253 byte(s) free
[RW] Read-write area: 253 byte(s) free
[RW] Read-write area: 253 byte(s) free
[RW] Read-write area: 252 byte(s) free
End

# lspci -vvv -s 03:00.0
03:00.0 RAID bus controller: Broadcom / LSI MegaRAID SAS-3 3108 [Invader] (rev 02)
...
Capabilities: [d0] Vital Product Data
pcilib: sysfs_read_vpd: read failed: Input/output error
Not readable

Actual results:
libvirtd errors about vpd info.

Expected results:
There should be no error about this, perhaps we can change it to warning.

Additional info:
The vpd feature is introduced by https://www.mail-archive.com/libvir-list@redhat.com/msg222338.html

Comment 1 Ján Tomko 2021-11-12 11:51:44 UTC
Fixed upstream by:
commit 600f580d623ae4077ddeb6c7cb24f8a315a7c73b
Author:     Dmitrii Shcherbakov <dmitrii.shcherbakov>
CommitDate: 2021-11-02 13:43:23 +0000

    PCI VPD: Skip fields with invalid values

git describe: v7.9.0-30-g600f580d62

Comment 2 yalzhang@redhat.com 2021-12-02 04:41:50 UTC
Test with libvirt-7.10.0-1.el9.x86_64 on the system and no error message in the log.

Comment 5 yalzhang@redhat.com 2021-12-14 11:07:43 UTC
Test on libvirt-7.10.0-1.module+el8.6.0+13502+4f24a11d.x86_64, the bug is fixed, no error info in libvirtd.log, only debug log for reference.

# ll /sys/devices/pci0000:00/0000:00:02.2/0000:06:00.0/vpd
-rw-------. 1 root root 0 Dec 14 05:07 /sys/devices/pci0000:00/0000:00:02.2/0000:06:00.0/vpd
# virsh nodedev-dumpxml pci_0000_06_00_0
<device>
  <name>pci_0000_06_00_0</name>
  <path>/sys/devices/pci0000:00/0000:00:02.2/0000:06:00.0</path>
  <parent>pci_0000_00_02_2</parent>
  <driver>
    <name>mlx4_core</name>
  </driver>
  <capability type='pci'>
    <class>0x020000</class>
    <domain>0</domain>
    <bus>6</bus>
    <slot>0</slot>
    <function>0</function>
    <product id='0x1007'>MT27520 Family [ConnectX-3 Pro]</product>
    <vendor id='0x15b3'>Mellanox Technologies</vendor>
    <capability type='virt_functions' maxCount='4'/>
    <numa node='0'/>
    <pci-express>
      <link validity='cap' port='8' speed='8' width='8'/>
      <link validity='sta' speed='8' width='8'/>
    </pci-express>
  </capability>
</device>

# ll /sys/devices/pci0000:00/0000:00:01.0/0000:03:00.0/vpd
-rw-------. 1 root root 0 Dec 14 05:07 /sys/devices/pci0000:00/0000:00:01.0/0000:03:00.0/vpd
# virsh nodedev-dumpxml pci_0000_03_00_0
<device>
  <name>pci_0000_03_00_0</name>
  <path>/sys/devices/pci0000:00/0000:00:01.0/0000:03:00.0</path>
  <parent>pci_0000_00_01_0</parent>
  <driver>
    <name>megaraid_sas</name>
  </driver>
  <capability type='pci'>
    <class>0x010400</class>
    <domain>0</domain>
    <bus>3</bus>
    <slot>0</slot>
    <function>0</function>
    <product id='0x005d'>MegaRAID SAS-3 3108 [Invader]</product>
    <vendor id='0x1000'>Broadcom / LSI</vendor>
    <numa node='0'/>
    <pci-express>
      <link validity='cap' port='0' speed='8' width='8'/>
      <link validity='sta' speed='8' width='8'/>
    </pci-express>
  </capability>
</device>

# grep -i virPCIVPD*  /var/log/libvirt/libvirtd.log 
2021-12-14 11:05:50.053+0000: 39063: debug : virPCIVPDReadVPDBytes:427 : Unable to read 1 bytes at offset 0 from fd: 25
2021-12-14 11:05:50.053+0000: 39063: debug : virPCIVPDParse:748 : Encountered an invalid VPD: does not have a VPD-R record
2021-12-14 11:06:01.636+0000: 39062: debug : virPCIVPDParseVPDLargeResourceFields:600 : VPD-W section parsing ended prematurely (RW is not the last field).
2021-12-14 11:06:01.636+0000: 39062: debug : virPCIVPDParse:740 : Encountered an invalid VPD

Comment 7 errata-xmlrpc 2022-05-17 12:45:49 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (new packages: libvirt), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2022:2390