Bug 2138178
| Summary: | Timed out during operation: cannot acquire state change lock (held by monitor=remoteDispatchDomainCreate) | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 9 | Reporter: | Tom Tracy <ttracy> |
| Component: | qemu-kvm | Assignee: | Virtualization Maintenance <virt-maint> |
| qemu-kvm sub component: | PCI | QA Contact: | Yiqian Wei <yiwei> |
| Status: | CLOSED MIGRATED | Docs Contact: | |
| Severity: | low | ||
| Priority: | unspecified | CC: | alex.williamson, chayang, clegoate, coli, jinzhao, juzhang, laine, lvivier, rsibley, virt-maint, zhguo |
| Version: | 9.2 | Keywords: | MigratedToJIRA, Triaged |
| Target Milestone: | rc | ||
| Target Release: | --- | ||
| Hardware: | x86_64 | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2023-07-07 20:32:26 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
I can start a vm with a VF successfully in the following test env: 5.14.0-178.el9.x86_64 qemu-kvm-7.1.0-3.el9.x86_64 libvirt-8.7.0-1.el9.x86_64 edk2-ovmf-20220826gitba0e0e4c6a-1.el9.noarch seabios-bin-1.16.0-4.el9.noarch The domain xml: [1] Q35 + SEABIOS RHEL92 virt-install --machine=q35 --noreboot --name=rhel92 --memory=4096 --vcpus=4 --graphics type=vnc,port=5992,listen=0.0.0.0 --osinfo detect=on,require=off --check all=off --import --noautoconsole --network bridge=switch,model=virtio,mac=52:54:00:00:92:92 --disk path=/home/images/RHEL92.qcow2,bus=virtio,cache=none,format=qcow2,io=threads,size=20 --hostdev 0000:1a:08.2 [2] Q35 + OVMF RHEL92 virt-install --machine=q35 --noreboot --name=rhel92 --memory=4096 --vcpus=4 --graphics type=vnc,port=5992,listen=0.0.0.0 --osinfo detect=on,require=off --check all=off --import --noautoconsole --network bridge=switch,model=virtio,mac=52:54:00:00:92:92 --disk path=/home/images/RHEL92.qcow2,bus=virtio,cache=none,format=qcow2,io=threads,size=20 --boot=uefi --hostdev 0000:1a:08.2 Hi Tom, Could you please help provide the following info? [1] what's your host env ? # rpm -q qemu-kvm libvirt edk2-ovmf seabion-bin # uname -r [2] what's the domain xml you used ? # virsh dumpxml $domain [3] what is the detailed info of the virtual function and its physical function ? # lspci 0000:31:0a.0 # lspci 0000:31:00.0 [4] what's the libvirtd log when you met this issue ? # cat /var/log/libvirt/qemu/$domain.log [5] which libvirt tool you used to start the domain ? virsh cmd,virt-manager or others ?
Yanghang
Little background here. I test multiple network cards with virtio and passthrough devices. I use from 10Gb,25gb,100gb,200gb,200Gb bluefield cards. I created 5 virtIO networks with no issues. When I try to add the 10gb passthrough or any other network card, the guest hangs. I have done this with multiple times with RHEL8 and RHEL9 in the past with no issues. The issue started with RHEL9.2
[1] what's your host env ?
# rpm -q qemu-kvm libvirt edk2-ovmf seabion-bin
qemu-kvm-7.1.0-3.el9.x86_64
libvirt-8.7.0-1.el9.x86_64
edk2-ovmf-20220826gitba0e0e4c6a-1.el9.noarch
seabios-bin-1.16.0-4.el9.noarch
# uname -r
5.14.0-182.el9.x86_64 Both Host and Guest
Host grub line BOOT_IMAGE=(hd0,msdos1)/vmlinuz-5.14.0-182.el9.x86_64 root=/dev/mapper/rhel92_kvm_perf184-root ro crashkernel=1G-4G:192M,4G-64G:256M,64G-:512M resume=/dev/mapper/rhel92_kvm_perf184-swap rd.lvm.lv=rhel92_kvm_perf184/root rd.lvm.lv=rhel92_kvm_perf184/swap intel_iommu=on
[2] what's the domain xml you used ?
# virsh dumpxml $domain - See attached file
[3] what is the detailed info of the virtual function and its physical function ?
[root@perf184 ~]# lspci -vvvs 31:00.1 - Host
31:00.1 Ethernet controller: Intel Corporation Ethernet Controller X710 for 10GbE SFP+ (rev 02)
DeviceName: Integrated NIC 1 Port 2-1
Subsystem: Intel Corporation Ethernet Converged Network Adapter X710
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0
Interrupt: pin A routed to IRQ 16
NUMA node: 0
IOMMU group: 28
Region 0: Memory at a7000000 (64-bit, prefetchable) [size=16M]
Region 3: Memory at a9800000 (64-bit, prefetchable) [size=32K]
Expansion ROM at a6480000 [disabled] [size=512K]
Capabilities: [40] Power Management version 3
Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=1 PME-
Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit+
Address: 0000000000000000 Data: 0000
Masking: 00000000 Pending: 00000000
Capabilities: [70] MSI-X: Enable+ Count=129 Masked-
Vector table: BAR=3 offset=00000000
PBA: BAR=3 offset=00001000
Capabilities: [a0] Express (v2) Endpoint, MSI 00
DevCap: MaxPayload 2048 bytes, PhantFunc 0, Latency L0s <512ns, L1 <64us
ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset+ SlotPowerLimit 0.000W
DevCtl: CorrErr- NonFatalErr+ FatalErr+ UnsupReq+
RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop- FLReset-
MaxPayload 512 bytes, MaxReadReq 4096 bytes
DevSta: CorrErr+ NonFatalErr- FatalErr- UnsupReq+ AuxPwr- TransPend-
LnkCap: Port #0, Speed 8GT/s, Width x8, ASPM L1, Exit Latency L1 <16us
ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp+
LnkCtl: ASPM Disabled; RCB 64 bytes, Disabled- CommClk+
ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
LnkSta: Speed 8GT/s (ok), Width x8 (ok)
TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
DevCap2: Completion Timeout: Range ABCD, TimeoutDis+ NROPrPrP- LTR-
10BitTagComp- 10BitTagReq- OBFF Not Supported, ExtFmt- EETLPPrefix-
EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit-
FRS- TPHComp- ExtTPHComp-
AtomicOpsCap: 32bit- 64bit- 128bitCAS-
DevCtl2: Completion Timeout: 65ms to 210ms, TimeoutDis- LTR- OBFF Disabled,
AtomicOpsCtl: ReqEn-
LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete- EqualizationPhase1-
EqualizationPhase2- EqualizationPhase3- LinkEqualizationRequest-
Retimer- 2Retimers- CrosslinkRes: unsupported
Capabilities: [e0] Vital Product Data
Product Name: X710 10GbE Controller
Read-only fields:
[V0] Vendor specific: FFV20.5.16
[PN] Part number: YJYK1
[MN] Manufacture ID: 1028
[V1] Vendor specific: DSV1028VPDR.VER2.1
[V3] Vendor specific: DTINIC
[V4] Vendor specific: DCM1001FFFFFF2101FFFFFF
[V5] Vendor specific: NPY2
[V6] Vendor specific: PMT7
[V7] Vendor specific: NMVIntel Corp
[V8] Vendor specific: L1D0
[RV] Reserved: checksum good, 4 byte(s) reserved
Read/write fields:
[Y1] System specific: CCF1
End
Capabilities: [100 v2] Advanced Error Reporting
UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt+ UnxCmplt+ RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
UESvrt: DLP+ SDES+ TLP+ FCP+ CmpltTO+ CmpltAbrt+ UnxCmplt- RxOF+ MalfTLP+ ECRC+ UnsupReq- ACSViol-
CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+
CEMsk: RxErr+ BadTLP+ BadDLLP+ Rollover+ Timeout+ AdvNonFatalErr+
AERCap: First Error Pointer: 00, ECRCGenCap+ ECRCGenEn+ ECRCChkCap+ ECRCChkEn+
MultHdrRecCap- MultHdrRecEn- TLPPfxPres- HdrLogCap-
HeaderLog: 00000000 00000000 00000000 00000000
Capabilities: [140 v1] Device Serial Number 74-36-de-ff-ff-ca-05-68
Capabilities: [150 v1] Alternative Routing-ID Interpretation (ARI)
ARICap: MFVC- ACS-, Next Function: 0
ARICtl: MFVC- ACS-, Function Group: 0
Capabilities: [160 v1] Single Root I/O Virtualization (SR-IOV)
IOVCap: Migration-, Interrupt Message Number: 000
IOVCtl: Enable+ Migration- Interrupt- MSE+ ARIHierarchy-
IOVSta: Migration-
Initial VFs: 64, Total VFs: 64, Number of VFs: 4, Function Dependency Link: 01
VF offset: 79, stride: 1, Device ID: 154c
Supported Page Size: 00000553, System Page Size: 00000001
Region 0: Memory at 00000000a9000000 (64-bit, prefetchable)
Region 3: Memory at 00000000a9810000 (64-bit, prefetchable)
VF Migration: offset: 00000000, BIR: 0
Capabilities: [1a0 v1] Transaction Processing Hints
Device specific mode supported
No steering table available
Capabilities: [1b0 v1] Access Control Services
ACSCap: SrcValid- TransBlk- ReqRedir- CmpltRedir- UpstreamFwd- EgressCtrl- DirectTrans-
ACSCtl: SrcValid- TransBlk- ReqRedir- CmpltRedir- UpstreamFwd- EgressCtrl- DirectTrans-
Kernel driver in use: i40e
Kernel modules: i40e
[root@perf184 ~]# lspci -vvvs 31:0a.0 - virtual
31:0a.0 Ethernet controller: Intel Corporation Ethernet Virtual Function 700 Series (rev 02)
Subsystem: Intel Corporation Device 0000
Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
NUMA node: 0
IOMMU group: 309
Region 0: Memory at a9000000 (64-bit, prefetchable) [virtual] [size=64K]
Region 3: Memory at a9810000 (64-bit, prefetchable) [virtual] [size=16K]
Capabilities: [70] MSI-X: Enable- Count=5 Masked-
Vector table: BAR=3 offset=00000000
PBA: BAR=3 offset=00002000
Capabilities: [a0] Express (v2) Endpoint, MSI 00
DevCap: MaxPayload 2048 bytes, PhantFunc 0, Latency L0s <512ns, L1 <64us
ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset+ SlotPowerLimit 0.000W
DevCtl: CorrErr- NonFatalErr- FatalErr- UnsupReq-
RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop- FLReset-
MaxPayload 128 bytes, MaxReadReq 128 bytes
DevSta: CorrErr- NonFatalErr- FatalErr- UnsupReq- AuxPwr- TransPend-
LnkCap: Port #0, Speed 8GT/s, Width x8, ASPM L1, Exit Latency L1 <16us
ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp+
LnkCtl: ASPM Disabled; RCB 64 bytes, Disabled- CommClk-
ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
LnkSta: Speed unknown (downgraded), Width x0 (downgraded)
TrErr- Train- SlotClk- DLActive- BWMgmt- ABWMgmt-
DevCap2: Completion Timeout: Range ABCD, TimeoutDis+ NROPrPrP- LTR-
10BitTagComp- 10BitTagReq- OBFF Not Supported, ExtFmt- EETLPPrefix-
EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit-
FRS- TPHComp- ExtTPHComp-
AtomicOpsCap: 32bit- 64bit- 128bitCAS-
DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis- LTR- OBFF Disabled,
AtomicOpsCtl: ReqEn-
LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete- EqualizationPhase1-
EqualizationPhase2- EqualizationPhase3- LinkEqualizationRequest-
Retimer- 2Retimers- CrosslinkRes: unsupported
Capabilities: [100 v2] Advanced Error Reporting
UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
UESvrt: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr-
CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr-
AERCap: First Error Pointer: 00, ECRCGenCap+ ECRCGenEn- ECRCChkCap+ ECRCChkEn-
MultHdrRecCap- MultHdrRecEn- TLPPfxPres- HdrLogCap-
HeaderLog: 00000000 00000000 00000000 00000000
Capabilities: [150 v1] Alternative Routing-ID Interpretation (ARI)
ARICap: MFVC- ACS-, Next Function: 0
ARICtl: MFVC- ACS-, Function Group: 0
Capabilities: [1a0 v1] Transaction Processing Hints
Device specific mode supported
No steering table available
Capabilities: [1d0 v1] Access Control Services
ACSCap: SrcValid- TransBlk- ReqRedir- CmpltRedir- UpstreamFwd- EgressCtrl- DirectTrans-
ACSCtl: SrcValid- TransBlk- ReqRedir- CmpltRedir- UpstreamFwd- EgressCtrl- DirectTrans-
Kernel driver in use: vfio-pci
Kernel modules: iavf
# lspci 0000:31:00.0
[4] what's the libvirtd log when you met this issue ?
# cat /var/log/libvirt/qemu/$domain.log
2022-11-01 14:26:54.108+0000: starting up libvirt version: 8.7.0, package: 1.el9 (Red Hat, Inc. <http://bugzilla.redhat.com/bugzilla>, 2022-09-06-06:21:04, ), qemu version: 7.1.0qemu-kvm-7.1.0-3.el9, kernel: 5.14.0-182.el9.x86_64, hostname: perf184.perf.lab.eng.bos.redhat.com
LC_ALL=C \
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin \
HOME=/var/lib/libvirt/qemu/domain-1-rhel9.2 \
XDG_DATA_HOME=/var/lib/libvirt/qemu/domain-1-rhel9.2/.local/share \
XDG_CACHE_HOME=/var/lib/libvirt/qemu/domain-1-rhel9.2/.cache \
XDG_CONFIG_HOME=/var/lib/libvirt/qemu/domain-1-rhel9.2/.config \
/usr/libexec/qemu-kvm \
-name guest=rhel9.2,debug-threads=on \
-S \
-object '{"qom-type":"secret","id":"masterKey0","format":"raw","file":"/var/lib/libvirt/qemu/domain-1-rhel9.2/master-key.aes"}' \
-machine pc-q35-rhel9.0.0,usb=off,dump-guest-core=off,memory-backend=pc.ram \
-accel kvm \
-cpu host,migratable=on \
-m 1031008 \
-object '{"qom-type":"memory-backend-ram","id":"pc.ram","size":1081090244608}' \
-overcommit mem-lock=off \
-smp 64,sockets=64,cores=1,threads=1 \
-uuid 5b562de8-5436-4817-a44e-5c79f53c80e7 \
-no-user-config \
-nodefaults \
-chardev socket,id=charmonitor,fd=29,server=on,wait=off \
-mon chardev=charmonitor,id=monitor,mode=control \
-rtc base=utc,driftfix=slew \
-global kvm-pit.lost_tick_policy=delay \
-no-hpet \
-no-shutdown \
-global ICH9-LPC.disable_s3=1 \
-global ICH9-LPC.disable_s4=1 \
-boot strict=on \
-device '{"driver":"pcie-root-port","port":16,"chassis":1,"id":"pci.1","bus":"pcie.0","multifunction":true,"addr":"0x2"}' \
-device '{"driver":"pcie-root-port","port":17,"chassis":2,"id":"pci.2","bus":"pcie.0","addr":"0x2.0x1"}' \
-device '{"driver":"pcie-root-port","port":18,"chassis":3,"id":"pci.3","bus":"pcie.0","addr":"0x2.0x2"}' \
-device '{"driver":"pcie-root-port","port":19,"chassis":4,"id":"pci.4","bus":"pcie.0","addr":"0x2.0x3"}' \
-device '{"driver":"pcie-root-port","port":20,"chassis":5,"id":"pci.5","bus":"pcie.0","addr":"0x2.0x4"}' \
-device '{"driver":"pcie-root-port","port":21,"chassis":6,"id":"pci.6","bus":"pcie.0","addr":"0x2.0x5"}' \
-device '{"driver":"pcie-root-port","port":22,"chassis":7,"id":"pci.7","bus":"pcie.0","addr":"0x2.0x6"}' \
-device '{"driver":"pcie-root-port","port":23,"chassis":8,"id":"pci.8","bus":"pcie.0","addr":"0x2.0x7"}' \
-device '{"driver":"pcie-root-port","port":24,"chassis":9,"id":"pci.9","bus":"pcie.0","multifunction":true,"addr":"0x3"}' \
-device '{"driver":"pcie-root-port","port":25,"chassis":10,"id":"pci.10","bus":"pcie.0","addr":"0x3.0x1"}' \
-device '{"driver":"pcie-root-port","port":26,"chassis":11,"id":"pci.11","bus":"pcie.0","addr":"0x3.0x2"}' \
-device '{"driver":"pcie-root-port","port":27,"chassis":12,"id":"pci.12","bus":"pcie.0","addr":"0x3.0x3"}' \
-device '{"driver":"pcie-root-port","port":28,"chassis":13,"id":"pci.13","bus":"pcie.0","addr":"0x3.0x4"}' \
-device '{"driver":"pcie-root-port","port":29,"chassis":14,"id":"pci.14","bus":"pcie.0","addr":"0x3.0x5"}' \
-device '{"driver":"pcie-root-port","port":30,"chassis":15,"id":"pci.15","bus":"pcie.0","addr":"0x3.0x6"}' \
-device '{"driver":"qemu-xhci","p2":15,"p3":15,"id":"usb","bus":"pci.2","addr":"0x0"}' \
-device '{"driver":"virtio-serial-pci","id":"virtio-serial0","bus":"pci.3","addr":"0x0"}' \
-blockdev '{"driver":"file","filename":"/var/lib/libvirt/images/rhel9.2.qcow2","node-name":"libvirt-1-storage","auto-read-only":true,"discard":"unmap"}' \
-blockdev '{"node-name":"libvirt-1-format","read-only":false,"discard":"unmap","driver":"qcow2","file":"libvirt-1-storage","backing":null}' \
-device '{"driver":"virtio-blk-pci","bus":"pci.4","addr":"0x0","drive":"libvirt-1-format","id":"virtio-disk0","bootindex":1}' \
-netdev tap,fd=30,vhost=on,vhostfd=32,id=hostnet0 \
-device '{"driver":"virtio-net-pci","netdev":"hostnet0","id":"net0","mac":"52:54:00:c7:61:48","bootindex":2,"bus":"pci.1","addr":"0x0"}' \
-netdev tap,fds=33:34:35:36,vhost=on,vhostfds=37:38:39:40,id=hostnet1 \
-device '{"driver":"virtio-net-pci","mq":true,"vectors":10,"netdev":"hostnet1","id":"net1","mac":"52:54:00:e6:37:de","bus":"pci.7","addr":"0x0"}' \
-netdev tap,fds=41:42:43:44:45:46:47:48,vhost=on,vhostfds=49:50:51:52:53:54:55:56,id=hostnet2 \
-device '{"driver":"virtio-net-pci","mq":true,"vectors":18,"netdev":"hostnet2","id":"net2","mac":"52:54:00:fa:b2:03","bus":"pci.8","addr":"0x0"}' \
-netdev tap,fds=57:58:59:60:61:62:63:64:65:66:67:68:69:70:71:72:73:74:75:76:77:78:79:80:81:82:83:84:85:86:87:88:89:90:91:92:93:94:95:96:97:98:99:100:101:102:103:104:105:106:107:108:109:110:111:112:113:114:115:116:117:118:119:120,vhost=on,vhostfds=121:122:123:124:125:126:127:128:129:130:131:132:133:134:135:136:137:138:139:140:141:142:143:144:145:146:147:148:149:150:151:152:153:154:155:156:157:158:159:160:161:162:163:164:165:166:167:168:169:170:171:172:173:174:175:176:177:178:179:180:181:182:183:184,id=hostnet3 \
-device '{"driver":"virtio-net-pci","mq":true,"vectors":130,"netdev":"hostnet3","id":"net3","mac":"52:54:00:b6:ee:17","bus":"pci.9","addr":"0x0"}' \
-netdev tap,fds=185:186:187:188:189:190:191:192:193:194:195:196:197:198:199:200:201:202:203:204:205:206:207:208:209:210:211:212:213:214:215:216:217:218:219:220:221:222:223:224:225:226:227:228:229:230:231:232:233:234:235:236:237:238:239:240:241:242:243:244:245:246:247:248,vhost=on,vhostfds=249:250:251:252:253:254:255:256:257:258:259:260:261:262:263:264:265:266:267:268:269:270:271:272:273:274:275:276:277:278:279:280:281:282:283:284:285:286:287:288:289:290:291:292:293:294:295:296:297:298:299:300:301:302:303:304:305:306:307:308:309:310:311:312,id=hostnet4 \
-device '{"driver":"virtio-net-pci","mq":true,"vectors":130,"netdev":"hostnet4","id":"net4","mac":"52:54:00:27:c2:e4","bus":"pci.10","addr":"0x0"}' \
-netdev tap,fds=313:314:315:316:317:318:319:320:321:322:323:324:325:326:327:328:329:330:331:332:333:334:335:336:337:338:339:340:341:342:343:344:345:346:347:348:349:350:351:352:353:354:355:356:357:358:359:360:361:362:363:364:365:366:367:368:369:370:371:372:373:374:375:376,vhost=on,vhostfds=377:378:379:380:381:382:383:384:385:386:387:388:389:390:391:392:393:394:395:396:397:398:399:400:401:402:403:404:405:406:407:408:409:410:411:412:413:414:415:416:417:418:419:420:421:422:423:424:425:426:427:428:429:430:431:432:433:434:435:436:437:438:439:440,id=hostnet5 \
-device '{"driver":"virtio-net-pci","mq":true,"vectors":130,"netdev":"hostnet5","id":"net5","mac":"52:54:00:1d:a2:d2","bus":"pci.11","addr":"0x0"}' \
-chardev pty,id=charserial0 \
-device '{"driver":"isa-serial","chardev":"charserial0","id":"serial0","index":0}' \
-chardev socket,id=charchannel0,fd=28,server=on,wait=off \
-device '{"driver":"virtserialport","bus":"virtio-serial0.0","nr":1,"chardev":"charchannel0","id":"channel0","name":"org.qemu.guest_agent.0"}' \
-device '{"driver":"usb-tablet","id":"input0","bus":"usb.0","port":"1"}' \
-audiodev '{"id":"audio1","driver":"none"}' \
-vnc 127.0.0.1:0,audiodev=audio1 \
-device '{"driver":"virtio-vga","id":"video0","max_outputs":1,"bus":"pcie.0","addr":"0x1"}' \
-device '{"driver":"vfio-pci","host":"0000:31:0a.0","id":"hostdev0","bus":"pci.12","addr":"0x0"}' \
-device '{"driver":"virtio-balloon-pci","id":"balloon0","bus":"pci.5","addr":"0x0"}' \
-object '{"qom-type":"rng-random","id":"objrng0","filename":"/dev/urandom"}' \
-device '{"driver":"virtio-rng-pci","rng":"objrng0","id":"rng0","bus":"pci.6","addr":"0x0"}' \
-sandbox on,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontrol=deny \
-msg timestamp=on
char device redirected to /dev/pts/2 (label charserial0)
[5] which libvirt tool you used to start the domain ? virsh cmd,virt-manager or others ?
virt-manager and virsh cmd - guests hangs
to clear issue I have to reboot the host
Tom
Hi Tom, I have used the domain xml you provide and defined a similar domain (I need to update some domain cfg due to the hardware limit), but I can still start the domain successfully Could you please use the following domain cfg and test again to see if the bug still exists in your environment ? [1] reduce the domain memory to a smaller size, like 4G [2] use a recommended smp cpu number supported by KVM, like 24 or a smaller number [3] remove some optional domain devices , like the 5 virtio-net-pci devices And if possible , you can upload the guest hang screenshot in the attachment as well. :) I think these will be helpful for us to locate the root case. Yanghang Experiment #1 Reduce guest cpu to 24 and reduced memory to 4gb Removed virtio Networks Only SRIOV device (10Gb card) Then guests boots I knew the above config would boot as the issue is having both VirtIO network AND passthrough network causes the hang. I tried that before filing the BZ Experiment #2 With passthrough network working add back all virtio network devices with above configuration Guests boots with 5 virtio network and 1 passthrough with 24 cpu's / 4gb memory Experiment #3 Add remaining passthrough devices needed for testing Intel E810 25Gb Intel E810 100Gb Mellanox ConnectX6 200Gb Guest boots with all required / desired network devices Experiment #4 Increase CPU count from 24 to 64 Guest boots with all required / desired network devices Experiment # 5 Increase memory from 4Gb to 512Gb Guest boots with all required / desired network devices Experiment # 6 Increase memory from 512Gb to 640Gb Guest boots Experiment # 7 Decrease memory from 640Gb to 750Gb - Boots Note Adding memory past 256 Gb you get the following messages but the guests boots till you get close to host memory then guest will not boot Nov 02 10:08:22 perf184.perf.lab.eng.bos.redhat.com virtqemud[2806]: Cannot start job (query, none, none) in API remoteDispatchConnectGetAllDomainStats for domain rhel9.2; current job is (async nested, none, start) owned by (2827 remoteDispatchDomainCreate, 0 <null>, 2827 remoteDispatchDomainCreate (flags=0x0)) for (30s, 0s, 31s) Nov 02 10:08:22 perf184.perf.lab.eng.bos.redhat.com virtqemud[2806]: Timed out during operation: cannot acquire state change lock (held by monitor=remoteDispatchDomainCreate) So definitely a memory allocation issue. I can start testing with 256Gb guest memory. I do not want to see the above errors Note I used the same memory configuration with RHEL9.1 and never had the issue. We do not test migration so not worried about that This is a Dell R750 IceLake with 1Tb total system memory when the system hangs on power up. You see nothing from the virtmanager console all you see is the guest paused [root@perf184 ~]# virsh list --all Id Name State -------------------------------------- 1 rhel9.2 paused (In reply to Tom Tracy from comment #6) > This is a Dell R750 IceLake with 1Tb total system memory <memory unit='KiB'>1055752192</memory> This is over 98% of the total host memory! Frankly I'm surprised 9.1 could handle this. The host is more than likely hanging because it's thrashing trying to swap and OOM kill whatever it can. How close was 9.1 to reaching this same condition, another couple GB? Increased memory size anywhere in the kernel could have resulted in this. Device assignment requires pinning all of VM RAM into host memory. When we're cutting the host this thin on memory, the size of page structures, IOMMU page tables, and perhaps even memory fragmentation can come into play. Does comparing /proc/meminfo on the host between 9.1 and 9.2 suggest a significant difference in memory? Zhiyi
Just built 2 more guests on Cascade lake systems using the latest libvirt rpms you pointed to. I can start the guest with virtio and SRIOV networks but if I use more than 1/4 total memory still see theses errors
ov 09 13:13:44 perf2.perf.lab.eng.bos.redhat.com virtqemud[2984]: Cannot start job (query, none, none) in API remoteDispatchConnectGetAllDomainStats for domain rhel9.2; current job is (async nested, none, start) owned by (2988 remoteDispatchDomainCreate, 0 <null>, 2988 remoteDispatchDomainCreate (flags=0x0)) for (60s, 0s, 60s)
Nov 09 13:13:44 perf2.perf.lab.eng.bos.redhat.com virtqemud[2984]: Timed out during operation: cannot acquire state change lock (held by monitor=remoteDispatchDomainCreate)
Note I only see these messages when adding an sriov network. Just having virtio network, you do not see the message above
Alex
Yes your are absolutely correct as i miscalculated total memory on the IceLake. but what concerns me is the messages above. Thanks for pointing that out.
Tom
(In reply to Tom Tracy from comment #9) > Zhiyi > > Just built 2 more guests on Cascade lake systems using the latest > libvirt rpms you pointed to. I can start the guest with virtio and SRIOV > networks but if I use more than 1/4 total memory still see theses errors > > ov 09 13:13:44 perf2.perf.lab.eng.bos.redhat.com virtqemud[2984]: Cannot > start job (query, none, none) in API remoteDispatchConnectGetAllDomainStats > for domain rhel9.2; current job is (async nested, none, start) owned by > (2988 remoteDispatchDomainCreate, 0 <null>, 2988 remoteDispatchDomainCreate > (flags=0x0)) for (60s, 0s, 60s) > Nov 09 13:13:44 perf2.perf.lab.eng.bos.redhat.com virtqemud[2984]: Timed out > during operation: cannot acquire state change lock (held by > monitor=remoteDispatchDomainCreate) > > Note I only see these messages when adding an sriov network. Just having > virtio network, you do not see the message above Pinning memory is not instantaneous, it can certainly take more than 60s for a large VM. Is this part a regression? If this is a new watchdog in libvirt, maybe the timeout is not reasonable for such large VMs w/ assigned devices. Setting subcomponent to PCI rather than Networking as it seems to be more appropriate. Hey Laine,
The below error is seen when starting VMs with an assigned device when the VM size is more than ~256GB. Pinning VM memory at QEMU startup is an intensive process that scales linearly with VM RAM size, so I suspect we're exceeding a timeout in libvirt. If so, is there anything we can do about this, such as scaling the timeout based on VM size? Is there any residual fallout from these errors relative to libvirt managing the VM, or is this essential self correcting once QEMU becomes responsive? Is any of this timeout behavior new to libvirt? Unfortunately it's very difficult to characterize vfio page pinning throughput as it depends on factors such as hugepages/THP, memory fragmentation, and of course the processor performance and system load. Thanks
> ov 09 13:13:44 perf2.perf.lab.eng.bos.redhat.com virtqemud[2984]: Cannot
> start job (query, none, none) in API remoteDispatchConnectGetAllDomainStats
> for domain rhel9.2; current job is (async nested, none, start) owned by
> (2988 remoteDispatchDomainCreate, 0 <null>, 2988 remoteDispatchDomainCreate
> (flags=0x0)) for (60s, 0s, 60s)
> Nov 09 13:13:44 perf2.perf.lab.eng.bos.redhat.com virtqemud[2984]: Timed out
> during operation: cannot acquire state change lock (held by
> monitor=remoteDispatchDomainCreate)
The timeout for that lock has been set to 30 seconds since the timeout was added in commit 9b6efcfef in November 2009. If this worked previously, then I'd guess something is making the operation take longer than it previously took. I think it would be better to not lengthen the timeout just based on VM size; if its only needed in the case of device assignment, then we should only be allowing the longer timeout in those cases. (This would make the code more cluttered, but help to eliminate the frustration of waiting in cases when it was going to timeout anyway). But if it was previously working for the same configuration (can we verify this?), maybe it's an indication that something is broken, rather than that we need to increase the timeout (or maybe both :-)) Looking at this in more detail, it is only a stats request that's timing out while attempting to get the lock that is currently held by DomainCreate. Is the domain creation actually failing, or is it succeeding, but you see these error messages about virConnectGetAllDomainStats?? If the latter, then it's likely that whoever is requesting the stats (virt-manager?) silently ignores that error and recovers the next time it requests the stats. In that case lengthening timeouts when a DomainCreate is in progress is probably not the right thing to do, as it just causes the application requesting stats to hang for a longer period (actually that application might want a *shorter* timeout so it could avoiding becoming unresponsive during these times. (In reply to Laine Stump from comment #14) > Looking at this in more detail, it is only a stats request that's timing out > while attempting to get the lock that is currently held by DomainCreate. Is > the domain creation actually failing, or is it succeeding, but you see these > error messages about virConnectGetAllDomainStats?? If the latter, then it's > likely that whoever is requesting the stats (virt-manager?) silently ignores > that error and recovers the next time it requests the stats. In that case > lengthening timeouts when a DomainCreate is in progress is probably not the > right thing to do, as it just causes the application requesting stats to > hang for a longer period (actually that application might want a *shorter* > timeout so it could avoiding becoming unresponsive during these times. Thanks, Laine. Tom, can you please verify whether this is a regression, and in fact whether there's any misbehavior beyond unexpected warnings in the logs? If this bug is only now describing a nuisance issue, no longer related to the bug title, please update the title, description, and severity, and/or close as appropriate. Thanks. Alex
Would like to keep this ticket open as other team members create large vm's using all of the host resources. They just haven't gotten to RHEl9.2 yet. I never saw these messages with RHEL9.1 before.
Tom
(In reply to Alex Williamson from comment #15) > (In reply to Laine Stump from comment #14) > > Looking at this in more detail, it is only a stats request that's timing out > > while attempting to get the lock that is currently held by DomainCreate. Is > > the domain creation actually failing, or is it succeeding, but you see these > > error messages about virConnectGetAllDomainStats?? If the latter, then it's > > likely that whoever is requesting the stats (virt-manager?) silently ignores > > that error and recovers the next time it requests the stats. In that case > > lengthening timeouts when a DomainCreate is in progress is probably not the > > right thing to do, as it just causes the application requesting stats to > > hang for a longer period (actually that application might want a *shorter* > > timeout so it could avoiding becoming unresponsive during these times. > > Thanks, Laine. > > Tom, can you please verify whether this is a regression, and in fact whether > there's any misbehavior beyond unexpected warnings in the logs? > > If this bug is only now describing a nuisance issue, no longer related to > the bug title, please update the title, description, and severity, and/or > close as appropriate. Thanks. Considering this bug is mainly related to start a domain with large memory, reset the QA Contact to default first. |
Description of problem: When trying to ADD any SRIOV Network Device, the guest hangs and does not come online Version-Release number of selected component (if applicable): RHEL9.2 Libvirt 8.7.0-1.el9 How reproducible: Every time Steps to Reproduce: 1. Stop Guest 2. Add PCI Host Device 3. Select virtual function device 4. Boot guest Actual results: Guest hangs and see the following messages in journalctl Oct 27 09:55:24 perf184.perf.lab.eng.bos.redhat.com virtqemud[2860]: Timed out during operation: cannot acquire state change lock (held by monitor=remoteDispatchDomainCreate) Expected results: guest should boot with SRIOV network device Additional info: This happens with any network device (Intel,Mellanox) networks. If you add a virtio bridge, that works without issue. Do not see this with RHEL9.1. XML looks like this which looks the same as a RHEL9.1 Guest <hostdev mode='subsystem' type='pci' managed='yes'> <source> <address domain='0x0000' bus='0x31' slot='0x0a' function='0x0'/> </source> <address type='pci' domain='0x0000' bus='0x0c' slot='0x00' function='0x0'/> </hostdev>