Description of problem: libguestfs-test-tool hangs with cpu model Intel Xeon Processor (Icelake) on OpenStack env. $ libguestfs-test-tool ************************************************************ * IMPORTANT NOTICE * * When reporting bugs, include the COMPLETE, UNEDITED * output below in your bug report. * ************************************************************ PATH=/home/cloud-user/.local/bin:/home/cloud-user/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin XDG_RUNTIME_DIR=/run/user/1000 SELinux: Enforcing guestfs_get_append: (null) guestfs_get_autosync: 1 guestfs_get_backend: libvirt guestfs_get_backend_settings: [] guestfs_get_cachedir: /var/tmp guestfs_get_hv: /usr/libexec/qemu-kvm guestfs_get_memsize: 1280 guestfs_get_network: 0 guestfs_get_path: /usr/lib64/guestfs guestfs_get_pgroup: 0 guestfs_get_program: libguestfs-test-tool guestfs_get_recovery_proc: 1 guestfs_get_smp: 1 guestfs_get_sockdir: /run/user/1000 guestfs_get_tmpdir: /tmp guestfs_get_trace: 0 guestfs_get_verbose: 1 host_cpu: x86_64 Launching appliance, timeout set to 600 seconds. libguestfs: launch: program=libguestfs-test-tool libguestfs: launch: version=1.50.1rhel=9,release=4.el9,libvirt libguestfs: launch: backend registered: direct libguestfs: launch: backend registered: libvirt libguestfs: launch: backend=libvirt libguestfs: launch: tmpdir=/tmp/libguestfsBkGgeK libguestfs: launch: umask=0022 libguestfs: launch: euid=1000 libguestfs: libvirt version = 9003000 (9.3.0) libguestfs: guest random name = guestfs-g04zn61ghazs33su libguestfs: connect to libvirt libguestfs: opening libvirt handle: URI = qemu:///session, auth = default+wrapper, flags = 0 libguestfs: successfully opened libvirt handle: conn = 0x55920ba11030 libguestfs: qemu version (reported by libvirt) = 8000000 (8.0.0) libguestfs: get libvirt capabilities libguestfs: parsing capabilities XML libguestfs: parsing domcapabilities XML libguestfs: build appliance libguestfs: begin building supermin appliance libguestfs: run supermin libguestfs: command: run: /usr/bin/supermin libguestfs: command: run: \ --build libguestfs: command: run: \ --verbose libguestfs: command: run: \ --if-newer libguestfs: command: run: \ --lock /var/tmp/.guestfs-1000/lock libguestfs: command: run: \ --copy-kernel libguestfs: command: run: \ -f ext2 libguestfs: command: run: \ --host-cpu x86_64 libguestfs: command: run: \ /usr/lib64/guestfs/supermin.d libguestfs: command: run: \ -o /var/tmp/.guestfs-1000/appliance.d supermin: version: 5.3.3 supermin: rpm: detected RPM version 4.16 supermin: rpm: detected RPM architecture x86_64 supermin: package handler: fedora/rpm supermin: acquiring lock on /var/tmp/.guestfs-1000/lock supermin: build: /usr/lib64/guestfs/supermin.d supermin: reading the supermin appliance supermin: build: visiting /usr/lib64/guestfs/supermin.d/base.tar.gz type gzip base image (tar) supermin: build: visiting /usr/lib64/guestfs/supermin.d/daemon.tar.gz type gzip base image (tar) supermin: build: visiting /usr/lib64/guestfs/supermin.d/excludefiles type uncompressed excludefiles supermin: build: visiting /usr/lib64/guestfs/supermin.d/hostfiles type uncompressed hostfiles supermin: build: visiting /usr/lib64/guestfs/supermin.d/init.tar.gz type gzip base image (tar) supermin: build: visiting /usr/lib64/guestfs/supermin.d/packages type uncompressed packages supermin: build: visiting /usr/lib64/guestfs/supermin.d/udev-rules.tar.gz type gzip base image (tar) supermin: mapping package names to installed packages supermin: resolving full list of package dependencies supermin: build: 189 packages, including dependencies supermin: build: 32323 files supermin: build: 8465 files, after matching excludefiles supermin: build: 8476 files, after adding hostfiles supermin: build: 8462 files, after removing unreadable files supermin: build: 8487 files, after munging supermin: kernel: looking for kernel using environment variables ... supermin: kernel: looking for kernels in /lib/modules/*/vmlinuz ... supermin: kernel: picked vmlinuz /lib/modules/5.14.0-316.el9.x86_64/vmlinuz supermin: kernel: kernel_version 5.14.0-316.el9.x86_64 supermin: kernel: modpath /lib/modules/5.14.0-316.el9.x86_64 supermin: ext2: creating empty ext2 filesystem '/var/tmp/.guestfs-1000/appliance.d.t2lwt5zk/root' supermin: ext2: populating from base image supermin: ext2: copying files from host filesystem supermin: warning: /usr/libexec/utempter/utempter: Permission denied (ignored) Some distro files are not public readable, so supermin cannot copy them into the appliance. This is a problem with your Linux distro. Please ask your distro to stop doing pointless security by obscurity. You can ignore these warnings. You *do not* need to use sudo. supermin: warning: /usr/sbin/unix_update: Permission denied (ignored) supermin: warning: /var/lib/systemd/random-seed: Permission denied (ignored) supermin: ext2: copying kernel modules supermin: warning: /lib/modules/5.14.0-316.el9.x86_64/System.map: Permission denied (ignored) supermin: ext2: creating minimal initrd '/var/tmp/.guestfs-1000/appliance.d.t2lwt5zk/initrd' supermin: ext2: wrote 38 modules to minimal initrd supermin: renaming /var/tmp/.guestfs-1000/appliance.d.t2lwt5zk to /var/tmp/.guestfs-1000/appliance.d libguestfs: finished building supermin appliance libguestfs: command: run: qemu-img --help | grep -sqE -- '\binfo\b.*-U\b' libguestfs: command: run: qemu-img libguestfs: command: run: \ info libguestfs: command: run: \ -U libguestfs: command: run: \ --output json libguestfs: command: run: \ /var/tmp/.guestfs-1000/appliance.d/root libguestfs: parse_json: qemu-img info JSON output:\n{\n "children": [\n {\n "name": "file",\n "info": {\n "children": [\n ],\n "virtual-size": 4294967296,\n "filename": "/var/tmp/.guestfs-1000/appliance.d/root",\n "format": "file",\n "actual-size": 298606592,\n "format-specific": {\n "type": "file",\n "data": {\n }\n },\n "dirty-flag": false\n }\n }\n ],\n "virtual-size": 4294967296,\n "filename": "/var/tmp/.guestfs-1000/appliance.d/root",\n "format": "raw",\n "actual-size": 298606592,\n "dirty-flag": false\n}\n\n libguestfs: command: run: qemu-img libguestfs: command: run: \ create libguestfs: command: run: \ -f qcow2 libguestfs: command: run: \ -o backing_file=/var/tmp/.guestfs-1000/appliance.d/root,backing_fmt=raw libguestfs: command: run: \ /tmp/libguestfsBkGgeK/overlay2.qcow2 Formatting '/tmp/libguestfsBkGgeK/overlay2.qcow2', fmt=qcow2 cluster_size=65536 extended_l2=off compression_type=zlib size=4294967296 backing_file=/var/tmp/.guestfs-1000/appliance.d/root backing_fmt=raw lazy_refcounts=off refcount_bits=16 libguestfs: create libvirt XML libguestfs: libvirt XML:\n<?xml version="1.0"?>\n<domain type="kvm" xmlns:qemu="http://libvirt.org/schemas/domain/qemu/1.0">\n <name>guestfs-g04zn61ghazs33su</name>\n <memory unit="MiB">1280</memory>\n <currentMemory unit="MiB">1280</currentMemory>\n <cpu mode="maximum">\n <feature policy="disable" name="la57"/>\n </cpu>\n <vcpu>1</vcpu>\n <clock offset="utc">\n <timer name="rtc" tickpolicy="catchup"/>\n <timer name="pit" tickpolicy="delay"/>\n <timer name="hpet" present="no"/>\n </clock>\n <os>\n <type machine="q35">hvm</type>\n <kernel>/var/tmp/.guestfs-1000/appliance.d/kernel</kernel>\n <initrd>/var/tmp/.guestfs-1000/appliance.d/initrd</initrd>\n <cmdline>panic=1 console=ttyS0 edd=off udevtimeout=6000 udev.event-timeout=6000 no_timer_check printk.time=1 cgroup_disable=memory usbcore.nousb cryptomgr.notests tsc=reliable 8250.nr_uarts=1 root=UUID=23e8de0a-4a72-4ffc-b711-b66d736f097d selinux=0 guestfs_verbose=1 TERM=xterm-256color</cmdline>\n <bios useserial="yes"/>\n </os>\n <on_reboot>destroy</on_reboot>\n <devices>\n <rng model="virtio">\n <backend model="random">/dev/urandom</backend>\n </rng>\n <controller type="scsi" index="0" model="virtio-scsi"/>\n <disk device="disk" type="file">\n <source file="/tmp/libguestfsBkGgeK/scratch1.img"/>\n <target dev="sda" bus="scsi"/>\n <driver name="qemu" type="raw" cache="unsafe"/>\n <address type="drive" controller="0" bus="0" target="0" unit="0"/>\n </disk>\n <disk type="file" device="disk">\n <source file="/tmp/libguestfsBkGgeK/overlay2.qcow2"/>\n <target dev="sdb" bus="scsi"/>\n <driver name="qemu" type="qcow2" cache="unsafe"/>\n <address type="drive" controller="0" bus="0" target="1" unit="0"/>\n </disk>\n <serial type="unix">\n <source mode="connect" path="/run/user/1000/libguestfsc6FGM1/console.sock"/>\n <target port="0"/>\n </serial>\n <channel type="unix">\n <source mode="connect" path="/run/user/1000/libguestfsc6FGM1/guestfsd.sock"/>\n <target type="virtio" name="org.libguestfs.channel.0"/>\n </channel>\n <controller type="usb" model="none"/>\n <memballoon model="none"/>\n </devices>\n <qemu:commandline>\n <qemu:env name="TMPDIR" value="/var/tmp"/>\n </qemu:commandline>\n</domain>\n libguestfs: command: run: ls libguestfs: command: run: \ -a libguestfs: command: run: \ -l libguestfs: command: run: \ -R libguestfs: command: run: \ -Z /var/tmp/.guestfs-1000 libguestfs: /var/tmp/.guestfs-1000: libguestfs: total 4 libguestfs: drwxr-xr-x. 3 cloud-user cloud-user unconfined_u:object_r:user_tmp_t:s0 37 May 25 04:12 . libguestfs: drwxrwxrwt. 8 root root system_u:object_r:tmp_t:s0 4096 May 25 04:12 .. libguestfs: drwxr-xr-x. 2 cloud-user cloud-user unconfined_u:object_r:user_tmp_t:s0 46 May 25 04:12 appliance.d libguestfs: -rw-r--r--. 1 cloud-user cloud-user unconfined_u:object_r:user_tmp_t:s0 0 May 25 04:12 lock libguestfs: libguestfs: /var/tmp/.guestfs-1000/appliance.d: libguestfs: total 311140 libguestfs: drwxr-xr-x. 2 cloud-user cloud-user unconfined_u:object_r:user_tmp_t:s0 46 May 25 04:12 . libguestfs: drwxr-xr-x. 3 cloud-user cloud-user unconfined_u:object_r:user_tmp_t:s0 37 May 25 04:12 .. libguestfs: -rw-r--r--. 1 cloud-user cloud-user unconfined_u:object_r:user_tmp_t:s0 7569408 May 25 04:12 initrd libguestfs: -rwxr-xr-x. 1 cloud-user cloud-user unconfined_u:object_r:user_tmp_t:s0 12429176 May 25 04:12 kernel libguestfs: -rw-r--r--. 1 cloud-user cloud-user unconfined_u:object_r:user_tmp_t:s0 4294967296 May 25 04:12 root libguestfs: command: run: ls libguestfs: command: run: \ -a libguestfs: command: run: \ -l libguestfs: command: run: \ -Z /run/user/1000/libguestfsc6FGM1 libguestfs: total 0 libguestfs: drwx------. 2 cloud-user cloud-user unconfined_u:object_r:user_tmp_t:s0 80 May 25 04:12 . libguestfs: drwx------. 5 cloud-user cloud-user unconfined_u:object_r:user_tmp_t:s0 120 May 25 04:12 .. libguestfs: srwxr-xr-x. 1 cloud-user cloud-user unconfined_u:object_r:user_tmp_t:s0 0 May 25 04:12 console.sock libguestfs: srwxr-xr-x. 1 cloud-user cloud-user unconfined_u:object_r:user_tmp_t:s0 0 May 25 04:12 guestfsd.sock libguestfs: launch libvirt guest <------- hang $ cat ~/.cache/libvirt/qemu/log/guestfs-g04zn61ghazs33su.log 2023-05-25 08:12:45.952+0000: starting up libvirt version: 9.3.0, package: 2.el9 (Red Hat, Inc. <http://bugzilla.redhat.com/bugzilla>, 2023-05-16-10:12:41, ), qemu version: 8.0.0qemu-kvm-8.0.0-3.el9, kernel: 5.14.0-316.el9.x86_64, hostname: yoguo-test LC_ALL=C \ PATH=/home/cloud-user/.local/bin:/home/cloud-user/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin \ HOME=/home/cloud-user \ USER=cloud-user \ LOGNAME=cloud-user \ XDG_CACHE_HOME=/home/cloud-user/.config/libvirt/qemu/lib/domain-1-guestfs-g04zn61ghazs/.cache \ TMPDIR=/var/tmp \ /usr/libexec/qemu-kvm \ -name guest=guestfs-g04zn61ghazs33su,debug-threads=on \ -S \ -object '{"qom-type":"secret","id":"masterKey0","format":"raw","file":"/home/cloud-user/.config/libvirt/qemu/lib/domain-1-guestfs-g04zn61ghazs/master-key.aes"}' \ -machine pc-q35-rhel9.2.0,usb=off,dump-guest-core=off,memory-backend=pc.ram,graphics=off,hpet=off,acpi=off \ -accel kvm \ -cpu max,la57=off \ -m 1280 \ -object '{"qom-type":"memory-backend-ram","id":"pc.ram","size":1342177280}' \ -overcommit mem-lock=off \ -smp 1,sockets=1,cores=1,threads=1 \ -uuid 5a9a35ba-e32f-450c-adfb-c32f44644822 \ -display none \ -no-user-config \ -nodefaults \ -chardev socket,id=charmonitor,fd=22,server=on,wait=off \ -mon chardev=charmonitor,id=monitor,mode=control \ -rtc base=utc,driftfix=slew \ -global kvm-pit.lost_tick_policy=delay \ -no-shutdown \ -boot strict=on \ -kernel /var/tmp/.guestfs-1000/appliance.d/kernel \ -initrd /var/tmp/.guestfs-1000/appliance.d/initrd \ -append 'panic=1 console=ttyS0 edd=off udevtimeout=6000 udev.event-timeout=6000 no_timer_check printk.time=1 cgroup_disable=memory usbcore.nousb cryptomgr.notests tsc=reliable 8250.nr_uarts=1 root=UUID=23e8de0a-4a72-4ffc-b711-b66d736f097d selinux=0 guestfs_verbose=1 TERM=xterm-256color' \ -device '{"driver":"pcie-root-port","port":8,"chassis":1,"id":"pci.1","bus":"pcie.0","multifunction":true,"addr":"0x1"}' \ -device '{"driver":"pcie-root-port","port":9,"chassis":2,"id":"pci.2","bus":"pcie.0","addr":"0x1.0x1"}' \ -device '{"driver":"pcie-root-port","port":10,"chassis":3,"id":"pci.3","bus":"pcie.0","addr":"0x1.0x2"}' \ -device '{"driver":"pcie-root-port","port":11,"chassis":4,"id":"pci.4","bus":"pcie.0","addr":"0x1.0x3"}' \ -device '{"driver":"virtio-scsi-pci","id":"scsi0","bus":"pci.1","addr":"0x0"}' \ -device '{"driver":"virtio-serial-pci","id":"virtio-serial0","bus":"pci.2","addr":"0x0"}' \ -blockdev '{"driver":"file","filename":"/tmp/libguestfsBkGgeK/scratch1.img","node-name":"libvirt-2-storage","cache":{"direct":false,"no-flush":true},"auto-read-only":true,"discard":"unmap"}' \ -blockdev '{"node-name":"libvirt-2-format","read-only":false,"cache":{"direct":false,"no-flush":true},"driver":"raw","file":"libvirt-2-storage"}' \ -device '{"driver":"scsi-hd","bus":"scsi0.0","channel":0,"scsi-id":0,"lun":0,"device_id":"drive-scsi0-0-0-0","drive":"libvirt-2-format","id":"scsi0-0-0-0","bootindex":1,"write-cache":"on"}' \ -blockdev '{"driver":"file","filename":"/var/tmp/.guestfs-1000/appliance.d/root","node-name":"libvirt-3-storage","cache":{"direct":false,"no-flush":true},"auto-read-only":true,"discard":"unmap"}' \ -blockdev '{"node-name":"libvirt-3-format","read-only":true,"cache":{"direct":false,"no-flush":true},"driver":"raw","file":"libvirt-3-storage"}' \ -blockdev '{"driver":"file","filename":"/tmp/libguestfsBkGgeK/overlay2.qcow2","node-name":"libvirt-1-storage","cache":{"direct":false,"no-flush":true},"auto-read-only":true,"discard":"unmap"}' \ -blockdev '{"node-name":"libvirt-1-format","read-only":false,"cache":{"direct":false,"no-flush":true},"driver":"qcow2","file":"libvirt-1-storage","backing":"libvirt-3-format"}' \ -device '{"driver":"scsi-hd","bus":"scsi0.0","channel":0,"scsi-id":1,"lun":0,"device_id":"drive-scsi0-0-1-0","drive":"libvirt-1-format","id":"scsi0-0-1-0","write-cache":"on"}' \ -chardev socket,id=charserial0,path=/run/user/1000/libguestfsc6FGM1/console.sock \ -device '{"driver":"isa-serial","chardev":"charserial0","id":"serial0","index":0}' \ -chardev socket,id=charchannel0,path=/run/user/1000/libguestfsc6FGM1/guestfsd.sock \ -device '{"driver":"virtserialport","bus":"virtio-serial0.0","nr":1,"chardev":"charchannel0","id":"channel0","name":"org.libguestfs.channel.0"}' \ -audiodev '{"id":"audio1","driver":"none"}' \ -global ICH9-LPC.noreboot=off \ -watchdog-action reset \ -object '{"qom-type":"rng-random","id":"objrng0","filename":"/dev/urandom"}' \ -device '{"driver":"virtio-rng-pci","rng":"objrng0","id":"rng0","bus":"pci.3","addr":"0x0"}' \ -sandbox on,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontrol=deny \ -msg timestamp=on 2023-05-25 08:12:45.953+0000: Domain id=1 is tainted: custom-argv KVM: entry failed, hardware error 0x8 EAX=00000000 EBX=00000000 ECX=00000000 EDX=00080660 ESI=00000000 EDI=00000000 EBP=00000000 ESP=00000000 EIP=0000fff0 EFL=00000002 [-------] CPL=0 II=0 A20=1 SMM=0 HLT=0 ES =0000 00000000 0000ffff 00009300 CS =f000 ffff0000 0000ffff 00009b00 SS =0000 00000000 0000ffff 00009300 DS =0000 00000000 0000ffff 00009300 FS =0000 00000000 0000ffff 00009300 GS =0000 00000000 0000ffff 00009300 LDT=0000 00000000 0000ffff 00008200 TR =0000 00000000 0000ffff 00008b00 GDT= 00000000 0000ffff IDT= 00000000 0000ffff CR0=60000010 CR2=00000000 CR3=00000000 CR4=00000000 DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000 DR6=00000000ffff0ff0 DR7=0000000000000400 EFER=0000000000000000 Code=04 66 41 eb f1 66 83 c9 ff 66 89 c8 66 5b 66 5e 66 5f 66 c3 <ea> 5b e0 00 f0 30 36 2f 32 33 2f 39 39 00 fc 00 ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? 2023-05-25T08:13:15.885647Z qemu-kvm: terminating on signal 15 from pid 3101 (<unknown process>) 2023-05-25 08:13:16.086+0000: shutting down, reason=destroyed $ lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Address sizes: 46 bits physical, 57 bits virtual Byte Order: Little Endian CPU(s): 2 On-line CPU(s) list: 0,1 Vendor ID: GenuineIntel Model name: Intel Xeon Processor (Icelake) CPU family: 6 Model: 134 Thread(s) per core: 1 Core(s) per socket: 1 Socket(s): 2 Stepping: 0 BogoMIPS: 4589.21 Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss syscall nx pdpe1gb rdt scp lm constant_tsc rep_good nopl xtopology cpuid tsc_known_freq pni pclmulqdq vmx ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch cpuid_fault invpcid_single ssbd ibrs ibpb stibp ibrs_enhanced tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erm s invpcid avx512f avx512dq rdseed adx smap avx512ifma clflushopt clwb avx512cd sha_ni avx512bw avx512vl xsaveopt xsavec xge tbv1 xsaves wbnoinvd arat avx512vbmi umip pku ospke avx512_vbmi2 gfni vaes vpclmulqdq avx512_vnni avx512_bitalg avx512_vpop cntdq la57 md_clear arch_capabilities Virtualization features: Virtualization: VT-x Hypervisor vendor: KVM Virtualization type: full Caches (sum of all): L1d: 64 KiB (2 instances) L1i: 64 KiB (2 instances) L2: 8 MiB (2 instances) L3: 32 MiB (2 instances) NUMA: NUMA node(s): 1 NUMA node0 CPU(s): 0,1 Vulnerabilities: Itlb multihit: Not affected L1tf: Not affected Mds: Not affected Meltdown: Not affected Mmio stale data: Vulnerable: Clear CPU buffers attempted, no microcode; SMT Host state unknown Retbleed: Not affected Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization Spectre v2: Mitigation; Enhanced / Automatic IBRS, IBPB conditional, RSB filling, PBRSB-eIBRS Not affected Srbds: Not affected Tsx async abort: Not affected Version-Release number of selected component (if applicable): libguestfs-1.50.1-4.el9.x86_64 kernel-5.14.0-316.el9.x86_64 qemu-kvm-common-8.0.0-3.el9.x86_64 qemu-kvm-core-8.0.0-3.el9.x86_64 libvirt-libs-9.3.0-2.el9.x86_64 How reproducible: 100% Steps: 1. Create a VM on OpenStack env with the latest rhel9.3 nightly compose 2. Run libguestfs-test-tool Actual results: As above Expected results: libguestfs-test-tool works. Additional info: 1. No such issue with AMD EPYC-Rome Processor cpu model on OpenStack env
(In reply to YongkuiGuo from comment #0) > Steps: > > 1. Create a VM on OpenStack env with the latest rhel9.3 nightly compose > 2. Run libguestfs-test-tool > > [...] > > Expected results: > libguestfs-test-tool works. No, libguestfs has never been expected to work in L2, in a nested virtualization setup (when L1 is a KVM guest and L2, i.e. the libguestfs appliance, is also a KVM guest). Such attempts have always been defeated by non-deterministic nested virtualization bugs. If you insist you can reassign to kernel/KVM. It's a problem that's potentially triggered by specifying "-cpu max" on the qemu cmdline that is invoked in L1, while the L0 CPU is Icelake.
Interestingly this is 16/32 bit code: 0: 04 66 add al,0x66 2: 41 inc ecx 3: eb f1 jmp 0xfffffff6 5: 66 83 c9 ff or cx,0xffff 9: 66 89 c8 mov ax,cx c: 66 5b pop bx e: 66 5e pop si 10: 66 5f pop di 12: 66 c3 retw 14: ea 5b e0 00 f0 30 36 jmp 0x3630:0xf000e05b 1b: 2f das 1c: 32 33 xor dh,BYTE PTR [ebx] 1e: 2f das 1f: 39 39 cmp DWORD PTR [ecx],edi 21: 00 fc add ah,bh > No, libguestfs has never been expected to work in L2 My reading is that this would be an L1 guest, with the libguestfs appliance running as L2, which is something we do try to support (albeit, as you say, subject to various and multiple nested virtualization bugs).
(In reply to Richard W.M. Jones from comment #2) > Interestingly this is 16/32 bit code: > > 14: ea 5b e0 00 f0 30 36 jmp 0x3630:0xf000e05b Hm. This raises vague memories. According to the Intel SDM, this seems like "JMP ptr16:32", "Jump far, absolute, address given in operand"; valid in compat/legacy mode, invalid in 64-bit mode. This reminds me of early Linux boot code that performs the CPU mode switches, for entering long mode. Something something about "unrestricted guest" support. This kind of jump had been very problematic in OVMF on KVM years ago (OVMF switches CPU modes), kept triggering KVM emulation failures. I don't remember the details. Check "/sys/module/kvm_intel/parameters/unrestricted_guest" perhaps? If the processor lacks unrestricted guest support, I expect it will just not work; otherwise it should work fine. Seems like a pretty modern PCPU, so I'm unsure.
> Check "/sys/module/kvm_intel/parameters/unrestricted_guest" perhaps? We don't have access to the L0 host (and unlikely to be able to get access). However in L1 where we run libguestfs that file is: $ cat /sys/module/kvm_intel/parameters/unrestricted_guest Y
Created attachment 1966861 [details] qemu command Further information: TCG works (not surprising). Another indication that it's a problem with nested virt. The qemu command is attached.
Confusing: (In reply to YongkuiGuo from comment #0) > EIP=0000fff0 [...] > [...] > CS =f000 ffff0000 0000ffff 00009b00 this points to the reset vector where the BIOS is supposed to start... I wouldn't expect a far jump there (to a different code segment), before setting up segment descriptors. I'm unsure how reliable this register dump from QEMU is.
Also same error when running qemu directly: /usr/libexec/qemu-kvm \ -global virtio-blk-pci.scsi=off \ -no-user-config \ -nodefaults \ -display none \ -machine q35,accel=kvm:tcg,graphics=off \ -cpu max,la57=off \ -m 1280 \ -no-reboot \ -rtc driftfix=slew \ -no-hpet \ -global kvm-pit.lost_tick_policy=discard \ -kernel /var/tmp/.guestfs-1000/appliance.d/kernel \ -initrd /var/tmp/.guestfs-1000/appliance.d/initrd \ -object rng-random,filename=/dev/urandom,id=rng0 \ -device virtio-rng-pci,rng=rng0 \ -device virtio-scsi-pci,id=scsi \ -drive file=/tmp/libguestfsfGeYfb/scratch1.img,cache=unsafe,format=raw,id=hd0,if=none \ -device scsi-hd,drive=hd0 \ -drive file=/var/tmp/.guestfs-1000/appliance.d/root,snapshot=on,id=appliance,cache=unsafe,if=none \ -device scsi-hd,drive=appliance \ -device virtio-serial-pci \ -serial stdio \ -chardev socket,path=/run/user/1000/libguestfsHuYYGU/guestfsd.sock,id=channel0 \ -device virtserialport,chardev=channel0,name=org.libguestfs.channel.0 \ -append "panic=1 console=ttyS0 edd=off udevtimeout=6000 udev.event-timeout=6000 no_timer_check printk.time=1 cgroup_disable=memory usbcore.nousb cryptomgr.notests tsc=reliable 8250.nr_uarts=1 root=UUID=23e8de0a-4a72-4ffc-b711-b66d736f097d selinux=0 guestfs_verbose=1 TERM=xterm-256color" qemu-kvm: -no-hpet: warning: -no-hpet is deprecated, use '-machine hpet=off' instead KVM: entry failed, hardware error 0x8 EAX=00000000 EBX=00000000 ECX=00000000 EDX=00080660 ESI=00000000 EDI=00000000 EBP=00000000 ESP=00000000 EIP=0000fff0 EFL=00000002 [-------] CPL=0 II=0 A20=1 SMM=0 HLT=0 ES =0000 00000000 0000ffff 00009300 CS =f000 ffff0000 0000ffff 00009b00 SS =0000 00000000 0000ffff 00009300 DS =0000 00000000 0000ffff 00009300 FS =0000 00000000 0000ffff 00009300 GS =0000 00000000 0000ffff 00009300 LDT=0000 00000000 0000ffff 00008200 TR =0000 00000000 0000ffff 00008b00 GDT= 00000000 0000ffff IDT= 00000000 0000ffff CR0=60000010 CR2=00000000 CR3=00000000 CR4=00000000 DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000 DR6=00000000ffff0ff0 DR7=0000000000000400 EFER=0000000000000000 Code=04 66 41 eb f1 66 83 c9 ff 66 89 c8 66 5b 66 5e 66 5f 66 c3 <ea> 5b e0 00 f0 30 36 2f 32 33 2f 39 39 00 fc 00 ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ??
(In reply to Richard W.M. Jones from comment #2) > 14: ea 5b e0 00 f0 30 36 jmp 0x3630:0xf000e05b > 1b: 2f das > 1c: 32 33 xor dh,BYTE PTR [ebx] Wait, this disassembly is incorrect. The debugger / disassembler assumed an incorrect CPU mode, or it's a plain disassembler bug. The JMP instruction is actually EA5BE000F0 jmp 0xf000:0xe05b (i.e., ptr16:16, not ptr16:32), meaning a jump to 0xfe05b in real mode. And such a jump instruction is indeed valid as the first (and only) instruction that the reset vector points at. The constant 0xe05b is present in SeaBIOS's "src/romlayout.S": ORG 0xe05b entry_post: cmpl $0, %cs:HaveRunPost // Check for resume/reboot jnz entry_resume ENTRY_INTO32 _cfunc32flat_handle_post // Normal entry point so that's the code we're trying to jump *to*, but the L2 domain crashes in that very jump first instruction, at the reset vector. IOW, the CS:EIP info in the QEMU register dump is actually correct.
hardware error 0x8 seems to be EXIT_REASON_NMI_WINDOW.
Setting kvm_intel.dump_invalid_vmcs=1 (in L1): [18499.856522] VMCS 00000000b780e336, last attempted VM-entry on CPU 0 [18499.856526] *** Guest State *** [18499.856527] CR0: actual=0x0000000000000030, shadow=0x0000000060000010, gh_mask=fffffffffffffff7 [18499.856530] CR4: actual=0x0000000000002040, shadow=0x0000000000000000, gh_mask=fffffffffffef871 [18499.856531] CR3 = 0x0000000000000000 [18499.856534] PDPTR0 = 0x0000000000000000 PDPTR1 = 0x0000000000000000 [18499.856538] PDPTR2 = 0x0000000000000000 PDPTR3 = 0x0000000000000000 [18499.856539] RSP = 0x0000000000000000 RIP = 0x000000000000fff0 [18499.856540] RFLAGS=0x00000002 DR7 = 0x0000000000000400 [18499.856545] Sysenter RSP=0000000000000000 CS:RIP=0000:0000000000000000 [18499.856550] CS: sel=0xf000, attr=0x0009b, limit=0x0000ffff, base=0x00000000ffff0000 [18499.856557] DS: sel=0x0000, attr=0x00093, limit=0x0000ffff, base=0x0000000000000000 [18499.856562] SS: sel=0x0000, attr=0x00093, limit=0x0000ffff, base=0x0000000000000000 [18499.856568] ES: sel=0x0000, attr=0x00093, limit=0x0000ffff, base=0x0000000000000000 [18499.856574] FS: sel=0x0000, attr=0x00093, limit=0x0000ffff, base=0x0000000000000000 [18499.856579] GS: sel=0x0000, attr=0x00093, limit=0x0000ffff, base=0x0000000000000000 [18499.856582] GDTR: limit=0x0000ffff, base=0x0000000000000000 [18499.856594] LDTR: sel=0x0000, attr=0x00082, limit=0x0000ffff, base=0x0000000000000000 [18499.856598] IDTR: limit=0x0000ffff, base=0x0000000000000000 [18499.856603] TR: sel=0x0000, attr=0x0008b, limit=0x0000ffff, base=0x0000000000000000 [18499.856605] EFER= 0x0000000000000000 [18499.856607] PAT = 0x0007040600070406 [18499.856611] DebugCtl = 0x0000000000000000 DebugExceptions = 0x0000000000000000 [18499.856613] Interruptibility = 00000000 ActivityState = 00000000 [18499.856614] InterruptStatus = 0000 [18499.856617] *** Host State *** [18499.856620] RIP = 0xffffffffc0a0e863 RSP = 0xff2b812dc0947cb0 [18499.856627] CS=0010 SS=0018 DS=0000 ES=0000 FS=0000 GS=0000 TR=0040 [18499.856629] FSBase=00007fe67bdff640 GSBase=ff1ce2653ba00000 TRBase=fffffe0000003000 [18499.856633] GDTBase=fffffe0000001000 IDTBase=fffffe0000000000 [18499.856637] CR0=0000000080050033 CR3=0000000103f2c003 CR4=0000000000773ef0 [18499.856642] Sysenter RSP=fffffe0000003000 CS:RIP=0010:ffffffffaa0015f0 [18499.856644] EFER= 0x0000000000000d01 [18499.856646] PAT = 0x0407050600070106 [18499.856647] *** Control State *** [18499.856648] CPUBased=0xb5a06dfa SecondaryExec=0x000033eb TertiaryExec=0x0000000000000000 [18499.856649] PinBased=0x000000ff EntryControls=0000d1ff ExitControls=002befff [18499.856652] ExceptionBitmap=00060042 PFECmask=00000000 PFECmatch=00000000 [18499.856653] VMEntry: intr_info=00000000 errcode=00000000 ilen=00000000 [18499.856654] VMExit: intr_info=00000000 errcode=00000000 ilen=00000000 [18499.856654] reason=00000000 qualification=0000000000000000 [18499.856655] IDTVectoring: info=00000000 errcode=00000000 [18499.856657] TSC Offset = 0xffffd96342ad3040 [18499.856657] SVI|RVI = 00|00 TPR Threshold = 0x00 [18499.856660] APIC-access addr = 0x000000010776d000 virt-APIC addr = 0x0000000108c43000 [18499.856664] PostedIntrVec = 0xf2 [18499.856665] EPT pointer = 0x000000000122305e [18499.856667] Virtual processor ID = 0x0001 [18503.601139] VMCS 00000000cd5f9037, last attempted VM-entry on CPU 0 [18503.601144] *** Guest State *** [18503.601145] CR0: actual=0x0000000000000030, shadow=0x0000000060000010, gh_mask=fffffffffffffff7 [18503.601148] CR4: actual=0x0000000000002040, shadow=0x0000000000000000, gh_mask=fffffffffffef871 [18503.601149] CR3 = 0x0000000000000000 [18503.601152] PDPTR0 = 0x0000000000000000 PDPTR1 = 0x0000000000000000 [18503.601155] PDPTR2 = 0x0000000000000000 PDPTR3 = 0x0000000000000000 [18503.601155] RSP = 0x0000000000000000 RIP = 0x000000000000fff0 [18503.601157] RFLAGS=0x00000002 DR7 = 0x0000000000000400 [18503.601162] Sysenter RSP=0000000000000000 CS:RIP=0000:0000000000000000 [18503.601166] CS: sel=0xf000, attr=0x0009b, limit=0x0000ffff, base=0x00000000ffff0000 [18503.601173] DS: sel=0x0000, attr=0x00093, limit=0x0000ffff, base=0x0000000000000000 [18503.601178] SS: sel=0x0000, attr=0x00093, limit=0x0000ffff, base=0x0000000000000000 [18503.601184] ES: sel=0x0000, attr=0x00093, limit=0x0000ffff, base=0x0000000000000000 [18503.601189] FS: sel=0x0000, attr=0x00093, limit=0x0000ffff, base=0x0000000000000000 [18503.601195] GS: sel=0x0000, attr=0x00093, limit=0x0000ffff, base=0x0000000000000000 [18503.601198] GDTR: limit=0x0000ffff, base=0x0000000000000000 [18503.601204] LDTR: sel=0x0000, attr=0x00082, limit=0x0000ffff, base=0x0000000000000000 [18503.601207] IDTR: limit=0x0000ffff, base=0x0000000000000000 [18503.601213] TR: sel=0x0000, attr=0x0008b, limit=0x0000ffff, base=0x0000000000000000 [18503.601215] EFER= 0x0000000000000000 [18503.601216] PAT = 0x0007040600070406 [18503.601219] DebugCtl = 0x0000000000000000 DebugExceptions = 0x0000000000000000 [18503.601221] Interruptibility = 00000000 ActivityState = 00000000 [18503.601222] InterruptStatus = 0000 [18503.601225] *** Host State *** [18503.601228] RIP = 0xffffffffc0a0e863 RSP = 0xff2b812dc0c17ca0 [18503.601235] CS=0010 SS=0018 DS=0000 ES=0000 FS=0000 GS=0000 TR=0040 [18503.601237] FSBase=00007f93a6930640 GSBase=ff1ce2653ba00000 TRBase=fffffe0000003000 [18503.601240] GDTBase=fffffe0000001000 IDTBase=fffffe0000000000 [18503.601245] CR0=0000000080050033 CR3=0000000001316006 CR4=0000000000773ef0 [18503.601249] Sysenter RSP=fffffe0000003000 CS:RIP=0010:ffffffffaa0015f0 [18503.601251] EFER= 0x0000000000000d01 [18503.601253] PAT = 0x0407050600070106 [18503.601255] *** Control State *** [18503.601255] CPUBased=0xb5a06dfa SecondaryExec=0x000033eb TertiaryExec=0x0000000000000000 [18503.601256] PinBased=0x000000ff EntryControls=0000d1ff ExitControls=002befff [18503.601260] ExceptionBitmap=00060042 PFECmask=00000000 PFECmatch=00000000 [18503.601260] VMEntry: intr_info=00000000 errcode=00000000 ilen=00000000 [18503.601261] VMExit: intr_info=00000000 errcode=00000000 ilen=00000000 [18503.601262] reason=00000000 qualification=0000000000000000 [18503.601262] IDTVectoring: info=00000000 errcode=00000000 [18503.601264] TSC Offset = 0xffffd9612628d62e [18503.601265] SVI|RVI = 00|00 TPR Threshold = 0x00 [18503.601267] APIC-access addr = 0x000000000771c000 virt-APIC addr = 0x000000000114f000 [18503.601271] PostedIntrVec = 0xf2 [18503.601273] EPT pointer = 0x000000000117d05e [18503.601275] Virtual processor ID = 0x0002
The Intel SDM writes in appendix "VMX BASIC EXIT REASONS" (emphasis mine): > Every VM exit writes a 32-bit exit reason to the VMCS (see Section > 21.9.1). *Certain VM-entry failures also do this* (see Section 23.7) > [...] > > 8 -- NMI window. At the beginning of an instruction, there was no > virtual-NMI blocking; events were not blocked by MOV SS; and the > "NMI-window exiting" VM-execution control was 1. Can we perhaps disable the "vnmi" kvm_intel module parameter in L1? Or else (I assume: equivalently) remove the vmx-vnmi CPU model feature on the QEMU command line? ... I've tried the former, it doesn't help.
Emanuele, any guesses on this one ? A long shot but could this be related to bug 2127128 even though the error there was related to invalid control fields ?
(In reply to Bandan Das from comment #14) > Emanuele, any guesses on this one ? A long shot but could this be related to > bug 2127128 even > though the error there was related to invalid control fields ? That bug leads to some other bugs that appear more fitting: bug 2103118, bug 2105408, bug 2099216. What I find relatively annoying is that bug 2103118 -- which was originally encountered via virt-customize, and which produced identical symptoms, i.e., crashing at the SeaBIOS reset vector: <ea> 5b e0 00 f0 -- had never been *root-caused*. We only said "broken in 8.2, functional in 8.6, let's go shopping". We never located the actual fix! In such cases a reverse bisection is recommended, to see what precisely fixed the problem. If we had done that there, for RHEL-8, now we wouldn't be standing here, with our trousers around our ankles.
I agree, and added a comment to the other bug to alert people that the bug has probably not been fixed.
Latest rhel9.3 nightly compose is the L1 guest vm ? How about the host version? Dose the workaround listed in the comment https://bugzilla.redhat.com/show_bug.cgi?id=2103118#c50 work?
In L1: host kernel = 5.14.0-316.el9.x86_64 qemu = 8.0.0-3.el9 seabios = 1.16.1-1.el9 I don't think we have any information about the L0 host, but maybe Yongkui has access. The workaround probably only applies to OpenStack, but we're not using OpenStack to launch the L2 VM. The L2 VM is started with -cpu max so it'll have all features potentially enabled. If there is a particular CPU feature implicated then we could try modifying the -cpu parameter if you can give us guidance.
(In reply to Qinghua Cheng from comment #17) > Latest rhel9.3 nightly compose is the L1 guest VM ? Yes. > How about the host version? I created this VM on our OpenStack env, and I have no permission to access the LO host.
OpenStack flavor L1 guest CPU model L1 guest libguestfs-test-tool -------------------------------------------------------------------------------------------------------------------------- rhos-d ci.standard.medium Intel Xeon Processor (Icelake) RHEL9.3 nightly failed rhos-d ci.nested.virt.m1.medium Intel Xeon Processor (Skylake, IBRS) RHEL9.3 nightly passed rhos-01 ci.standard.medium AMD EPYC-Rome Processor RHEL9.3 nightly passed -------------------------------------------------------------------------------------------------------------------------- 1. In short, libguestfs-test-tool fails when the ci.xxx (excluding ci.nested.xxx) flavor is used and the L1 guest CPU model is Icelake in rhos-d OpenStack env. 2. According to these docs[1][2], nested Virtualization is enabled on ci.nested.xxx flavor but disabled on ci.xxx flavor in rhos-d. [1] https://docs.engineering.redhat.com/pages/viewpage.action?spaceKey=KB&title=PSI+OpenStack+Onboarding#PSIOpenStackOnboarding-NestedVirtualizationinRHOS-D [2] https://docs.engineering.redhat.com/display/KB/PSI+OpenStack+Use+Cases+-+Aggregates
With my pretty limited knowledge on cpu models... here's what I see as a summary so far. Hardware error seems to be caused by vmenter failure, which means 0x8 (VM_INSTRUCTION_ERROR) -> VM entry with invalid host-state field(s) (according to SDM 30.4). It got trapped in L0 then delivered to L1 kvm with the same error. So some host state seems wrong in L2's VMCS when hardware checks, probably something falls into SDM "26.2 CHECKS ON VMX CONTROLS AND HOST-STATE AREA". I can try to read more in the vmsd dump in comment 12 in the latter days this week (which seems to be quite useful), Before that I am curious on two things: 1. It seems that we're not be able to access the host (even so far I don't think anything mentioned on the host kernel version), then does it mean that even if we know it's a host kvm bug and we know a fix, it won't be fixed (because it'll need an upgrade of host kvm)? 2. Can we try a similar workaround as mentioned in comment 17 by Qinghua? Is "-cpu max" required? According to comment 21 where we do have a PASS use case, I'd give it a shot starting with "-cpu Skylake-Server-IBRS".
Also adding Vitaly (who sometimes has dealt with nested issues) to CC.
Getting information about L0 is crucial: most nested bugs end up being L0 KVM bugs, L1 hypervisor is usually innocent...
(In reply to Vitaly Kuznetsov from comment #25) > Getting information about L0 is crucial: most nested bugs end up being L0 > KVM bugs, L1 hypervisor is usually innocent... I am assuming this is something that reporter can provide. Hence, setting the needinfo.
(In reply to Nitesh Narayan Lal from comment #26) > (In reply to Vitaly Kuznetsov from comment #25) > > Getting information about L0 is crucial: most nested bugs end up being L0 > > KVM bugs, L1 hypervisor is usually innocent... > > I am assuming this is something that reporter can provide. Hence, setting > the needinfo. I cannot get any info about L0 host(see comment 19). Currently, I use ci.nested.xxx flavor or create VM on rhos-01 as a workaround(see comment 21).
(In reply to Nitesh Narayan Lal from comment #29) > (In reply to yduan from comment #28) > > (In reply to Peter Xu from comment #23) > > > With my pretty limited knowledge on cpu models... here's what I see as a > > > summary so far. > > > > > > Hardware error seems to be caused by vmenter failure, which means 0x8 > > > (VM_INSTRUCTION_ERROR) -> VM entry with invalid host-state field(s) > > > (according to SDM 30.4). It got trapped in L0 then delivered to L1 kvm with > > > the same error. > > > > > > So some host state seems wrong in L2's VMCS when hardware checks, probably > > > something falls into SDM "26.2 CHECKS ON VMX CONTROLS AND HOST-STATE AREA". > > > > > > I can try to read more in the vmsd dump in comment 12 in the latter days > > > this week (which seems to be quite useful), Before that I am curious on two > > > things: > > > > > > 1. It seems that we're not be able to access the host (even so far I don't > > > think anything mentioned on the host kernel version), then does it mean that > > > even if we know it's a host kvm bug and we know a fix, it won't be fixed > > > (because it'll need an upgrade of host kvm)? > > > > > > > IIUC, QE has no permission to touch the underlying host in PSI(PnT Shared > > Infrastructure) OpenStack environment. > > > > Additional info: > > The test environment, Production Cloud-D - RHOS 16.1[1], is running on RHEL > > 8.2 [2] which is same as Red > > Hathttps://bugzilla.redhat.com/show_bug.cgi?id=2103118#c30. > > Hi, Since the 8.2 kernel is pretty old and we need L0 information for > further investigation. > Can you please reproduce this issue on the latest rhel9/rhel8 host/guest > combination? > The problem is, as mentioned above, is that this is RHOS 16.1 and it's tied to the 8.2 kernel and to fix this for RHOS, we have to fix the 8.2 kernel (which brings us to your comment above that 8.2 is pretty old and we probably don't want to be pushing nested virtualization fixes to it.) Coming back to his bug, I was able to reproduce it fairly easily with a Icelake system running 8.2 for L0 and a RHEL 9 for L1. Unlike bug 2127128 that I linked above, it's not the hardware that's complaining. Rather, L0 KVM is crafting a message similar to what the host would do in such cases. In my test, the error comes from L0 finding a mismatch between L1's CR4 (vmcs12->cr4) and the acceptable values that it has kept for cr4_fixed1, specifically CR4.LA57[12]. I am pretty sure this could happen for any other cpuid bits that the 8.2 kernel doesn't know about. to_vmx(vcpu)->nested.msrs.cr4_fixed1 is filled up in nested_vmx_cr_fixed1_bits_update() by checking if the guest cpuid has the bit set and if the host too supports it and all would have been fine if we had this one line in the 8.2 kernel: cr4_fixed1_update(X86_CR4_LA57, ecx, bit(X86_FEATURE_LA57)); What this would have done is set LA57 in nested.msrs.cr4_fixed1 if the host supported 5 level page tables and it's also set in the guest's cpu model. (CR4_FIXED1 basically says that that bit in CR4 can be set to 1 is it's set in CR4_FIXED1 but if it's 0, setting that bit in CR4 would cause an exception) Anyway, without that oneliner, to_vmx(vcpu)->nested.msrs.cr4_fixed1[12] = 0. When L1 tries to read MSR_IA32_VMX_CR0_FIXED1, vmx_get_vmx_msr() returns the real hardware value of the msr(Another issue in the 8.2 kernel! The 8.2 kernel returns vmx_get_vmx_msr(&vmcs_config.nested, msr->index, &msr->data) where as newer kernels return vmx_get_vmx_msr(&vmx->nested.msrs...)) and assuming the host has 5 level page tables, L1 would want to set that bit in CR4. So, in the end, we would hit (on L0): if (CC(!nested_host_cr4_valid(vcpu, vmcs12->host_cr4))) return -EINVAL | | \ / if (nested_vmx_check_host_state(vcpu, vmcs12)) return nested_vmx_fail(vcpu, VMXERR_ENTRY_INVALID_HOST_STATE_FIELD); Regarding a workaround, the only one I can think of is booting L1 with la57=off but given that this is a managed instance, I don't know how much of it would be possible. Regarding a fix, I think the change is trivial. I am not too keen on it though. This is nvmx after all and I can't guarantee that the la57 feature that's affecting my setup applies to the reporter's environment. We would have to know what the host hardware is. Honestly, I am inclined to close this as WONTFIX.
Thanks - I very much agree that we should close this WONTFIX. I didn't initially realise that the L0 kernel was so ancient. By the way would you mind making your analysis in comment 30 public? That way if someone hits this error message (which to be honest is not a very good one) then they will find this bug and have an explanation of what is going on.
Thanks Bandan for the anaylsis, closing this bug.