Tested with:
libvirt-daemon-8.0.0-9.module+el8.7.0+15830+85788ab7.x86_64
qemu-kvm-6.2.0-16.module+el8.7.0+15743+c774064d.x86_64
libtpms-0.9.1-0.20211126git1ff6fe1f43.module+el8.6.0+13725+61ae1949.x86_64.rpm
1. start a guest with vtpm device, and create an external snapshot immediately
# virsh start rhel8.7-q35; virsh snapshot-create-as rhel8.7-q35 s2 --memspec file=/tmp/rhel8.7-q35.mem --diskspec vda,file=/tmp/rhel8.7-q35.s2
Domain 'rhel8.7-q35' started
Domain snapshot s2 created
2. upgrade the host to RHEL9
3. restore the guest from snapshot
# virsh restore /tmp/rhel8.7-q35.mem rhel8.7-q35.xml
error: Failed to restore domain from /tmp/rhel8.7-q35.mem
error: internal error: qemu unexpectedly closed the monitor: 2022-07-27T09:48:32.577737Z qemu-kvm: Machine type 'pc-q35-rhel8.6.0' is deprecated: machine types for previous major
2022-07-27T09:48:37.613513Z qemu-kvm: -device cirrus-vga,id=video0,bus=pcie.0,addr=0x1: warning: 'cirrus-vga' is deprecated, please use a different VGA card instead
2022-07-27T09:48:37.640415Z qemu-kvm: warning: netdev channel1 has no peer
2022-07-27T09:48:38.663843Z qemu-kvm: tpm-emulator: Setting the stateblob (type 2) failed with a TPM error 0x3 a parameter is bad
2022-07-27T09:48:38.663869Z qemu-kvm: error while loading state for instance 0x0 of device 'tpm-emulator'
2022-07-27T09:48:38.848779Z qemu-kvm: load of migration failed: Input/output error
Actual results:
Failed to restore vm
Additional info:
1. This bug can reproduced in RHEL8.7 with the steps in Bug #2035731
2. The guest also can not be restored from the snapshot before upgrading.
3. As it is suggested to create snapshots for the the guests before upgrading:
https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/9/html-single/upgrading_from_rhel_8_to_rhel_9/index
I think this bug would have bigger impact in the scenario of upgrade from RHEL8 to RHEL9.
+++ This bug was initially created as a clone of Bug #2035731 +++
Description of problem:
Start a vm with vtpm device, then save the vm immediately. Restoring vm will fail.
Version-Release number of selected component (if applicable):
# rpm -q libvirt qemu-kvm libtpms swtpm
libvirt-7.10.0-1.el9.x86_64
qemu-kvm-6.2.0-1.el9.x86_64
libtpms-0.9.1-0.20211126git1ff6fe1f43.el9.x86_64
swtpm-0.7.0-1.20211109gitb79fd91.el9.x86_64
How reproducible:
100%
Steps to Reproduce:
1.Start a vm with vtpm device, and save it immediately
# virsh start vm1; virsh managedsave vm1
<tpm model='tpm-crb'>
<backend type='emulator' version='2.0'/>
</tpm>
2. Restore the vm:
# virsh start vm1
error: Failed to start domain 'vm1'
error: internal error: qemu unexpectedly closed the monitor: 2021-12-27T08:52:14.261131Z qemu-kvm: tpm-emulator: Setting the stateblob (type 2) failed with a TPM error 0x3 a parameter is bad
2021-12-27T08:52:14.261145Z qemu-kvm: error while loading state for instance 0x0 of device 'tpm-emulator'
2021-12-27T08:52:14.261235Z qemu-kvm: load of migration failed: Input/output error
3. Remove state file, try to restore vm again:
# rm /var/lib/libvirt/swtpm/505ee98d-e9af-4597-9eff-168e66f5f6ce/tpm2/tpm2-00.permall
rm: remove regular file '/var/lib/libvirt/swtpm/505ee98d-e9af-4597-9eff-168e66f5f6ce/tpm2/tpm2-00.permall'? y
# virsh start vm1
error: Failed to start domain 'vm1'
error: internal error: qemu unexpectedly closed the monitor: 2021-12-27T09:05:28.928729Z qemu-kvm: tpm-emulator: Setting the stateblob (type 2) failed with a TPM error 0x3 a parameter is bad
2021-12-27T09:05:28.928743Z qemu-kvm: error while loading state for instance 0x0 of device 'tpm-emulator'
2021-12-27T09:05:28.928832Z qemu-kvm: load of migration failed: Input/output error
Actual results:
Failed to restore vm
Expected results:
Restore vm successfully
Additional info:
1. Can't reproduce if save vm after vm is fully booted up.
2. swtpm log:
# cat /var/log/swtpm/libvirt/qemu/vm1-swtpm.log
libtpms/tpm2: STATE_RESET_DATA: s_ContextSlotMask has bad value: 0x0000
Data client disconnected
libtpms/tpm2: STATE_RESET_DATA: s_ContextSlotMask has bad value: 0x0000
Data client disconnected
--- Additional comment from Qinghua Cheng on 2022-01-04 10:21:03 UTC ---
We don't reproduce it with Win11 and Win2022 guests
Environment:
libtpms-0.9.1-0.20211126git1ff6fe1f43.el9.x86_64
swtpm-0.7.0-1.20211109gitb79fd91.el9.x86_64
qemu-kvm-6.2.0-1.el9.x86_64
libvirt-7.10.0-1.el9.x86_64
edk2-ovmf-20210527gite1999b264f1f-7.el9.noarch
--- Additional comment from Stefan Berger on 2022-01-04 20:10:32 UTC ---
The fix for this is in a PR here now: https://github.com/stefanberger/libtpms/pull/287
--- Additional comment from Marc-Andre Lureau on 2022-01-06 14:17:07 UTC ---
Unfortunately, the series from Stefan B. is stuck at this point. Last iteration I could find is "[PATCH v3 1/3] selftests: tpm2: Probe for available PCR bank External"
--- Additional comment from Stefan Berger on 2022-01-06 14:19:49 UTC ---
Patches haven't received a verdict from the kernel test maintainer yet: https://lkml.org/lkml/2021/12/23/681
--- Additional comment from Yanan Fu on 2022-01-14 03:28:09 UTC ---
Hi there,
We i test migration with Win11 + tpm device, if src and dst vm use same tpm daemon by mistake, can hit the same error info:
18:23:27 INFO | [qemu output] qemu-kvm: tpm-emulator: Setting the stateblob (type 2) failed with a TPM error 0x84
18:23:27 INFO | [qemu output] qemu-kvm: error while loading state for instance 0x0 of device 'tpm-emulator'
18:23:27 INFO | [qemu output] qemu-kvm: load of migration failed: Input/output error
18:23:27 INFO | [qemu output] (Process terminated with status 1)
Just for a reference, thanks!
--- Additional comment from Stefan Berger on 2022-01-14 15:06:37 UTC ---
(In reply to Yanan Fu from comment #5)
> Hi there,
>
> We i test migration with Win11 + tpm device, if src and dst vm use same tpm
> daemon by mistake, can hit the same error info:
What is 'same tpm daemon by mistake'?
>
> 18:23:27 INFO | [qemu output] qemu-kvm: tpm-emulator: Setting the stateblob
> (type 2) failed with a TPM error 0x84
> 18:23:27 INFO | [qemu output] qemu-kvm: error while loading state for
> instance 0x0 of device 'tpm-emulator'
> 18:23:27 INFO | [qemu output] qemu-kvm: load of migration failed:
> Input/output error
> 18:23:27 INFO | [qemu output] (Process terminated with status 1)
>
>
> Just for a reference, thanks!
Is this with the fix applied to libptms on both machines or without?
--- Additional comment from Yanan Fu on 2022-01-14 15:35:03 UTC ---
(In reply to Stefan Berger from comment #6)
> (In reply to Yanan Fu from comment #5)
> > Hi there,
> >
> > We i test migration with Win11 + tpm device, if src and dst vm use same tpm
> > daemon by mistake, can hit the same error info:
>
> What is 'same tpm daemon by mistake'?
It is for local migration test:
1. src vm, qemu cli:
-chardev socket,id=char_vtpm_tpm0,path=/tmp/avocado_5l5j2ys4/avocado-vt-vm1_tpm0_swtpm.sock \ <--- here
-tpmdev emulator,chardev=char_vtpm_tpm0,id=emulator_vtpm_tpm0 \
-device tpm-crb,id=tpm-crb_vtpm_tpm0,tpmdev=emulator_vtpm_tpm0 \
2. dst vm qemu cli:
-chardev socket,id=char_vtpm_tpm0,path=/tmp/avocado_5l5j2ys4/avocado-vt-vm1_tpm0_swtpm.sock \ <--- here
-tpmdev emulator,chardev=char_vtpm_tpm0,id=emulator_vtpm_tpm0 \
-device tpm-crb,id=tpm-crb_vtpm_tpm0,tpmdev=emulator_vtpm_tpm0 \
I hit the same error when use same 'path' for both src and dst vm.
This is incorrect usage, so i say 'by mistake'.
The correct usage is using unique path for each one, in this way, it works normally.
>
> >
> > 18:23:27 INFO | [qemu output] qemu-kvm: tpm-emulator: Setting the stateblob
> > (type 2) failed with a TPM error 0x84
> > 18:23:27 INFO | [qemu output] qemu-kvm: error while loading state for
> > instance 0x0 of device 'tpm-emulator'
> > 18:23:27 INFO | [qemu output] qemu-kvm: load of migration failed:
> > Input/output error
> > 18:23:27 INFO | [qemu output] (Process terminated with status 1)
> >
> >
> > Just for a reference, thanks!
>
> Is this with the fix applied to libptms on both machines or without?
Without any fix.
--- Additional comment from Stefan Berger on 2022-01-14 20:14:21 UTC ---
Anyway, thanks for finding the bug!
--- Additional comment from RHEL Program Management on 2022-06-20 13:13:11 UTC ---
The release+ flag was dropped due to a missing devel_ack+ and/or qa_ack+
--- Additional comment from Marc-Andre Lureau on 2022-06-20 13:14:12 UTC ---
please qa ack, thanks
--- Additional comment from errata-xmlrpc on 2022-06-21 12:03:35 UTC ---
This bug has been added to advisory RHBA-2022:96997 by Marc-Andre Lureau (mlureau)
--- Additional comment from errata-xmlrpc on 2022-06-21 12:03:52 UTC ---
Bug report changed to ON_QA status by Errata System.
A QE request has been submitted for advisory RHBA-2022:96997-01
https://errata.devel.redhat.com/advisory/96997
--- Additional comment from Yanan Fu on 2022-06-24 02:48:02 UTC ---
QE bot(pre verify): Set 'Verified:Tested,SanityOnly' as gating/tier1 test pass.
--- Additional comment from Qinghua Cheng on 2022-06-28 03:20:26 UTC ---
Verified for on rhel 9.1
kernel: 5.14.0-114.el9.x86_64
qemu: qemu-kvm-7.0.0-6.el9.x86_64
libtpms-0.9.1-2.20211126git1ff6fe1f43.el9.x86_64
swtpm-0.7.0-3.20211109gitb79fd91.el9.x86_64
edk2-ovmf-20220526git16779ede2d36-1.el9.noarch
guest: win11
Looped the actions
virsh start <vm> (save it immediately)
virsh managedsave <vm>
virsh start <vm>
40 times, guest vm works normally.
Set bug status to verified.
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory (Low: virt:rhel and virt-devel:rhel security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.
https://access.redhat.com/errata/RHSA-2022:7472