Bug 1501715
Summary: | Run 'virsh managedsave' with '--bypass-cache' flag failed. | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | lcheng | ||||||||
Component: | libvirt | Assignee: | Jiri Denemark <jdenemar> | ||||||||
Status: | CLOSED ERRATA | QA Contact: | Yanqiu Zhang <yanqzhan> | ||||||||
Severity: | high | Docs Contact: | |||||||||
Priority: | high | ||||||||||
Version: | 7.5 | CC: | dyuan, dzheng, jdenemar, jsuchane, junli, lcheng, rbalakri, xuzhang, yafu, zpeng | ||||||||
Target Milestone: | rc | Keywords: | Automation, Regression | ||||||||
Target Release: | --- | ||||||||||
Hardware: | Unspecified | ||||||||||
OS: | Unspecified | ||||||||||
Whiteboard: | |||||||||||
Fixed In Version: | libvirt-3.9.0-1.el7 | Doc Type: | If docs needed, set a value | ||||||||
Doc Text: | Story Points: | --- | |||||||||
Clone Of: | Environment: | ||||||||||
Last Closed: | 2018-04-10 10:59:09 UTC | Type: | Bug | ||||||||
Regression: | --- | Mount Type: | --- | ||||||||
Documentation: | --- | CRM: | |||||||||
Verified Versions: | Category: | --- | |||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||
Embargoed: | |||||||||||
Attachments: |
|
Description
lcheng
2017-10-13 05:35:39 UTC
Would you mind attaching libvirt's debug log? Thanks. Created attachment 1339026 [details]
libvirtd log
Hi Jaroslav,
Sorry for the late reply. The attachment is following commands test result log.
# virsh managedsave test --running --bypass-cache
error: Failed to save domain test state
error: operation failed: domain save job: unexpectedly failed
# virsh managedsave test --paused --bypass-cache
error: Failed to save domain test state
error: operation failed: domain save job: unexpectedly failed
# virsh dump test /tmp/test.d --bypass-cache
error: Failed to core dump domain test to /tmp/test.d
error: operation failed: domain core dump job: unexpectedly failed
Could you also attach the corresponding QEMU log from /var/log/libvirt/qemu/test.log? Created attachment 1339109 [details]
libvirtd log and corresponding QEMU log
Hi Jiri,
I rerun the commands. The attachments are libvirtd log and corresponding QEMU log.
Created attachment 1339110 [details] corresponding QEMU log Sorry, I didn't notice only one attachment can be uploaded. In comment 6, the attachment is libvirtd log. This attachment is corresponding QEMU log. Broken by the following commit: commit 633b699bfda06d9fcdb7f9466e2d2c9b4bc3e63c Author: Daniel P. Berrange <berrange> Date: Wed Sep 20 16:25:56 2017 +0100 iohelper: avoid calling read() with misaligned buffers for O_DIRECT The iohelper currently calls saferead() to get data from the underlying file. This has a problem with O_DIRECT when hitting end-of-file. saferead() is asked to read 1MB, but the first read() it does may return only a few KB, so it'll try another read() to fill the remaining buffer. Unfortunately the buffer pointer passed into this 2nd read() is likely not aligned to the extent that O_DIRECT requires, so rather than seeing '0' for end-of-file, we'll get -1 + EINVAL due to misaligned buffer. The way the iohelper is currently written, it already handles getting short reads, so there is actually no need to use saferead() at all. We can simply call read() directly. The benefit of this is that we can now write() the data immediately so when we go into the subsequent reads() we'll always have a correctly aligned buffer. Technically the file position ought to be aligned for O_DIRECT too, but this does not appear to matter when at end-of-file. Tested-by: Nikolay Shirokovskiy <nshirokovskiy> Reviewed-by: Eric Blake <eblake> Signed-off-by: Daniel P. Berrange <berrange> Fixed upstream by commit 05021e727d80527c4b53debed98b87b565780a16 Refs: v3.8.0-236-g05021e727 Author: Nikolay Shirokovskiy <nshirokovskiy> AuthorDate: Thu Sep 28 10:06:47 2017 +0300 Commit: Jiri Denemark <jdenemar> CommitDate: Tue Oct 24 10:53:18 2017 +0200 iohelper: use saferead if later write with O_DIRECT One of the usecases of iohelper is to read from pipe and write to file with O_DIRECT. As we read from pipe we can have partial read and then we fail to write this data because output file is open with O_DIRECT and buffer size is not aligned. Signed-off-by: Jiri Denemark <jdenemar> Please see bz1240494 comment 4. The '0600001' is expected on PPC64 as O_DIRECT will differ between x86 and PPC. Verify on x86_64 with latest build: libvirt-3.9.0-7.el7.x86_64 qemu-kvm-rhev-2.10.0-16.el7.x86_64 Steps: 1.dump [terminal 1]# virsh dump V7.4-full-2 /tmp/V74.dump --bypass-cache Domain V7.4-full-2 dumped to /tmp/V74.dump [terminal 2]# while(true);do cat /proc/$(lsof -w /tmp/V74.dump|awk '/libvirt_i/{print $2}')/fdinfo/1 ;done ... pos: 489684992 flags: 0140001 mnt_id: 63 ... 2.save # virsh save rhel7.4-2 /tmp/V74.save --bypass-cache Domain rhel7.4-2 saved to /tmp/V74.save # while(true);do cat /proc/$(lsof -w /tmp/V74.save|awk '/libvirt_i/{print $2}')/fdinfo/1 ;done ... pos: 613416960 flags: 0140001 mnt_id: 63 ... 3.restore # virsh restore /tmp/V74.save --bypass-cache Domain restored from /tmp/V74.save # while(true);do cat /proc/$(lsof -w /tmp/V74.save|awk '/libvirt_i/{print $2}')/fdinfo/0 ;done ... pos: 211812352 flags: 0140000 mnt_id: 63 ... 4.managedsave # virsh managedsave V7.4-full-2 --bypass-cache Domain V7.4-full-2 state saved by libvirt # while(true);do cat /proc/$(lsof -w /var/lib/libvirt/qemu/save/V7.4-full-2.save|awk '/libvirt_i/{print $2}')/fdinfo/1 ;done ... pos: 441450496 flags: 0140001 mnt_id: 63 ... 5.start # virsh start V7.4-full-2 --bypass-cache Domain V7.4-full-2 started # while(true);do cat /proc/$(lsof -w /var/lib/libvirt/qemu/save/V7.4-full-2.save|awk '/libvirt_i/{print $2}')/fdinfo/0 ;done ... pos: 463470592 flags: 0140000 mnt_id: 63 ... The operations are successfully executed without error, and the flags are correct. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2018:0704 |