Bug 1336649

Summary: [RHEL.7.3] Guest will not boot up when specify aio=native and snapshot=on together
Product: Red Hat Enterprise Linux 7 Reporter: Yang Meng <meyang>
Component: qemu-kvm-rhevAssignee: Kevin Wolf <kwolf>
Status: CLOSED ERRATA QA Contact: Virtualization Bugs <virt-bugs>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 7.3CC: aliang, amit.shah, chayang, huding, juzhang, knoel, lmiksik, meyang, mrezanin, ngu, pezhang, pingl, virt-maint, xutian, xuwei
Target Milestone: rcKeywords: Reopened
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: qemu-kvm-rhev-2.6.0-9.el7 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-11-07 21:09:29 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Yang Meng 2016-05-17 06:54:42 UTC
Description of problem:

guest will not boot up successfully when specify the following options together in the disk 
aio=native,snapshot=on/off  // not work  thanks to pezhang's test(pezhang)
aio=threads,snapshot=on/off // work

Version-Release number of selected component (if applicable):
kernel: kernel-3.10.0-396.el7.x86_64
qemu : qemu-kvm-rhev-2.6.0-1.el7.x86_64

How reproducible:

100%

Steps to Reproduce:
1.boot up the guest ,specify the following options:

-drive id=drive_image1,if=none,cache=none,aio=native,snapshot=on,format=qcow2,file=/usr/share/avocado/data/avocado-vt/images/RHEL-Server-7.3-64-virtio.qcow2 \
    -device virtio-blk-pci,id=image1,drive=drive_image1,bootindex=0,bus=pci.0,addr=03,disable-legacy=off,disable-modern=on \

2.full commandline of me:

/usr/libexec/qemu-kvm \
    -name 'avocado-vt-vm1'  \
    -sandbox off  \
    -machine pc  \
    -nodefaults  \
    -vga cirrus  \
    -chardev socket,id=qmp_id_qmpmonitor1,path=/var/tmp/monitor-qmpmonitor1-20160516-052527-XpOtv1yH,server,nowait \
    -mon chardev=qmp_id_qmpmonitor1,mode=control  \
    -chardev socket,id=qmp_id_catch_monitor,path=/var/tmp/monitor-catch_monitor-20160516-052527-XpOtv1yH,server,nowait \
    -mon chardev=qmp_id_catch_monitor,mode=control \
    -device pvpanic,ioport=0x505,id=id76oG5J  \
    -chardev socket,id=serial_id_serial0,path=/var/tmp/serial-serial0-20160516-052527-XpOtv1yH,server,nowait \
    -device isa-serial,chardev=serial_id_serial0  \
    -chardev socket,id=seabioslog_id_20160516-052527-XpOtv1yH,path=/var/tmp/seabios-20160516-052527-XpOtv1yH,server,nowait \
    -device isa-debugcon,chardev=seabioslog_id_20160516-052527-XpOtv1yH,iobase=0x402 \
    -device ich9-usb-ehci1,id=usb1,addr=1d.7,multifunction=on,bus=pci.0 \
    -device ich9-usb-uhci1,id=usb1.0,multifunction=on,masterbus=usb1.0,addr=1d.0,firstport=0,bus=pci.0 \
    -device ich9-usb-uhci2,id=usb1.1,multifunction=on,masterbus=usb1.0,addr=1d.2,firstport=2,bus=pci.0 \
    -device ich9-usb-uhci3,id=usb1.2,multifunction=on,masterbus=usb1.0,addr=1d.4,firstport=4,bus=pci.0 \
    -drive id=drive_image1,if=none,cache=none,aio=native,snapshot=on,format=qcow2,file=/usr/share/avocado/data/avocado-vt/images/RHEL-Server-7.3-64-virtio.qcow2 \
    -device virtio-blk-pci,id=image1,drive=drive_image1,bootindex=0,bus=pci.0,addr=03,disable-legacy=off,disable-modern=on \
    -device virtio-net-pci,mac=9a:17:18:19:1a:1b,id=idFlFOIk,vectors=4,netdev=idRnJbnY,bus=pci.0,addr=04,disable-legacy=off,disable-modern=on  \
    -netdev tap,id=idRnJbnY,vhost=on \
    -m 8192  \
    -smp 4,maxcpus=4,cores=2,threads=1,sockets=2  \
    -cpu 'SandyBridge',+kvm_pv_unhalt \
    -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1  \
    -vnc :0  \
    -rtc base=utc,clock=host,driftfix=slew  \
    -boot order=cdn,once=c,menu=off,strict=off \
    -enable-kvm \
    -monitor stdio \

3.guest will not boot up
and got the error

qemu-kvm: -drive id=drive_image1,if=none,cache=none,aio=native,snapshot=on,format=qcow2,file=/usr/share/avocado/data/avocado-vt/images/RHEL-Server-7.3-64-virtio.qcow2: aio=native was specified, but it requires cache.direct=on, which was not specified.


Actual results:
guest didn't boot up successfully.

Expected results:

guest boot up ,no error, no crash

Additional info:

cpuinfo:

processor	: 7
vendor_id	: GenuineIntel
cpu family	: 6
model		: 58
model name	: Intel(R) Core(TM) i7-3770 CPU @ 3.40GHz
stepping	: 9
microcode	: 0x1b
cpu MHz		: 1600.125
cache size	: 8192 KB
physical id	: 0
siblings	: 8
core id		: 3
cpu cores	: 4
apicid		: 7
initial apicid	: 7
fpu		: yes
fpu_exception	: yes
cpuid level	: 13
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm ida arat epb pln pts dtherm tpr_shadow vnmi flexpriority ept vpid fsgsbase smep erms xsaveopt
bogomips	: 6784.27
clflush size	: 64
cache_alignment	: 64
address sizes	: 36 bits physical, 48 bits virtual
power management:

Comment 1 Xu Tian 2016-05-17 07:56:11 UTC
view from code it looks if snapshot=on, qemu will get a tmpfile 
from sytem TMPDIR, TMPDIR in most linux host use tmpfs. and tmpfs not support O_DRIECT, so I guess qemu not startup with error "id=drive_image1,if=none,cache=none,aio=native,snapshot=on,format=qcow2,file=/usr/share/avocado/data/avocado-vt/images/RHEL-Server-7.3-64-virtio.qcow2: aio=native was specified, but it requires cache.direct=on, which was not specified." is expect.


int get_tmp_filename(char *filename, int size)
{
#ifdef _WIN32
    char temp_dir[MAX_PATH];
    /* GetTempFileName requires that its output buffer (4th param)
       have length MAX_PATH or greater.  */
    assert(size >= MAX_PATH);
    return (GetTempPath(MAX_PATH, temp_dir)
            && GetTempFileName(temp_dir, "qem", 0, filename)
            ? 0 : -GetLastError());
#else
    int fd;
    const char *tmpdir;
    tmpdir = getenv("TMPDIR");
    if (!tmpdir)
        tmpdir = "/tmp";
    if (snprintf(filename, size, "%s/vl.XXXXXX", tmpdir) >= size) {
        return -EOVERFLOW;
    }
    fd = mkstemp(filename);
    if (fd < 0) {
        return -errno;
    }
    if (close(fd) != 0) {
        unlink(filename);
        return -errno;
    }
    return 0;
#endif
}

mounts output on my host:

[root@xutian-dev]# (rhel7/master-1.5.3) cat /proc/mounts 
sysfs /sys sysfs rw,nosuid,nodev,noexec,relatime 0 0
proc /proc proc rw,nosuid,nodev,noexec,relatime 0 0
devtmpfs /dev devtmpfs rw,nosuid,size=6117880k,nr_inodes=1529470,mode=755 0 0
securityfs /sys/kernel/security securityfs rw,nosuid,nodev,noexec,relatime 0 0
tmpfs /dev/shm tmpfs rw,nosuid,nodev 0 0
devpts /dev/pts devpts rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000 0 0
tmpfs /run tmpfs rw,nosuid,nodev,mode=755 0 0
tmpfs /sys/fs/cgroup tmpfs ro,nosuid,nodev,noexec,mode=755 0 0
cgroup /sys/fs/cgroup/systemd cgroup rw,nosuid,nodev,noexec,relatime,xattr,release_agent=/usr/lib/systemd/systemd-cgroups-agent,name=systemd 0 0
pstore /sys/fs/pstore pstore rw,nosuid,nodev,noexec,relatime 0 0
cgroup /sys/fs/cgroup/freezer cgroup rw,nosuid,nodev,noexec,relatime,freezer 0 0
cgroup /sys/fs/cgroup/memory cgroup rw,nosuid,nodev,noexec,relatime,memory 0 0
cgroup /sys/fs/cgroup/cpu,cpuacct cgroup rw,nosuid,nodev,noexec,relatime,cpu,cpuacct 0 0
cgroup /sys/fs/cgroup/blkio cgroup rw,nosuid,nodev,noexec,relatime,blkio 0 0
cgroup /sys/fs/cgroup/net_cls,net_prio cgroup rw,nosuid,nodev,noexec,relatime,net_cls,net_prio 0 0
cgroup /sys/fs/cgroup/perf_event cgroup rw,nosuid,nodev,noexec,relatime,perf_event 0 0
cgroup /sys/fs/cgroup/devices cgroup rw,nosuid,nodev,noexec,relatime,devices 0 0
cgroup /sys/fs/cgroup/pids cgroup rw,nosuid,nodev,noexec,relatime,pids 0 0
cgroup /sys/fs/cgroup/hugetlb cgroup rw,nosuid,nodev,noexec,relatime,hugetlb 0 0
cgroup /sys/fs/cgroup/cpuset cgroup rw,nosuid,nodev,noexec,relatime,cpuset 0 0
configfs /sys/kernel/config configfs rw,relatime 0 0
/dev/mapper/fedora--server_dhcp--10--17-root / xfs rw,relatime,attr2,inode64,noquota 0 0
systemd-1 /proc/sys/fs/binfmt_misc autofs rw,relatime,fd=33,pgrp=1,timeout=0,minproto=5,maxproto=5,direct 0 0
mqueue /dev/mqueue mqueue rw,relatime 0 0
hugetlbfs /dev/hugepages hugetlbfs rw,relatime 0 0
debugfs /sys/kernel/debug debugfs rw,relatime 0 0
tmpfs /tmp tmpfs rw 0 0

Thanks,
Xu

Comment 2 Xu Tian 2016-05-17 08:08:28 UTC
Hi reporter, 

please provider your output of bash command "echo $TMPDIR; cat /proc/mounts"

thanks,
Xu

Comment 3 Yang Meng 2016-05-17 08:11:28 UTC
(In reply to xu from comment #2)
> Hi reporter, 
> 
> please provider your output of bash command "echo $TMPDIR; cat /proc/mounts"
> 
> thanks,
> Xu

following is :

[root@hp-z220-01 home]# echo $TMPDIR

[root@hp-z220-01 home]# cat /proc/mounts
rootfs / rootfs rw 0 0
sysfs /sys sysfs rw,nosuid,nodev,noexec,relatime 0 0
proc /proc proc rw,nosuid,nodev,noexec,relatime 0 0
devtmpfs /dev devtmpfs rw,nosuid,size=8046840k,nr_inodes=2011710,mode=755 0 0
securityfs /sys/kernel/security securityfs rw,nosuid,nodev,noexec,relatime 0 0
tmpfs /dev/shm tmpfs rw,nosuid,nodev 0 0
devpts /dev/pts devpts rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000 0 0
tmpfs /run tmpfs rw,nosuid,nodev,mode=755 0 0
tmpfs /sys/fs/cgroup tmpfs ro,nosuid,nodev,noexec,mode=755 0 0
cgroup /sys/fs/cgroup/systemd cgroup rw,nosuid,nodev,noexec,relatime,xattr,release_agent=/usr/lib/systemd/systemd-cgroups-agent,name=systemd 0 0
pstore /sys/fs/pstore pstore rw,nosuid,nodev,noexec,relatime 0 0
cgroup /sys/fs/cgroup/freezer cgroup rw,nosuid,nodev,noexec,relatime,freezer 0 0
cgroup /sys/fs/cgroup/devices cgroup rw,nosuid,nodev,noexec,relatime,devices 0 0
cgroup /sys/fs/cgroup/blkio cgroup rw,nosuid,nodev,noexec,relatime,blkio 0 0
cgroup /sys/fs/cgroup/hugetlb cgroup rw,nosuid,nodev,noexec,relatime,hugetlb 0 0
cgroup /sys/fs/cgroup/net_cls,net_prio cgroup rw,nosuid,nodev,noexec,relatime,net_prio,net_cls 0 0
cgroup /sys/fs/cgroup/cpuset cgroup rw,nosuid,nodev,noexec,relatime,cpuset 0 0
cgroup /sys/fs/cgroup/cpu,cpuacct cgroup rw,nosuid,nodev,noexec,relatime,cpuacct,cpu 0 0
cgroup /sys/fs/cgroup/memory cgroup rw,nosuid,nodev,noexec,relatime,memory 0 0
cgroup /sys/fs/cgroup/perf_event cgroup rw,nosuid,nodev,noexec,relatime,perf_event 0 0
cgroup /sys/fs/cgroup/pids cgroup rw,nosuid,nodev,noexec,relatime,pids 0 0
configfs /sys/kernel/config configfs rw,relatime 0 0
/dev/mapper/rhel_hp--z220--01-root / xfs rw,relatime,attr2,inode64,noquota 0 0
rpc_pipefs /var/lib/nfs/rpc_pipefs rpc_pipefs rw,relatime 0 0
systemd-1 /proc/sys/fs/binfmt_misc autofs rw,relatime,fd=24,pgrp=1,timeout=300,minproto=5,maxproto=5,direct 0 0
mqueue /dev/mqueue mqueue rw,relatime 0 0
debugfs /sys/kernel/debug debugfs rw,relatime 0 0
hugetlbfs /dev/hugepages hugetlbfs rw,relatime 0 0
nfsd /proc/fs/nfsd nfsd rw,relatime 0 0
/dev/sda1 /boot xfs rw,relatime,attr2,inode64,noquota 0 0
/dev/mapper/rhel_hp--z220--01-home /home xfs rw,relatime,attr2,inode64,noquota 0 0
10.73.194.27:/vol/s2coredump /var/crash nfs rw,relatime,vers=3,rsize=65536,wsize=65536,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=10.73.194.27,mountvers=3,mountport=4046,mountproto=udp,local_lock=none,addr=10.73.194.27 0 0
10.73.194.28:/vol/S2/kvmauto/windows_img /mnt/windows nfs rw,relatime,vers=3,rsize=65536,wsize=65536,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=10.73.194.28,mountvers=3,mountport=4046,mountproto=udp,local_lock=none,addr=10.73.194.28 0 0
10.73.194.27:/vol/s2kvmauto/iso /home/kvm_autotest_root/iso nfs rw,relatime,vers=3,rsize=65536,wsize=65536,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=10.73.194.27,mountvers=3,mountport=4046,mountproto=udp,local_lock=none,addr=10.73.194.27 0 0
10.73.194.28:/vol/S2/kvmauto/linux_img /mnt/linux nfs rw,relatime,vers=3,rsize=65536,wsize=65536,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=10.73.194.28,mountvers=3,mountport=4046,mountproto=udp,local_lock=none,addr=10.73.194.28 0 0
tmpfs /run/user/0 tmpfs rw,nosuid,nodev,relatime,size=1614064k,mode=700 0 0

Comment 5 Pei Zhang 2016-05-17 08:40:14 UTC
I agree with xu, the result is expected.  

Below info maybe can help us understand why this issue(aio=native,snapshot=on/off) in qemu2.3 work, but in qemu2.6 don't.

http://wiki.qemu.org/Features/Block/Todo#raw-posix:_Error_out_on_aio.3Dnative_with_cache.direct.3Doff_instead_of_falling_back_to_aio.3Dthreads_.5BKevin.5D

...
raw-posix: Error out on aio=native with cache.direct=off instead of falling back to aio=threads [Kevin] 
Deprecated in 2.3 (commit 9651825), intend to make it an error in 2.5 


-Pei

Comment 6 Amit Shah 2016-05-17 09:17:54 UTC
Commit 69bef7931e8880c709556f8444938d8bb9a16118 moves default location of snapshot images from /tmp to /var/tmp.  That is part of the qemu 2.0 release, so qemu-kvm-rhev should not be affected by this.

Comment 7 Ademar Reis 2016-05-31 19:11:08 UTC
(In reply to Amit Shah from comment #6)
> Commit 69bef7931e8880c709556f8444938d8bb9a16118 moves default location of
> snapshot images from /tmp to /var/tmp.  That is part of the qemu 2.6
> release, so qemu-kvm-rhev should not be affected by this.

But the reporter reproduced it with QEMU-2.6, so this is probably not the case.

Anyway, looks like this is expected behavior and with libvirt this should not impact users, so I'm tempted to close this as NOTABUG. Leaving this up to Kevin to verify and decide.

Comment 8 Amit Shah 2016-06-01 12:55:18 UTC
(In reply to Ademar Reis from comment #7)
> (In reply to Amit Shah from comment #6)
> > Commit 69bef7931e8880c709556f8444938d8bb9a16118 moves default location of
> > snapshot images from /tmp to /var/tmp.  That is part of the qemu 2.6
> > release, so qemu-kvm-rhev should not be affected by this.
> 
> But the reporter reproduced it with QEMU-2.6, so this is probably not the
> case.
> 
> Anyway, looks like this is expected behavior and with libvirt this should
> not impact users, so I'm tempted to close this as NOTABUG. Leaving this up
> to Kevin to verify and decide.

My comment was unclear, let me try another time:

Comment 1 mentions code which was changed by commit 69bef7931e8880c709556f8444938d8bb9a16118 to use /var/tmp instead of /tmp.  So this error is most likely not due to the location of the tmpfile (or TMPDIR is defined, and is not /var/tmp).  I just wanted to point out that the code referenced in comment 1 is not what it looks like anymore in qemu-kvm-rhev.

Comment 9 Kevin Wolf 2016-06-03 11:16:00 UTC
The problem is that bdrv_temp_snapshot_options() automatically sets
cache.direct=off for the temporary file, but it doesn't change the aio=native
setting, so we end up with a somewhat confusing error message.

The expected behaviour isn't completely clear, but I think I'm leaning towards
the position that on the command line, you configure the image file layer and
not the temporary snapshot layer. So aio=native shouldn't be set for the
snapshot layer and bdrv_temp_snapshot_options() must filter out the
corresponding flag.

After implementing this fix, you will get a temporary snapshot with
cache=unsafe,aio=threads on top of the real image with cache=none,aio=native.

Comment 10 Miroslav Rezanina 2016-06-23 08:45:39 UTC
Fix included in qemu-kvm-rhev-2.6.0-9.el7

Comment 12 Yang Meng 2016-06-29 07:48:29 UTC
According to #comment 9, Verified on:
qemu: qemu-kvm-rhev-2.6.0-9.el7.x86_64
kernel: kernel-3.10.0-453.el7.x86_64

steps:

1. boot up guest with commandline:

cat testme.sh

/usr/libexec/qemu-kvm \
    -name 'avocado-vt-vm1'  \
    -sandbox off  \
    -machine pc  \
    -nodefaults  \
    -vga cirrus  \
    -chardev socket,id=qmp_id_qmpmonitor1,path=/var/tmp/monitor-qmpmonitor1-20160516-052527-XpOtv1yH,server,nowait \
    -mon chardev=qmp_id_qmpmonitor1,mode=control  \
    -chardev socket,id=qmp_id_catch_monitor,path=/var/tmp/monitor-catch_monitor-20160516-052527-XpOtv1yH,server,nowait \
    -mon chardev=qmp_id_catch_monitor,mode=control \
    -device pvpanic,ioport=0x505,id=id76oG5J  \
    -chardev socket,id=serial_id_serial0,path=/var/tmp/serial-serial0-20160516-052527-XpOtv1yH,server,nowait \
    -device isa-serial,chardev=serial_id_serial0  \
    -chardev socket,id=seabioslog_id_20160516-052527-XpOtv1yH,path=/var/tmp/seabios-20160516-052527-XpOtv1yH,server,nowait \
    -device isa-debugcon,chardev=seabioslog_id_20160516-052527-XpOtv1yH,iobase=0x402 \
    -device ich9-usb-ehci1,id=usb1,addr=1d.7,multifunction=on,bus=pci.0 \
    -device ich9-usb-uhci1,id=usb1.0,multifunction=on,masterbus=usb1.0,addr=1d.0,firstport=0,bus=pci.0 \
    -device ich9-usb-uhci2,id=usb1.1,multifunction=on,masterbus=usb1.0,addr=1d.2,firstport=2,bus=pci.0 \
    -device ich9-usb-uhci3,id=usb1.2,multifunction=on,masterbus=usb1.0,addr=1d.4,firstport=4,bus=pci.0 \
    -drive id=drive_image1,if=none,cache=none,aio=native,snapshot=on,format=qcow2,file=/usr/share/avocado/data/avocado-vt/images/RHEL-Server-7.3-64-virtio.qcow2 \
    -device virtio-blk-pci,id=image1,drive=drive_image1,bootindex=0,bus=pci.0,addr=03,disable-legacy=off,disable-modern=on \
    -device virtio-net-pci,mac=9a:17:18:19:1a:1b,id=idFlFOIk,vectors=4,netdev=idRnJbnY,bus=pci.0,addr=04,disable-legacy=off,disable-modern=on  \
    -netdev tap,id=idRnJbnY,vhost=on \
    -m 8192  \
    -smp 4,maxcpus=4,cores=2,threads=1,sockets=2  \
    -cpu 'Opteron_G3',+kvm_pv_unhalt \
    -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1  \
    -vnc :0  \
    -rtc base=utc,clock=host,driftfix=slew  \
    -boot order=cdn,once=c,menu=off,strict=off \
    -enable-kvm \
    -monitor stdio \

2.boot up with no errors

[root@hp-dl385g7-04 home]# sh testme.sh 
QEMU 2.6.0 monitor - type 'help' for more information
(qemu) info status
VM status: running

3.also tried when specify the option "cache=unsafe,aio=threads",also works fine.

so will mark this bug as verified,thanks.

Comment 13 Gu Nini 2016-07-20 07:54:24 UTC
Continued with comment #12, set the bug to be correct status as verified.

Comment 15 errata-xmlrpc 2016-11-07 21:09:29 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-2673.html