Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1642083

Summary: GlusterFS only - BackupAPI: Failure to start VM with snapshot disk attached: libvirtError: unsupported configuration: native I/O needs either no disk cache or directsync cache mode, QEMU will fallback to aio=threads
Product: [oVirt] ovirt-engine Reporter: Avihai <aefrat>
Component: BLL.StorageAssignee: Tal Nisan <tnisan>
Status: CLOSED CURRENTRELEASE QA Contact: Elad <ebenahar>
Severity: high Docs Contact:
Priority: unspecified    
Version: 4.2.7.1CC: bugs, michal.skrivanek, ratamir, rbarry, sabose, sbose
Target Milestone: ovirt-4.2.7Keywords: Automation, Regression
Target Release: ---Flags: rule-engine: ovirt-4.2+
rule-engine: blocker+
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: ovirt-engine-4.2.7.4 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-11-02 14:33:17 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Storage RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
engine ,vdsm,libvirt logs none

Description Avihai 2018-10-23 15:02:05 UTC
Created attachment 1496713 [details]
engine ,vdsm,libvirt logs

Description of problem:
Started occuring while running Tier1 TestCase6169 .
- Create source and backup VM's from bootable disks(rhel7.6_rhv4.2_guest_disk ) 
  on glusterfs SD.
- Attach snapshots disk of source VM to backup VM
- Start source and backup VMs -> backup VM fail to start .

Engine log:
2018-10-23 17:04:13,849+03 ERROR [org.ovirt.engine.core.vdsbroker.monitoring.VmDevicesMonitoring] (EE-ManagedThreadFactory-engineScheduled-Thread-49) [] VM 'a3fa319c-1018-4b67-b9c6-8e40463bee6b' managed non plug
gable device was removed unexpectedly from libvirt: 'VmDevice:{id='VmDeviceId:{deviceId='f28d8063-6bd7-48dd-b5c4-076d546e5636', vmId='a3fa319c-1018-4b67-b9c6-8e40463bee6b'}', device='virtio-scsi', type='CONTROLL
ER', specParams='[ioThreadId=1]', address='', managed='true', plugged='false', readOnly='false', deviceAlias='', customProperties='[]', snapshotId='null', logicalName='null', hostDevice='null'}'
2018-10-23 17:04:19,621+03 INFO  [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer] (ForkJoinPool-1-worker-3) [] VM 'a3fa319c-1018-4b67-b9c6-8e40463bee6b' was reported as Down on VDS '6749df87-f3bf-4319-a50
e-a47c16d346d2'(host_mixed_1)
2018-10-23 17:04:19,627+03 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.DestroyVDSCommand] (ForkJoinPool-1-worker-3) [] START, DestroyVDSCommand(HostName = host_mixed_1, DestroyVmVDSCommandParameters:{hostId
='6749df87-f3bf-4319-a50e-a47c16d346d2', vmId='a3fa319c-1018-4b67-b9c6-8e40463bee6b', secondsToWait='0', gracefully='false', reason='', ignoreNoVm='true'}), log id: 637f6c1
2018-10-23 17:04:20,742+03 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.DestroyVDSCommand] (ForkJoinPool-1-worker-3) [] FINISH, DestroyVDSCommand, log id: 637f6c1
2018-10-23 17:04:20,743+03 INFO  [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer] (ForkJoinPool-1-worker-3) [] VM 'a3fa319c-1018-4b67-b9c6-8e40463bee6b'(backup_vm_TestCase6169_2317015487) moved from 'Wait
ForLaunch' --> 'Down'
2018-10-23 17:04:20,799+03 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (ForkJoinPool-1-worker-3) [] EVENT_ID: VM_DOWN_ERROR(119), VM backup_vm_TestCase6169_2317015487 is down wit
h error. Exit message: unsupported configuration: native I/O needs either no disk cache or directsync cache mode, QEMU will fallback to aio=threads.
2018-10-23 17:04:20,802+03 INFO  [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer] (ForkJoinPool-1-worker-3) [] add VM 'a3fa319c-1018-4b67-b9c6-8e40463bee6b'(backup_vm_TestCase6169_2317015487) to rerun tre
atment
2018-10-23 17:04:20,811+03 ERROR [org.ovirt.engine.core.vdsbroker.monitoring.VmsMonitoring] (ForkJoinPool-1-worker-3) [] Rerun VM 'a3fa319c-1018-4b67-b9c6-8e40463bee6b'. Called from VDS 'host_mixed_1'
2018-10-23 17:04:20,844+03 WARN  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engine-Thread-6177) [] EVENT_ID: USER_INITIATED_RUN_VM_FAILED(151), Failed to run 
VM backup_vm_TestCase6169_2317015487 on Host host_mixed_1.
2018-10-23 17:04:20,859+03 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engine-Thread-6177) [] EVENT_ID: USER_FAILED_RUN_VM(54), Failed to run VM backup_v
m_TestCase6169_2317015487  (User: admin@internal-authz).

VDSM log:
2018-10-23 17:04:19,592+0300 ERROR (vm/a3fa319c) [virt.vm] (vmId='a3fa319c-1018-4b67-b9c6-8e40463bee6b') The vm start process failed (vm:948)
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 877, in _startUnderlyingVm
    self._run()
  File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 2898, in _run
    dom.createWithFlags(flags)
  File "/usr/lib/python2.7/site-packages/vdsm/common/libvirtconnection.py", line 130, in wrapper
    ret = f(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/vdsm/common/function.py", line 92, in wrapper
    return func(inst, *args, **kwargs)
  File "/usr/lib64/python2.7/site-packages/libvirt.py", line 1110, in createWithFlags
    if ret == -1: raise libvirtError ('virDomainCreateWithFlags() failed', dom=self)
libvirtError: unsupported configuration: native I/O needs either no disk cache or directsync cache mode, QEMU will fallback to aio=threads


Libvirt.log:
2018-10-23 14:04:19.469+0000: 22934: error : qemuCheckDiskConfig:1270 : unsupported configuration: native I/O needs either no disk cache or directsync cache mode, QEMU will fallback to aio=threads

Version-Release number of selected component (if applicable):
4.2.7.3-0.1.el7ev
vdsm-4.20.43-1
libvirt-client-4.5.0-10
qemu-kvm-common-rhev-2.12.0-18
Red Hat Enterprise Linux Server release 7.6 (Maipo)

How reproducible:
100%

Steps to Reproduce:
1.Create source and backup VM's from bootable disks(rhel7.6_rhv4.2_guest_disk ) 
  on glusterfs SD. 
2.add an additional disk to source VM
3.Create snapshot on source VM
4.Attach snapshot disk of source VM to backup VM
5.Start both source and backup VM


Actual results:
Only source VM succeed to activate.
Backup VM failed to activate


Expected results:
Both VM's should succeed to activate.


Additional info:
Only occur on gluster

Comment 1 Avihai 2018-10-23 15:05:30 UTC
Same TestCase6169  running on gluster passed about 10 runs ago , at rhv-4.2.6-5 so this looks like a regression.

Comment 2 Michal Skrivanek 2018-10-24 05:27:52 UTC
Probably related to the recent change to aio native on gluster, see code just few lines below that change, at https://github.com/oVirt/ovirt-engine/blob/master/backend/manager/modules/vdsbroker/src/main/java/org/ovirt/engine/core/vdsbroker/builder/vminfo/LibvirtVmXmlBuilder.java#L1796

Comment 3 Sumit Bose 2018-10-24 07:09:10 UTC
Forwarding needinfo.

Comment 4 Sahina Bose 2018-10-24 08:32:29 UTC
Yes, it is introduced by the change to use aio=native. What's the behaviour with snapshot and other disk types that use aio=native?

Comment 5 Ryan Barry 2018-10-24 16:41:30 UTC
(In reply to Sahina Bose from comment #4)
> Yes, it is introduced by the change to use aio=native. What's the behaviour
> with snapshot and other disk types that use aio=native?

I'm running through the code, since I haven't touched this part of it yet, but:

    public String getDiskType(VM vm, DiskImage diskImage, VmDevice device) {                                                                                                                          
        if (device.getSnapshotId() != null) {                                                                                                                                                         
            return "file"; // transient disks are always files                                                                                                                                        
        } 

In this case, the condition:
           nativeIO = !"file".equals(diskType);

Is false (if it has a snapshot ID)

Comment 6 Sahina Bose 2018-10-24 16:49:07 UTC
(In reply to Ryan Barry from comment #5)
> (In reply to Sahina Bose from comment #4)
> > Yes, it is introduced by the change to use aio=native. What's the behaviour
> > with snapshot and other disk types that use aio=native?
> 
> I'm running through the code, since I haven't touched this part of it yet,
> but:
> 
>     public String getDiskType(VM vm, DiskImage diskImage, VmDevice device) {
> 
>         if (device.getSnapshotId() != null) {                               
> 
>             return "file"; // transient disks are always files              
> 
>         } 
> 
> In this case, the condition:
>            nativeIO = !"file".equals(diskType);
> 
> Is false (if it has a snapshot ID)

Thanks!

Comment 7 Elad 2018-10-28 08:32:59 UTC
VM starts properly with snapshot disk attached on Gluster.

vdsm-4.20.43-1.el7ev.x86_64
ovirt-engine-4.2.7.4-0.1.el7ev.noarch
libvirt-4.5.0-10.el7_6.2.x86_64
qemu-img-rhev-2.12.0-18.el7_6.1.x86_64
glusterfs-3.12.2-18.el7.x86_64

Comment 8 Sandro Bonazzola 2018-11-02 14:33:17 UTC
This bugzilla is included in oVirt 4.2.7 release, published on November 2nd 2018.

Since the problem described in this bug report should be
resolved in oVirt 4.2.7 release, it has been closed with a resolution of CURRENT RELEASE.

If the solution does not work for you, please open a new bug report.