Bug 1785615 - Failed to perform "Change CD" operation
Summary: Failed to perform "Change CD" operation
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: ovirt-engine
Classification: oVirt
Component: BLL.Virt
Version: 4.4.0
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ovirt-4.4.0
: ---
Assignee: Steven Rosenberg
QA Contact: Nisim Simsolo
URL:
Whiteboard:
Depends On: 1813858 1813962
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-12-20 13:12 UTC by Radek Duda
Modified: 2020-06-23 17:04 UTC (History)
18 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Previously, trying to mount an ISO domain (File -> Change CD) within the Console generated a "Failed to perform 'Change CD' operation" error due to the deprecation of REST API v3. The current release fixes this issue: It upgrades Remote Viewer to use REST API v4 so mounting an ISO domain within the console works.
Clone Of:
Environment:
Last Closed: 2020-05-20 20:01:59 UTC
oVirt Team: Virt
Embargoed:
pm-rhel: ovirt-4.4+
pm-rhel: blocker+
mtessun: planning_ack+
pm-rhel: devel_ack+


Attachments (Terms of Use)
vdsm.log (7.12 KB, text/plain)
2019-12-20 13:12 UTC, Radek Duda
no flags Details
REST_DEBUG=proxy remote-viewer console.vv --spice-debug (129.65 KB, text/plain)
2019-12-20 13:15 UTC, Radek Duda
no flags Details
VM log during change cd (73.65 KB, text/plain)
2019-12-25 13:17 UTC, Steven Rosenberg
no flags Details
Libvirt logs (300.08 KB, application/gzip)
2020-01-06 10:04 UTC, Steven Rosenberg
no flags Details
VDSM Log Created during Change CD (100.08 KB, application/gzip)
2020-01-06 10:08 UTC, Steven Rosenberg
no flags Details
Screen Shot of Error when choosing a CD (128.42 KB, image/png)
2020-01-06 10:15 UTC, Steven Rosenberg
no flags Details
Remote Viewer debug fails to connect (54.71 KB, image/png)
2020-03-11 09:14 UTC, Steven Rosenberg
no flags Details
Engine log when simulating Change CD (141.51 KB, text/plain)
2020-03-20 08:36 UTC, Steven Rosenberg
no flags Details
VDSM Log (11.94 MB, text/plain)
2020-03-20 08:50 UTC, Steven Rosenberg
no flags Details
qemu log (10.46 KB, text/plain)
2020-03-20 08:51 UTC, Steven Rosenberg
no flags Details
Legacy Bios with Q35 emulation still and issue (38.55 KB, image/png)
2020-03-23 15:44 UTC, Steven Rosenberg
no flags Details

Description Radek Duda 2019-12-20 13:12:12 UTC
Created attachment 1646799 [details]
vdsm.log

Description of problem:
CD can not be mounted to RHEL guest.

Version-Release number of selected component (if applicable):
engine:
ovirt-engine-4.4.0-0.9.master.el7.noarch

host:
vdsm-4.40.0-164.git38a19bb.el8ev.x86_64
libvirt-5.0.0-7.module+el8+2887+effa3c42.x86_64

guest: rhel8.2

client: rhel8.2
spice-gtk3-0.37-1.el8.x86_64
virt-viewer-7.0-9.el8.x86_64


How reproducible:
always

Steps to Reproduce:
1. Launch VM in RHV and have nfs ISO storage domain there
2. Connect to it using remote-viewer
3. Try to mount ISO domain (File->Change CD)

Actual results:
'Operation failed: [Failed to perform "Change CD" operation, CD might be still in use by the VM.
Please try to manually detach the CD from withing the VM:
  1. Log in to the VM
  2  For Linux VMs, un-mount the CD using umount command;
     For Windows VMs, right click on the CD drive and click 'Eject';]'
message appears and no CD is mounted even thou there is no CD mounted to the VM

Expected results:
CD is mounted to the VM

Additional info:

Comment 1 Radek Duda 2019-12-20 13:15:40 UTC
Created attachment 1646801 [details]
REST_DEBUG=proxy remote-viewer console.vv --spice-debug

Comment 9 Steven Rosenberg 2019-12-25 13:10:39 UTC
I simulated this issue, the errors in the engine log are from the response from the vdsm:

VDSErrorException: Failed in vdscommand to ChangeDiskVDS, error = Failed to change disk image (Failed with error FAILED_CHANGE_CD_IS_MOUNTED and code 41)


within the vdsm log we have the following error:

2019-12-25 12:25:40,750+0200 ERROR (jsonrpc/0) [virt.vm] (vmId='a2d4fffa-4f6a-448d-b584-3257d7c90a0f') forceful updateDeviceFlags failed (vm:4606)
Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/vdsm/virt/vm.py", line 4603, in _changeBlockDev
    diskelem_xml, libvirt.VIR_DOMAIN_DEVICE_MODIFY_FORCE
  File "/usr/lib/python3.6/site-packages/vdsm/virt/virdomain.py", line 101, in f
    ret = attr(*args, **kwargs)
  File "/usr/lib/python3.6/site-packages/vdsm/common/libvirtconnection.py", line 131, in wrapper
    ret = f(*args, **kwargs)
  File "/usr/lib/python3.6/site-packages/vdsm/common/function.py", line 94, in wrapper
    return func(inst, *args, **kwargs)
  File "/usr/lib64/python3.6/site-packages/libvirt.py", line 2782, in updateDeviceFlags
    if ret == -1: raise libvirtError ('virDomainUpdateDeviceFlags() failed', dom=self)
libvirt.libvirtError: internal error: No device with bus 'ide' and target 'hdc'

Therefore the issue seems to be with the libvirt.

The engine is sending the bus (iface) of ide:

2019-12-25 12:25:40,704+02 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.ChangeDiskVDSCommand] (default task-2) [413fd180-b206-4ae7-bdf8-255cbe8daa7b] START, ChangeDiskVDSCommand(HostName = Host_44, ChangeDiskVDSCommandParameters:{hostId='34623bb4-2687-47e0-9f14-98fac559a922', vmId='a2d4fffa-4f6a-448d-b584-3257d7c90a0f', iface='ide', index='2', diskPath='/rhev/data-center/mnt/vserver-spider.eng.lab.tlv.redhat.com:_pub_srosenbe_iso/e522ee24-1469-407a-8efe-d681be2589bd/images/11111111-1111-1111-1111-111111111111/Win10_1909_English_x64.iso'}), log id: 6a57129e

The question may be why we are not supporting it. Please advise if further information is needed.

Comment 10 Steven Rosenberg 2019-12-25 13:17:23 UTC
Created attachment 1647633 [details]
VM log during change cd

VM Log from the /var/log/libvirt/qemu directory after changecd. Please advise for further information.

Comment 11 Michal Privoznik 2020-01-06 09:05:32 UTC
Steven, can you please attach debug logs? I'm failing to reproduce this behaviour.

https://wiki.libvirt.org/page/DebugLogs

Comment 12 Steven Rosenberg 2020-01-06 10:04:26 UTC
Created attachment 1650075 [details]
Libvirt logs

Note: Failure happens for both VM1 and VMTest VMs.

Comment 13 Steven Rosenberg 2020-01-06 10:08:47 UTC
Created attachment 1650077 [details]
VDSM Log Created during Change CD

Accompanying VDSM Log with error:

2020-01-06 11:57:58,161+0200 ERROR (jsonrpc/5) [virt.vm] (vmId='5a4bdbf4-7cf1-4631-912e-689ec80e8eab') forceful updateDeviceFlags failed (vm:4606)
Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/vdsm/virt/vm.py", line 4603, in _changeBlockDev
    diskelem_xml, libvirt.VIR_DOMAIN_DEVICE_MODIFY_FORCE
  File "/usr/lib/python3.6/site-packages/vdsm/virt/virdomain.py", line 101, in f
    ret = attr(*args, **kwargs)
  File "/usr/lib/python3.6/site-packages/vdsm/common/libvirtconnection.py", line 131, in wrapper
    ret = f(*args, **kwargs)
  File "/usr/lib/python3.6/site-packages/vdsm/common/function.py", line 94, in wrapper
    return func(inst, *args, **kwargs)
  File "/usr/lib64/python3.6/site-packages/libvirt.py", line 2782, in updateDeviceFlags
    if ret == -1: raise libvirtError ('virDomainUpdateDeviceFlags() failed', dom=self)
libvirt.libvirtError: internal error: No device with bus 'ide' and target 'hdc'
2020-01-06 11:57:58,161+0200 ERROR (jsonrpc/5) [api] FINISH changeCD error=Failed to change disk image (api:131)
Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/vdsm/virt/vm.py", line 4603, in _changeBlockDev
    diskelem_xml, libvirt.VIR_DOMAIN_DEVICE_MODIFY_FORCE
  File "/usr/lib/python3.6/site-packages/vdsm/virt/virdomain.py", line 101, in f
    ret = attr(*args, **kwargs)
  File "/usr/lib/python3.6/site-packages/vdsm/common/libvirtconnection.py", line 131, in wrapper
    ret = f(*args, **kwargs)
  File "/usr/lib/python3.6/site-packages/vdsm/common/function.py", line 94, in wrapper
    return func(inst, *args, **kwargs)
  File "/usr/lib64/python3.6/site-packages/libvirt.py", line 2782, in updateDeviceFlags
    if ret == -1: raise libvirtError ('virDomainUpdateDeviceFlags() failed', dom=self)
libvirt.libvirtError: internal error: No device with bus 'ide' and target 'hdc'

Comment 14 Steven Rosenberg 2020-01-06 10:15:24 UTC
Created attachment 1650078 [details]
Screen Shot of Error when choosing a CD

Screen shot shows the actual error the user sees

Comment 15 Michal Privoznik 2020-01-06 10:52:34 UTC
Well, I think this is expected behaviour. You define and start your domain as:

<domain xmlns:ns0="http://ovirt.org/vm/tune/1.0" xmlns:ovirt-vm="http://ovirt.org/vm/1.0" type="kvm">
    <name>VMTest</name>
    <uuid>5a4bdbf4-7cf1-4631-912e-689ec80e8eab</uuid>
    <devices>
        <disk device="cdrom" snapshot="no" type="file">
            <driver error_policy="report" name="qemu" type="raw" />
            <source file="/rhev/data-center/mnt/vserver-spider.eng.lab.tlv.redhat.com:_pub_srosenbe_iso/e522ee24-1469-407a-8efe-d681be2589bd/images/11111111-1111-1111-1111-111111111111/CentOS-8-x86_64-1905-dvd1.iso" startupPolicy="optional">
                <seclabel model="dac" relabel="no" type="none" />
            </source>
            <target bus="sata" dev="sdc" />
            <readonly />
            <alias name="ua-50a2fab1-5515-4119-957a-a4d45b7cc2e0" />
            <address bus="0" controller="0" target="0" type="drive" unit="2" />
            <boot order="2" />
        </disk>
        <disk device="disk" snapshot="no" type="file">
            <target bus="scsi" dev="sda" />
            <source file="/rhev/data-center/mnt/vserver-spider.eng.lab.tlv.redhat.com:_pub_srosenbe_engine__master/797bd8a1-625d-443d-b10f-cec4ed74b185/images/040e79d4-5f62-4a52-b5fd-6dd8afbff18c/f780b9fe-8683-441f-aa7e-0b0b121535cd">
                <seclabel model="dac" relabel="no" type="none" />
            </source>
            <driver cache="none" error_policy="stop" io="threads" name="qemu" type="qcow2" />
            <alias name="ua-040e79d4-5f62-4a52-b5fd-6dd8afbff18c" />
            <address bus="0" controller="0" target="0" type="drive" unit="0" />
            <boot order="1" />
            <serial>040e79d4-5f62-4a52-b5fd-6dd8afbff18c</serial>
        </disk>
    </devices>
</domain>

That is without any IDE bus (the CD-ROM is attached to a SATA bus), but then try to update the device as:

<disk device="cdrom" type="file">
  <source file="/rhev/data-center/mnt/vserver-spider.eng.lab.tlv.redhat.com:_pub_srosenbe_iso/e522ee24-1469-407a-8efe-d681be2589bd/images/11111111-1111-1111-1111-111111111111/Win10_1909_English_x64.iso" />
  <target bus="ide" dev="hdc" />
</disk>, flags=0x4


What do you think the expected behaviour should be?

Comment 16 Steven Rosenberg 2020-01-12 15:34:14 UTC
(In reply to Michal Privoznik from comment #15)
> Well, I think this is expected behaviour. You define and start your domain
> as:
> 
> <domain xmlns:ns0="http://ovirt.org/vm/tune/1.0"
> xmlns:ovirt-vm="http://ovirt.org/vm/1.0" type="kvm">
>     <name>VMTest</name>
>     <uuid>5a4bdbf4-7cf1-4631-912e-689ec80e8eab</uuid>
>     <devices>
>         <disk device="cdrom" snapshot="no" type="file">
>             <driver error_policy="report" name="qemu" type="raw" />
>             <source
> file="/rhev/data-center/mnt/vserver-spider.eng.lab.tlv.redhat.com:
> _pub_srosenbe_iso/e522ee24-1469-407a-8efe-d681be2589bd/images/11111111-1111-
> 1111-1111-111111111111/CentOS-8-x86_64-1905-dvd1.iso"
> startupPolicy="optional">
>                 <seclabel model="dac" relabel="no" type="none" />
>             </source>
>             <target bus="sata" dev="sdc" />
>             <readonly />
>             <alias name="ua-50a2fab1-5515-4119-957a-a4d45b7cc2e0" />
>             <address bus="0" controller="0" target="0" type="drive" unit="2"
> />
>             <boot order="2" />
>         </disk>
>         <disk device="disk" snapshot="no" type="file">
>             <target bus="scsi" dev="sda" />
>             <source
> file="/rhev/data-center/mnt/vserver-spider.eng.lab.tlv.redhat.com:
> _pub_srosenbe_engine__master/797bd8a1-625d-443d-b10f-cec4ed74b185/images/
> 040e79d4-5f62-4a52-b5fd-6dd8afbff18c/f780b9fe-8683-441f-aa7e-0b0b121535cd">
>                 <seclabel model="dac" relabel="no" type="none" />
>             </source>
>             <driver cache="none" error_policy="stop" io="threads"
> name="qemu" type="qcow2" />
>             <alias name="ua-040e79d4-5f62-4a52-b5fd-6dd8afbff18c" />
>             <address bus="0" controller="0" target="0" type="drive" unit="0"
> />
>             <boot order="1" />
>             <serial>040e79d4-5f62-4a52-b5fd-6dd8afbff18c</serial>
>         </disk>
>     </devices>
> </domain>
> 
> That is without any IDE bus (the CD-ROM is attached to a SATA bus), but then
> try to update the device as:
> 
> <disk device="cdrom" type="file">
>   <source
> file="/rhev/data-center/mnt/vserver-spider.eng.lab.tlv.redhat.com:
> _pub_srosenbe_iso/e522ee24-1469-407a-8efe-d681be2589bd/images/11111111-1111-
> 1111-1111-111111111111/Win10_1909_English_x64.iso" />
>   <target bus="ide" dev="hdc" />
> </disk>, flags=0x4
> 
> 
> What do you think the expected behaviour should be?

I traced out this issue to the engine. When loading the VM, we do set the cdrom's bus to sata [1]. When we change the CD, the bus is sent as ide [2]. The ide comes from the os Repository, the os id = 30, which is referenced as "rhel_8x64" and uses the default os configuration which is set to the cd interface of "ide,q35/sata" [3] [4]. The parsinhg extracts the ide prefix and this is what is sent as the bus. According to Michal Privoznik, the discrepancy is what is causing the libvirt to fail. The question is what are the correct values that should be sent both for when the VM is loaded and for when the CD is changed?


[1]    <disk type="file" device="cdrom" snapshot="no">
      <driver name="qemu" type="raw" error_policy="report"/>
      <source file="/rhev/data-center/mnt/vserver-spider.eng.lab.tlv.redhat.com:_pub_srosenbe_iso/e522ee24-1469-407a-8efe-d681be2589bd/images/11111111-1111-1111-1111-111111111111/CentOS-8-x86_64-1905-dvd1.iso" startupPolicy="optional">
        <seclabel model="dac" type="none" relabel="no"/>
      </source>
      <target dev="sdc" bus="sata"/>
      <readonly/>
      <alias name="ua-50a2fab1-5515-4119-957a-a4d45b7cc2e0"/>
      <address bus="0" controller="0" unit="2" type="drive" target="0"/>
      <boot order="2"/>
    </disk>

[2] <disk device="cdrom" type="file">
  <source file="/rhev/data-center/mnt/vserver-spider.eng.lab.tlv.redhat.com:_pub_srosenbe_iso/e522ee24-1469-407a-8efe-d681be2589bd/images/11111111-1111-1111-1111-111111111111/Win10_1909_English_x64.iso" />
  <target bus="ide" dev="hdc" />
</disk>, flags=0x4

[3] os.other.devices.cdInterface.value = ide,q35/sata
os.other.devices.diskInterfaces.value = i440fx/IDE, VirtIO_SCSI, VirtIO, q35/SATA
os.other.devices.floppy.support.value = false
os.other.devices.floppy.support.value.4.3 = true
os.other.devices.floppy.support.value.4.2 = true

[4] https://github.com/oVirt/ovirt-engine/blob/master/packaging/conf/osinfo-defaults.properties#L71

Comment 17 Ryan Barry 2020-01-12 22:14:40 UTC
Q35 should never use IDE

Comment 18 Steven Rosenberg 2020-01-13 09:38:27 UTC
(In reply to Ryan Barry from comment #17)
> Q35 should never use IDE

OK, the Bios Type for the VM is "Cluster Default" and the Cluster's default is "Q35 Chipset With Legacy BIOS" for the Cluster's BIOS type. So the problem is that we are sending the prefix instead of detecting if the BIOS Type is Q35 and then sending the rest of the suffix which is SATA.

Comment 19 Shmuel Melamud 2020-02-26 10:33:07 UTC
> the default os configuration which is set to the cd interface of "ide,q35/sata" [3] [4]. The parsinhg extracts the ide prefix and this is what is sent as the bus

This is the main problem. The parsing should get the BIOS type, that is q35, and choose SATA. Why does it happen? Maybe incorrect parser is used or incorrect BIOS type is passed to it?

Comment 20 Steven Rosenberg 2020-02-26 17:58:21 UTC
(In reply to Shmuel Melamud from comment #19)
> > the default os configuration which is set to the cd interface of "ide,q35/sata" [3] [4]. The parsinhg extracts the ide prefix and this is what is sent as the bus
> 
> This is the main problem. The parsing should get the BIOS type, that is q35,
> and choose SATA. Why does it happen? Maybe incorrect parser is used or
> incorrect BIOS type is passed to it?

On further review, my test was as follows:

1. Q35 chipset type with Q35 custom emulation machine type does send the sata cd interface and the change cd for this combination was solved.
2. Legacy chipset type with pc-440fx 7.6.0 custom emulation machine type does send the ide cd interface and the change cd for this combination was solved.
3. The user can choose Legacy chipset type with the default Q35 emulation machine type which sends the sata cd interface and though the VM can load, the change cd fails.

There seems to be a conflict and it is not clear if the user should be able to choose Legacy chipset with a Q35 custom emulation machine type.

Please advise accordingly

Comment 21 Ryan Barry 2020-02-26 19:55:17 UTC
No, they should not. Q35 and i440fx are orthogonal

Comment 22 Steven Rosenberg 2020-03-08 09:59:30 UTC
Further checking reveals other scenarios as well related to Q35 / Legacy mixing due to the fact that the bios type default is the cluster's default which is Legacy while the emulation machine type's default has been changed to Q35, such as after launching and shutting down a VM, attempting to relaunch the VM fails dues to the following errors:

2020-03-05 13:08:36,152+02 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (ForkJoinPool-1-worker-23) [] EVENT_ID: VM_DOWN_ERROR(119), VM VM3 is down with error. Exit message: XML error: Invalid PCI address 0000:06:00.0. slot must be >= 1.
2020-03-05 13:08:36,160+02 INFO  [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer] (ForkJoinPool-1-worker-23) [] add VM '42a6ca73-357a-46eb-a6d9-413af1922be8'(VM3) to rerun treatment
2020-03-05 13:08:36,181+02 ERROR [org.ovirt.engine.core.vdsbroker.monitoring.VmsMonitoring] (ForkJoinPool-1-worker-23) [] Rerun VM '42a6ca73-357a-46eb-a6d9-413af1922be8'. Called from VDS 'Host2'
2020-03-05 13:08:36,215+02 WARN  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engine-Thread-425) [] EVENT_ID: USER_INITIATED_RUN_VM_FAILED(151), Failed to run VM VM3 on Host Host2.

Changing say the emulation machine type to be pci-i440fx rhel 7.6.0 resolves these issues.

As per Comment 21, there seems to be a need for some, if not all of the following:

1. Validating the bios type against the emulation machine type to ensure mixing between Legacy Bios Types and Q35 emulation machine types do not occur at the back end.
2. Changing the default so that a Legacy Bios Type and Q35 emulation machine type do not occur on default.
3. On the Webadmin, when Legacy bios types are chosen, the emulation machine type is automatically updated to legacy values such as pci-i440 choices, while Q35 Bios Types will automatically change the emulation machine type values to Q35 emulation machine type values and visa versa (changing the emulation machine types for example).
3. On the Webadmin side, when the Bios Type is Legacy only Legacy emulation machine types such as pci-i440 are displayed in the drop down and Q35 values are filtered whereas for Q35 Bios Types, only Q35 emulation machine type values are displayed.

Please advise accordingly.

Comment 23 Ryan Barry 2020-03-08 21:45:28 UTC
Ok. We are shipping with q35/legacybios as the default. Not i440fx/legacy. Not Q35/uefi. The change CD operation should be changed to address it as SATA on q35, including on change operations and in vdsm code.

Comment 24 Martin Tessun 2020-03-09 10:31:54 UTC
Fully agree with Ryan's assessment in c#23 here.

Depending on the path of the CD (which depends on the BIOS) - the change operations needs to address the "correct" device-path (either sata (q35) or IDE (i440fx)).
I believe we cannot define a CD to be connected to virtio-scsi or scsi? If that is the case the decision tree gets even more complicated.

Cheers,
Martin

Comment 25 Steven Rosenberg 2020-03-09 11:35:33 UTC
(In reply to Martin Tessun from comment #24)
> Fully agree with Ryan's assessment in c#23 here.
> 
> Depending on the path of the CD (which depends on the BIOS) - the change
> operations needs to address the "correct" device-path (either sata (q35) or
> IDE (i440fx)).
> I believe we cannot define a CD to be connected to virtio-scsi or scsi? If
> that is the case the decision tree gets even more complicated.
> 
> Cheers,
> Martin

OK, then are we supporting i440fx emulation machine type with Q35 Bios Type and should IDE also be sent for that? This combination also currently fails to launch the VM.

It seems there are a few issues in this area.

Comment 26 Steven Rosenberg 2020-03-09 11:46:46 UTC
Created attachment 1668651 [details]
Change CD is now blocked

After rebasing, the change CD functionality is now blocked. This as well seems to be a regression. Maybe someone can advise.

Comment 27 Ryan Barry 2020-03-09 12:14:17 UTC
(In reply to Steven Rosenberg from comment #25)
> OK, then are we supporting i440fx emulation machine type with Q35 Bios Type
> and should IDE also be sent for that? This combination also currently fails
> to launch the VM.
> 
> It seems there are a few issues in this area.

This doesn't make any sense. Q35 is not a BIOS type. i440fx is also not a BIOS type.

Both are chipsets, like virtualized motherboards. Q35 does not support IDE at all. 

Q35 supports UEFI or legacybios, no IDE.
i440fx supports legacybios only, with IDE

CD changing should be unblocked, then fixed

Comment 28 Steven Rosenberg 2020-03-09 13:07:39 UTC
(In reply to Ryan Barry from comment #27)
> (In reply to Steven Rosenberg from comment #25)
> > OK, then are we supporting i440fx emulation machine type with Q35 Bios Type
> > and should IDE also be sent for that? This combination also currently fails
> > to launch the VM.
> > 
> > It seems there are a few issues in this area.
> 
> This doesn't make any sense. Q35 is not a BIOS type. i440fx is also not a
> BIOS type.
> 
> Both are chipsets, like virtualized motherboards. Q35 does not support IDE
> at all. 
> 
> Q35 supports UEFI or legacybios, no IDE.
> i440fx supports legacybios only, with IDE
> 
> CD changing should be unblocked, then fixed

OK, then it seems validation as per Comment 22 is in order for disallowing say "Q35 Chipset with UEFI BIOS" Bios Type with "pc-i440fx-7.x.0" emulation machine type? Or they should also have worked and should be sending SATA (because it is a Q35 chipset)? As per Comment 23, there does seem to be a need for documenting the correct device path, as in what combinations are supported in order to define what combinations are disallowed. For example "Q35 Chipset 
 with Legacy BIOS" Bios Type should support i440fx or only Q35 emulation machine types and it seems should send SATA then? Currently there are many issues in this area including BZ # 1649847.

Comment 29 Ryan Barry 2020-03-09 13:31:09 UTC
No, as said in comment#21, they are orthogonal. Q35 machine types/bios types should not be able to be mixed with i440fx machine types/bios types, and this is not a supported configuration.

Q35 with legacy bios (or UEFI) should support only Q35 emulation types and should send SATA. i440fx should support only i440fx emulation types and ide

Comment 30 Martin Tessun 2020-03-09 15:51:35 UTC
Hi Steve,

for further clarification:
1. machine-types of "pc"-type are based on i440fx chipset from Intel. This chipset does have ISA.
2. machine-types of "q35"-type are based on ICH9 (Q35) chipset from Intel. This chipset does not support ISA.

Think of it as hardware - they are using different mainboards - but the same BIOS (SeaBIOS) whereas the q35-based machines can also have a UEFI based BIOS. Still the device-paths for CD do not change:

1. q35 has SATA->CD
2. i440fx has ISA->CD

If you need more information on q35 vs i440fx in qemu, check this page: https://wiki.qemu.org/Features/Q35
Also an old presentation (2012) from KVM Forum might help: https://www.linux-kvm.org/images/0/06/2012-forum-Q35.pdf

That said you cannot even do a '"Q35 Chipset with UEFI BIOS" Bios Type with "pc-i440fx-7.x.0" emulation machine type?' because the machine-type pc-i440fx-7.x.0 is i440fx chipset and not q35. As Ryan spelled out several times they are orthogonal.

Cheers,
Martin

Comment 31 Steven Rosenberg 2020-03-09 16:15:51 UTC
(In reply to Martin Tessun from comment #30)
> Hi Steve,
> 
> for further clarification:
> 1. machine-types of "pc"-type are based on i440fx chipset from Intel. This
> chipset does have ISA.
> 2. machine-types of "q35"-type are based on ICH9 (Q35) chipset from Intel.
> This chipset does not support ISA.
> 
> Think of it as hardware - they are using different mainboards - but the same
> BIOS (SeaBIOS) whereas the q35-based machines can also have a UEFI based
> BIOS. Still the device-paths for CD do not change:
> 
> 1. q35 has SATA->CD
> 2. i440fx has ISA->CD
> 
> If you need more information on q35 vs i440fx in qemu, check this page:
> https://wiki.qemu.org/Features/Q35
> Also an old presentation (2012) from KVM Forum might help:
> https://www.linux-kvm.org/images/0/06/2012-forum-Q35.pdf
> 
> That said you cannot even do a '"Q35 Chipset with UEFI BIOS" Bios Type with
> "pc-i440fx-7.x.0" emulation machine type?' because the machine-type
> pc-i440fx-7.x.0 is i440fx chipset and not q35. As Ryan spelled out several
> times they are orthogonal.
> 
> Cheers,
> Martin

I understood they were orthogonal, but as per Comment 23, Ryan stated that the default would be q35/Legacybios, which is as I understand a Q35 Emulation Machine Type while the Bios Type is Legacy which is what the engine does by default currently, but leads to failures when changing the CD as well as loading a VM after the first time. Subsequent loading fails. It does send the <target dev="sdc" bus="sata"/> for the cdrom device as per your explanation. So the issue is if we allow this because we treat legacy Bios as Q35 when the emulation machine type is q35 (the default scenario) and send the sata in the domain xml (every time we load the VM), then this is a libvirt bug that the second loading fails. 

This is really the question, whether to disallow the combination or whether it is a problem in libvirt. I understand they are independent, but one needs to know which combination libvirt expects.

Comment 32 Ryan Barry 2020-03-09 19:46:49 UTC
This is not a libvirt bug. It works as expected with Q35 VMs outside of RHV (virt-manager, for example).

The engine manages libvirt XML, including on VM shutdown/start and CD changes, and we are in complete control of this. comment#19 (the parser) addresses the concerns. The engine knows everything about the VM, including what chipset and firmware is being used.

Again, i440fx should always use ide.
Q35 should always use SATA.
Mixing "Q35 chipset with i440fx machine type" should not be allowed.

It doesn't matter whether Q35 uses a legacy bios or not. It is still Q35. We should block the combination of "Q35 chipset with i440fx machine type", but users would need to manually set a custom machine type in that case anyway, and the VM won't start.

Please test the following:

* Q35 machine (doesn't matter what BIOS type) starts with SATA. CD change is sent as SATA. Shutting down/restarting the VM keeps the type as SATA.
* The same for i440fx

Blocking the combination can be done after we establish whether there are bugs anywhere here. And, to be clear, if there are bugs, they are almost certainly RHV, not libvirt. comment#15 clearly shows the root cause and what the engine needs to do.

Comment 33 Steven Rosenberg 2020-03-10 11:46:42 UTC
(In reply to Ryan Barry from comment #32)
> This is not a libvirt bug. It works as expected with Q35 VMs outside of RHV
> (virt-manager, for example).
> 
> The engine manages libvirt XML, including on VM shutdown/start and CD
> changes, and we are in complete control of this. comment#19 (the parser)
> addresses the concerns. The engine knows everything about the VM, including
> what chipset and firmware is being used.
> 
> Again, i440fx should always use ide.
> Q35 should always use SATA.
> Mixing "Q35 chipset with i440fx machine type" should not be allowed.
> 
> It doesn't matter whether Q35 uses a legacy bios or not. It is still Q35. We
> should block the combination of "Q35 chipset with i440fx machine type", but
> users would need to manually set a custom machine type in that case anyway,
> and the VM won't start.
> 
> Please test the following:
> 
> * Q35 machine (doesn't matter what BIOS type) starts with SATA. CD change is
> sent as SATA. Shutting down/restarting the VM keeps the type as SATA.
> * The same for i440fx
> 
> Blocking the combination can be done after we establish whether there are
> bugs anywhere here. And, to be clear, if there are bugs, they are almost
> certainly RHV, not libvirt. comment#15 clearly shows the root cause and what
> the engine needs to do.

OK, this answers some of the questions on what is supported. Shmuel did fix the Change CD issue with sending the sata for Q35 as per Comment 20 as tested previously. The issues are mixing of which as I understand i440fx emulation machine type (emulating the hardware) will not be supported by Q35 chipset Bios Types.

It seems Q35 emulation machine types can be supported by Legacy Bios Types. In testing this failed as per Comment 20, Item 3. Comment 22 shows there are other issues with this combination which seem to be an invalid slot. Comment 26 shows the the Remote Viewer no longer allows for further testing. Upon some investigation, the remote viewer is not part of the engine which currently blocks further testing on this issue.

Comment 35 Eduardo Lima (Etrunko) 2020-03-10 20:27:52 UTC
(In reply to Steven Rosenberg from comment #26)
> Created attachment 1668651 [details]
> Change CD is now blocked
> 
> After rebasing, the change CD functionality is now blocked. This as well
> seems to be a regression. Maybe someone can advise.

This is a different message than the first one. Can you provide the logs for remote-viewer just as provided in comment #1?

$ REST_DEBUG=proxy remote-viewer console.vv --spice-debug

Comment 36 Steven Rosenberg 2020-03-11 09:14:52 UTC
Created attachment 1669197 [details]
Remote Viewer debug fails to connect

As per the screen shot, running the command suggested [1] fails due to a connection type failure. We think this may be part of the problem. 



[1] REST_DEBUG=proxy remote-viewer console.vv --spice-debug

Comment 37 Eduardo Lima (Etrunko) 2020-03-11 13:02:40 UTC
> Created attachment 1669197 [details]
> Remote Viewer debug fails to connect
> 
> As per the screen shot, running the command suggested [1] fails due to a
> connection type failure. We think this may be part of the problem. 


This error usually happens when the console.vv file is not found. Please make sure you have it in the directory provided. I suggest you pass the full path for the console.vv file as argument.

Comment 38 Steven Rosenberg 2020-03-11 18:14:33 UTC
(In reply to Eduardo Lima (Etrunko) from comment #37)
> > Created attachment 1669197 [details]
> > Remote Viewer debug fails to connect
> > 
> > As per the screen shot, running the command suggested [1] fails due to a
> > connection type failure. We think this may be part of the problem. 
> 
> 
> This error usually happens when the console.vv file is not found. Please
> make sure you have it in the directory provided. I suggest you pass the full
> path for the console.vv file as argument.

OK, now it worked by choosing show in folder. This is the error:

 GET /ovirt-engine/api HTTP/1.1
> Soup-Debug-Timestamp: 1583950293
> Soup-Debug: SoupSessionAsync 1 (0x55c1b8d4c100), SoupMessage 1 (0x55c1b8c8c0c0), SoupSocket 1 (0x7fee28008980)
> Host: localhost.localdomain:8443
> Content-Type: application/xml
> Version: 3
> All-Content: true
> Filter: false
> Authorization: Bearer jHRxchFgEb3lZ_gHnxt0VjGp3o4c6WB3KcGRY1c6lT6RzSQiHmOUaJKdIFoPvuib0up70Vk1G3aCG5vm_b-vlQ
> Connection: Keep-Alive
> 
> <action></action>
  
(remote-viewer:10441): GSpice-DEBUG: 20:11:33.056: spice-widget.c:1876 0:0 focus_out_event
(remote-viewer:10441): GSpice-DEBUG: 20:11:33.056: spice-widget.c:1484 0:0 release_keys
< HTTP/1.1 405 Method Not Allowed
< Soup-Debug-Timestamp: 1583950293
< Soup-Debug: SoupMessage 1 (0x55c1b8c8c0c0)
< Connection: keep-alive
< Content-Type: text/html;charset=UTF-8
< Content-Length: 103
< Correlation-Id: 924635c1-41c1-4703-b341-98e069c2fe5d
< Date: Wed, 11 Mar 2020 18:11:33 GMT
< 
< <html><head><title>Error</title></head><body>HTTP method GET is not supported by this URL</body></html>
  
(remote-viewer:10441): GSpice-DEBUG: 20:11:43.514: spice-gtk-session.c:587 clipboard_get_targets:
(remote-viewer:10441): GSpice-DEBUG: 20:11:43.514: spice-gtk-session.c:622  "TIMESTAMP"
(remote-viewer:10441): GSpice-DEBUG: 20:11:43.514: spice-gtk-session.c:622  "TARGETS"
(remote-viewer:10441): GSpice-DEBUG: 20:11:43.514: spice-gtk-session.c:622  "MULTIPLE"
(remote-viewer:10441): GSpice-DEBUG: 20:11:43.514: spice-gtk-session.c:622  "UTF8_STRING"
(remote-viewer:10441): GSpice-DEBUG: 20:11:43.514: spice-gtk-session.c:622  "COMPOUND_TEXT"
(remote-viewer:10441): GSpice-DEBUG: 20:11:43.514: spice-gtk-session.c:622  "TEXT"
(remote-viewer:10441): GSpice-DEBUG: 20:11:43.514: spice-gtk-session.c:622  "STRING"
(remote-viewer:10441): GSpice-DEBUG: 20:11:43.514: spice-gtk-session.c:622  "text/plain;charset=utf-8"
(remote-viewer:10441): GSpice-DEBUG: 20:11:43.514: spice-gtk-session.c:622  "text/plain"
(remote-viewer:10441): GSpice-DEBUG: 20:11:48.862: spice-gtk-session.c:587 clipboard_get_targets:
(remote-viewer:10441): GSpice-DEBUG: 20:11:48.862: spice-gtk-session.c:622  "TIMESTAMP"
(remote-viewer:10441): GSpice-DEBUG: 20:11:48.862: spice-gtk-session.c:622  "TARGETS"
(remote-viewer:10441): GSpice-DEBUG: 20:11:48.862: spice-gtk-session.c:622  "MULTIPLE"
(remote-viewer:10441): GSpice-DEBUG: 20:11:48.862: spice-gtk-session.c:622  "UTF8_STRING"
(remote-viewer:10441): GSpice-DEBUG: 20:11:48.862: spice-gtk-session.c:622  "COMPOUND_TEXT"
(remote-viewer:10441): GSpice-DEBUG: 20:11:48.862: spice-gtk-session.c:622  "TEXT"
(remote-viewer:10441): GSpice-DEBUG: 20:11:48.862: spice-gtk-session.c:622  "STRING"
(remote-viewer:10441): GSpice-DEBUG: 20:11:48.862: spice-gtk-session.c:622  "text/plain;charset=utf-8"
(remote-viewer:10441): GSpice-DEBUG: 20:11:48.863: spice-gtk-session.c:622  "text/plain"

Comment 41 Sandro Bonazzola 2020-03-16 08:47:29 UTC
Looks like libgovirt is using SDK v3 for connecting to oVirt. SDK v3 was deprecated in the past versions and has been removed in oVirt 4.4.
We need libgovirt to be refactored for using SDK v4.

Comment 42 Steven Rosenberg 2020-03-16 09:07:11 UTC
(In reply to Sandro Bonazzola from comment #41)
> Looks like libgovirt is using SDK v3 for connecting to oVirt. SDK v3 was
> deprecated in the past versions and has been removed in oVirt 4.4.
> We need libgovirt to be refactored for using SDK v4.


Sandro is referring to this error from the ovirt-engine when the Change CD on the Remote Viewer is invoked [1]


This is a blocker for the original issue.

Please advise.

[1] 2020-03-15 13:46:42,918+02 WARN  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (default task-6) [7c6f1298] EVENT_ID: DEPRECATED_API(13,000), Client from address "0:0:0:0:0:0:0:1" is using version 3 of the API, which has been deprecated since version 4.0 of the engine, and will no longer be supported starting with version 4.3. It is highly recommended to update the client to use a supported version of the API and the SDKs, before upgrading to version 4.3 of the engine.

Comment 52 Steven Rosenberg 2020-03-20 08:36:32 UTC
Created attachment 1671772 [details]
Engine log when simulating Change CD

Comment 53 Steven Rosenberg 2020-03-20 08:50:05 UTC
Created attachment 1671794 [details]
VDSM Log

Note: There are errors in the vdsm log from libvirt, but they happen too often to be related to the Change CD which was only tested twice somewhere between 10:22 and 10:26:

2020-03-20 10:23:06,475+0200 ERROR (jsonrpc/7) [virt.vm] (vmId='21830a63-f34f-4851-b206-b8d9b3224213') Operation failed (vm:4788)
Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/vdsm/virt/vm.py", line 4753, in setCpuTuneQuota
    self._dom.setSchedulerParameters({'vcpu_quota': int(quota)})
  File "/usr/lib/python3.6/site-packages/vdsm/virt/virdomain.py", line 101, in f
    ret = attr(*args, **kwargs)
  File "/usr/lib/python3.6/site-packages/vdsm/common/libvirtconnection.py", line 131, in wrapper
    ret = f(*args, **kwargs)
  File "/usr/lib/python3.6/site-packages/vdsm/common/function.py", line 94, in wrapper
    return func(inst, *args, **kwargs)
  File "/usr/lib64/python3.6/site-packages/libvirt.py", line 2555, in setSchedulerParameters
    if ret == -1: raise libvirtError ('virDomainSetSchedulerParameters() failed', dom=self)
libvirt.libvirtError: Requested operation is not valid: cgroup CPU controller is not mounted
2020-03-20 10:23:06,475+0200 INFO  (jsonrpc/7) [api.virt] FINISH setCpuTuneQuota return={'status': {'code': 62, 'message': 'Requested operation is not valid: cgroup CPU controller is not mounted'}} from=::1,46100, vmId=21830a63-f34f-4851-b206-b8d9b3224213 (api:54)
2020-03-20 10:23:06,475+0200 INFO  (jsonrpc/7) [jsonrpc.JsonRpcServer] RPC call VM.setCpuTuneQuota failed (error 62) in 0.00 seconds (__init__:312)
2020-03-20 10:23:06,478+0200 INFO  (jsonrpc/0) [api.virt] START setCpuTunePeriod(period=100000) from=::1,46100, vmId=21830a63-f34f-4851-b206-b8d9b3224213 (api:48)
2020-03-20 10:23:06,478+0200 ERROR (jsonrpc/0) [virt.vm] (vmId='21830a63-f34f-4851-b206-b8d9b3224213') Operation failed (vm:4788)
Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/vdsm/virt/vm.py", line 4765, in setCpuTunePeriod
    self._dom.setSchedulerParameters({'vcpu_period': int(period)})
  File "/usr/lib/python3.6/site-packages/vdsm/virt/virdomain.py", line 101, in f
    ret = attr(*args, **kwargs)
  File "/usr/lib/python3.6/site-packages/vdsm/common/libvirtconnection.py", line 131, in wrapper
    ret = f(*args, **kwargs)
  File "/usr/lib/python3.6/site-packages/vdsm/common/function.py", line 94, in wrapper
    return func(inst, *args, **kwargs)
  File "/usr/lib64/python3.6/site-packages/libvirt.py", line 2555, in setSchedulerParameters
    if ret == -1: raise libvirtError ('virDomainSetSchedulerParameters() failed', dom=self)
libvirt.libvirtError: Requested operation is not valid: cgroup CPU controller is not mounted
2020-03-20 10:23:06,478+0200 INFO  (jsonrpc/0) [api.virt] FINISH setCpuTunePeriod return={'status': {'code': 62, 'message': 'Requested operation is not valid: cgroup CPU controller is not mounted'}} from=::1,46100, vmId=21830a63-f34f-4851-b206-b8d9b3224213 (api:54)
2020-03-20 10:23:06,478+0200 INFO  (jsonrpc/0) [jsonrpc.JsonRpcServer] RPC call VM.setCpuTunePeriod failed (error 62) in 0.00 seconds (__init__:312)
2020-03-20 10:23:08,484+0200 INFO  (periodic/0) [vdsm.api] START repoStats(domains=()) from=internal, task_id=9dc8fbfd-4dd0-45f1-99b1-1f44a0490e3c (api:48)

Comment 54 Steven Rosenberg 2020-03-20 08:51:30 UTC
Created attachment 1671796 [details]
qemu log

Comment 57 Steven Rosenberg 2020-03-23 15:44:26 UTC
Created attachment 1672742 [details]
Legacy Bios with Q35 emulation still and issue

Installing libgovirt version 0.3.4-10.el8_1 on the engine did resolve the issue with the "Method Not Allowed". With this version, the Change CD does display the iso file list. Please ensure this version is installed with ovirt for further testing.

As per Comment 20, matching Legacy Bios Type with i440fx Emulation Machine Type works as does matching Q35 Bios Type with Q35 Emulation Machine Type.

The issue is still with choosing Legacy Bios Type with Q35 Emulation Machine Type which causes an error when double clicking on an iso file from the list as per this screen shot.

Being that this is an advanced feature, it is up to the user to choose the correct Bios Type and Emulation machine Type. Therefore, this seems to resolve these issues.

Comment 58 Ryan Barry 2020-03-23 15:46:28 UTC
Great. ON_QA, then?

Comment 59 Radek Duda 2020-03-27 15:08:16 UTC
I tried CD changing with new libgovirt version 0.3.4-10.el8_1.
If I use legacy BIOS type cluster, which is default in RHV 4.4, Q35 emulation is chosen (Cluster defaul in VM creation dialog) -> not working CD changing feature then.
Shouldn't be chosen i440fx emulation as Cluster parameter when choosing legacy BIOS?
I do not think this bug is properly solved when it does not work with default values.

Comment 60 Ryan Barry 2020-03-27 17:42:34 UTC
Is that a mix of machine types, or just BIOS? If it's just BIOS, then yes, this needs more work

Comment 61 Steven Rosenberg 2020-03-29 08:40:42 UTC
(In reply to Radek Duda from comment #59)
> I tried CD changing with new libgovirt version 0.3.4-10.el8_1.
> If I use legacy BIOS type cluster, which is default in RHV 4.4, Q35
> emulation is chosen (Cluster defaul in VM creation dialog) -> not working CD
> changing feature then.
> Shouldn't be chosen i440fx emulation as Cluster parameter when choosing
> legacy BIOS?
> I do not think this bug is properly solved when it does not work with
> default values.

The default values are currently incorrect because the fix to resolve that is still pending. To test this one needs to ensure that if the Bios Type is Legacy Bios, that the emulation machine type is i440fx or if the Bios Type is a Q35 Bios Type, that the emulation machine type is a Q35 emulation machine type.

Comment 62 Nisim Simsolo 2020-04-05 13:51:22 UTC
Verified: 
ovirt-engine-4.4.0-0.29.master.el8ev
vdsm-4.40.9-1.el8ev.x86_64
libvirt-daemon-6.0.0-16.module+el8.2.0+6131+4e715f3b.x86_64
qemu-kvm-4.2.0-17.module+el8.2.0+6131+4e715f3b.x86_64

Verification scenario:
for steps 1-4 set RHEL 8 OS type in edit VM dialog
1. Run RHEL8 VM: Q35 chipset with legacy BIOS and Q35-rhel-8.2.0 custom emulated machine type
2. Run RHEL8 VM: Q35 chipset with UEFI BIOS and Q35-rhel-8.2.0 custom emulated machine type
3. Run RHEL8 VM: Q35 chipset with SecureBoot and Q35-rhel-8.2.0 custom emulated machine type
4. Run RHEL8 VM: Legacy BIOS with pc-i440fx-rhel7.6.0custom emulated machine type

Steps 1-4 expected results:
Verify it is possible to change CD more than 10 times (On each change, CD should be readable from the VM)
Verify it is possible to remove and add CD 10 times (On each change,  CD should be readable from the VM)
Observe vdsm.log and Verify there are no errors related to this action and target bus in VM domxml is never set as IDE (bus should be sata, except for step 4.

5. Repeat steps 1-4 for Windows 10 VM.

Comment 63 Sandro Bonazzola 2020-05-20 20:01:59 UTC
This bugzilla is included in oVirt 4.4.0 release, published on May 20th 2020.

Since the problem described in this bug report should be
resolved in oVirt 4.4.0 release, it has been closed with a resolution of CURRENT RELEASE.

If the solution does not work for you, please open a new bug report.


Note You need to log in before you can comment on or make changes to this bug.