Bug 1560666 - Hosted Engine VM (deployed in the past) fails to reboot with 'libvirtError: internal error: failed to format device alias for PTY retrieval' due to an error in console device in libvirt XML generated by the engine
Summary: Hosted Engine VM (deployed in the past) fails to reboot with 'libvirtError: i...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: ovirt-hosted-engine-ha
Classification: oVirt
Component: Agent
Version: 2.2.5
Hardware: Unspecified
OS: Unspecified
urgent
high
Target Milestone: ovirt-4.2.2
: 2.2.10
Assignee: Andrej Krejcir
QA Contact: Nikolai Sednev
URL:
Whiteboard:
Depends On: 1556971 1566072 1566111
Blocks: 1458711 1504606 1560976
TreeView+ depends on / blocked
 
Reported: 2018-03-26 16:59 UTC by Simone Tiraboschi
Modified: 2018-04-27 07:22 UTC (History)
9 users (show)

Fixed In Version: ovirt-hosted-engine-ha-2.2.10-1.el7ev
Doc Type: No Doc Update
Doc Text:
undefined
Clone Of:
: 1560976 (view as bug list)
Environment:
Last Closed: 2018-04-27 07:22:03 UTC
oVirt Team: SLA
Embargoed:
rule-engine: ovirt-4.2+
rule-engine: blocker+


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
oVirt gerrit 89483 0 master MERGED Ignore engine generated libvirt XML 2018-03-27 11:58:35 UTC
oVirt gerrit 89485 0 v2.2.z MERGED Ignore engine generated libvirt XML 2018-03-27 11:58:53 UTC
oVirt gerrit 89784 0 master MERGED ovf: consume engine XML eventually truncating console alias 2018-04-04 07:53:06 UTC
oVirt gerrit 89805 0 v2.2.z MERGED ovf: consume engine XML eventually truncating console alias 2018-04-04 08:39:10 UTC

Description Simone Tiraboschi 2018-03-26 16:59:08 UTC
Description of problem:
Since 4.2.2, ovirt-ha-agent extracts the XML from libvirt generated by the engine and pass it to VDSM.
In system deployed in the past (with vintage otopi flow), the hosted-engine VM was containing a console device that gets wrongly rendered in that XML.

        <console type="pty">
            <target port="0" type="virtio"/>
            <alias name="ua-c60aba6e-b6d8-448b-ab6e-8c7b5c29f351"/>
        </console>

Systems deployed with the ansible flow are not affected since the engine VM got create by the engine via REST APIs.


Version-Release number of selected component (if applicable):
4.2.2-rc5 (beta3)

How reproducible:
?

Steps to Reproduce:
1. deploy hosted-engine with the vintage flow
2. wait for the engine to create OVF_STORE disks
3. try to restart the engine VM

Actual results:
the XML generated from the engine contains
        <console type="pty">
            <target port="0" type="virtio"/>
            <alias name="ua-c60aba6e-b6d8-448b-ab6e-8c7b5c29f351"/>
        </console>

the VM fails to start with:
2018-03-26 11:54:31,272-0400 ERROR (vm/eccb9cb6) [virt.vm] (vmId='eccb9cb6-affd-4806-8200-708370581227') The vm start process failed (vm:940)
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 869, in _startUnderlyingVm
    self._run()
  File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 2832, in _run
    dom.createWithFlags(flags)
  File "/usr/lib/python2.7/site-packages/vdsm/common/libvirtconnection.py", line 130, in wrapper
    ret = f(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/vdsm/common/function.py", line 92, in wrapper
    return func(inst, *args, **kwargs)
  File "/usr/lib64/python2.7/site-packages/libvirt.py", line 1099, in createWithFlags
    if ret == -1: raise libvirtError ('virDomainCreateWithFlags() failed', dom=self)
libvirtError: internal error: failed to format device alias for PTY retrieval

Expected results:
The engine VM is able to restart from systems deployed in the past also booting from the XML created by the engine.

Additional info:
In only affects systems deployed with the vintage flow

Comment 1 jiyan 2018-03-28 07:02:29 UTC
Hi, Simone.

Version:
vdsm-4.20.23-1.el7ev.x86_64
qemu-kvm-rhev-2.10.0-21.el7_5.1.x86_64
kernel-3.10.0-862.el7.x86_64
libvirt-3.9.0-14.virtcov.el7_5.2.x86_64

Could you please help me with how to configure "pty" console in RHV web UI, I can only see 'unix' console generated. Thank you very much.
...
    <console ** type='unix' **>
      <source mode='bind' path='/var/run/ovirt-vmconsole-console/df899f5c-db94-48b2-867a-e0c266b59b7a.sock'/>
      <target type='serial' port='0'/>
      <alias name='serial0'/>
    </console>
...

And another doubt, I could not see 'ua-' alias for console, serial and channel by cheking dump xml file of VM in register host.
...
    <serial type='unix'>
      <source mode='bind' path='/var/run/ovirt-vmconsole-console/df899f5c-db94-48b2-867a-e0c266b59b7a.sock'/>
      <target type='isa-serial' port='0'>
        <model name='isa-serial'/>
      </target>
      ** <alias name='serial0'/>
    </serial>
    <console type='unix'>
      <source mode='bind' path='/var/run/ovirt-vmconsole-console/df899f5c-db94-48b2-867a-e0c266b59b7a.sock'/>
      <target type='serial' port='0'/>
      ** <alias name='serial0'/>
    </console>
    <channel type='unix'>
      <source mode='bind' path='/var/lib/libvirt/qemu/channels/df899f5c-db94-48b2-867a-e0c266b59b7a.ovirt-guest-agent.0'/>
      <target type='virtio' name='ovirt-guest-agent.0' state='disconnected'/>
      ** <alias name='channel0'/>
      <address type='virtio-serial' controller='0' bus='0' port='1'/>
    </channel>
    <channel type='unix'>
      <source mode='bind' path='/var/lib/libvirt/qemu/channels/df899f5c-db94-48b2-867a-e0c266b59b7a.org.qemu.guest_agent.0'/>
      <target type='virtio' name='org.qemu.guest_agent.0' state='disconnected'/>
      ** <alias name='channel1'/>
      <address type='virtio-serial' controller='0' bus='0' port='2'/>
    </channel>
...

Comment 2 Simone Tiraboschi 2018-03-28 07:13:43 UTC
In the vintage hosted-engine deployment, hosted-engine-setup was directly creating a VM over vdsm and the engine was importing it (adding aliases and so on), I'm not sure you can reproduce this properly creating a VM from admin UI.

To reproduce on hosted-engine:
1. deploy hosted-engine with the vintage flow (4.1 or 4.2 passing --noansible option to ovirt-hosted-engine-setup)
2. connect to the engine and add another storage domain
3. wait for the engine VM to be imported by the engine
4. wait for the OVF_STORE disks to appear (normally after 60 minutes, you can force to appear earlier editing engine VM configuration from the engine)

Comment 3 Andrej Krejcir 2018-03-28 09:42:54 UTC
I tried reproducing this with master vdsm and ovirt-hosted-engine (without the workaround patch) on centos 7.4.
The VM starts successfully, there is no error.

The XML generated by the engine contains:
<console type="pty">
   <target type="virtio" port="0" />
   <alias name="ua-ba3264b3-1a04-4e7b-a590-9c4528d63ac6" />
</console>

And after the HE VM starts, 'virsh -r dumpxml HostedEngine' shows:
<console type='pty' tty='/dev/pts/2'>
  <source path='/dev/pts/2'/>
  <target type='virtio' port='0'/>
  <alias name='console0'/>
</console>


Versions:
  libvirt - 3.2.0-14.el7_4.9
  vdsm - 4.30.0-176.gitb930fd4.el7.centos
  ovirt-hosted-engine-ha - 2.3.0-0.0.master.20180323105559.20180323105555.git558fa11.el7.centos
  ovirt-hosted-engine-setup - 2.3.0-0.0.master.20180323165102.git5a3d63d.el7.centos


I will try it again with a newer version of libvirt.

Comment 4 Simone Tiraboschi 2018-03-28 10:04:04 UTC
(In reply to Andrej Krejcir from comment #3)
> I tried reproducing this with master vdsm and ovirt-hosted-engine (without
> the workaround patch) on centos 7.4.

As far as I understood, it's specific to RHEL 7.5 with a fresher libvirt

Comment 5 Martin Sivák 2018-04-03 15:38:08 UTC
The fix is wrong. We should fix the alias when starting the VM instead of disabling a feature that serves as a base for dozen of other fixes.

Comment 6 Nikolai Sednev 2018-04-22 12:02:57 UTC
1. deployed hosted-engine with the vintage flow over iSCSI.
2. connected to the engine and added another NFS storage domain.
3. waited for the engine VM to be imported by the engine.
4. wait for the OVF_STORE disks to appear (I've forced it to appear earlier by editing the engine VM with attachment of 2 additional VCPUs from UI).
5. set the host in to global maintenance and powered-off the engine.
6. removed global maintenance.
7. waited for the engine to get started by ha-agent.
8. engine got started and I've connected to it's UI.

virsh -r dumpxml HostedEngine
    </interface>
    <console type='pty' tty='/dev/pts/1'>
      <source path='/dev/pts/1'/>
      <target type='virtio' port='0'/>
      <alias name='ua-633d97eb-5b89-4774-8c'/>
    </console>

libvirt-client-3.9.0-14.el7_5.3.x86_64
ovirt-hosted-engine-setup-2.2.18-1.el7ev.noarch
ovirt-hosted-engine-ha-2.2.10-1.el7ev.noarch
rhvm-appliance-4.2-20180420.0.el7.noarch
vdsm-common-4.20.26-1.el7ev.noarch
Linux 3.10.0-862.el7.x86_64 #1 SMP Wed Mar 21 18:14:51 EDT 2018 x86_64 x86_64 x86_64 GNU/Linux
Red Hat Enterprise Linux Server release 7.5 (Maipo)

Comment 7 Nikolai Sednev 2018-04-22 12:09:50 UTC
Works for me using vintage deployment flow on latest components forth to previous comment #6. Moving to verified.
Please feel free to reopen in case that you still being able to reproduce.

Comment 8 Sandro Bonazzola 2018-04-27 07:22:03 UTC
This bugzilla is included in oVirt 4.2.2 release, published on March 28th 2018.

Since the problem described in this bug report should be
resolved in oVirt 4.2.2 release, it has been closed with a resolution of CURRENT RELEASE.

If the solution does not work for you, please open a new bug report.


Note You need to log in before you can comment on or make changes to this bug.