Bug 1605198 - [downstream clone - 4.2.5] Hit Xorg Segmentation fault while installing rhel7.4 release guest in RHV 4.2 with QXL
Summary: [downstream clone - 4.2.5] Hit Xorg Segmentation fault while installing rhel7...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-engine
Version: 4.2.1
Hardware: x86_64
OS: Linux
medium
medium
Target Milestone: ovirt-4.2.5
: ---
Assignee: Milan Zamazal
QA Contact: Liran Rotenberg
URL:
Whiteboard:
Depends On: 1520848
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-07-20 12:34 UTC by RHV bug bot
Modified: 2021-03-11 20:22 UTC (History)
16 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
This update adjusts virtual machine video RAM settings to provide enough RAM for any modern Linux guest operating system, not just RHEL 7.
Clone Of: 1520848
Environment:
Last Closed: 2018-07-31 17:49:18 UTC
oVirt Team: Virt
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2018:2318 0 None None None 2018-07-31 17:50:03 UTC
oVirt gerrit 92145 0 'None' 'MERGED' 'core: Set vramMultiplier for all x86 Linux systems' 2019-11-27 02:27:10 UTC
oVirt gerrit 92329 0 'None' 'MERGED' 'core: Set vramMultiplier for all x86 Linux systems' 2019-11-27 02:27:10 UTC

Description RHV bug bot 2018-07-20 12:34:11 UTC
+++ This bug is a downstream clone. The original bug is: +++
+++   bug 1520848 +++
======================================================================

Created attachment 1363136 [details]
X.log

Description of problem:
Failed to install rhel7.4 release guest in RHV4.2 with QXL, hit Xorg Segmentation fault

Version-Release number of selected component (if applicable):
xorg-x11-drv-qxl-0.1.5-3.el7
vdsm-4.20.8-1.el7ev.x86_64
libvirt-daemon-3.2.0-14.el7_4.4.x86_64
qemu-kvm-rhev-2.9.0-16.el7_4.12.x86_64
ovirt-engine-dashboard-1.2.0-0.8.el7ev.noarch

How reproducible:
100%

Steps to Reproduce:
1. Try to install a guest in RHV4.2 web console with Video Type: QXL, the related xml are as below:
    <video>
      <model type='qxl' ram='65536' vram='8192' vgamem='16384' heads='1' primary='yes'/>
      <alias name='video0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/>
    </video>

2. Failed to install the guest, the Backtrace as below:
[  9610.279] (EE) Backtrace:
[  9610.354] (EE) 0: /usr/bin/Xorg (xorg_backtrace+0x55) [0x558dca8a1655]
[  9610.354] (EE) 1: /usr/bin/Xorg (0x558dca6f5000+0x1b0369) [0x558dca8a5369]
[  9610.354] (EE) 2: /lib64/libpthread.so.0 (0x7fc1b36b4000+0xf5e0) [0x7fc1b36c35e0]
[  9610.354] (EE) 3: /usr/lib64/xorg/modules/drivers/qxl_drv.so (0x7fc1aeb76000+0x8f34) [0x7fc1aeb7ef34]
[  9610.354] (EE) 4: /usr/lib64/xorg/modules/drivers/qxl_drv.so (0x7fc1aeb76000+0x9456) [0x7fc1aeb7f456]
[  9610.354] (EE) 5: /usr/lib64/xorg/modules/drivers/qxl_drv.so (0x7fc1aeb76000+0x17075) [0x7fc1aeb8d075]
[  9610.354] (EE) 6: /usr/bin/Xorg (miCopyRegion+0x1ba) [0x558dca8824ea]
[  9610.354] (EE) 7: /usr/bin/Xorg (miDoCopy+0x470) [0x558dca882a90]
[  9610.354] (EE) 8: /usr/lib64/xorg/modules/drivers/qxl_drv.so (0x7fc1aeb76000+0x15de6) [0x7fc1aeb8bde6]
[  9610.354] (EE) 9: /usr/bin/Xorg (0x558dca6f5000+0x138c3f) [0x558dca82dc3f]
[  9610.354] (EE) 10: /usr/bin/Xorg (0x558dca6f5000+0x4f982) [0x558dca744982]
[  9610.354] (EE) 11: /usr/bin/Xorg (0x558dca6f5000+0x53a2b) [0x558dca748a2b]
[  9610.354] (EE) 12: /usr/bin/Xorg (0x558dca6f5000+0x57aca) [0x558dca74caca]
[  9610.354] (EE) 13: /lib64/libc.so.6 (__libc_start_main+0xf5) [0x7fc1b3312c05]
[  9610.354] (EE) 14: /usr/bin/Xorg (0x558dca6f5000+0x41bce) [0x558dca736bce]
[  9610.354] (EE)
[  9610.355] (EE) Segmentation fault at address 0x0
[  9610.355] (EE)
Fatal server error:
[  9610.355] (EE) Caught signal 11 (Segmentation fault). Server aborting

Actual results:
Failed to install the rhel7.4 release guest.

Expected results:
The guest is installed successfully.

Additional info:
X.log, anaconda.log

(Originally by Chenli Hu)

Comment 3 RHV bug bot 2018-07-20 12:34:28 UTC
Created attachment 1363137 [details]
anaconda.log

(Originally by Chenli Hu)

Comment 4 RHV bug bot 2018-07-20 12:34:34 UTC
Could you get a backtrace with symbols, or a core file?
The Xorg log has [  9610.271] (EE) qxl(0): error doing QXL_ALLOC
which corresponds to
    ret = drmIoctl(qxl->drm_fd, DRM_IOCTL_QXL_ALLOC, &alloc);
    if (ret) {
        xf86DrvMsg(qxl->pScrn->scrnIndex, X_ERROR,
                   "error doing QXL_ALLOC\n");
        free(bo);
        return NULL; // an invalid handle
    }

Is there anything in the kernel log when this happens?

(Originally by Christophe Fergeau)

Comment 5 RHV bug bot 2018-07-20 12:34:40 UTC
Created attachment 1363452 [details]
ifcfg.log

(Originally by Chenli Hu)

Comment 6 RHV bug bot 2018-07-20 12:34:46 UTC
Created attachment 1363453 [details]
packaging.log

(Originally by Chenli Hu)

Comment 7 RHV bug bot 2018-07-20 12:34:53 UTC
Created attachment 1363454 [details]
program.log

(Originally by Chenli Hu)

Comment 8 RHV bug bot 2018-07-20 12:34:59 UTC
Created attachment 1363455 [details]
rpm-script.log

(Originally by Chenli Hu)

Comment 9 RHV bug bot 2018-07-20 12:35:05 UTC
Created attachment 1363456 [details]
rpm-script.log

(Originally by Chenli Hu)

Comment 10 RHV bug bot 2018-07-20 12:35:12 UTC
Created attachment 1363457 [details]
storage.log

(Originally by Chenli Hu)

Comment 11 RHV bug bot 2018-07-20 12:35:19 UTC
I'm sorry, this is happened when install the guest, and the installation is not finished, I didn't get any core file or get the backtrace with symbols, let me upload all the logs I collected as above.

(Originally by Chenli Hu)

Comment 12 RHV bug bot 2018-07-20 12:35:25 UTC
Do you have the full libvirt XML for the VM which crashes? I've tried to reproduce this in virt-manager (no access to a RHEV instance), but the installation is working fine so far. Does the crash happens when the graphical installer starts? Or does it happen a bit later after you interacted with it?

(Originally by Christophe Fergeau)

Comment 13 RHV bug bot 2018-07-20 12:35:31 UTC
I also want to reproduce it by virt-manager before filing this bug, but failed. It happened a bit later after I interacted with it. In the rpm-script.log, it shows:
...
iwl7265-firmware-22.0.7.0-56.el7.noarch (326/329)

I selected minimal installation.

(Originally by Chenli Hu)

Comment 14 RHV bug bot 2018-07-20 12:35:38 UTC
Can you provide the full libvirt XML from RHV? Maybe this would be helpful to recreate a VM which triggers that crash.

(Originally by Christophe Fergeau)

Comment 15 RHV bug bot 2018-07-20 12:35:43 UTC
Setting needinfo, moving to 7.6

(Originally by victortoso)

Comment 16 RHV bug bot 2018-07-20 12:35:48 UTC
Rerun with latest rhv4.2, install rhel7.4 release guests with QXL successfully, Xorg Segmentation fault is no longer existed.

Verified on packages:
vdsm-4.20.17-1.el7ev.x86_64
libvirt-3.2.0-14.el7_4.9.x86_64
qemu-kvm-rhev-2.9.0-16.el7_4.14.x86_64
ovirt-engine-4.2.1.1-0.1.el7.noarch
ovirt-engine-dashboard-1.2.1-1.el7ev.noarch

Test steps:
1. Install a minimal guest in latest RHV4.2 web console with Video Type: QXL successfully, the related xml is attached file: 7_4.xml

2. Install a "Server with GUI" guest in latest RHV4.2 web console with Video Type: QXL successfully, the related xml is attached file: 7_4-1.xml

As there is no Xorg Segmentation fault, and the guest can installed successfully in latest rhv4.2, I think this bug can changed to "Current release".

(Originally by Chenli Hu)

Comment 17 RHV bug bot 2018-07-20 12:35:55 UTC
Created attachment 1391793 [details]
7_4.xml

(Originally by Chenli Hu)

Comment 18 RHV bug bot 2018-07-20 12:36:01 UTC
Created attachment 1391794 [details]
7_4-1.xml

(Originally by Chenli Hu)

Comment 19 RHV bug bot 2018-07-20 12:36:07 UTC
(In reply to chhu from comment #0)
>       <model type='qxl' ram='65536' vram='8192' vgamem='16384' heads='1'
> primary='yes'/>

vram should be set to 32768 for RHEL 7 guest with 1 QXL, based on https://www.spice-space.org/multiple-monitors.html

If that works change that and it works, likely to be a bug in RHEV initial setup. (8192 is for UMS driver, RHEL6)

(Originally by victortoso)

Comment 20 RHV bug bot 2018-07-20 12:36:14 UTC
Hi, Victor

I used the default setting in latest RHV4.2 web console with Video Type: QXL, and didn't do other setting, just used `virsh dumpxml` to show the guest xml to file: 7_4.xml(minimal), 7_4-1.xml(Server with GUI).

(Originally by Chenli Hu)

Comment 21 RHV bug bot 2018-07-20 12:36:20 UTC
Sorry, I was not clear in comment #18

For RHEL 7 (guest) VM with 1 QXL, the expected value for vram is 32768 instead of 8192. That could be causing the error on QXL_ALLOC mentioned by Christhophe in comment #3

Questions:
- Have you set the 'Operation systems' to RHEL 7.x while creating the VM in RHV4.2 ? If yes, RHV might have a bug.
- If you set the vram to 32768, does the problem goes away? (you should not see any error like `error doing QXL_ALLOC`)

(Originally by victortoso)

Comment 22 RHV bug bot 2018-07-20 12:36:27 UTC
(In reply to Victor Toso from comment #20)
> Sorry, I was not clear in comment #18
> 
> For RHEL 7 (guest) VM with 1 QXL, the expected value for vram is 32768
> instead of 8192. That could be causing the error on QXL_ALLOC mentioned by
> Christhophe in comment #3
> 
> Questions:
> - Have you set the 'Operation systems' to RHEL 7.x while creating the VM in
> RHV4.2 ? If yes, RHV might have a bug.

No, I set the 'Operation systems': Linux. And if I set it to: 'RHEL 7.x', 
the vram is 32768, and can install the guest successfully.
    <video>
      <model type='qxl' ram='65536' vram='32768' vgamem='16384' heads='1' primary='yes'/>
      <alias name='video0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/>
    </video>

But the machine type is: pc-i440fx-rhel7.3.0, it should be: pc-i440fx-rhel7.4.0
  <os>
    <type arch='x86_64' machine='pc-i440fx-rhel7.3.0'>hvm</type>
    <smbios mode='sysinfo'/>
  </os>

The full xml are in file: redhat7-4.xml

> - If you set the vram to 32768, does the problem goes away? (you should not
> see any error like `error doing QXL_ALLOC`)

As the test result in comment15, in latest rhv4.2, with the xml in 7_4.xml(vram='8192'), set the 'Operation systems': Linux, I can install the guest successfully.

(Originally by Chenli Hu)

Comment 23 RHV bug bot 2018-07-20 12:36:33 UTC
Created attachment 1393011 [details]
redhat7-4.xml

(Originally by Chenli Hu)

Comment 24 RHV bug bot 2018-07-20 12:36:39 UTC
Hi,

> > For RHEL 7 (guest) VM with 1 QXL, the expected value for vram is 32768
> > instead of 8192. That could be causing the error on QXL_ALLOC mentioned by
> > Christhophe in comment #3
> > 
> > Questions:
> > - Have you set the 'Operation systems' to RHEL 7.x while creating the VM in
> > RHV4.2 ? If yes, RHV might have a bug.
> 
> No, I set the 'Operation systems': Linux. And if I set it to: 'RHEL 7.x', 
> the vram is 32768, and can install the guest successfully.

As RHEL 7.x works but not as Linux, okay. Would be nice to know if this is a regression or not. I'll be moving the bug to RHEVM.

>     <video>
>       <model type='qxl' ram='65536' vram='32768' vgamem='16384' heads='1'
> primary='yes'/>
>       <alias name='video0'/>
>       <address type='pci' domain='0x0000' bus='0x00' slot='0x02'
> function='0x0'/>
>     </video>
> 
> But the machine type is: pc-i440fx-rhel7.3.0, it should be:
> pc-i440fx-rhel7.4.0
>   <os>
>     <type arch='x86_64' machine='pc-i440fx-rhel7.3.0'>hvm</type>
>     <smbios mode='sysinfo'/>
>   </os>

Maybe same bug (wrong config being used) or a different one (wrong value is set in the config)

> The full xml are in file: redhat7-4.xml
> 
> > - If you set the vram to 32768, does the problem goes away? (you should not
> > see any error like `error doing QXL_ALLOC`)
> 
> As the test result in comment15, in latest rhv4.2, with the xml in
> 7_4.xml(vram='8192'), set the 'Operation systems': Linux, I can install the
> guest successfully.

Doesn't mean that you can use it without issues.

(Originally by victortoso)

Comment 29 RHV bug bot 2018-07-20 12:37:06 UTC
*** Bug 1542214 has been marked as a duplicate of this bug. ***

(Originally by victortoso)

Comment 33 RHV bug bot 2018-07-20 12:37:33 UTC
I do not see anything wrong
RHEL 7 64bit is working correctly setting vram to 32MB, other guests are using 8MB
See https://bugzilla.redhat.com/show_bug.cgi?id=1275539#c12 for the result of discussions with SPICE team back then.
Are the recommendations different now? Christophe?

The machine type is the one we have been using in oVirt/RHV 4.2 at the time this bug was filed. Note since then we're using i440fx-rhel7.5.0 in RHV 4.2

Victor, any other issue?

(Originally by michal.skrivanek)

Comment 34 RHV bug bot 2018-07-20 12:37:40 UTC
(In reply to Michal Skrivanek from comment #32)
> I do not see anything wrong
> RHEL 7 64bit is working correctly setting vram to 32MB, other guests are
> using 8MB

The problem is when user uses 'Linux' for RHEL 7 installation. 8MB is not enough as it needs 32MB. We can't avoid guest failures due lack of memory...

> See https://bugzilla.redhat.com/show_bug.cgi?id=1275539#c12 for the result
> of discussions with SPICE team back then.
> Are the recommendations different now? Christophe?
> 
> The machine type is the one we have been using in oVirt/RHV 4.2 at the time
> this bug was filed. Note since then we're using i440fx-rhel7.5.0 in RHV 4.2
> 
> Victor, any other issue?

Not really. The problem here is user installing RHEL 7 with 'Linux' profile instead of 'RHEL 7.x' profile. Should we close as WONTFIX in this case?

(Originally by victortoso)

Comment 35 RHV bug bot 2018-07-20 12:37:46 UTC
adding back needinfo from comment #12 and adding David as well to advise about any potential update of memory settings for QXL

(In reply to Victor Toso from comment #33)
> Not really. The problem here is user installing RHEL 7 with 'Linux' profile
> instead of 'RHEL 7.x' profile. Should we close as WONTFIX in this case?

Seems so. Unless there are updates from SPICE team

(Originally by michal.skrivanek)

Comment 36 RHV bug bot 2018-07-20 12:37:52 UTC
(In reply to Michal Skrivanek from comment #34)
> adding back needinfo from comment #12 and adding David as well to advise
> about any potential update of memory settings for QXL
> 
> (In reply to Victor Toso from comment #33)
> > Not really. The problem here is user installing RHEL 7 with 'Linux' profile
> > instead of 'RHEL 7.x' profile. Should we close as WONTFIX in this case?
> 
> Seems so. Unless there are updates from SPICE team

Martin?

(Originally by Yaniv Kaul)

Comment 37 RHV bug bot 2018-07-20 12:37:59 UTC
(In reply to Yaniv Kaul from comment #35)
> (In reply to Michal Skrivanek from comment #34)
> > adding back needinfo from comment #12 and adding David as well to advise
> > about any potential update of memory settings for QXL
> > 
> > (In reply to Victor Toso from comment #33)
> > > Not really. The problem here is user installing RHEL 7 with 'Linux' profile
> > > instead of 'RHEL 7.x' profile. Should we close as WONTFIX in this case?
> > 
> > Seems so. Unless there are updates from SPICE team
> 
> Martin?

Why does the Linux profile has 8MB instead of 32MB in the first place?
From my pov not only RHEL will fail but also every other Linux using XOrg with QXL driver.

So would it not be sensible increasing the Memory to 32MB for every Linux like profile?

I believe there is a Windows profile that covers all Windows right now, and I believe the RHEL and Linux profiles are all a bit different.

So as long as there is a reason for the 8MB for the Linux profile, I would say this is a somewhat expected behaviour, as the "wrong" profile was chosen.

In case there is no good reason for the 8MB we should adjust this to 32MB as well.

David?

(Originally by Martin Tessun)

Comment 38 RHV bug bot 2018-07-20 12:38:04 UTC
(In reply to Martin Tessun from comment #36)
> (In reply to Yaniv Kaul from comment #35)
> > (In reply to Michal Skrivanek from comment #34)
> > > adding back needinfo from comment #12 and adding David as well to advise
> > > about any potential update of memory settings for QXL
> > > 
> > > (In reply to Victor Toso from comment #33)
> > > > Not really. The problem here is user installing RHEL 7 with 'Linux' profile
> > > > instead of 'RHEL 7.x' profile. Should we close as WONTFIX in this case?
> > > 
> > > Seems so. Unless there are updates from SPICE team
> > 
> > Martin?
> 
> Why does the Linux profile has 8MB instead of 32MB in the first place?
> From my pov not only RHEL will fail but also every other Linux using XOrg
> with QXL driver.
> 
> So would it not be sensible increasing the Memory to 32MB for every Linux
> like profile?
> 
> I believe there is a Windows profile that covers all Windows right now, and
> I believe the RHEL and Linux profiles are all a bit different.
> 
> So as long as there is a reason for the 8MB for the Linux profile, I would
> say this is a somewhat expected behaviour, as the "wrong" profile was chosen.
> 
> In case there is no good reason for the 8MB we should adjust this to 32MB as
> well.
> 
> David?

The quick answer is: there are no reasons for limiting vram to 8mb. 

Spice supports only 3 types of VMs: Windows ( xp, windows 7 and windows 10), rhel 7.x and rhel 6.x. Only rhel 7.x driver is using vram. 

I hope it helps

dnb

(Originally by David Blechter)

Comment 39 RHV bug bot 2018-07-20 12:38:10 UTC
Michal, if there are no obvious drawbacks, let's change it for 4.3 and consider backport to 4.2.z

(Originally by Yaniv Kaul)

Comment 40 RHV bug bot 2018-07-20 12:38:17 UTC
sure, but since we're changing it for all Linux guest OS types QE must test for regressions for older guests like RHEL 5 and 6 and at least one or two other common OSes

(Originally by michal.skrivanek)

Comment 41 RHV bug bot 2018-07-20 12:38:23 UTC
WARN: Bug status wasn't changed from MODIFIED to ON_QA due to the following reason:

[Found non-acked flags: '{'rhevm-4.2.z': '?'}', ]

For more info please contact: rhv-devops: Bug status wasn't changed from MODIFIED to ON_QA due to the following reason:

[Found non-acked flags: '{'rhevm-4.2.z': '?'}', ]

For more info please contact: rhv-devops

(Originally by rhv-bugzilla-bot)

Comment 42 RHV bug bot 2018-07-20 12:38:29 UTC
please clone properly

(Originally by Dusan Fodor)

Comment 45 Liran Rotenberg 2018-07-26 09:57:10 UTC
Verified on:
ovirt-engine-4.2.5.2-0.1.el7ev.noarch
vdsm-4.20.35-1.el7ev.x86_64
qemu-kvm-rhev-2.10.0-21.el7_5.4.x86_64

Steps:
1. Create a new VM
2. Set Operating System: Linux, Console Video Type: QXL, SPICE
3. Run the VM and try to install OS.

Tested on OS:
RHEL6.9, RHEL7.4, RHEL7.5, Fedora26, CentOS7

Results:
On every test the engine xml shows:
    <video>
      <model type="qxl" vram="32768" heads="1" ram="65536" vgamem="16384"/>
      <alias name="ua-e29aba7f-0996-479c-861b-b1b0c956b9c5"/>
    </video>
in vdsm vm xml:
    <video>
      <model type='qxl' ram='65536' vram='32768' vgamem='16384' heads='1' primary='yes'/>
      <alias name='ua-e29aba7f-0996-479c-861b-b1b0c956b9c5'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/>
    </video>

The vram is set correctly. 
The guest successfully installed the OS.

Comment 47 errata-xmlrpc 2018-07-31 17:49:18 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:2318


Note You need to log in before you can comment on or make changes to this bug.