Bug 1520848 - Hit Xorg Segmentation fault while installing rhel7.4 release guest in RHV 4.2 with QXL
Summary: Hit Xorg Segmentation fault while installing rhel7.4 release guest in RHV 4.2...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-engine
Version: 4.2.1
Hardware: x86_64
OS: Linux
medium
medium
Target Milestone: ovirt-4.3.0
: 4.3.0
Assignee: Milan Zamazal
QA Contact: Liran Rotenberg
URL:
Whiteboard: libvirt_RHV_INT
: 1542214 1671186 (view as bug list)
Depends On:
Blocks: 1605198 1678350
TreeView+ depends on / blocked
 
Reported: 2017-12-05 10:00 UTC by chhu
Modified: 2023-09-15 00:05 UTC (History)
16 users (show)

Fixed In Version: ovirt-engine-4.3.0_alpha
Doc Type: Bug Fix
Doc Text:
This release updates the VM video RAM settings to ensure enough RAM is present for any Linux guest operating system.
Clone Of:
: 1605198 (view as bug list)
Environment:
Last Closed: 2019-05-08 12:36:59 UTC
oVirt Team: Virt
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
X.log (20.87 KB, text/plain)
2017-12-05 10:00 UTC, chhu
no flags Details
anaconda.log (28.21 KB, text/plain)
2017-12-05 10:01 UTC, chhu
no flags Details
ifcfg.log (2.23 KB, text/plain)
2017-12-06 01:32 UTC, chhu
no flags Details
packaging.log (87.69 KB, text/plain)
2017-12-06 01:33 UTC, chhu
no flags Details
program.log (29.56 KB, text/plain)
2017-12-06 01:33 UTC, chhu
no flags Details
rpm-script.log (13.52 KB, text/plain)
2017-12-06 01:34 UTC, chhu
no flags Details
rpm-script.log (13.52 KB, text/plain)
2017-12-06 01:39 UTC, chhu
no flags Details
storage.log (125.89 KB, text/plain)
2017-12-06 01:41 UTC, chhu
no flags Details
7_4.xml (8.32 KB, text/plain)
2018-02-06 04:22 UTC, chhu
no flags Details
7_4-1.xml (8.31 KB, text/plain)
2018-02-06 04:23 UTC, chhu
no flags Details
redhat7-4.xml (8.26 KB, text/plain)
2018-02-08 03:36 UTC, chhu
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Knowledge Base (Solution) 3897021 0 None None None 2019-04-18 19:55:40 UTC
Red Hat Product Errata RHEA-2019:1085 0 None None None 2019-05-08 12:37:22 UTC
oVirt gerrit 92145 0 'None' MERGED core: Set vramMultiplier for all x86 Linux systems 2020-09-22 02:18:58 UTC
oVirt gerrit 92329 0 'None' MERGED core: Set vramMultiplier for all x86 Linux systems 2020-09-22 02:18:58 UTC

Description chhu 2017-12-05 10:00:27 UTC
Created attachment 1363136 [details]
X.log

Description of problem:
Failed to install rhel7.4 release guest in RHV4.2 with QXL, hit Xorg Segmentation fault

Version-Release number of selected component (if applicable):
xorg-x11-drv-qxl-0.1.5-3.el7
vdsm-4.20.8-1.el7ev.x86_64
libvirt-daemon-3.2.0-14.el7_4.4.x86_64
qemu-kvm-rhev-2.9.0-16.el7_4.12.x86_64
ovirt-engine-dashboard-1.2.0-0.8.el7ev.noarch

How reproducible:
100%

Steps to Reproduce:
1. Try to install a guest in RHV4.2 web console with Video Type: QXL, the related xml are as below:
    <video>
      <model type='qxl' ram='65536' vram='8192' vgamem='16384' heads='1' primary='yes'/>
      <alias name='video0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/>
    </video>

2. Failed to install the guest, the Backtrace as below:
[  9610.279] (EE) Backtrace:
[  9610.354] (EE) 0: /usr/bin/Xorg (xorg_backtrace+0x55) [0x558dca8a1655]
[  9610.354] (EE) 1: /usr/bin/Xorg (0x558dca6f5000+0x1b0369) [0x558dca8a5369]
[  9610.354] (EE) 2: /lib64/libpthread.so.0 (0x7fc1b36b4000+0xf5e0) [0x7fc1b36c35e0]
[  9610.354] (EE) 3: /usr/lib64/xorg/modules/drivers/qxl_drv.so (0x7fc1aeb76000+0x8f34) [0x7fc1aeb7ef34]
[  9610.354] (EE) 4: /usr/lib64/xorg/modules/drivers/qxl_drv.so (0x7fc1aeb76000+0x9456) [0x7fc1aeb7f456]
[  9610.354] (EE) 5: /usr/lib64/xorg/modules/drivers/qxl_drv.so (0x7fc1aeb76000+0x17075) [0x7fc1aeb8d075]
[  9610.354] (EE) 6: /usr/bin/Xorg (miCopyRegion+0x1ba) [0x558dca8824ea]
[  9610.354] (EE) 7: /usr/bin/Xorg (miDoCopy+0x470) [0x558dca882a90]
[  9610.354] (EE) 8: /usr/lib64/xorg/modules/drivers/qxl_drv.so (0x7fc1aeb76000+0x15de6) [0x7fc1aeb8bde6]
[  9610.354] (EE) 9: /usr/bin/Xorg (0x558dca6f5000+0x138c3f) [0x558dca82dc3f]
[  9610.354] (EE) 10: /usr/bin/Xorg (0x558dca6f5000+0x4f982) [0x558dca744982]
[  9610.354] (EE) 11: /usr/bin/Xorg (0x558dca6f5000+0x53a2b) [0x558dca748a2b]
[  9610.354] (EE) 12: /usr/bin/Xorg (0x558dca6f5000+0x57aca) [0x558dca74caca]
[  9610.354] (EE) 13: /lib64/libc.so.6 (__libc_start_main+0xf5) [0x7fc1b3312c05]
[  9610.354] (EE) 14: /usr/bin/Xorg (0x558dca6f5000+0x41bce) [0x558dca736bce]
[  9610.354] (EE)
[  9610.355] (EE) Segmentation fault at address 0x0
[  9610.355] (EE)
Fatal server error:
[  9610.355] (EE) Caught signal 11 (Segmentation fault). Server aborting

Actual results:
Failed to install the rhel7.4 release guest.

Expected results:
The guest is installed successfully.

Additional info:
X.log, anaconda.log

Comment 2 chhu 2017-12-05 10:01:51 UTC
Created attachment 1363137 [details]
anaconda.log

Comment 3 Christophe Fergeau 2017-12-05 10:10:12 UTC
Could you get a backtrace with symbols, or a core file?
The Xorg log has [  9610.271] (EE) qxl(0): error doing QXL_ALLOC
which corresponds to
    ret = drmIoctl(qxl->drm_fd, DRM_IOCTL_QXL_ALLOC, &alloc);
    if (ret) {
        xf86DrvMsg(qxl->pScrn->scrnIndex, X_ERROR,
                   "error doing QXL_ALLOC\n");
        free(bo);
        return NULL; // an invalid handle
    }

Is there anything in the kernel log when this happens?

Comment 4 chhu 2017-12-06 01:32:32 UTC
Created attachment 1363452 [details]
ifcfg.log

Comment 5 chhu 2017-12-06 01:33:11 UTC
Created attachment 1363453 [details]
packaging.log

Comment 6 chhu 2017-12-06 01:33:41 UTC
Created attachment 1363454 [details]
program.log

Comment 7 chhu 2017-12-06 01:34:07 UTC
Created attachment 1363455 [details]
rpm-script.log

Comment 8 chhu 2017-12-06 01:39:17 UTC
Created attachment 1363456 [details]
rpm-script.log

Comment 9 chhu 2017-12-06 01:41:05 UTC
Created attachment 1363457 [details]
storage.log

Comment 10 chhu 2017-12-06 01:50:34 UTC
I'm sorry, this is happened when install the guest, and the installation is not finished, I didn't get any core file or get the backtrace with symbols, let me upload all the logs I collected as above.

Comment 11 Christophe Fergeau 2017-12-07 09:51:02 UTC
Do you have the full libvirt XML for the VM which crashes? I've tried to reproduce this in virt-manager (no access to a RHEV instance), but the installation is working fine so far. Does the crash happens when the graphical installer starts? Or does it happen a bit later after you interacted with it?

Comment 12 chhu 2017-12-07 10:05:55 UTC
I also want to reproduce it by virt-manager before filing this bug, but failed. It happened a bit later after I interacted with it. In the rpm-script.log, it shows:
...
iwl7265-firmware-22.0.7.0-56.el7.noarch (326/329)

I selected minimal installation.

Comment 13 Christophe Fergeau 2017-12-07 12:31:21 UTC
Can you provide the full libvirt XML from RHV? Maybe this would be helpful to recreate a VM which triggers that crash.

Comment 14 Victor Toso 2018-01-26 16:14:40 UTC
Setting needinfo, moving to 7.6

Comment 15 chhu 2018-02-06 04:10:57 UTC
Rerun with latest rhv4.2, install rhel7.4 release guests with QXL successfully, Xorg Segmentation fault is no longer existed.

Verified on packages:
vdsm-4.20.17-1.el7ev.x86_64
libvirt-3.2.0-14.el7_4.9.x86_64
qemu-kvm-rhev-2.9.0-16.el7_4.14.x86_64
ovirt-engine-4.2.1.1-0.1.el7.noarch
ovirt-engine-dashboard-1.2.1-1.el7ev.noarch

Test steps:
1. Install a minimal guest in latest RHV4.2 web console with Video Type: QXL successfully, the related xml is attached file: 7_4.xml

2. Install a "Server with GUI" guest in latest RHV4.2 web console with Video Type: QXL successfully, the related xml is attached file: 7_4-1.xml

As there is no Xorg Segmentation fault, and the guest can installed successfully in latest rhv4.2, I think this bug can changed to "Current release".

Comment 16 chhu 2018-02-06 04:22:52 UTC
Created attachment 1391793 [details]
7_4.xml

Comment 17 chhu 2018-02-06 04:23:27 UTC
Created attachment 1391794 [details]
7_4-1.xml

Comment 18 Victor Toso 2018-02-06 14:11:04 UTC
(In reply to chhu from comment #0)
>       <model type='qxl' ram='65536' vram='8192' vgamem='16384' heads='1'
> primary='yes'/>

vram should be set to 32768 for RHEL 7 guest with 1 QXL, based on https://www.spice-space.org/multiple-monitors.html

If that works change that and it works, likely to be a bug in RHEV initial setup. (8192 is for UMS driver, RHEL6)

Comment 19 chhu 2018-02-07 01:40:40 UTC
Hi, Victor

I used the default setting in latest RHV4.2 web console with Video Type: QXL, and didn't do other setting, just used `virsh dumpxml` to show the guest xml to file: 7_4.xml(minimal), 7_4-1.xml(Server with GUI).

Comment 20 Victor Toso 2018-02-07 07:15:20 UTC
Sorry, I was not clear in comment #18

For RHEL 7 (guest) VM with 1 QXL, the expected value for vram is 32768 instead of 8192. That could be causing the error on QXL_ALLOC mentioned by Christhophe in comment #3

Questions:
- Have you set the 'Operation systems' to RHEL 7.x while creating the VM in RHV4.2 ? If yes, RHV might have a bug.
- If you set the vram to 32768, does the problem goes away? (you should not see any error like `error doing QXL_ALLOC`)

Comment 21 chhu 2018-02-08 03:33:41 UTC
(In reply to Victor Toso from comment #20)
> Sorry, I was not clear in comment #18
> 
> For RHEL 7 (guest) VM with 1 QXL, the expected value for vram is 32768
> instead of 8192. That could be causing the error on QXL_ALLOC mentioned by
> Christhophe in comment #3
> 
> Questions:
> - Have you set the 'Operation systems' to RHEL 7.x while creating the VM in
> RHV4.2 ? If yes, RHV might have a bug.

No, I set the 'Operation systems': Linux. And if I set it to: 'RHEL 7.x', 
the vram is 32768, and can install the guest successfully.
    <video>
      <model type='qxl' ram='65536' vram='32768' vgamem='16384' heads='1' primary='yes'/>
      <alias name='video0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/>
    </video>

But the machine type is: pc-i440fx-rhel7.3.0, it should be: pc-i440fx-rhel7.4.0
  <os>
    <type arch='x86_64' machine='pc-i440fx-rhel7.3.0'>hvm</type>
    <smbios mode='sysinfo'/>
  </os>

The full xml are in file: redhat7-4.xml

> - If you set the vram to 32768, does the problem goes away? (you should not
> see any error like `error doing QXL_ALLOC`)

As the test result in comment15, in latest rhv4.2, with the xml in 7_4.xml(vram='8192'), set the 'Operation systems': Linux, I can install the guest successfully.

Comment 22 chhu 2018-02-08 03:36:50 UTC
Created attachment 1393011 [details]
redhat7-4.xml

Comment 23 Victor Toso 2018-02-08 06:07:14 UTC
Hi,

> > For RHEL 7 (guest) VM with 1 QXL, the expected value for vram is 32768
> > instead of 8192. That could be causing the error on QXL_ALLOC mentioned by
> > Christhophe in comment #3
> > 
> > Questions:
> > - Have you set the 'Operation systems' to RHEL 7.x while creating the VM in
> > RHV4.2 ? If yes, RHV might have a bug.
> 
> No, I set the 'Operation systems': Linux. And if I set it to: 'RHEL 7.x', 
> the vram is 32768, and can install the guest successfully.

As RHEL 7.x works but not as Linux, okay. Would be nice to know if this is a regression or not. I'll be moving the bug to RHEVM.

>     <video>
>       <model type='qxl' ram='65536' vram='32768' vgamem='16384' heads='1'
> primary='yes'/>
>       <alias name='video0'/>
>       <address type='pci' domain='0x0000' bus='0x00' slot='0x02'
> function='0x0'/>
>     </video>
> 
> But the machine type is: pc-i440fx-rhel7.3.0, it should be:
> pc-i440fx-rhel7.4.0
>   <os>
>     <type arch='x86_64' machine='pc-i440fx-rhel7.3.0'>hvm</type>
>     <smbios mode='sysinfo'/>
>   </os>

Maybe same bug (wrong config being used) or a different one (wrong value is set in the config)

> The full xml are in file: redhat7-4.xml
> 
> > - If you set the vram to 32768, does the problem goes away? (you should not
> > see any error like `error doing QXL_ALLOC`)
> 
> As the test result in comment15, in latest rhv4.2, with the xml in
> 7_4.xml(vram='8192'), set the 'Operation systems': Linux, I can install the
> guest successfully.

Doesn't mean that you can use it without issues.

Comment 28 Victor Toso 2018-02-08 12:52:34 UTC
*** Bug 1542214 has been marked as a duplicate of this bug. ***

Comment 32 Michal Skrivanek 2018-03-27 08:30:07 UTC
I do not see anything wrong
RHEL 7 64bit is working correctly setting vram to 32MB, other guests are using 8MB
See https://bugzilla.redhat.com/show_bug.cgi?id=1275539#c12 for the result of discussions with SPICE team back then.
Are the recommendations different now? Christophe?

The machine type is the one we have been using in oVirt/RHV 4.2 at the time this bug was filed. Note since then we're using i440fx-rhel7.5.0 in RHV 4.2

Victor, any other issue?

Comment 33 Victor Toso 2018-03-27 09:32:15 UTC
(In reply to Michal Skrivanek from comment #32)
> I do not see anything wrong
> RHEL 7 64bit is working correctly setting vram to 32MB, other guests are
> using 8MB

The problem is when user uses 'Linux' for RHEL 7 installation. 8MB is not enough as it needs 32MB. We can't avoid guest failures due lack of memory...

> See https://bugzilla.redhat.com/show_bug.cgi?id=1275539#c12 for the result
> of discussions with SPICE team back then.
> Are the recommendations different now? Christophe?
> 
> The machine type is the one we have been using in oVirt/RHV 4.2 at the time
> this bug was filed. Note since then we're using i440fx-rhel7.5.0 in RHV 4.2
> 
> Victor, any other issue?

Not really. The problem here is user installing RHEL 7 with 'Linux' profile instead of 'RHEL 7.x' profile. Should we close as WONTFIX in this case?

Comment 34 Michal Skrivanek 2018-03-27 09:48:15 UTC
adding back needinfo from comment #12 and adding David as well to advise about any potential update of memory settings for QXL

(In reply to Victor Toso from comment #33)
> Not really. The problem here is user installing RHEL 7 with 'Linux' profile
> instead of 'RHEL 7.x' profile. Should we close as WONTFIX in this case?

Seems so. Unless there are updates from SPICE team

Comment 35 Yaniv Kaul 2018-04-02 08:05:09 UTC
(In reply to Michal Skrivanek from comment #34)
> adding back needinfo from comment #12 and adding David as well to advise
> about any potential update of memory settings for QXL
> 
> (In reply to Victor Toso from comment #33)
> > Not really. The problem here is user installing RHEL 7 with 'Linux' profile
> > instead of 'RHEL 7.x' profile. Should we close as WONTFIX in this case?
> 
> Seems so. Unless there are updates from SPICE team

Martin?

Comment 36 Martin Tessun 2018-04-06 11:49:06 UTC
(In reply to Yaniv Kaul from comment #35)
> (In reply to Michal Skrivanek from comment #34)
> > adding back needinfo from comment #12 and adding David as well to advise
> > about any potential update of memory settings for QXL
> > 
> > (In reply to Victor Toso from comment #33)
> > > Not really. The problem here is user installing RHEL 7 with 'Linux' profile
> > > instead of 'RHEL 7.x' profile. Should we close as WONTFIX in this case?
> > 
> > Seems so. Unless there are updates from SPICE team
> 
> Martin?

Why does the Linux profile has 8MB instead of 32MB in the first place?
From my pov not only RHEL will fail but also every other Linux using XOrg with QXL driver.

So would it not be sensible increasing the Memory to 32MB for every Linux like profile?

I believe there is a Windows profile that covers all Windows right now, and I believe the RHEL and Linux profiles are all a bit different.

So as long as there is a reason for the 8MB for the Linux profile, I would say this is a somewhat expected behaviour, as the "wrong" profile was chosen.

In case there is no good reason for the 8MB we should adjust this to 32MB as well.

David?

Comment 37 David Blechter 2018-04-18 11:25:06 UTC
(In reply to Martin Tessun from comment #36)
> (In reply to Yaniv Kaul from comment #35)
> > (In reply to Michal Skrivanek from comment #34)
> > > adding back needinfo from comment #12 and adding David as well to advise
> > > about any potential update of memory settings for QXL
> > > 
> > > (In reply to Victor Toso from comment #33)
> > > > Not really. The problem here is user installing RHEL 7 with 'Linux' profile
> > > > instead of 'RHEL 7.x' profile. Should we close as WONTFIX in this case?
> > > 
> > > Seems so. Unless there are updates from SPICE team
> > 
> > Martin?
> 
> Why does the Linux profile has 8MB instead of 32MB in the first place?
> From my pov not only RHEL will fail but also every other Linux using XOrg
> with QXL driver.
> 
> So would it not be sensible increasing the Memory to 32MB for every Linux
> like profile?
> 
> I believe there is a Windows profile that covers all Windows right now, and
> I believe the RHEL and Linux profiles are all a bit different.
> 
> So as long as there is a reason for the 8MB for the Linux profile, I would
> say this is a somewhat expected behaviour, as the "wrong" profile was chosen.
> 
> In case there is no good reason for the 8MB we should adjust this to 32MB as
> well.
> 
> David?

The quick answer is: there are no reasons for limiting vram to 8mb. 

Spice supports only 3 types of VMs: Windows ( xp, windows 7 and windows 10), rhel 7.x and rhel 6.x. Only rhel 7.x driver is using vram. 

I hope it helps

dnb

Comment 38 Yaniv Kaul 2018-04-18 15:37:17 UTC
Michal, if there are no obvious drawbacks, let's change it for 4.3 and consider backport to 4.2.z

Comment 39 Michal Skrivanek 2018-06-15 06:31:55 UTC
sure, but since we're changing it for all Linux guest OS types QE must test for regressions for older guests like RHEL 5 and 6 and at least one or two other common OSes

Comment 40 RHV bug bot 2018-07-02 15:34:11 UTC
WARN: Bug status wasn't changed from MODIFIED to ON_QA due to the following reason:

[Found non-acked flags: '{'rhevm-4.2.z': '?'}', ]

For more info please contact: rhv-devops: Bug status wasn't changed from MODIFIED to ON_QA due to the following reason:

[Found non-acked flags: '{'rhevm-4.2.z': '?'}', ]

For more info please contact: rhv-devops

Comment 41 Dusan Fodor 2018-07-12 14:18:46 UTC
please clone properly

Comment 44 Liran Rotenberg 2018-11-14 14:22:36 UTC
Verified on:
ovirt-engine-4.3.0-0.0.master.20181101091940.git61310aa.el7.noarch
vdsm-4.30.1-52.git5426c0c.el7.x86_64
qemu-kvm-rhev-2.12.0-18.el7_6.2.x86_64

Steps:
1. Create a new VM
2. Set Operating System: Linux, Console Video Type: QXL, SPICE
3. Run the VM and try to install OS.

Tested on OS:
RHEL6.9, RHEL7.4, RHEL7.5, RHEL7.6, CentOS7, Fedora28

Results:
On every test the engine xml shows:
<video>
      <model type='qxl' ram='65536' vram='32768' vgamem='16384' heads='1' primary='yes'/>
      <alias name='ua-b98dfc9d-5c01-40c8-88cf-68654f26f41a'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/>
</video>

in vdsm vm xml:
<video>
    <model heads="1" ram="65536" type="qxl" vgamem="16384" vram="32768"/>
    <alias name="ua-b98dfc9d-5c01-40c8-88cf-68654f26f41a"/>
    <address bus="0x00" domain="0x0000" function="0x0" slot="0x02" type="pci"/>
</video>

The vram is set correctly. 
The guest successfully installed the OS.

Comment 45 Victor Toso 2019-04-09 13:48:11 UTC
*** Bug 1671186 has been marked as a duplicate of this bug. ***

Comment 47 errata-xmlrpc 2019-05-08 12:36:59 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2019:1085

Comment 48 Red Hat Bugzilla 2023-09-15 00:05:30 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 500 days


Note You need to log in before you can comment on or make changes to this bug.