Bug 894360
| Summary: | starting a F18 install as a CentOS5 xen guest | ||||||
|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 5 | Reporter: | Florian La Roche <florian.laroche> | ||||
| Component: | kernel-xen | Assignee: | Andrew Jones <drjones> | ||||
| Status: | CLOSED ERRATA | QA Contact: | Virtualization Bugs <virt-bugs> | ||||
| Severity: | unspecified | Docs Contact: | |||||
| Priority: | unspecified | ||||||
| Version: | 5.9 | CC: | drjones, gansalmon, hhuang, itamar, jonathan, kernel-maint, leiwang, madhu.chinakonda, wshi, xen-maint | ||||
| Target Milestone: | --- | ||||||
| Target Release: | 5.10 | ||||||
| Hardware: | Unspecified | ||||||
| OS: | Unspecified | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | kernel-2.6.18-360.el5 | Doc Type: | Bug Fix | ||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2013-09-30 23:45:55 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Bug Depends On: | |||||||
| Bug Blocks: | 514489 | ||||||
| Attachments: |
|
||||||
|
Description
Florian La Roche
2013-01-11 14:54:21 UTC
Hi, do F17 installs on the same host work? Cannot test F17 here, but F16 did work. Also F18 continues to install ok, but not sure how much I can trust this system. best regards, Florian La Roche This is likely due to
commit 841e3604d35aa70d399146abdc526d8c89a2c2f5
Author: Suresh Siddha <suresh.b.siddha>
Date: Fri Aug 24 14:13:00 2012 -0700
x86, fpu: always use kernel_fpu_begin/end() for in-kernel FPU usage
use kernel_fpu_begin/end() instead of unconditionally accessing cr0 and
saving/restoring just the few used xmm/ymm registers.
which is in the f18 kernel. RHEL5 Xen (and its clones) rely on cr0 changes to keep consistent fpu state. The patch above removes those as an optimization, and unfortunately doesn't supply any alternative paths as long as the AVX cpufeature is present. RHEL5 Xen exposes AVX to guests (until now that's been harmless and possibly allowed guests to benefit from a small performance boost).
Upstream Xen has enhanced their fpu save/restore, so it's possible that running over a later Xen wouldn't have this problem. Either way for PV guests they wouldn't have the problem, because I see in upstream code that AVX is masked for PV guests when the domain can't use XSAVE. No domain running over RHEL5 Xen can use XSAVE, as it's not supported, and is already masked. So for the resolution we should also mask AVX from the guests in the hypervisor.
A workaround for installing F18 and other distros using kernels >= v3.7-rc1 is to add the following parameter to the guest's kernel command line clearcpuid=156 e.g. with virt-install use '-x clearcpuid=156' (In reply to comment #4) > A workaround for installing F18 and other distros using kernels >= v3.7-rc1 > is to add the following parameter to the guest's kernel command line > > clearcpuid=156 > > e.g. with virt-install use '-x clearcpuid=156' Launch a guest with F18 iso using xm command, after added parameter "clearcpuid=156", the "Segmentation fault" disappear but the installation progress will stopped at one step for 235s(see attachment), after that, it successfully launch anaconda. Created attachment 685709 [details]
Installation progress screenshot
after 235s, the progress can go on and finally launch anonconda
(In reply to comment #5) > > Launch a guest with F18 iso using xm command, after added parameter > "clearcpuid=156", the "Segmentation fault" disappear but the installation > progress will stopped at one step for 235s(see attachment), after that, it > successfully launch anaconda. That's xenbus waiting for devices. I'm not sure what it's waiting for, but I would guess it's a different problem (possibly config related). Please open a new bug and attach your guest config file. (In reply to comment #7) > (In reply to comment #5) > > > > Launch a guest with F18 iso using xm command, after added parameter > > "clearcpuid=156", the "Segmentation fault" disappear but the installation > > progress will stopped at one step for 235s(see attachment), after that, it > > successfully launch anaconda. > > That's xenbus waiting for devices. I'm not sure what it's waiting for, but I > would guess it's a different problem (possibly config related). Please open > a new bug and attach your guest config file. Sorry, it's my fault, i make a mistake on disk in config file(using tap:qcow for a raw image), no such problem yet, parameter "clearcpuid" is a workaround for this bug. This request was evaluated by Red Hat Product Management for inclusion in the current release of Red Hat Enterprise Linux. Because the affected component is not scheduled to be updated in the current release, Red Hat is unable to address this request at this time. Red Hat invites you to ask your support representative to propose this request, if appropriate, in the next release of Red Hat Enterprise Linux. PM, kernel-xen is the kernel. So the component is scheduled to be updated and we need the pm_ack drew This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux release for currently deployed products. This request is not yet committed for inclusion in a release. Patch(es) available in kernel-2.6.18-360.el5 You can download this test kernel (or newer) from http://people.redhat.com/plougher/el5/ Detailed testing feedback is always welcomed. If you require guidance regarding testing, please ask the bug assignee. Reproduced: kernel-xen-2.6.18-359.el5 Boot a 64bit hvm guest from RHEL-7.0-20130606.0-Server-x86_64-dvd1-ks.iso will lead to guest crash: [ 6.021764] Call Trace: [ 6.023606] [<ffffffffa00b5071>] do_xor_speed+0x71/0xc2 [xor] [ 6.027143] [<ffffffffa00b512d>] calibrate_xor_blocks+0x6b/0xf3e [xor] [ 6.031164] [<ffffffffa00b50c2>] ? do_xor_speed+0xc2/0xc2 [xor] [ 6.037366] [<ffffffff810020e2>] do_one_initcall+0xe2/0x190 [ 6.040833] [<ffffffff810c5717>] load_module+0xf47/0x1400 [ 6.044384] [<ffffffff81307600>] ? ddebug_proc_write+0xf0/0xf0 [ 6.048178] [<ffffffff810c1e34>] ? copy_module_from_fd.isra.42+0x44/0x140 [ 6.053059] [<ffffffff810c5d66>] SyS_finit_module+0x86/0xb0 [ 6.057388] [<ffffffff8160f399>] system_call_fastpath+0x16/0x1b [ 6.061935] Code: 89 d4 53 48 89 f3 e8 80 a3 f6 e0 84 c0 0f 84 b9 01 00 00 e8 63 a4 f6 e0 4d 85 ed 49 8d 45 ff 0f 84 9b 01 00 00 66 0f 1f 44 00 00 <c4> c1 7d 6f 04 24 c5 fc 57 03 c5 fd 7f 03 c4 c1 7d 6f 4c 24 20 [ 6.096859] RIP [<ffffffffa00afc60>] xor_avx_2+0x40/0x210 [xor] [ 6.101594] RSP <ffff88003f83fd28> [ 6.106464] ---[ end trace 85ff96b28d97c5f0 ]--- dracut-pre-udev[200]: //lib/dracut/hooks/pre-udev/30-anaconda-modprobe.sh: line 32: 231 Segmentation fault modprobe $m &>/dev/null Verified: kernel-xen-2.6.18-360.el5 Guest boot up successfully. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHSA-2013-1348.html |