Description of problem: [ 3.534209] async_tx: api initialized (async) [ 3.535048] xor: automatically using best checksumming function: [ 3.536056] invalid opcode: 0000 [#1] SMP [ 3.536060] Modules linked in: xor(+) async_tx raid1 raid0 iscsi_ibft iscsi_boot_sysfs scsi_dh_rdac scsi_dh_hp_sw scsi_dh_emc scsi_dh_alua iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi squashfs cramfs [ 3.536073] CPU 0 [ 3.536075] Pid: 211, comm: modprobe Not tainted 3.6.10-4.fc18.x86_64 #1 [ 3.536079] RIP: e030:[<ffffffffa006bbfc>] [<ffffffffa006bbfc>] xor_avx_2+0x4c/0x250 [xor] [ 3.536085] RSP: e02b:ffff880003871cc0 EFLAGS: 00010282 [ 3.536088] RAX: 0000000000000000 RBX: ffff880003874000 RCX: 0000000000000050 [ 3.536091] RDX: 0000000000000000 RSI: 0000000000000002 RDI: 000000008005023b [ 3.536094] RBP: ffff880003871d98 R08: 0000000000000000 R09: 00000000000000f0 [ 3.536097] R10: 0000000000007ff0 R11: 0720073a076e076f R12: ffff880003877000 [ 3.536099] R13: 0000000000000008 R14: 000000008005003b R15: ffff880003874000 [ 3.536105] FS: 00007f20ca066740(0000) GS:ffff88002fc00000(0000) knlGS:0000000000000000 [ 3.536108] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b [ 3.536110] CR2: 00007fa0d58f39f0 CR3: 00000000277a2000 CR4: 0000000000000620 [ 3.536113] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 3.536116] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000000 [ 3.536119] Process modprobe (pid: 211, threadinfo ffff880003870000, task ffff88002a441710) [ 3.536122] Stack: [ 3.536124] ffffffff8105e3f8 0000000000000000 0000000000000006 00000000ffffffff [ 3.536129] 0000000000000000 0000000000000034 ffffffff8162ccab ffff880003874000 [ 3.536133] ffff880003877000 ffffffffa006e000 00000000fffb79ef ffffffff816242ca [ 3.648235] Call Trace: [ 3.648242] [<ffffffff8105e3f8>] ? console_unlock+0x1e8/0x440 [ 3.648247] [<ffffffff8162ccab>] ? xen_hypervisor_callback+0x1b/0x20 [ 3.648252] [<ffffffff816242ca>] ? error_exit+0x2a/0x60 [ 3.648255] [<ffffffff81623df8>] ? retint_restore_args+0x5/0x6 [ 3.648259] [<ffffffffa006c97a>] do_xor_speed+0x6e/0xc8 [xor] [ 3.648263] [<ffffffffa0077075>] calibrate_xor_blocks+0x75/0x1000 [xor] [ 3.648274] [<ffffffffa0077000>] ? 0xffffffffa0076fff [ 3.648279] [<ffffffff8100212a>] do_one_initcall+0x12a/0x180 [ 3.648284] [<ffffffff810be400>] sys_init_module+0x140/0x21f0 [ 3.648289] [<ffffffff812fb480>] ? ddebug_proc_open+0xd0/0xd0 [ 3.648292] [<ffffffff8162bae9>] system_call_fastpath+0x16/0x1b [ 3.648295] Code: 49 89 d4 65 48 8b 04 25 28 00 00 00 48 89 84 24 98 00 00 00 31 c0 49 c1 ed 09 e8 30 7a f9 e0 66 90 49 89 c6 e8 c6 89 f9 e0 66 90 <c5> fc 29 04 24 c5 fc 29 4c 24 20 c5 fc 29 54 24 40 c5 fc 29 5c [ 3.648345] RIP [<ffffffffa006bbfc>] xor_avx_2+0x4c/0x250 [xor] [ 3.648349] RSP <ffff880003871cc0> [ 3.648352] ---[ end trace eb2b88f802238019 ]--- dracut-pre-udev[191]: //lib/dracut/hooks/pre-udev/30-anaconda-modprobe.sh: line 32: 211 Segmentation fault modprobe $m &>/dev/null Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info:
Hi, do F17 installs on the same host work?
Cannot test F17 here, but F16 did work. Also F18 continues to install ok, but not sure how much I can trust this system. best regards, Florian La Roche
This is likely due to commit 841e3604d35aa70d399146abdc526d8c89a2c2f5 Author: Suresh Siddha <suresh.b.siddha> Date: Fri Aug 24 14:13:00 2012 -0700 x86, fpu: always use kernel_fpu_begin/end() for in-kernel FPU usage use kernel_fpu_begin/end() instead of unconditionally accessing cr0 and saving/restoring just the few used xmm/ymm registers. which is in the f18 kernel. RHEL5 Xen (and its clones) rely on cr0 changes to keep consistent fpu state. The patch above removes those as an optimization, and unfortunately doesn't supply any alternative paths as long as the AVX cpufeature is present. RHEL5 Xen exposes AVX to guests (until now that's been harmless and possibly allowed guests to benefit from a small performance boost). Upstream Xen has enhanced their fpu save/restore, so it's possible that running over a later Xen wouldn't have this problem. Either way for PV guests they wouldn't have the problem, because I see in upstream code that AVX is masked for PV guests when the domain can't use XSAVE. No domain running over RHEL5 Xen can use XSAVE, as it's not supported, and is already masked. So for the resolution we should also mask AVX from the guests in the hypervisor.
A workaround for installing F18 and other distros using kernels >= v3.7-rc1 is to add the following parameter to the guest's kernel command line clearcpuid=156 e.g. with virt-install use '-x clearcpuid=156'
(In reply to comment #4) > A workaround for installing F18 and other distros using kernels >= v3.7-rc1 > is to add the following parameter to the guest's kernel command line > > clearcpuid=156 > > e.g. with virt-install use '-x clearcpuid=156' Launch a guest with F18 iso using xm command, after added parameter "clearcpuid=156", the "Segmentation fault" disappear but the installation progress will stopped at one step for 235s(see attachment), after that, it successfully launch anaconda.
Created attachment 685709 [details] Installation progress screenshot after 235s, the progress can go on and finally launch anonconda
(In reply to comment #5) > > Launch a guest with F18 iso using xm command, after added parameter > "clearcpuid=156", the "Segmentation fault" disappear but the installation > progress will stopped at one step for 235s(see attachment), after that, it > successfully launch anaconda. That's xenbus waiting for devices. I'm not sure what it's waiting for, but I would guess it's a different problem (possibly config related). Please open a new bug and attach your guest config file.
(In reply to comment #7) > (In reply to comment #5) > > > > Launch a guest with F18 iso using xm command, after added parameter > > "clearcpuid=156", the "Segmentation fault" disappear but the installation > > progress will stopped at one step for 235s(see attachment), after that, it > > successfully launch anaconda. > > That's xenbus waiting for devices. I'm not sure what it's waiting for, but I > would guess it's a different problem (possibly config related). Please open > a new bug and attach your guest config file. Sorry, it's my fault, i make a mistake on disk in config file(using tap:qcow for a raw image), no such problem yet, parameter "clearcpuid" is a workaround for this bug.
This request was evaluated by Red Hat Product Management for inclusion in the current release of Red Hat Enterprise Linux. Because the affected component is not scheduled to be updated in the current release, Red Hat is unable to address this request at this time. Red Hat invites you to ask your support representative to propose this request, if appropriate, in the next release of Red Hat Enterprise Linux.
PM, kernel-xen is the kernel. So the component is scheduled to be updated and we need the pm_ack drew
This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux release for currently deployed products. This request is not yet committed for inclusion in a release.
Patch(es) available in kernel-2.6.18-360.el5 You can download this test kernel (or newer) from http://people.redhat.com/plougher/el5/ Detailed testing feedback is always welcomed. If you require guidance regarding testing, please ask the bug assignee.
Reproduced: kernel-xen-2.6.18-359.el5 Boot a 64bit hvm guest from RHEL-7.0-20130606.0-Server-x86_64-dvd1-ks.iso will lead to guest crash: [ 6.021764] Call Trace: [ 6.023606] [<ffffffffa00b5071>] do_xor_speed+0x71/0xc2 [xor] [ 6.027143] [<ffffffffa00b512d>] calibrate_xor_blocks+0x6b/0xf3e [xor] [ 6.031164] [<ffffffffa00b50c2>] ? do_xor_speed+0xc2/0xc2 [xor] [ 6.037366] [<ffffffff810020e2>] do_one_initcall+0xe2/0x190 [ 6.040833] [<ffffffff810c5717>] load_module+0xf47/0x1400 [ 6.044384] [<ffffffff81307600>] ? ddebug_proc_write+0xf0/0xf0 [ 6.048178] [<ffffffff810c1e34>] ? copy_module_from_fd.isra.42+0x44/0x140 [ 6.053059] [<ffffffff810c5d66>] SyS_finit_module+0x86/0xb0 [ 6.057388] [<ffffffff8160f399>] system_call_fastpath+0x16/0x1b [ 6.061935] Code: 89 d4 53 48 89 f3 e8 80 a3 f6 e0 84 c0 0f 84 b9 01 00 00 e8 63 a4 f6 e0 4d 85 ed 49 8d 45 ff 0f 84 9b 01 00 00 66 0f 1f 44 00 00 <c4> c1 7d 6f 04 24 c5 fc 57 03 c5 fd 7f 03 c4 c1 7d 6f 4c 24 20 [ 6.096859] RIP [<ffffffffa00afc60>] xor_avx_2+0x40/0x210 [xor] [ 6.101594] RSP <ffff88003f83fd28> [ 6.106464] ---[ end trace 85ff96b28d97c5f0 ]--- dracut-pre-udev[200]: //lib/dracut/hooks/pre-udev/30-anaconda-modprobe.sh: line 32: 231 Segmentation fault modprobe $m &>/dev/null Verified: kernel-xen-2.6.18-360.el5 Guest boot up successfully.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHSA-2013-1348.html