The customer is running Windows 2008 guests on a RHEL 5.4 xen system that uses Intel Xeon processors. When running a cygwin bash shell on a Windows guest, the guest crashes with the following: Exception: STATUS_ACCESS_VIOLATION at eip=7710FBA2 eax=00000001 ebx=00000000 ecx=00000000 edx=00000000 esi=0028C850 edi=004C5490 ebp=0028C7EC esp=0028C798 program=C:\cygwin\bin\bash.exe, pid 2456, thread main cs=0023 ds=002B es=002B fs=0053 gs=002B ss=002B Stack trace: Frame Function Args 0028C7EC 7710FBA2 (0028C850, 0028C8C8, 00010000, 00000000) 0028C854 74CC7ED4 (004C50E8, 40000001, 00000000, 004C5490) 0028C888 76A31AE9 (0028C8C8, 40000001, 00000000, 00000001) 0028C9D8 61095772 (611588E0, 0028CA18, 0028CA14, 00010000) 0028CA28 61095BFD (0028CA64, 00010000, 00010000, 00000000) 0028CA78 6109689A (0028CAF0, 0028CB00, 0028CAFC, 00000000) 0028CB18 610B5178 (00DDECA0, 00000000, FFFFFFFF, FFFFFFFF) 0028CB48 00411DB3 (00DDECA0, 00000000, 004784B4, 0042F953) 0028CB78 00403868 (00000001, 00DD84C8, 004784B4, 72FB1748) 0028CD38 00403377 (0028D008, 00000002, 611A0CCE, 61006DDA) 0028CD78 61006DDA (00000000, 0028CDB0, 610066E0, 7EFDE000) This is discussed in http://www.mail-archive.com/kvm@vger.kernel.org/msg09173.html, http://sourceware.org/ml/cygwin/2008-01/msg00582.html, http://old.nabble.com/Xen-3.2.1---Win-2003-2008-Server-64-bit-guests:-cygwin-bash-builtin-%22test%22-crashes-td19001336.html and http://lists.xensource.com/archives/html/xen-devel/2009-11/msg00164.html There is a problem in the processing of the “mov %gs...” instruction. On the AMD processor the gs register is being restored somehow but not on the Intel processor. Even the technical commenter's do not sound like they know what restores the register on the AMD. It appears that it still crashes in 3.4.0, but something in 3.4.1 did fix the problem. However, it looks like it is one of those, “something fixed it but we're not sure what” issues. However, I have been told we have no plans of moving to xen 3.4.1. Steps to reproduce, from customer Build 64bit windows 2008 server on a xen vm. Install cygwin with bash shell. Run the following command in a bash shell: test -a test
Well, I'm investigating this now and I found something relevant in KVM kernel code at [1]. Unfortunately it appears that vmx_get_msr() function is KVM only and not available in Xen. According to the path in the patch file (kernel/x86/vmx.c) it's most probably done in the KVM kernel module itself and therefore it may be a kernel-xen bug but I'm not 100% sure about this one. It still may be a problem in VMX Assist in Xen user-space which is being available for Intel CPUs but most probably I think this is the bug in the hypervisor which belongs to kernel-xen component. Bill, you wrote that it still crashes with 3.4.0 but it's been fixed in the 3.4.1. Was the same version of xen kernel used or not? What version was used with xen-3.4.0 and what version with xen-3.4.1 ? This information could help a lot to determine the component. Thanks, Michal [1] http://patchwork.kernel.org/patch/7092/
Created attachment 431139 [details] Test fix for xen HV (In reply to comment #1) > Well, I'm investigating this now and I found something relevant in KVM kernel > code at [1]. Unfortunately it appears that vmx_get_msr() function is KVM only > and not available in Xen. According to the path in the patch file > (kernel/x86/vmx.c) it's most probably done in the KVM kernel module itself and > therefore it may be a kernel-xen bug but I'm not 100% sure about this one. It > still may be a problem in VMX Assist in Xen user-space which is being available > for Intel CPUs but most probably I think this is the bug in the hypervisor > which belongs to kernel-xen component. > > Bill, you wrote that it still crashes with 3.4.0 but it's been fixed in the > 3.4.1. Was the same version of xen kernel used or not? What version was used > with xen-3.4.0 and what version with xen-3.4.1 ? This information could help a > lot to determine the component. > > Thanks, > Michal > > [1] http://patchwork.kernel.org/patch/7092/ Well, I found some relevant information about this one. This has been fixed in the xen-unstable.hg c/s 19953 (vmx: Fix handling of FS/GS base MSRs) available at [1]. Since this is in the xen/arch/x86/hvm/vmx/vmx.c file it's the hypervisor related therefore the component is kernel-xen. Also, I've tried code as described on [2] to make it fail on Linux 64-bit guest but it didn't crash: #include <setjmp.h> jmp_buf env; main() { if(setjmp(env)) return; longjmp(env, 1); } This didn't crash the application when compiled with gcc. I tried also with Windows 2003 x64 but it didn't crash at well. I'm currently downloading and installing Windows 2008 to test the patch. Michal [1] http://xenbits.xensource.com/xen-unstable.hg?rev/fe4c6845a9d7 [2] http://lists.xensource.com/archives/html/xen-devel/2009-11/msg00164.html
Bill, I did try installing Windows 2008 x64 edition with following package versions and I was unable to reproduce it: kernel-xen-2.6.18-194.3.1.el5 xen-3.0.3-113.el5(virttest30.g9810091) According to comment #2 I guess this was fixed in the kernel-xen component already since I was unable to reproduce. Just a note: Windows 2008 is *not* Windows 2008R2 - Windows 2008R2 is a successor of Windows 2008 (R1). Could you please guide customers to reproduce using the package version as described above? You can try with the following packages: kernel-xen - http://people.redhat.com/jwilson/el5/194.el5/ xen - http://people.redhat.com/mrezanin/xen/ Could you please guide customers to those links (they are available to public so they should have no problem to access them) and download appropriate versions for their architecture and try again? Thanks, Michal
Oh, sorry, my bad. The problem was with the Windows permissions on the C:\cygwin folder (even for Administrator user) so that's why I was unable to reproduce it, after moving to some read-write location (e.g. C:\Documents and Settings\Administrator\Data) I was able to reproduce it. Also, I did try it using my patch applied and it was working fine. Test command: "test -e / ; echo hi" Before my patch applied: bash shell just exited After my patch applied: bash shell echoed "hi" and continued So according to my testing I guess this is the plausible fix. Michal
Bill, could you please guide customers to [1] I've created and put on people page right now? It's working fine for me so I'd like customers to test this kernel/hypervisor version. According to the IT they're using x86_64 architecture so there's RPM for their architecture and just kernel-xen should be enough. Could you please provide me test results from their testing? Thanks, Michal [1] http://people.redhat.com/minovotn/kernel-xen/
Bill, any updates on this ? Thanks, Michal
This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux maintenance release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Update release for currently deployed products. This request is not yet committed for inclusion in an Update release.
in kernel-2.6.18-219.el5 You can download this test kernel from http://people.redhat.com/jwilson/el5 Detailed testing feedback is always welcomed.
Test with: host: x86_64 (Intel(R) Xeon(R) CPU W3520 @ 2.67GHz) xen-3.0.3-120.el5 guest: Win2008-64 Test steps: 1. install cygwin with shells on Win2008-64 guest 2. run the following command in the bash shell: test -e / ; echo hi reproduced the bug with kernel-xen-2.6.18-215.el5: no guest crash but bash shell just exited at step2. same as described in comment 5. verified the bug with kernel-xen-2.6.18-238.el5: bash shell echoed "hi" and continued at step2. According to the test results above, move to VERIFIED.
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2011-0017.html