Description of problem: See http://post-office.corp.redhat.com/archives/virtualist/2007-April/msg00289.html and http://post-office.corp.redhat.com/archives/virtualist/2007-April/msg00290.html Version-Release number of selected component (if applicable): any current xen kernel apparently. Seems to stretch back as far as 2.6.19-1.2895 which is what I've been using. How reproducible: always Steps to Reproduce: 1.get the test program from the first mail. I don't want to attach it here, it's too big. 2.run with zero, one, two, ... parameters, depending on how many cores the machine has 3. Actual results: always 0 Expected results: 0, 1, 2, ... depending on parameters Additional info: Use strace to see what it's doing. It's pretty simple. The code uses sched_setaffinity to bind to process to one processor, depending on the parameter count. Then it calls the vgetcpu vsyscall. That's the code in the vdso. x86-64 does not have a real syscall.
I suspect this has never worked. The initialization code is all marked as #ifndef CONFIG_XEN. So, while vgetcpu does run successfully, it's effectively returning unitialized data. I'm looking at what's needed to make it useful. Due to lack of decent NUMA info we won't get any good node data.
Node data is not important _right now_ (will be soon, though). The sched_getcpu() interface in libc only exposes the CPU index. If you cannot fix the code you should at least make sure that the vsyscall fails and return ENOSYS. Currently it returns zero which is very wrong.
Yeah, it's all wrong, I've fixed it.
To fix this requires two patches. The first fixes a problem where we simply never launch APs, so we end up with just a UP machine with all APs paused waiting to be initialized and launched. The second fixes an issue with the vgetcpu vsyscall. Turns out this vsyscall was there, but none of the initialization code for it would run because it was commented out with #ifndef CONFIG_XEN. When restoring that code, we have to make sure the GDT entry is updated via hypercall because directly writing to the GDT causes a write fault since the GDT page is write protected in Xen.
Created attachment 153558 [details] Fix SMP for x86_64 XenLinux First patch fixes SMP support for x86_64 XenLinux.
Created attachment 153559 [details] x86_64 vgetcpu vsyscall fix Second patch fixes x86_64 vgetcpu vsyscall for XenLinux.
The patches work well for me. Corrected the issues relating to "xm vcpu-set 0" also, I can switch between anywhere from 1 to 4 (the number of processors I have) without any problems/crashes.
The patches were included on Fedora Core 7, on build kernel-xen-2.6-2.6.20-2925.8.fc7.