Bug 513537

Summary: 32-bit PV guest on x86_64 host still remains shutdown when 'xm save' failed
Product: Red Hat Enterprise Linux 5 Reporter: Yufang Zhang <yuzhang>
Component: xenAssignee: Michal Novotny <minovotn>
Status: CLOSED ERRATA QA Contact: Virtualization Bugs <virt-bugs>
Severity: medium Docs Contact:
Priority: low    
Version: 5.4CC: areis, clalance, jbartlett01, lilu, llim, mrezanin, syeghiay, tao, xen-maint
Target Milestone: rc   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: xen-3.0.3-108.el5 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 537800 (view as bug list) Environment:
Last Closed: 2011-01-13 22:18:05 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 486157    
Bug Blocks: 514499    
Attachments:
Description Flags
xend.log
none
Resume properly after 32-on-64 failure.
none
Updated fix for 32-on-64 failed save
none
New version of this patch
none
Updated fix for 32-on-64 failed save none

Description Yufang Zhang 2009-07-24 03:31:25 UTC
Created attachment 354965 [details]
xend.log

Description of problem:
32-bit PV guest on x86_64 host still remains shutdown when 'xm save' failed.

Version-Release number of selected component (if applicable):
xen-3.0.3-91.el5

How reproducible:
always

Steps to Reproduce:
1.start a 32-bit paravirtualized guest with 512MB memory on x86_64 host
  (the guest do not have a VFB device,otherwise it will stop run when 'xm save' failed due to bug 513335)
2.mount a 100MB disk partition on /mnt
3.run
     # xm save <guest> /mnt/<guest>.save
then save will fail with:
     Error: /usr/lib/xen/bin/xc_save 22 5 0 0 0 failed
     Usage: xm save <Domain> <CheckpointFile>

     Save a domain state to restore later.

  
Actual results:
run:   
   #xm list
shows:
   domain1 5 511 1 ---s-- 11.3
the guest remains shutdown and can not run again.
This means that bug 486157 has not been fully fixed in xen-3.0.3-90.el5. 


Expected results:
Just work fine as if no xm save command was ever issued,as long as the guest doesn`t have a VFB device.  

Additional info:
A 32-bit PV(do not have a VFB device) on 32-bit host works just fine when xm save failed,as if no xm save command was ever issued.

Comment 1 Michal Novotny 2009-08-21 13:38:49 UTC
Well, I see there the strange line and xend.log... This needs some investigation. I'll take care of this one...

The lines here with my patch for BZ #513335 applied are:

...
[2009-08-21 15:35:59 xend.XendDomainInfo 22775] INFO (XendDomainInfo:1204) Domain has shutdown: name=migrating-RH54PV id=5 reason=suspend.
[2009-08-21 15:35:59 xend.XendDomainInfo 22775] DEBUG (XendDomainInfo:2089) zombieDeviceCleanup: watch result is {'Count': 2, 'Done': 1}
[2009-08-21 15:35:59 xend.XendDomainInfo 22775] DEBUG (XendDomainInfo:2199) Domain renamed to RH54PV
[2009-08-21 15:35:59 xend.XendDomainInfo 22775] DEBUG (XendDomainInfo:1025) Storing domain details: {'console/ring-ref': '2206951', 'console/port': '2', 'name': 'RH54PV', 'console/limit': '1048576', 'vm': '/vm/ba42b61c-edd7-d9dd-da1c-c2d9e6629f92', 'domid': '5', 'cpu/0/availability': 'online', 'memory/target': '1048576', 'store/ring-ref': '2206952', 'store/port': '1'}
[2009-08-21 15:35:59 xend 22775] DEBUG (DevController:114) DevController: writing {'protocol': 'x86_32-abi', 'state': '1', 'backend-id': '0', 'backend': '/local/domain/0/backend/vkbd/5/0'} to /local/domain/5/device/vkbd/0.
[2009-08-21 15:35:59 xend 22775] DEBUG (DevController:116) DevController: writing {'frontend-id': '5', 'domain': 'RH54PV', 'frontend': '/local/domain/5/device/vkbd/0', 'state': '1', 'online': '1'} to /local/domain/0/backend/vkbd/5/0.
[2009-08-21 15:35:59 xend.XendDomainInfo 22775] DEBUG (XendDomainInfo:631) Checking for duplicate for uname: /var/lib/xen/images/mig/RH54PV.img [tap:aio:/var/lib/xen/images/mig/RH54PV.img], dev: xvda, mode: w
[2009-08-21 15:35:59 xend 22775] DEBUG (blkif:27) exception looking up device number for xvda: [Errno 2] No such file or directory: '/dev/xvda'
[2009-08-21 15:35:59 xend 22775] DEBUG (blkif:27) exception looking up device number for xvda: [Errno 2] No such file or directory: '/dev/xvda'
[2009-08-21 15:35:59 xend 22775] DEBUG (blkif:27) exception looking up device number for xvda: [Errno 2] No such file or directory: '/dev/xvda'
[2009-08-21 15:35:59 xend 22775] DEBUG (DevController:114) DevController: writing {'virtual-device': '51712', 'device-type': 'disk', 'protocol': 'x86_32-abi', 'backend-id': '0', 'state': '1', 'backend': '/local/domain/0/backend/tap/5/51712'} to /local/domain/5/device/vbd/51712.
[2009-08-21 15:35:59 xend 22775] DEBUG (DevController:116) DevController: writing {'domain': 'RH54PV', 'frontend': '/local/domain/5/device/vbd/51712', 'dev': 'xvda', 'state': '1', 'params': 'aio:/var/lib/xen/images/mig/RH54PV.img', 'mode': 'w', 'online': '1', 'frontend-id': '5', 'type': 'tap'} to /local/domain/0/backend/tap/5/51712.
[2009-08-21 15:35:59 xend 22775] DEBUG (DevController:114) DevController: writing {'mac': '00:16:36:0f:d2:e6', 'handle': '0', 'protocol': 'x86_32-abi', 'backend-id': '0', 'state': '1', 'backend': '/local/domain/0/backend/vif/5/0'} to /local/domain/5/device/vif/0.
[2009-08-21 15:35:59 xend 22775] DEBUG (DevController:116) DevController: writing {'bridge': 'virbr0', 'domain': 'RH54PV', 'handle': '0', 'script': '/etc/xen/scripts/vif-bridge', 'state': '1', 'frontend': '/local/domain/5/device/vif/0', 'mac': '00:16:36:0f:d2:e6', 'online': '1', 'frontend-id': '5'} to /local/domain/0/backend/vif/5/0.
[2009-08-21 15:35:59 xend 22775] DEBUG (DevController:114) DevController: writing {'protocol': 'x86_32-abi', 'state': '1', 'backend-id': '0', 'backend': '/local/domain/0/backend/vfb/5/0'} to /local/domain/5/device/vfb/0.
[2009-08-21 15:35:59 xend 22775] DEBUG (DevController:116) DevController: writing {'vncunused': '1', 'domain': 'RH54PV', 'frontend': '/local/domain/5/device/vfb/0', 'state': '1', 'keymap': 'en-us', 'online': '1', 'frontend-id': '5', 'type': 'vnc'} to /local/domain/0/backend/vfb/5/0.
[2009-08-21 15:35:59 xend 22775] DEBUG (vfbif:70) No VNC passwd configured for vfb access
[2009-08-21 15:35:59 xend 22775] DEBUG (vfbif:11) Spawn: ['/usr/lib64/xen/bin/qemu-dm', '-M', 'xenpv', '-d', '5', '-domain-name', 'RH54PV', '-vnc', '0.0.0.0:0', '-vncunused', '-k', 'en-us']
[2009-08-21 15:35:59 xend.XendDomainInfo 22775] DEBUG (XendDomainInfo:2206) XendDomainInfo.resumeDomain: devices created
[2009-08-21 15:35:59 xend.XendDomainInfo 22775] ERROR (XendDomainInfo:2211) XendDomainInfo.resume: xc.domain_resume failed on domain 5.
Traceback (most recent call last):
  File "/usr/lib64/python2.4/site-packages/xen/xend/XendDomainInfo.py", line 2208, in resumeDomain
    xc.domain_resume(self.domid, fast)
Error: (1, 'Internal error', "Couldn't map p2m_frame_list_list")
[2009-08-21 15:35:59 xend 22775] DEBUG (XendCheckpoint:137) XendCheckpoint.save: resumeDomain
nd.
[2009-08-21 15:36:00 xend.XendDomainInfo 22775] INFO (XendDomainInfo:1204) Domain has shutdown: name=RH54PV id=5 reason=suspend.

Comment 2 Michal Novotny 2009-08-21 14:52:41 UTC
Well, when I added logging error code to libxc, xend.log contains this line:

[2009-08-21 16:13:04 xend.XendDomainInfo 2169] ERROR (XendDomainInfo:2211) XendDomainInfo.resume: xc.domain_resume failed on domain 7.
Traceback (most recent call last):
  File "/usr/lib64/python2.4/site-packages/xen/xend/XendDomainInfo.py", line 2208, in resumeDomain
    xc.domain_resume(self.domid, fast)
Error: (1, 'Internal error', "Couldn't map p2m_frame_list_list (errno 14)")

According to perror output the error code 14 means the bad address.

# perror 14
OS error code  14:  Bad address

The function called here to map p2m_frame_list_list is map_xc_foreign_range (which is a hypervisor call, more specifically IOCTL_PRIVCMD_MMAP). The issue here is that it fails on domain_resume() call but this call is not available when save doesn't fail because there is no domain_resume() call. Also, when doing restore of the domain, it seems that there is no domain_resume() call as well so this is not that simple. Does anybody know when domain_resume() is called?

Thanks,
Michal

Comment 3 Michal Novotny 2009-09-02 11:19:55 UTC
Well, this seems that mapping foreign range is called with 0 byte length argument so this may be the issue. I need to investigate how can I see the required length.

Comment 4 Michal Novotny 2009-09-07 12:54:53 UTC
Well, investigation shows that it's calling the function resumeDomain() in python *only* for saving the domain in XendCheckpoint.py . That's unfortunate because the function here is not that we're doing live checkpoint which is not working yet. When I alter my XendDomainInfo.py to do live checkpoint it's returning the same message. I've been doing more investigation of this one and the result is that when we call resumeDomain() in Python, it's calling xc_domain_resume() with fast = 0 for PV guest which is calling xc_domain_resume_any... When I added some logging to this function, it keeps returning information about P2M frames that's good for non-checkpoint save, ie. when doing normal save. When we're doing save with checkpoint (which resumes the domain), it's the only place where resumeDomain() is called before my patch. It keeps returning the same error like when it fails to save the file. The problem here is the Max P2M and P2M from the info.shared_info_frame which returns 0 for resumeDomain() call - this is the reason why it returns error 14 - Bad address. The xc_map_foreign_range() function is unable to accept zero as an argument for `mfn` which is the problem here. It seems that the shared_info_frame have something bad set when doing checkpoint which has the same behavior like resuming the domain after failed save. This needs more investigation and it will fix non-working checkpoint save as well according to what I wrote...

Comment 5 Michal Novotny 2009-09-09 09:09:45 UTC
There's a strange bus error (SIGBUS) in libxc/xc_resume.c. GDB output of python xend process is:
...
Program received signal SIGBUS, Bus error.
[Switching to Thread 0x45ddc940 (LWP 6508)]
0x00002ad139f58103 in xc_domain_resume_any (xc_handle=7, domid=74) at xc_resume.c:263
263         memcpy( p2m_m, p2m, p2m_size * guest_width );
(gdb) p p2m_size
$1 = 264192
(gdb) p p2m_m
$2 = (xen_pfn_t *) 0x17a84c70
(gdb) p p2m
$3 = (xen_pfn_t *) 0x2aaaaaaf2000
(gdb) p memcpy( p2m_m, p2m, p2m_size * guest_width )

Program received signal SIGBUS, Bus error.
0x0000003f5547be5b in memcpy () from /lib64/libc.so.6
The program being debugged was signaled while in a function called from GDB.
GDB remains in the frame where the signal was received.
To change this behavior use "set unwindonsignal on"
Evaluation of the expression containing the function (memcpy) will be abandoned.
(gdb) p p2m_size
$4 = 0
(gdb) p p2m_m
No symbol "p2m_m" in current context.
(gdb) p p2m
$5 = (xen_pfn_t *) 0x0
(gdb)
...

Comment 6 Michal Novotny 2009-09-22 14:58:50 UTC
This is strange, when I was doing investigation of this one I found out that it's really strange. After p2m allocation in xc_resume.c I found a block of 0x2000 memory that was allocated by /proc/xen/privcmd - according to /proc/{xendPid}/maps - but I was unable to access any element of this block, even the first element of this one. It's mapped as:

2aaaabcf2000-2aaaabdf4000 r--s 00000000 00:03 4026534917                 /proc/xen/privcmd

(gdb) p p2m
$1 = (xen_pfn_t *) 0x2aaaabcf2000

But it's not accessible at all because accessing p2m[0] in the code returns SIGBUS:

Program received signal SIGBUS, Bus error.
[Switching to Thread 0x40bc7940 (LWP 30221)]
0x00002ae6734f6f95 in xc_domain_resume_any (xc_handle=7, domid=13) at xc_resume.c:325
325         fprintf(fp, "P2M[0]: %"PRIu32"\n",  ((uint32_t *)p2m)[0]);
(gdb)

...
Also when I try to copy it in GDB, it returns following:

(gdb) set $a=malloc(p2m_size * guest_width)
(gdb) p memcpy($a, p2m, p2m_size)
Program received signal SIGBUS, Bus error.
0x0000003852e7bf0b in memcpy () from /lib64/libc.so.6
The program being debugged was signaled while in a function called from GDB.
GDB remains in the frame where the signal was received.
To change this behavior use "set unwindonsignal on"
Evaluation of the expression containing the function (memcpy) will be abandoned.
(gdb)

SIGBUS again in the context of memcpy. Anybody has an idea what's going on? Isn't possible that hypervisor doesn't allocate the memory right but it returns code like everything is OK ?

Chris, what do you mean about that?

Comment 7 Chris Lalancette 2009-09-23 11:51:58 UTC
(In reply to comment #6)
> This is strange, when I was doing investigation of this one I found out that
> it's really strange. After p2m allocation in xc_resume.c I found a block of
> 0x2000 memory that was allocated by /proc/xen/privcmd - according to
> /proc/{xendPid}/maps - but I was unable to access any element of this block,
> even the first element of this one. It's mapped as:
> 
> 2aaaabcf2000-2aaaabdf4000 r--s 00000000 00:03 4026534917                
> /proc/xen/privcmd
> 
> (gdb) p p2m
> $1 = (xen_pfn_t *) 0x2aaaabcf2000
> 
> But it's not accessible at all because accessing p2m[0] in the code returns
> SIGBUS:

Ah, I'm starting to remember.  You can't actually read the memory area that you've mmapped from privcmd.  There's this code in the kernel:

unsigned long privcmd_nopfn(struct vm_area_struct *vma,
				unsigned long address)
{
	return NOPFN_SIGBUS;
}

static struct vm_operations_struct privcmd_vm_ops = {
	.nopfn = privcmd_nopfn
};

Which basically means that as soon as you try to instantiate a real piece of memory behind this virtual address, you'll get a SIGBUS.

So the real problem is that you shouldn't be accessing the area mapped to /proc/xen/privcmd at all.

The questions to answer are:
1)  Why is it trying to read this area at all?
2)  Why don't you have the same problem when failing a 64-on-64 guest?

Chris Lalancette

Comment 8 Michal Novotny 2009-09-23 12:49:37 UTC
(In reply to comment #7)
> (In reply to comment #6)
> > This is strange, when I was doing investigation of this one I found out that
> > it's really strange. After p2m allocation in xc_resume.c I found a block of
> > 0x2000 memory that was allocated by /proc/xen/privcmd - according to
> > /proc/{xendPid}/maps - but I was unable to access any element of this block,
> > even the first element of this one. It's mapped as:
> > 
> > 2aaaabcf2000-2aaaabdf4000 r--s 00000000 00:03 4026534917                
> > /proc/xen/privcmd
> > 
> > (gdb) p p2m
> > $1 = (xen_pfn_t *) 0x2aaaabcf2000
> > 
> > But it's not accessible at all because accessing p2m[0] in the code returns
> > SIGBUS:
> 
> Ah, I'm starting to remember.  You can't actually read the memory area that
> you've mmapped from privcmd.  There's this code in the kernel:
> 
> unsigned long privcmd_nopfn(struct vm_area_struct *vma,
>     unsigned long address)
> {
>  return NOPFN_SIGBUS;
> }
> 
> static struct vm_operations_struct privcmd_vm_ops = {
>  .nopfn = privcmd_nopfn
> };
> 
> Which basically means that as soon as you try to instantiate a real piece of
> memory behind this virtual address, you'll get a SIGBUS.
> 
> So the real problem is that you shouldn't be accessing the area mapped to
> /proc/xen/privcmd at all.
> 
> The questions to answer are:
> 1)  Why is it trying to read this area at all?
> 2)  Why don't you have the same problem when failing a 64-on-64 guest?
> 
> Chris Lalancette  

The thing is that I need to allocate and access p2m structure (xen_pfn_t *). When I try to access this one on 64-on-64 bit machine, it returns the non-null p2m structure, so I can access everything fine and set the store_mfn and console_mfn fields to p2m[GET_FIELD(&ctxt, store_mfn)] etc... This is working fine but for 32-on-64 bit it's not working and p2m[{anything}] is showing error of SIGBUS.

I need to access p2m[] structure to get current store_mfn and console_mfn. If I omit this one, it's working for SSH but those {store|console}_mfn values are not set at all so if I understand the values correctly, there would be a problem with console and xenstore connection. This way the p2m is allocated right for 64-on-64 but not for 32-on-64... Can't this be a HV issue because I can't access the memory at all for 32-on-64 but it's accessible for 64-on-64?

Thanks,
Michal

Comment 9 Chris Lalancette 2009-09-23 13:22:44 UTC
(In reply to comment #8)
> (In reply to comment #7)
> > (In reply to comment #6)
> > > This is strange, when I was doing investigation of this one I found out that
> > > it's really strange. After p2m allocation in xc_resume.c I found a block of
> > > 0x2000 memory that was allocated by /proc/xen/privcmd - according to
> > > /proc/{xendPid}/maps - but I was unable to access any element of this block,
> > > even the first element of this one. It's mapped as:
> > > 
> > > 2aaaabcf2000-2aaaabdf4000 r--s 00000000 00:03 4026534917                
> > > /proc/xen/privcmd
> > > 
> > > (gdb) p p2m
> > > $1 = (xen_pfn_t *) 0x2aaaabcf2000
> > > 
> > > But it's not accessible at all because accessing p2m[0] in the code returns
> > > SIGBUS:
> > 
> > Ah, I'm starting to remember.  You can't actually read the memory area that
> > you've mmapped from privcmd.  There's this code in the kernel:
> > 
> > unsigned long privcmd_nopfn(struct vm_area_struct *vma,
> >     unsigned long address)
> > {
> >  return NOPFN_SIGBUS;
> > }
> > 
> > static struct vm_operations_struct privcmd_vm_ops = {
> >  .nopfn = privcmd_nopfn
> > };
> > 
> > Which basically means that as soon as you try to instantiate a real piece of
> > memory behind this virtual address, you'll get a SIGBUS.
> > 
> > So the real problem is that you shouldn't be accessing the area mapped to
> > /proc/xen/privcmd at all.
> > 
> > The questions to answer are:
> > 1)  Why is it trying to read this area at all?
> > 2)  Why don't you have the same problem when failing a 64-on-64 guest?
> > 
> > Chris Lalancette  
> 
> The thing is that I need to allocate and access p2m structure (xen_pfn_t *).
> When I try to access this one on 64-on-64 bit machine, it returns the non-null
> p2m structure, so I can access everything fine and set the store_mfn and
> console_mfn fields to p2m[GET_FIELD(&ctxt, store_mfn)] etc... This is working
> fine but for 32-on-64 bit it's not working and p2m[{anything}] is showing error
> of SIGBUS.
> 
> I need to access p2m[] structure to get current store_mfn and console_mfn. If I
> omit this one, it's working for SSH but those {store|console}_mfn values are
> not set at all so if I understand the values correctly, there would be a
> problem with console and xenstore connection. This way the p2m is allocated
> right for 64-on-64 but not for 32-on-64... Can't this be a HV issue because I
> can't access the memory at all for 32-on-64 but it's accessible for 64-on-64?

I'm not 100% sure, but I don't think so.  You can still access the p2m table; you just can't touch this portion of memory, and that's by design.  I could be wrong here, but what if you just hard-code to skip that particular region, and restore all of the rest of the memory?  Does it work then?  If so, at least we can concentrate on why we are accessing that particular region of memory.

Chris Lalancette

Comment 10 Michal Novotny 2009-09-23 13:28:29 UTC
> > 
> > The thing is that I need to allocate and access p2m structure (xen_pfn_t *).
> > When I try to access this one on 64-on-64 bit machine, it returns the non-null
> > p2m structure, so I can access everything fine and set the store_mfn and
> > console_mfn fields to p2m[GET_FIELD(&ctxt, store_mfn)] etc... This is working
> > fine but for 32-on-64 bit it's not working and p2m[{anything}] is showing error
> > of SIGBUS.
> > 
> > I need to access p2m[] structure to get current store_mfn and console_mfn. If I
> > omit this one, it's working for SSH but those {store|console}_mfn values are
> > not set at all so if I understand the values correctly, there would be a
> > problem with console and xenstore connection. This way the p2m is allocated
> > right for 64-on-64 but not for 32-on-64... Can't this be a HV issue because I
> > can't access the memory at all for 32-on-64 but it's accessible for 64-on-64?
> 
> I'm not 100% sure, but I don't think so.  You can still access the p2m table;
> you just can't touch this portion of memory, and that's by design.  I could be
> wrong here, but what if you just hard-code to skip that particular region, and
> restore all of the rest of the memory?  Does it work then?  If so, at least we
> can concentrate on why we are accessing that particular region of memory.
> 
> Chris Lalancette  

Well, this is the problem. I the p2m table is allocated using xc_map_foreign_range() which calls the hypercall of PRIVCMD_MMAP and for 32-on-64 I cannot access any element of p2m table since this returns SIGBUS immediately. If I skip any p2m mapping and p2m stuff, it's working fine  but no store_mfn/console_mfn is set that time (because setting this needs p2m table access which always returns SIGBUS in this case).

Michal

Comment 11 Michal Novotny 2009-09-23 13:40:51 UTC
(In reply to comment #10)
> > > 
> > > The thing is that I need to allocate and access p2m structure (xen_pfn_t *).
> > > When I try to access this one on 64-on-64 bit machine, it returns the non-null
> > > p2m structure, so I can access everything fine and set the store_mfn and
> > > console_mfn fields to p2m[GET_FIELD(&ctxt, store_mfn)] etc... This is working
> > > fine but for 32-on-64 bit it's not working and p2m[{anything}] is showing error
> > > of SIGBUS.
> > > 
> > > I need to access p2m[] structure to get current store_mfn and console_mfn. If I
> > > omit this one, it's working for SSH but those {store|console}_mfn values are
> > > not set at all so if I understand the values correctly, there would be a
> > > problem with console and xenstore connection. This way the p2m is allocated
> > > right for 64-on-64 but not for 32-on-64... Can't this be a HV issue because I
> > > can't access the memory at all for 32-on-64 but it's accessible for 64-on-64?
> > 
> > I'm not 100% sure, but I don't think so.  You can still access the p2m table;
> > you just can't touch this portion of memory, and that's by design.  I could be
> > wrong here, but what if you just hard-code to skip that particular region, and
> > restore all of the rest of the memory?  Does it work then?  If so, at least we
> > can concentrate on why we are accessing that particular region of memory.
> > 
> > Chris Lalancette  
> 
> Well, this is the problem. I the p2m table is allocated using
> xc_map_foreign_range() which calls the hypercall of PRIVCMD_MMAP and for
> 32-on-64 I cannot access any element of p2m table since this returns SIGBUS
> immediately. If I skip any p2m mapping and p2m stuff, it's working fine  but no
> store_mfn/console_mfn is set that time (because setting this needs p2m table
> access which always returns SIGBUS in this case).
> 
> Michal  

Oh, sorry, I did some kind of mystification. It's not calling xc_map_foreign_range() but xc_map_foreign_batch(). I added some logging of results to this mapping function... The function didn't fail but even in this function when I try to fprintf() the current uint32_t and uint64_t values of this one, it's not working at all - even for uint32_t. It seems like it's not allocated right. xc_map_foreign_batch() calls PRIVCMD_MMAPBATCH hypercall which returns the good value (ie. it's not < 0) but I can't access even the first element (no matter whether I cast it as 32 bit or 64 bit number). I think the hypercall fails on something but it didn't return the error so that's why the SIGBUS signal is received.

Comment 12 Chris Lalancette 2009-09-24 15:09:45 UTC
Quick update; we do need to do the store_mfn and console.domU.mfn restoration because of some wacky stuff the domU kernel does during save.  It basically stores the *pfn* in the mfn field during suspend, and then expects that to be back to the *mfn* after a restore.  So we do have to store it back there.  There's something else going on preventing that from really working, though, which we will have to track down.

Chris Lalancette

Comment 13 Chris Lalancette 2009-09-25 12:03:34 UTC
Created attachment 362663 [details]
Resume properly after 32-on-64 failure.

As I expected, the problem comes down to the fact that xc_resume_any() didn't know about 32-on-64 support.  Attached is an extremely rough patch that should make it smart enough to do that.  I got it to do what I consider the right thing, but unfortunately the guest domain now crashes after resume (instead of just being stuck in S state).  It's unclear at the moment whether that is a bug in the domU guest kernel, or a bug in the patch.

In any case, I can't spend any more time on this.  Michal, you'll have to clean up the this patch and see if you can get it to work completely.

Chris Lalancette

Comment 14 Michal Novotny 2009-09-29 10:31:19 UTC
Hi Chris,
I will work on this one, thanks for the rough version of this patch.

Michal

Comment 15 Michal Novotny 2009-09-29 10:59:23 UTC
Well, I did apply your patch and I tried 32 bit RHEL 5.3 guest on RHEL 5.4 dom0 with -95 version of xen package with your patch applied and it was working perfectly. I did:

1. xm create (32-on-64-)guest
2. xm console guest
3. xm save guest /mnt/small
4. xm console guest

And everything was working perfectly but for 32 bit RHEL 5.4 domU it was showing some error messages:
...
Starting sendmail: netfront: rx->offset: 12, size: 4294967295
netfront: rx->offset: 12, size: 4294967295
netfront: rx->offset: 12, size: 4294967295
netfront: rx->offset: 12, size: 4294967295
netfront: rx->offset: 12, size: 4294967295
netfront: rx->offset: 12, size: 4294967295
netfront: rx->offset: 12, size: 4294967295
netfront: rx->offset: 12, size: 4294967295
netfront: rx->offset: 12, size: 4294967295
netfront: rx->offset: 12, size: 4294967295
printk: 16 messages suppressed.
netfront: rx->offset: 12, size: 4294967295
printk: 80 messages suppressed.
netfront: rx->offset: 12, size: 4294967295

Since this is not done by RHEL 5.3 domU and only by RHEL 5.4 domU, cannot this be a kernel issue of Xen kernel? I am not familiar with those netfront messages but the dom0 has been unchanged between both tries (the triage with RHEL 5.3 and RHEL 5.4).

Michal

Comment 17 Chris Lalancette 2009-10-05 07:55:19 UTC
(In reply to comment #15)
> Well, I did apply your patch and I tried 32 bit RHEL 5.3 guest on RHEL 5.4 dom0
> with -95 version of xen package with your patch applied and it was working
> perfectly. I did:
> 
> 1. xm create (32-on-64-)guest
> 2. xm console guest
> 3. xm save guest /mnt/small
> 4. xm console guest
> 
> And everything was working perfectly but for 32 bit RHEL 5.4 domU it was
> showing some error messages:
> ...
> Starting sendmail: netfront: rx->offset: 12, size: 4294967295
> netfront: rx->offset: 12, size: 4294967295
> netfront: rx->offset: 12, size: 4294967295
> netfront: rx->offset: 12, size: 4294967295
> netfront: rx->offset: 12, size: 4294967295
> netfront: rx->offset: 12, size: 4294967295
> netfront: rx->offset: 12, size: 4294967295
> netfront: rx->offset: 12, size: 4294967295
> netfront: rx->offset: 12, size: 4294967295
> netfront: rx->offset: 12, size: 4294967295
> printk: 16 messages suppressed.
> netfront: rx->offset: 12, size: 4294967295
> printk: 80 messages suppressed.
> netfront: rx->offset: 12, size: 4294967295
> 
> Since this is not done by RHEL 5.3 domU and only by RHEL 5.4 domU, cannot this
> be a kernel issue of Xen kernel? I am not familiar with those netfront messages
> but the dom0 has been unchanged between both tries (the triage with RHEL 5.3
> and RHEL 5.4).

It's possible, but not certain.  The problem is that we are diddling around with the domain memory from userspace, so it's not entirely clear where the problem is coming from.  This particular error is coming about because the dom0 is putting error messages on the shared ring, and the guest is reporting them.  So it's possible it's:

1)  A userspace bug, because we touched some memory we shouldn't have.
2)  A domU bug, because something in the ring structures wasn't re-initialized properly.
3)  A dom0 bug, since it didn't re-initialize the backend of the rings properly.

So, some things to look at:
1)  Does networking work at all when this is happening?  I suspect the answer is no, but if it eventually starts working then that would be interesting to know.

2)  Does this happen with a 64-bit domU before this patch is applied?  Does it happen with a 64-bit domU after this patch is applied?

3)  Take a look at the output from xenstore-ls, and see if you can see anything about the state of the either the frontend or backend not being in the right state (i.e. XenbusInitialized, XenbusClosing, etc).

4)  Read the code-paths for the "failure" case, and see if we are forgetting to somehow re-initialize the netback portion of the rings.

Chris Lalancette

Comment 18 Michal Novotny 2009-10-05 08:04:58 UTC
(In reply to comment #17)
> (In reply to comment #15)
> > Well, I did apply your patch and I tried 32 bit RHEL 5.3 guest on RHEL 5.4 dom0
> > with -95 version of xen package with your patch applied and it was working
> > perfectly. I did:
> > 
> > 1. xm create (32-on-64-)guest
> > 2. xm console guest
> > 3. xm save guest /mnt/small
> > 4. xm console guest
> > 
> > And everything was working perfectly but for 32 bit RHEL 5.4 domU it was
> > showing some error messages:
> > ...
> > Starting sendmail: netfront: rx->offset: 12, size: 4294967295
> > netfront: rx->offset: 12, size: 4294967295
> > netfront: rx->offset: 12, size: 4294967295
> > netfront: rx->offset: 12, size: 4294967295
> > netfront: rx->offset: 12, size: 4294967295
> > netfront: rx->offset: 12, size: 4294967295
> > netfront: rx->offset: 12, size: 4294967295
> > netfront: rx->offset: 12, size: 4294967295
> > netfront: rx->offset: 12, size: 4294967295
> > netfront: rx->offset: 12, size: 4294967295
> > printk: 16 messages suppressed.
> > netfront: rx->offset: 12, size: 4294967295
> > printk: 80 messages suppressed.
> > netfront: rx->offset: 12, size: 4294967295
> > 
> > Since this is not done by RHEL 5.3 domU and only by RHEL 5.4 domU, cannot this
> > be a kernel issue of Xen kernel? I am not familiar with those netfront messages
> > but the dom0 has been unchanged between both tries (the triage with RHEL 5.3
> > and RHEL 5.4).
> 
> It's possible, but not certain.  The problem is that we are diddling around
> with the domain memory from userspace, so it's not entirely clear where the
> problem is coming from.  This particular error is coming about because the dom0
> is putting error messages on the shared ring, and the guest is reporting them. 
> So it's possible it's:
> 
> 1)  A userspace bug, because we touched some memory we shouldn't have.


Well, the netfront rx error is the one I saw in the past already. I don't know what BZ number it was connected to but it's about NIC device I think.


> 2)  A domU bug, because something in the ring structures wasn't re-initialized
> properly.

Well, this is my guess.


> 3)  A dom0 bug, since it didn't re-initialize the backend of the rings
> properly.


You're right, this should be the issue as well, that the backend device is not connected/initialized the right way... I'll try to check this in xenstore-ls...


> 
> So, some things to look at:
> 1)  Does networking work at all when this is happening?  I suspect the answer
> is no, but if it eventually starts working then that would be interesting to
> know.
> 

Oh, I need to test this one again because I don't know now but if I remember it correctly it doesn't. I need to double-check this one...


> 2)  Does this happen with a 64-bit domU before this patch is applied?  Does it
> happen with a 64-bit domU after this patch is applied?
> 

I never run in such issues for 64-on-64 at all. But I'll test this one after the patch is applied, good point.


> 3)  Take a look at the output from xenstore-ls, and see if you can see anything
> about the state of the either the frontend or backend not being in the right
> state (i.e. XenbusInitialized, XenbusClosing, etc).

Good point, thanks.


> 
> 4)  Read the code-paths for the "failure" case, and see if we are forgetting to
> somehow re-initialize the netback portion of the rings.
> 

Code-paths? What do you mean by that? I am a little confused by this one...

Michal

> Chris Lalancette

Comment 19 Michal Novotny 2009-10-05 10:39:31 UTC
Created attachment 363651 [details]
Updated fix for 32-on-64 failed save

Well, this is the updated version of this patch. Chris' fix went the right direction but there was some more work needed for vif devices. Network devices were not properly connected after resume so this one reconnects the device by setting the state to InitWait and then Connected for xenbus to intercept the change. After this patch applied it's working fine for 32-on-64 bit save failure and 64-on-64 bit save failure as well. It's been tested and it's working fine.

Michal

Comment 20 Michal Novotny 2009-10-05 14:45:55 UTC
Created attachment 363682 [details]
New version of this patch

Hi, this is the updated version of this patch with some cleanup done and also looking for vif devices using their backends path from /vm/{uuid}/device tree in xenstore...

Comment 21 Michal Novotny 2009-10-06 04:38:20 UTC
Created attachment 363772 [details]
Updated fix for 32-on-64 failed save

This is the new and cleaned version of my patch

Comment 22 Michal Novotny 2009-10-20 16:47:39 UTC
Well, I studied netback sources and I found some evidence of reiniting vif devices when frontend is state XenbusStateInitialising (1) and backend is in the state 6 (XenbusStateClosed) but when I tried to apply it to my patch it was not working.. Anybody has an idea what do do about it now? Chris?

Thanks,
Michal

Comment 25 Jiri Denemark 2009-11-16 09:02:59 UTC
Sure. What about filing a bug against kernel-xen and describing your observations?

Comment 26 Michal Novotny 2009-11-16 11:42:31 UTC
(In reply to comment #25)
> Sure. What about filing a bug against kernel-xen and describing your
> observations?  

Good point but I'd like to be sure about that. The thing here is that I am having doubts a little. But since it was working neither with -80 version nor -96 version it *may* be a good idea to think of it as a kernel-xen one...

I'll try that.
Thanks,
Michal

Comment 33 Paolo Bonzini 2010-06-23 15:52:08 UTC
*** Bug 459728 has been marked as a duplicate of this bug. ***

Comment 35 Linqing Lu 2010-08-25 09:46:50 UTC
This bug has been verified on xen-3.0.3-115.el5
The HVM guest runs WELL after "xm save" failed.

But this guest name will get a "Zombie-" prefix after shut it down, instead of disappearing from "xm list".
Just as the same situation as Bug 589123.

Comment 36 Michal Novotny 2010-12-20 10:56:43 UTC
Linqing,
bug 589123 is a kernel-xen bug and until this bug is fixed it will leave zombies. There's nothing we can do with this in the user-space stack anymore.

Michal

Comment 38 errata-xmlrpc 2011-01-13 22:18:05 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2011-0031.html