Bug 504910 - Command "virsh save <domain> <file>" hang when trying to save a paused domain on xen
Command "virsh save <domain> <file>" hang when trying to save a paused domain...
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: xen (Show other bugs)
5.4
All Linux
medium Severity medium
: rc
: ---
Assigned To: Michal Novotny
Virtualization Bugs
:
Depends On:
Blocks: 5.4/TechnicalNotes
  Show dependency treegraph
 
Reported: 2009-06-09 22:13 EDT by Edward Wang
Modified: 2014-02-02 17:37 EST (History)
11 users (show)

See Also:
Fixed In Version: xen-3.0.3-95.el5
Doc Type: Bug Fix
Doc Text:
Save operations should not be attempted on paused Xen domains. This will cause Xend to hang.
Story Points: ---
Clone Of:
Environment:
Last Closed: 2010-03-30 04:58:35 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Xm save hang on saving paused domain fix (841 bytes, patch)
2009-06-22 10:13 EDT, Michal Novotny
no flags Details | Diff
Error message when shutting down/saving paused domain (911 bytes, patch)
2009-07-14 04:00 EDT, Michal Novotny
no flags Details | Diff

  None (edit)
Description Edward Wang 2009-06-09 22:13:21 EDT
Description of problem:
I defined one domain on xen, start it and then set it to paused status by command "virsh suspend <domain>. After that, try to save the domain by command "virsh save <domain> <file>" and find that the save process hang there without response anymore.

Version-Release number of selected component (if applicable):
- libvirt: 0.6.3-6.el5

How reproducible:
100%, every time.

Steps to Reproduce:
1, virsh define rhel5u3.xml 
2, virsh start rhel5u3
3, virsh suspend rhel5u3
4, virsh save rhel5u3 /tmp/rhel5u3.save
  
Actual results:
In step 4, the save process hang there without response anymore. 
1) When issue command "virsh list --all", find two domain with the SAME name exist, one is "paused", another is "shut off". see below:

[root@dhcp-66-70-66 tcs]# virsh list --all
 Id Name                 State
----------------------------------
  0 Domain-0             running
123 rhel5u3              paused
  - rhel5u3              shut off

2) When check domain status from "virt-manager" find that the domain(rhel5u3) status is switching between "Paused" and "Shutoff" again and again.

Expected results:
The save process for a paused domain should succeed on xen.

Additional info:
1, system information (uname -a)
Linux dhcp-66-70-66.nay.redhat.com 2.6.18-151.el5xen #1 SMP Wed May 27 16:33:19 EDT 2009 x86_64 x86_64 x86_64 GNU/Linux
2, rhel version (cat /etc/redhat-release)
Red Hat Enterprise Linux Server release 5.4 Beta (Tikanga)
[root@dhcp-66-70-66 ~]#
Comment 3 Daniel Berrange 2009-06-11 12:34:44 EDT
Can you try the same steps, but instead of 'virsh save', use 'xm save' instead. This will determine if the bug is in libvirt or XenD.
Comment 4 Edward Wang 2009-06-11 21:48:06 EDT
Daniel,

I've tried 'xm save' (the other steps are the same with above), the result is the same, that is:
1, The 'xm save' process hangs
2, Find two domains with the SAME name exist in the output of "virsh list --all", one is 'paused', another is 'shut off'

If still need other information, just let me know.

Thanks a lot
Comment 5 Bill Burns 2009-06-15 16:00:10 EDT
Release note added. If any revisions are required, please set the 
"requires_release_notes" flag to "?" and edit the "Release Notes" field accordingly.
All revisions will be proofread by the Engineering Content Services team.

New Contents:
Save operations should not be attempted on paused Xen domains. This will cause Xend to hang.
Comment 6 Michal Novotny 2009-06-22 09:52:15 EDT
I am taking this one, patch will be coming soon...
Comment 7 Michal Novotny 2009-06-22 10:13:38 EDT
Created attachment 348908 [details]
Xm save hang on saving paused domain fix

This is the patch/workaround for this bug. The problem is that xend was waiting until the domain shuts down when doing save but this condition never happened because of domain was unable to shutdown in paused state. Therefore the domain must be unpaused before doing save and my patch checks the domain state and if it's paused, it unpauses the domain to allow shutdown. After that, everything works fine and the domain is saved.

I've tried it about 10 times and xend never hung.
Michal
Comment 8 Paolo Bonzini 2009-07-09 09:06:34 EDT
I'm not sure of whether this is a good fix, unfortunately.  The machine will restart for a very short while, but it was presumably paused for a reason... :-(
Comment 9 Chris Lalancette 2009-07-09 10:27:24 EDT
Yeah, we discussed this on the mailing list, I should have updated it here.

You are right, it's a bad idea to unpause the domain behind the administrator's back.  Basically, we should just fail with a message saying "Can't save since the domain is paused; unpause it first if you want to save", and just abort.

Chris Lalancette
Comment 10 Michal Novotny 2009-07-13 16:08:33 EDT
Yeah, we could write some error message from XenD itself but it makes a perfect sense - when a domain is paused, it can't start shutting down because shutdown is just issuing a shutdown domain in the domain itself. Should I create a patch to write just an error when trying to shutdown paused domain?

Well, it's paused for the reason but what administrator would do if he really wants to shut the domain down? He unpauses it and issues shutdown then so this could be annoying for administrators because unpausing the domain is necessary to issue just one last command - the shutdown command. This is in fact what does one step for administrator - not to annoy administrators with message that he needs to unpause it first. In fact if I imagine myself being a system administrator wanting to shut it down I'd prefer xend doing this for me and not telling me I need to unpause it manually in order to allow shutdown.

Maybe, this should be configurable in xend-config.sxp file... something like:
(allow-domain-unpause-before-shutdown yes)

or something similar... Administrators should then choose whether to display only a message (when set to "no") or to unpause the domain automatically when issuing a shutdown command on paused domain.

What do you think about this suggestion ?
Comment 11 Paolo Bonzini 2009-07-14 02:54:43 EDT
> Well, it's paused for the reason but what administrator would do if he really
> wants to shut the domain down?

For example, kvm will save the machine and then restore it as paused (actually it's buggy, but I have that bug).  The key point here is that the domain is not shutdown, it is destroyed when you save (unless I'm mistaken).  So in theory it should be possible to make it work.

I agree with Chris that for Xen an error message is the best course of action.  Even just printing a message and hanging (the administrator can then unpause the VM from another terminal or after ^Z) does not seem too bad, though I'm not sure about it.
Comment 12 Michal Novotny 2009-07-14 03:21:37 EDT
(In reply to comment #11)
> > Well, it's paused for the reason but what administrator would do if he really
> > wants to shut the domain down?
> 
> For example, kvm will save the machine and then restore it as paused (actually
> it's buggy, but I have that bug).  The key point here is that the domain is not
> shutdown, it is destroyed when you save (unless I'm mistaken).  So in theory it
> should be possible to make it work.
> 
> I agree with Chris that for Xen an error message is the best course of action. 
> Even just printing a message and hanging (the administrator can then unpause
> the VM from another terminal or after ^Z) does not seem too bad, though I'm not
> sure about it.  

I see your point. I am not thinking it's bad but if I would like to shut it down, I'd prefer it configurable to either show error message or do the unpause for me.
Comment 13 Chris Lalancette 2009-07-14 03:25:26 EDT
(In reply to comment #12)
> (In reply to comment #11)
> > > Well, it's paused for the reason but what administrator would do if he really
> > > wants to shut the domain down?
> > 
> > For example, kvm will save the machine and then restore it as paused (actually
> > it's buggy, but I have that bug).  The key point here is that the domain is not
> > shutdown, it is destroyed when you save (unless I'm mistaken).  So in theory it
> > should be possible to make it work.
> > 
> > I agree with Chris that for Xen an error message is the best course of action. 
> > Even just printing a message and hanging (the administrator can then unpause
> > the VM from another terminal or after ^Z) does not seem too bad, though I'm not
> > sure about it.  
> 
> I see your point. I am not thinking it's bad but if I would like to shut it
> down, I'd prefer it configurable to either show error message or do the unpause
> for me.  

You don't want to do things behind the administrators back.  While we could add an option to do it for the administrator, the fact of the matter is that very few people use this functionality, so it's complete overkill.  Just having a message that says "unpause before you save/shutdown" is safe, easy, and straightforward.

Chris Lalancette
Comment 14 Michal Novotny 2009-07-14 03:35:29 EDT
(In reply to comment #13)
> (In reply to comment #12)
> > (In reply to comment #11)
> > > > Well, it's paused for the reason but what administrator would do if he really
> > > > wants to shut the domain down?
> > > 
> > > For example, kvm will save the machine and then restore it as paused (actually
> > > it's buggy, but I have that bug).  The key point here is that the domain is not
> > > shutdown, it is destroyed when you save (unless I'm mistaken).  So in theory it
> > > should be possible to make it work.
> > > 
> > > I agree with Chris that for Xen an error message is the best course of action. 
> > > Even just printing a message and hanging (the administrator can then unpause
> > > the VM from another terminal or after ^Z) does not seem too bad, though I'm not
> > > sure about it.  
> > 
> > I see your point. I am not thinking it's bad but if I would like to shut it
> > down, I'd prefer it configurable to either show error message or do the unpause
> > for me.  
> 
> You don't want to do things behind the administrators back.  While we could add
> an option to do it for the administrator, the fact of the matter is that very
> few people use this functionality, so it's complete overkill.  Just having a
> message that says "unpause before you save/shutdown" is safe, easy, and
> straightforward.
> 
> Chris Lalancette  

Of course it depends on how often this is used. I can just display a message, that's right and easy - the thing was just to make administrator choose what does he want in the config file. But since it's now a very used feature, it's overkill and therefore displaying the message is good.

Michal
Comment 15 Michal Novotny 2009-07-14 04:00:26 EDT
Created attachment 351557 [details]
Error message when shutting down/saving paused domain

This is patch to display just error message when issuing domain shutdown - the shutdown is called when saving too. It can't shutdown when it's paused and not to unpause the domain behind administrators back (like first version of this patch did), this just shows the error message.
Comment 20 Jiri Denemark 2009-09-22 05:32:05 EDT
Fix built into xen-3.0.3-95.el5
Comment 22 Yewei Shao 2009-12-25 04:18:08 EST
When I try to save a paused domain on xen, it will pop out a message like: "Error: Can't shutdown/save the domain since the domain is paused; unpaused it first if you want to shutdown/save". Base on the comment #15, so this bug is verified in xen-3.0.3-102.el5.
Comment 24 errata-xmlrpc 2010-03-30 04:58:35 EDT
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2010-0294.html
Comment 25 Paolo Bonzini 2010-04-08 11:49:55 EDT
This bug was closed during 5.5 development and it's being removed from the internal tracking bugs (which are now for 5.6).

Note You need to log in before you can comment on or make changes to this bug.