Bug 486291

Summary: xm console does not work when xm save fails
Product: Red Hat Enterprise Linux 5 Reporter: Jiri Denemark <jdenemar>
Component: xenAssignee: Jiri Denemark <jdenemar>
Status: CLOSED ERRATA QA Contact: Virtualization Bugs <virt-bugs>
Severity: medium Docs Contact:
Priority: low    
Version: 5.4CC: mshao, xen-maint, yuzhang
Target Milestone: rc   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: xen-3.0.3-85.el5 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2009-09-02 10:09:27 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 486157    
Bug Blocks: 492187    
Attachments:
Description Flags
Patch fixing this bug
none
Corrected patch fixing this bug none

Description Jiri Denemark 2009-02-19 09:13:36 UTC
Description of problem:

When xm save of a PV guest fails, xm console does not accept any input and does not show any output either. However, the guest remains accessible via network, one can ssh to it and no error is shown in guest's logs.

Version-Release number of selected component (if applicable):

xen-3.0.3-80.el5, xen-3.0.3-64.el5

How reproducible:

always

Steps to Reproduce:
1. create a PV guest
2. xm save guest /mnt/small/guest.save # so that it fails
3. xm console
  
Actual results:

The only thing one can do with the console is disconnecting with Ctrl-]. The console shows no output and accepts no input.

Expected results:

xm console should work as if no xm save command was ever issued

Additional info:

Nothing suspicious is shown in xend.log

Comment 1 Jiri Denemark 2009-04-20 14:06:11 UTC
Created attachment 340346 [details]
Patch fixing this bug

Comment 2 Jiri Denemark 2009-04-21 10:22:59 UTC
Created attachment 340500 [details]
Corrected patch fixing this bug

To fix this, new function xc_evtchn_status() has been backported to libxc from current upstream. Using this function, xenconsoled checks status of the event channel associated to the guest and reopens it if necessary.

This patch is backported from upstream staging c/s 19561, which is a modified version of the original patch with some leaks fixed.

Comment 3 Jiri Denemark 2009-04-21 10:26:26 UTC
Background of this bug: when a guest is resumed after aborted save, the event channel associated to its xenconsole is unbound which prevents xenconsoled from getting or sending any data from/to the guest's console.

Comment 4 Jiri Denemark 2009-05-11 13:40:42 UTC
Fix built into xen-3.0.3-85.el5

Comment 6 Yufang Zhang 2009-07-20 08:26:01 UTC
The bug can not be Reproduced in xen-3.0.3-80:
When repeating the following step:
    (1)start a paravirtualized guest with 512MB memory
    (2)mount a 100MB disk partition on /mnt
    (3)run
        # xm save <guest> /mnt/<guest>.save
      then save will fail with:
        Error: /usr/lib/xen/bin/xc_save 22 5 0 0 0 failed
        Usage: xm save <Domain> <CheckpointFile>

        Save a domain state to restore later.
    (4)run
        #xm list
       shows:
        domain1 5 511 1 ---s-- 11.3
       the guest remains shutdown and can not run again.

The guest remains shutdown. It is not connected via ssh or ping when xm save fails.This is quite like the case in bz 486157.

Repeating the steps above in the system updated to xen-3.0.3-90.el5,the guest still remains shutdown when save failed. xm list shows the guest shutdown.The guest can not start again and can not be connected.

Comment 7 Jiri Denemark 2009-07-21 12:10:51 UTC
Ah, sorry about that, you would need to reproduce this bug on an older Xen package, for example xen-3.0.3-64.el5. As -80.el5 is affected by a regression https://bugzilla.redhat.com/show_bug.cgi?id=486157.

Also there's no point in trying to verify this bug before successful verification of 486157.

Comment 8 Yewei Shao 2009-07-28 08:09:00 UTC
Verified on xen-3.0.3-91.el5

Comment 10 errata-xmlrpc 2009-09-02 10:09:27 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2009-1328.html