Bug 567931

Summary: virt-top exits sometimes when the window is resized
Product: [Fedora] Fedora Reporter: Richard W.M. Jones <rjones>
Component: libvirtAssignee: Richard W.M. Jones <rjones>
Status: CLOSED UPSTREAM QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: medium Docs Contact:
Priority: low    
Version: rawhideCC: berrange, clalance, crobinso, fedora-ocaml-list, itamar, jforbes, rjones, veillard, virt-maint
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 568172 (view as bug list) Environment:
Last Closed: 2010-02-24 12:33:09 EST Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
Bug Depends On:    
Bug Blocks: 568172    
Attachments:
Description Flags
strace from Rawhide libvirt / Rawhide virt-top none

Description Richard W.M. Jones 2010-02-24 06:25:23 EST
Description of problem:

virt-top exits sometimes (clean exit with code 1, not a segfault)
when the window is resized.

Version-Release number of selected component (if applicable):

virt-top-1.0.4-3.fc13.x86_64
Also observed with the Debian virt-top package.

How reproducible:

Sometimes.

Steps to Reproduce:
1. sudo virt-top -d 0.1 --debug /tmp/debug
2. Resize the window aggressively.
3.
  
Actual results:

Occasionally virt-top exits.

Expected results:

Should not exit.

Additional info:

Original report: http://rwmj.wordpress.com/2010/02/23/virt-top-is-in-debian/#comment-1256
Comment 1 Richard W.M. Jones 2010-02-24 06:45:55 EST
This bug is quite hard to reproduce anyway, but it seems like it
doesn't happen at all when virt-top is run under gdb.  Possibly
gdb alters the way that signals are delivered.
Comment 2 Richard W.M. Jones 2010-02-24 06:50:03 EST
I think this is actually a libvirt bug.  The strace output
when it exits is:

18863 rt_sigaction(SIGTSTP, {0x3e77e192f0, [], SA_RESTORER|SA_RESTART, 0x3e75a337d0}, NULL, 8) = 0
18863 stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=3661, ...}) = 0
18863 poll([{fd=3, events=POLLOUT}, {fd=4, events=POLLIN}], 2, -1) = 1 ([{fd=3, revents=POLLOUT}])
18863 sendto(3, "\0\0\0\34 \0\200\206\0\0\0\1\0\0\0003\0\0\0\0\0\0\0r\0\0\0\0", 28, 0, NULL, 0) = 28
18863 poll([{fd=3, events=POLLIN}, {fd=4, events=POLLIN}], 2, -1) = 1 ([{fd=3, revents=POLLIN}])
18863 recvfrom(3, "\0\0\1h", 4, 0, NULL, NULL) = 4
18863 recvfrom(3, " \0\200\206\0\0\0\1\0\0\0\25\0\0\0\1\0\0\0q\0\0\0\0\0\0\0\21\0\0\0\21RHEL6200910210x32\0\0\0\0\0\0\vTmpBZ552994\0\0\0\0\21RHEL6201002033x64\0\0\0\0\0\0\nDebian5x64\0\0\0\0\0\fUbuntu910x64\0\0\0\16RHEL6Alpha3x64\0\0\0\0\0\rRHEL54Betax64\0\0\0\0\0\0\6F10x32\0\0\0\0\0\nCentOS5x32\0\0\0\0\0\rF13Rawhidex64\0\0\0\0\0\0\16TmpDebFirewall\0\0\0\0\0\rF12x64preview\0\0\0\0\0\0\nWin2003x32\0\0\0\0\0\7VSphere\0\0\0\0\vWindows7x32\0\0\0\0\vWindows7x64\0\0\0\0\vFreeBSD8x64\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0", 356, 0, NULL, NULL) = 356
18863 write(2, "libvir: Remote error : no call waiting for reply with serial 113\n", 65) = 65
18863 write(1, "\33[17;1H\33[2J\33[?47l\0338\r\33[?1l\33>", 27) = 27
18863 ioctl(1, SNDCTL_TMR_STOP or TCSETSW, {B9600 opost isig icanon echo ...}) = 0
18863 write(2, "libvirt: VIR_ERR_RPC: VIR_FROM_REMOTE: no call waiting for reply with serial 113\n", 81) = 81
18863 exit_group(1)                     = ?

I wonder if this is a regression (see bug 484414).

My libvirt version is:
libvirt-0.7.5-3.fc13.x86_64
Comment 3 Richard W.M. Jones 2010-02-24 06:52:40 EST
Also occurs with
libvirt-0.7.6-1.fc13.x86_64

The strace this time is roughly the same:

18952 recvfrom(3, "\0\0\1h", 4, 0, NULL, NULL) = 4
18952 recvfrom(3, " \0\200\206\0\0\0\1\0\0\0\25\0\0\0\1\0\0\6O\0\0\0\0\0\0\0\21\0\0\0\21RHEL6200910210x32\0\0\0\0\0\0\vTmpBZ552994\0\0\0\0\21RHEL6201002033x64\0\0\0\0\0\0\nDebian5x64\0\0\0\0\0\fUbuntu910x64\0\0\0\16RHEL6Alpha3x64\0\0\0\0\0\rRHEL54Betax64\0\0\0\0\0\0\6F10x32\0\0\0\0\0\nCentOS5x32\0\0\0\0\0\rF13Rawhidex64\0\0\0\0\0\0\16TmpDebFirewall\0\0\0\0\0\rF12x64preview\0\0\0\0\0\0\nWin2003x32\0\0\0\0\0\7VSphere\0\0\0\0\vWindows7x32\0\0\0\0\vWindows7x64\0\0\0\0\vFreeBSD8x64\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0", 356, 0, NULL, NULL) = 356
18952 write(2, "libvir: Remote error : no call waiting for reply with serial 1615\n", 66) = 66
18952 write(1, "\33[26;1H\33[2J\33[?47l\0338\r\33[?1l\33>", 27) = 27
18952 ioctl(1, SNDCTL_TMR_STOP or TCSETSW, {B9600 opost isig icanon echo ...}) = 0
18952 write(2, "libvirt: VIR_ERR_RPC: VIR_FROM_REMOTE: no call waiting for reply with serial 1615\n", 82) = 82
18952 exit_group(1)                     = ?
Comment 4 Richard W.M. Jones 2010-02-24 07:07:43 EST
Created attachment 396047 [details]
strace from Rawhide libvirt / Rawhide virt-top

Full strace output, requested by danpb.
Comment 5 Richard W.M. Jones 2010-02-24 08:08:57 EST
Patch posted upstream (libvirt) to fix this:

https://www.redhat.com/archives/libvir-list/2010-February/msg00824.html
Comment 6 Richard W.M. Jones 2010-02-24 12:33:09 EST
This change was pushed to upstream libvirt which
fixes the bug:

http://libvirt.org/git/?p=libvirt.git;a=commitdiff;h=f4a43df52b7c84bda61863250d20135f044893da