Bug 677614

Summary: QEMU should report the PID of the process that sent it signals for troubleshooting purposes
Product: Red Hat Enterprise Linux 5 Reporter: Dan Yasny <dyasny>
Component: kvmAssignee: Gleb Natapov <gleb>
Status: CLOSED ERRATA QA Contact: Virtualization Bugs <virt-bugs>
Severity: high Docs Contact:
Priority: unspecified    
Version: 5.6CC: abaron, acathrow, bcao, danken, gcosta, juzhang, knoel, mkenneth, tburke, virt-maint
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: kvm-83-231.el5 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 735716 (view as bug list) Environment:
Last Closed: 2011-07-21 08:49:44 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 580948, 677620, 735716    

Description Dan Yasny 2011-02-15 11:28:55 UTC
Description of problem:
We have seen cases when a script or other software would send a kill to qemu process, which seems like a destroy issued. Since it was not issued by any VM management suite, it was never logged, and the customer saw the VMs crashing and was alarmed

Issue was resolved by the following patch:
diff --git a/qemu/vl.c b/qemu/vl.c
index e50ce1c..5667bd5 100644
--- a/qemu/vl.c
+++ b/qemu/vl.c
@@ -5089,8 +5089,9 @@ static void *qemu_alloc_physram(unsigned long memory)
 
 #ifndef _WIN32
 
-static void termsig_handler(int signal)
+static void termsig_handler(int signal, siginfo_t *info, void *c)
 {
+    printf("Got signal %d from pid %d\n", info->si_signo, info->si_pid);   // Vladik, here's si_pid printed out
     qemu_system_shutdown_request();
 }
 
@@ -5099,7 +5100,8 @@ static void termsig_setup(void)
     struct sigaction act;
 
     memset(&act, 0, sizeof(act));
-    act.sa_handler = termsig_handler;
+    act.sa_sigaction = termsig_handler;
+    act.sa_flags = SA_SIGINFO;
     sigaction(SIGINT,  &act, NULL);
     sigaction(SIGHUP,  &act, NULL);
     sigaction(SIGTERM, &act, NULL);


It seems like a good practice in all environments, thus requesting for this reporting to be made permanent in qemu

Version-Release number of selected component (if applicable):
%define kvmversion 83-maint-snapshot-20090205
%define pkgversion 83
%define pkgrelease 224


How reproducible:
always

Steps to Reproduce:
1.run a VM
2.kill -s 15 $qemu_pid
3.vdsm in this case would assume user shutdown or destroy called, when actualy the PID kiling the VM belongs to bash
  
Actual results:
$VM.stdout doesn't report anything useful to find out why VM went down

Expected results:
$VM.stdout reports the PID that sent in the kill, which is aggregated into logs and used to find the process that issued the destroy of the VM

Additional info:
already agreed with vdsm team to add PID resolution to the vdsm logs

Comment 2 Dor Laor 2011-02-17 08:26:02 UTC
Gleb, it seems right and would also help on upstream and rhel6

Comment 3 Gleb Natapov 2011-02-17 10:27:33 UTC
I am not sure it is that simple. The patch above was a hack written in three minutes. If upstream will accept it then great, but more correct solution would be to report this data through QMP. This will complicate patch itself and will require management adjustment and we all know how they like it.

Comment 4 Dor Laor 2011-02-20 21:57:42 UTC
(In reply to comment #3)
> I am not sure it is that simple. The patch above was a hack written in three
> minutes. If upstream will accept it then great, but more correct solution would
> be to report this data through QMP. This will complicate patch itself and will
> require management adjustment and we all know how they like it.

Let's first try. If it will be an issue we can easily have a private rhel5 version here.

Comment 15 Dan Kenigsberg 2011-04-10 07:26:24 UTC
*** Bug 677620 has been marked as a duplicate of this bug. ***

Comment 19 Mike Cao 2011-05-12 03:37:20 UTC
Reproduced on kvm-83-229.el5
Verified on kvm-83-232.el5

Steps to Reproduce:
1.run a VM
2.kill -s 15 $qemu_pid


Actual Results:
on kvm-83-229.el5 ,nothing output
on kvm-83-232.el5 ,Got signal 15 from pid 11668

Based on above ,this issue has been fixed.

Comment 21 errata-xmlrpc 2011-07-21 08:49:44 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2011-1068.html

Comment 22 errata-xmlrpc 2011-07-21 11:49:28 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2011-1068.html