Bug 677620 - Blocked on QEMU - Add PID to process name resolution to vdsm logging
Summary: Blocked on QEMU - Add PID to process name resolution to vdsm logging
Keywords:
Status: CLOSED DUPLICATE of bug 677614
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: vdsm22
Version: 5.6
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: rc
: ---
Assignee: Dan Kenigsberg
QA Contact: yeylon@redhat.com
URL:
Whiteboard:
Depends On: 677614 735716
Blocks:
TreeView+ depends on / blocked
 
Reported: 2011-02-15 11:42 UTC by Dan Yasny
Modified: 2016-04-18 06:38 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2011-04-10 07:26:24 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 677614 0 unspecified CLOSED QEMU should report the PID of the process that sent it signals for troubleshooting purposes 2021-02-22 00:41:40 UTC

Internal Links: 677614

Description Dan Yasny 2011-02-15 11:42:36 UTC
Description of problem:
BZ#677614 asks for adding the PID of the process that sent a sigkill/sigterm to qemu to stdout of qemu. 
Since a PID as a number in the logs is useless we need to also resolve the PID to an actuall processname (probably filtering out vdsm itself)

The following patch has been tested and worked with the current vdsm22-4.5-63.9.el5 and kvm .224 patched with the qemu patch in BZ#677614 and resolved a case where a customers' custom script killed VMs, and vdsm logged those as "# Got shutdown request"

# diff -u vm.py.orig vm.py
--- vm.py.orig	2011-02-15 19:20:58.000000000 +0200
+++ vm.py	2011-02-15 19:20:45.000000000 +0200
@@ -1362,7 +1362,18 @@
         except:
             self.log.error(traceback.format_exc())
         try:
-            self.log.debug('qemu stdouterr: ' + file(self.dumpFile).read())
+            qemu_log = file(self.dumpFile).read()
+            qlog_arr = qemu_log.split()
+            for item in range(len(qlog_arr)):
+                if qlog_arr[item] == 'signal':
+                    try:
+                        pid_num = qlog_arr[item + 4]
+                        pid_cmd = '\nPID ' + pid_num + ' resolves to : ' + file('/proc/%s/cmdline' % pid_num).read() + '\n'
+                        qemu_log += qemu_log + pid_cmd
+                    except:
+                        pass
+            self.log.debug('qemu stdouterr: ' + qemu_log)
+                
         except:
             pass
         t = threading.Thread(target=self._prepostVmScript, args=['post_vm'])



Version-Release number of selected component (if applicable):
vdsm22-4.5-63.9.el5


How reproducible:
always

Steps to Reproduce:
1.install host with patched qemu and vm.py
2.kill -s 15 $qemu_pid
3.watch log 
  
Actual results:
will see "# Got shutdown request" as if this was a proper shutdown for the VM

Expected results:
will see the PID of the process and resolve the PID to the actual command and log it for troubleshooting.

It would also make sense to filter out all instances where the PID resolves to vdsm itself (an actual destroy call which is OK), so we only keep the actual external commands that interfered with qemu in the logs

Additional info:

Comment 1 Dan Kenigsberg 2011-02-17 09:55:54 UTC
Please attach how qemu's new stderr looks like, and in which qemu version it will appear (please add dependency on relevant qemu bug)

Comment 2 Dan Yasny 2011-02-17 10:14:39 UTC
(In reply to comment #1)
> Please attach how qemu's new stderr looks like, and in which qemu version it
> will appear (please add dependency on relevant qemu bug)

The QEMU BZ is https://bugzilla.redhat.com/show_bug.cgi?id=677614

The output currently looks like this:
  printf("Got signal %d from pid %d\n", info->si_signo, info->si_pid);   

How it will look when the BZ is finally resolved - don;t know yet, will have to ask Dor or Gleb; same for the qemu version this will appear in.

adding dep.

Comment 3 Dan Kenigsberg 2011-04-10 07:26:24 UTC
According to https://bugzilla.redhat.com/show_bug.cgi?id=677614#c13 this should be solved by having an auditctl rule to log all pids on creation. Such rule would enable the customer to identify the guilty process.

*** This bug has been marked as a duplicate of bug 677614 ***


Note You need to log in before you can comment on or make changes to this bug.