Bug 1307160

Summary:	VirtualDomain can log unnecessary error when probing nonexistent domain
Product:	Red Hat Enterprise Linux 7	Reporter:	Ken Gaillot <kgaillot>
Component:	resource-agents	Assignee:	Oyvind Albrigtsen <oalbrigt>
Status:	CLOSED ERRATA	QA Contact:	cluster-qe <cluster-qe>
Severity:	low	Docs Contact:
Priority:	unspecified
Version:	7.2	CC:	agk, cluster-maint, fdinitto, mnovacek
Target Milestone:	rc
Target Release:	---
Hardware:	All
OS:	All
Whiteboard:
Fixed In Version:	resource-agents-3.9.5-69.el7	Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2016-11-04 00:01:55 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Ken Gaillot 2016-02-12 21:14:21 UTC

Description of problem: Pacemaker startup probes of VirtualDomain resources can result in unnecessary and potentially confusing "Unable to determine emulator" log messages.

Version-Release number of selected component (if applicable): 3.9.5-54.el7_2.1


How reproducible: Reliably


Steps to Reproduce:
1. Call VirtualDomain's monitor action for a domain that does not exist and has not been seen before (i.e. no /run/resource-agents/VirtualDomain-$DOMAIN-emu.state file exists).

Actual results: Log message "ERROR: Unable to determine emulator for" domain.

Expected results: No errors logged for successful "not running" monitor of nonexistent domain.


Additional info: The VirtualDomain resource agent's get_emulator function can be called by update_emulator_cache or pid_status. While the above error message may be useful with the pid_status call, it is unnecessary with update_emulator_cache. Perhaps get_emulator could just return a status, and the caller could print the message or not.

Comment 3 Ken Gaillot 2016-03-18 20:36:35 UTC

I was unable to reproduce this with a KVM guest, but it does affect at least LXC. I was hoping to spare you the steps for setting up an LXC guest node :) but here is the reproducer:

1. Set up a pacemaker cluster with at least two cluster nodes (the cluster nodes may be VMs).

2. Make sure you have these prerequisites on all nodes:
* yum install rsync libvirt-daemon libvirt-daemon-driver-lxc libvirt-daemon-lxc libvirt-login-shell pacemaker-remote
* SELinux must be enabled (permissive or enforcing)
* libvirtd must be enabled and running
* root must be able to ssh without a password between all nodes

3. Download http://people.redhat.com/kgaillot/bz1307160/lxc_autogen.sh (newer version than the one supplied with RHEL 7.2 pacemaker-cts package)

4. Run "./lxc_autogen.sh -v" on each node to verify the local environment (should print the PID of libvirtd and no errors).

5. Create an LXC guest node: "./lxc_autogen.sh -g -a -m -s -c 1 && crm_resource --wait"

6. At this point, the guest node will have been probed on all nodes. Any nodes that aren't running the container will have logs like this in their /var/log/messages:
Mar 18 15:10:35 rhel7-2 VirtualDomain(container1)[4736]: ERROR: Unable to determine emulator for lxc1

7. To clean up afterwards: "./lxc_autogen.sh -R -s"

Comment 4 Oyvind Albrigtsen 2016-04-05 10:30:25 UTC

Working patch:
https://github.com/ClusterLabs/resource-agents/pull/787/files

Comment 6 michal novacek 2016-09-12 10:11:51 UTC

I'm unable to verify this bug with qemu or lxc in resource-agents-3.9.5-81, patch is simple enough.

I have verified that the patch is present as 

bz1307160-virtualdomain-fix-unnecessary-error-when-probing-nonexistent-domain.patch 

and that resource-agents can be built correctly.

Comment 8 errata-xmlrpc 2016-11-04 00:01:55 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-2174.html