Hide Forgot
Description of problem: Pacemaker startup probes of VirtualDomain resources can result in unnecessary and potentially confusing "Unable to determine emulator" log messages. Version-Release number of selected component (if applicable): 3.9.5-54.el7_2.1 How reproducible: Reliably Steps to Reproduce: 1. Call VirtualDomain's monitor action for a domain that does not exist and has not been seen before (i.e. no /run/resource-agents/VirtualDomain-$DOMAIN-emu.state file exists). Actual results: Log message "ERROR: Unable to determine emulator for" domain. Expected results: No errors logged for successful "not running" monitor of nonexistent domain. Additional info: The VirtualDomain resource agent's get_emulator function can be called by update_emulator_cache or pid_status. While the above error message may be useful with the pid_status call, it is unnecessary with update_emulator_cache. Perhaps get_emulator could just return a status, and the caller could print the message or not.
I was unable to reproduce this with a KVM guest, but it does affect at least LXC. I was hoping to spare you the steps for setting up an LXC guest node :) but here is the reproducer: 1. Set up a pacemaker cluster with at least two cluster nodes (the cluster nodes may be VMs). 2. Make sure you have these prerequisites on all nodes: * yum install rsync libvirt-daemon libvirt-daemon-driver-lxc libvirt-daemon-lxc libvirt-login-shell pacemaker-remote * SELinux must be enabled (permissive or enforcing) * libvirtd must be enabled and running * root must be able to ssh without a password between all nodes 3. Download http://people.redhat.com/kgaillot/bz1307160/lxc_autogen.sh (newer version than the one supplied with RHEL 7.2 pacemaker-cts package) 4. Run "./lxc_autogen.sh -v" on each node to verify the local environment (should print the PID of libvirtd and no errors). 5. Create an LXC guest node: "./lxc_autogen.sh -g -a -m -s -c 1 && crm_resource --wait" 6. At this point, the guest node will have been probed on all nodes. Any nodes that aren't running the container will have logs like this in their /var/log/messages: Mar 18 15:10:35 rhel7-2 VirtualDomain(container1)[4736]: ERROR: Unable to determine emulator for lxc1 7. To clean up afterwards: "./lxc_autogen.sh -R -s"
Working patch: https://github.com/ClusterLabs/resource-agents/pull/787/files
I'm unable to verify this bug with qemu or lxc in resource-agents-3.9.5-81, patch is simple enough. I have verified that the patch is present as bz1307160-virtualdomain-fix-unnecessary-error-when-probing-nonexistent-domain.patch and that resource-agents can be built correctly.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2016-2174.html