Description of problem:
I had a working cluster, running off 3 VMs, very basic, standard. I'm not sure if recent updates broke it.
I see these over & over in log file:
2017/01/24 22:20:05.025164 [recoverd: 3474]: Public IP '10.5.10.51' is assigned to us but not on an interface
2017/01/24 22:20:05.027571 [recoverd: 3474]: Trigger takeoverrun
2017/01/24 22:20:05.053386 [recoverd: 3474]: Takeover run starting
2017/01/24 22:20:05.106044 [ 3309]: Takeover of IP 10.5.10.51/28 on interface eth0
~]$ ctdb status
Number of nodes:3
pnn:0 10.5.10.55 OK
pnn:1 10.5.10.56 OK (THIS NODE)
pnn:2 10.5.10.57 OK
Recovery mode:NORMAL (0)
is seems that CTDB does not create/assign that public IP.
~]$ ctdb ip -v
Public IPs on node 1
10.5.10.51 node active[eth0] available[eth0] configured[eth0]
I suspect the update, for nothing really changed except:
update to libvirt-2.0.0-10.el7_3.4.x86_64
well... ctdb and the rest of the samba suit updated also, but these same versions(as on VM cluster) are running fine on bare metal.
Version-Release number of selected component (if applicable):
Should be very easy, my cluster runs off three VM guests on a host: libvirt-2.0.0-10.el7_3.4.x86_64
Set up a basic CTDB, in /etc/sysconfig/ctdb:
# CTDB-RA: Auto-generated by /usr/lib/ocf/resource.d/pawel/CTDB, backup is at /etc/sysconfig/ctdb.ctdb-ra-orig
You can also CTDB_MANAGES_SAMBA=no which I was curious about, but, still fails
Steps to Reproduce:
Moving to ctdb component for an investigation.
a bit of an update: If I manually - ip addr add 10.5.10.51/28 dev eth0 - then ctdb quiets down and seems to be ok(?)
I think it's not SE, cannot find an evidence it is.
what happened was that VM ctdb cluster (all Centos 7.3) either:
* rpm update got a few event scripts loose their exec bit, including: 10.interface 11.routing
* pcsd toolkit did when I had ctdb under HA and when pcs resource were removed ctdb scripts did not get bitmask reset.
I'd say: please close the report