Description of problem: Node with cloudprovider vsphere fails to post status Ready when a ipfailover is configured. atomic-openshift-node seems to duplicate ip's and fails with - "doesn't match $setElementOrder list: [map[type:ExternalIP] map[type:InternalIP] map[type:ExternalIP] map[type:InternalIP] map[type:Hostname]]" Full logs here: where: Node ip: 159.103.104.245 and VIP: 159.103.104.40 Find local IP address 159.103.104.245 and set type to vsphere.go:503] Find local IP address 159.103.104.40 and set type to vsphere.go:503] Find local IP address 172.17.0.1 and set type to vsphere.go:503] Find local IP address 10.214.0.1 and set type to round_trippers.go:405] PATCH https://console-ocptestinfra.julisbaer.com:443/api/v1/nodes/srp10556lx/status 500 Internal Server Error in 1 milliseconds kubelet_node_status.go:380] Error updating node status, will retry: failed to patch status "{\"status\":{\"$setElementOrder/addresses\":[{\"type":\"ExternalIP\"},{\"type\":\"InternalIP\"},{\"type\":\"ExternalIP\"},{\"type\":\"InternalIP\"},{\"type\":\"Hostname\"}],\"$setElementOrder/conditions\":[{\"type":"OutOfDisk\"},{\"type\":\"MemoryPressure\"},{\"type":"DiskPressure"},{\"type\":\"Ready\"}],\"addresses\":[{\"address":"159.103.104.40\",\"type\":\"ExternalIP\"},{\"address":"159.103.104.245\",\"type\":\"ExternalIP\"},{\"address":"159.103.104.40\",\"type\":\"InternalIP\"},{\"address":"159.103.104.245\",\"type\":\"InternalIP\"}],\"conditions":{\"lastHeartbeatTime\":\"2018-03-06T21:43:42Z\",\"type\":\"OutOfDisk"},\"lastHeartbeatTime\":\"2018-03-06T21:43:42Z\",\"type\":\"MemoryPressure"},\"lastHeartbeatTime\":\"2018-03-06T21:43:42Z\",\"type\":\"DiskPressure\"},{\"lastHeartbeatTime\":\"2018-03-06T21:43:42Z","type":"Ready"}],"volumesInUse":\"kubernetes.iovsphere-volume/[t-zrh07-dc1-imple/t-zrh07-dc1-03-openshift] kubevols/kubernetes-dynamic-pvc-65f51bfb-215e-11e8-873c-005056a81271.vmdk\",\"kubernetes.io/vsphere-volume/[t-zrh07-dc1-simple/t-zrh07-dc1-03-openshift] kubevols/kubernetes-dynamic-pvc-4e384322-215e-11e8-873c-005056a81271.vmdk\"]}}" for node "srp10556lx": The order in patch list: atomic-openshift-node[64170]: [map[address:159.103.104.40 type:ExternalIP] map[address:159.103.104.245 type:ExternalIP] map[address:159.103.104.40 type:InternalIP] map[address:159.103.104.245 type:InternalIP]] atomic-openshift-node[64170]: doesn't match $setElementOrder list: atomic-openshift-node[64170]: [map[type:ExternalIP] map[type:InternalIP] map[type:ExternalIP] map[type:InternalIP] map[type:Hostname]] Version-Release number of selected component (if applicable): 3.7 How reproducible: This happens every time the node the vSphere cloudprovider is active, when it's not ipfailover works as expected.
Potentially related https://github.com/kubernetes/contrib/issues/2761 Related bz: https://bugzilla.redhat.com/show_bug.cgi?id=1527315 There are a few vSphere related fixes coming out in the upcoming 3.7 errata (scheduled to release tomorrow). I'd be interested to see if those fixes address this issue as well.
The yet to be released version with the potential fixes is v3.7.36
Thanks Borja, I am closing this bug now.
Hi Is this still likely to be the issue in 3.10.14? I have the exact error and logs as above in this env using ipfailover with vsphere provider. Everything works great until IP failover is added. Sorry to resurrect an old report, I can open another if needed. Thanks for your time.
FYI for anyone that comes across this. The cause is the nodeIP not being defined, and needing to have the openshift_set_node_ip and such that was removed in 3.10, but worked fine in 3.9 https://bugzilla.redhat.com/show_bug.cgi?id=1624679 RFE for this. As a workaround, setting the nodeIP: {{ ip_address}} manually in the /etc/origin/node/node-config.yaml and quickly restarting the atomic-openshift-node.service before the node-config.yaml is overwritten again, sorts this out. However every time the atomic-openshift-node.service is restarted (manually or on a reboot), the issue reappears.
openshift 3.11 with cloudprovider vsphere have same problem,openshift add IP failover get this error?Is there a way to avoid this problems? E1116 02:56:54.095820 1 status.go:64] apiserver received an error that is not an metav1.Status: &errors.errorString{s:"The order in patch list:\n[map[address:10.6.0.197 type:ExternalIP] map[address:10.6.0.192 type:ExternalIP] map[address:10.6.0.197 type:InternalIP] map[address:10.6.0.192 type:InternalIP]]\n doesn't match $setElementOrder list:\n[map[type:ExternalIP] map[type:InternalIP] map[type:ExternalIP] map[type:InternalIP] map[type:Hostname]]\n"} E1116 02:56:54.105421 1 status.go:64] apiserver received an error that is not an metav1.Status: &errors.errorString{s:"The order in patch list:\n[map[address:10.6.0.197 type:ExternalIP] map[address:10.6.0.192 type:ExternalIP] map[type:InternalIP address:10.6.0.197] map[address:10.6.0.192 type:InternalIP]]\n doesn't match $setElementOrder list:\n[map[type:ExternalIP] map[type:InternalIP] map[type:ExternalIP] map[type:InternalIP] map[type:Hostname]]\n"} E1116 02:56:54.133776 1 status.go:64] apiserver received an error that is not an metav1.Status: &errors.errorString{s:"The order in patch list:\n[map[address:10.6.0.197 type:ExternalIP] map[address:10.6.0.192 type:ExternalIP] map[address:10.6.0.197 type:InternalIP] map[type:InternalIP address:10.6.0.192]]\n doesn't match $setElementOrder list:\n[map[type:ExternalIP] map[type:InternalIP] map[type:ExternalIP] map[type:InternalIP] map[type:Hostname]]\n"} E1116 02:56:54.159910 1 status.go:64] apiserver received an error that is not an metav1.Status: &errors.errorString{s:"The order in patch list:\n[map[address:10.6.0.197 type:ExternalIP] map[address:10.6.0.192 type:ExternalIP] map[type:InternalIP address:10.6.0.197] map[address:10.6.0.192 type:InternalIP]]\n doesn't match $setElementOrder list:\n[map[type:ExternalIP] map[type:InternalIP] map[type:ExternalIP] map[type:InternalIP] map[type:Hostname]]\n"} E1116 02:56:54.183218 1 status.go:64] apiserver received an error that is not an metav1.Status: &errors.errorString{s:"The order in patch list:\n[map[address:10.6.0.197 type:ExternalIP] map[address:10.6.0.192 type:ExternalIP] map[address:10.6.0.197 type:InternalIP] map[address:10.6.0.192 type:InternalIP]]\n doesn't match $setElementOrder list:\n[map[type:ExternalIP] map[type:InternalIP] map[type:ExternalIP] map[type:InternalIP] map[type:Hostname]]\n"} E1116 02:57:04.192781 1 status.go:64] apiserver received an error that is not an metav1.Status: &errors.errorString{s:"The order in patch list:\n[map[type:ExternalIP address:10.6.0.197] map[address:10.6.0.192 type:ExternalIP] map[address:10.6.0.197 type:InternalIP] map[address:10.6.0.192 type:InternalIP]]\n doesn't match $setElementOrder list:\n[map[type:ExternalIP] map[type:InternalIP] map[type:ExternalIP] map[type:InternalIP] map[type:Hostname]]\n"} E1116 02:57:04.203555 1 status.go:64] apiserver received an error that is not an metav1.Status: &errors.errorString{s:"The order in patch list:\n[map[address:10.6.0.197 type:ExternalIP] map[type:ExternalIP address:10.6.0.192] map[address:10.6.0.197 type:InternalIP] map[address:10.6.0.192 type:InternalIP]]\n doesn't match $setElementOrder list:\n[map[type:ExternalIP] map[type:InternalIP] map[type:ExternalIP] map[type:InternalIP] map[type:Hostname]]\n"} E1116 02:57:04.232610 1 status.go:64] apiserver received an error that is not an metav1.Status: &errors.errorString{s:"The order in patch list:\n[map[address:10.6.0.197 type:ExternalIP] map[address:10.6.0.192 type:ExternalIP] map[address:10.6.0.197 type:InternalIP] map[address:10.6.0.192 type:InternalIP]]\n doesn't match $setElementOrder list:\n[map[type:ExternalIP] map[type:InternalIP] map[type:ExternalIP] map[type:InternalIP] map[type:Hostname]]\n"}
@luxq We had to create a config map for every single node in the cluster, with the nodeIP configured as an "edit" part in the map (we're using 3.11 too, but this is the case since 3.10) https://access.redhat.com/solutions/3625721 Expanding slightly, we kept the first three default maps for master, compute and infra as is, but added our own in; openshift_node_groups=[{'name': 'node-config-master', 'labels': ['node-role.kubernetes.io/master=true']}, {'name': 'node-config-infra', 'labels': ['node-role.kubernetes.io/infra=true']}, {'name': 'node-config-compute', 'labels': ['node-role.kubernetes.io/compute=true']}, {'name': 'cm-osm01', 'labels': ['node-role.kubernetes.io/master=true'], 'edits': [{'key': 'nodeIP','value': '10.x.x.x'}]}, {'name': 'cm-osm02', 'labels': ['node-role.kubernetes.io/master=true'], 'edits': [{'key': 'nodeIP','value': '10.x.x.x'}]} ..... etc where "cs-osm01" is our first master, cs-osm02 was second .. etc Edit the labels according to that server's role. The next problem you'll hit is probably yet another PR we have lodged for vSphere cloud provider; https://bugzilla.redhat.com/show_bug.cgi?id=1643348 in 3.11, where any more than 5 IP's on the primary interface (ipfailover, egressIP etc) will make vSphere cloud provider go into a spin.
(In reply to cg from comment #24) > @luxq We had to create a config map for every single node in the cluster, > with the nodeIP configured as an "edit" part in the map (we're using 3.11 > too, but this is the case since 3.10) > > https://access.redhat.com/solutions/3625721 > > Expanding slightly, we kept the first three default maps for master, compute > and infra as is, but added our own in; > > > openshift_node_groups=[{'name': 'node-config-master', 'labels': > ['node-role.kubernetes.io/master=true']}, {'name': 'node-config-infra', > 'labels': ['node-role.kubernetes.io/infra=true']}, {'name': > 'node-config-compute', 'labels': ['node-role.kubernetes.io/compute=true']}, > {'name': 'cm-osm01', 'labels': ['node-role.kubernetes.io/master=true'], > 'edits': [{'key': 'nodeIP','value': '10.x.x.x'}]}, {'name': 'cm-osm02', > 'labels': ['node-role.kubernetes.io/master=true'], 'edits': [{'key': > 'nodeIP','value': '10.x.x.x'}]} ..... etc > > where "cs-osm01" is our first master, cs-osm02 was second .. etc > Edit the labels according to that server's role. > > The next problem you'll hit is probably yet another PR we have lodged for > vSphere cloud provider; https://bugzilla.redhat.com/show_bug.cgi?id=1643348 > in 3.11, where any more than 5 IP's on the primary interface (ipfailover, > egressIP etc) will make vSphere cloud provider go into a spin. @cg Is that right?master and infra node is put together, will trigger this problem? ipfailover run on the master [OSEv3:vars] ... ... openshift_node_groups=[{'name': 'node-config-master', 'labels': ['node-role.kubernetes.io/master=true']}, {'name': 'node-config-infra', 'labels': ['node-role.kubernetes.io/infra=true']}, {'name': 'node-config-compute', 'labels': ['node-role.kubernetes.io/compute=true']}, {'name': 'node-master01', 'labels': ['node-role.kubernetes.io/master=true'],'edits': [{'key': 'nodeIP','value': '10.x.x.1'}]},{'name': 'node-master02', 'labels': ['node-role.kubernetes.io/master=true'],'edits': [{'key': 'nodeIP','value': '10.x.x.2'}]},{'name': 'node-master03', 'labels': ['node-role.kubernetes.io/master=true'],'edits': [{'key': 'nodeIP','value': '10.x.x.3'}]}] [nodes] master1.openshift.sz.clio openshift_schedulable=true openshift_node_group_name='node-master01' master2.openshift.sz.clio openshift_schedulable=true openshift_node_group_name='node-master02' master3.openshift.sz.clio openshift_schedulable=true openshift_node_group_name='node-master03' master[1:3].openshift.sz.clio openshift_schedulable=true openshift_node_group_name='node-config-infra' node1.openshift.sz.clio openshift_schedulable=true openshift_node_group_name='node-config-compute' node2.openshift.sz.clio openshift_schedulable=true openshift_node_group_name='node-config-compute' node3.openshift.sz.clio openshift_schedulable=true openshift_node_group_name='node-config-compute'