Description of problem: Version-Release number of selected component (if applicable): oc v3.7.0-0.143.1 kubernetes v1.7.0+80709908fd features: Basic-Auth GSSAPI Kerberos SPNEGO Server https://172.31.78.254:443 openshift v3.7.0-0.143.1 kubernetes v1.7.0+80709908fd Steps to Reproduce: 1. Upgraded from relatively recent v3.7 to latest v3.7 (stage branch) [root@free-stg-node-infra-70a4e ~]# systemctl status atomic-openshift-node ● atomic-openshift-node.service - OpenShift Node Loaded: loaded (/etc/systemd/system/atomic-openshift-node.service; enabled; vendor preset: disabled) Drop-In: /usr/lib/systemd/system/atomic-openshift-node.service.d └─openshift-sdn-ovs.conf Active: activating (auto-restart) (Result: exit-code) since Thu 2017-10-05 21:19:13 UTC; 3s ago Docs: https://github.com/openshift/origin Process: 33010 ExecStopPost=/usr/bin/dbus-send --system --dest=uk.org.thekelleys.dnsmasq /uk/org/thekelleys/dnsmasq uk.org.thekelleys.SetDomainServers array:string: (code=exited, status=0/SUCCESS) Process: 33008 ExecStopPost=/usr/bin/rm /etc/dnsmasq.d/node-dnsmasq.conf (code=exited, status=0/SUCCESS) Process: 32984 ExecStart=/usr/bin/openshift start node --config=${CONFIG_FILE} $OPTIONS (code=exited, status=255) Process: 32981 ExecStartPre=/usr/bin/dbus-send --system --dest=uk.org.thekelleys.dnsmasq /uk/org/thekelleys/dnsmasq uk.org.thekelleys.SetDomainServers array:string:/in-addr.arpa/127.0.0.1,/cluster.local/127.0.0.1 (code=exited, status=0/SUCCESS) Process: 32979 ExecStartPre=/usr/bin/cp /etc/origin/node/node-dnsmasq.conf /etc/dnsmasq.d/ (code=exited, status=0/SUCCESS) Main PID: 32984 (code=exited, status=255) Oct 05 21:19:13 ip-172-31-69-53.us-east-2.compute.internal systemd[1]: Failed to start OpenShift Node. Oct 05 21:19:13 ip-172-31-69-53.us-east-2.compute.internal systemd[1]: Unit atomic-openshift-node.service entered failed state. Oct 05 21:19:13 ip-172-31-69-53.us-east-2.compute.internal systemd[1]: atomic-openshift-node.service failed. journal from one of the nodes (all of them are in the same loop and reporting NotReady to oc get nodes): Oct 05 21:20:25 ip-172-31-69-53.us-east-2.compute.internal systemd[1]: atomic-openshift-node.service: main process exited, code=exited, status=255/n/a Oct 05 21:20:25 ip-172-31-69-53.us-east-2.compute.internal systemd[1]: Failed to start OpenShift Node. Oct 05 21:20:25 ip-172-31-69-53.us-east-2.compute.internal systemd[1]: Unit atomic-openshift-node.service entered failed state. Oct 05 21:20:25 ip-172-31-69-53.us-east-2.compute.internal systemd[1]: atomic-openshift-node.service failed. Oct 05 21:20:30 ip-172-31-69-53.us-east-2.compute.internal systemd[1]: atomic-openshift-node.service holdoff time over, scheduling restart. Oct 05 21:20:30 ip-172-31-69-53.us-east-2.compute.internal systemd[1]: Starting OpenShift Node... Oct 05 21:20:30 ip-172-31-69-53.us-east-2.compute.internal atomic-openshift-node[34256]: I1005 21:20:30.409519 34256 panic.go:26] Process will log all panics and errors to Sentry. Oct 05 21:20:30 ip-172-31-69-53.us-east-2.compute.internal atomic-openshift-node[34256]: I1005 21:20:30.444041 34256 start_node.go:252] Reading node configuration from /etc/origin/node/node-config.yaml Oct 05 21:20:30 ip-172-31-69-53.us-east-2.compute.internal atomic-openshift-node[34256]: I1005 21:20:30.446625 34256 loader.go:357] Config loaded from file /etc/origin/node/system:node:ip-172-31-69-53.us-east-2.compute.internal.kubeconfig Oct 05 21:20:30 ip-172-31-69-53.us-east-2.compute.internal atomic-openshift-node[34256]: I1005 21:20:30.448072 34256 loader.go:357] Config loaded from file /etc/origin/node/system:node:ip-172-31-69-53.us-east-2.compute.internal.kubeconfig Oct 05 21:20:30 ip-172-31-69-53.us-east-2.compute.internal atomic-openshift-node[34256]: I1005 21:20:30.449529 34256 loader.go:357] Config loaded from file /etc/origin/node/system:node:ip-172-31-69-53.us-east-2.compute.internal.kubeconfig Oct 05 21:20:30 ip-172-31-69-53.us-east-2.compute.internal atomic-openshift-node[34256]: I1005 21:20:30.470201 34256 round_trippers.go:405] GET https://internal.api.free-stg.openshift.com:443/oapi/v1/clusternetworks/default 200 OK in 19 milliseconds Oct 05 21:20:30 ip-172-31-69-53.us-east-2.compute.internal atomic-openshift-node[34256]: I1005 21:20:30.470462 34256 node.go:142] Initializing SDN node of type "redhat/openshift-ovs-multitenant" with configured hostname "ip-172-31-69-53.us-east-2.compute.internal" (IP ""), iptables sync period "5m0s" Oct 05 21:20:30 ip-172-31-69-53.us-east-2.compute.internal atomic-openshift-node[34256]: I1005 21:20:30.472246 34256 network_config.go:138] DNS Bind to 127.0.0.1:53 Oct 05 21:20:30 ip-172-31-69-53.us-east-2.compute.internal atomic-openshift-node[34256]: I1005 21:20:30.473298 34256 loader.go:357] Config loaded from file /etc/origin/node/system:node:ip-172-31-69-53.us-east-2.compute.internal.kubeconfig Oct 05 21:20:30 ip-172-31-69-53.us-east-2.compute.internal atomic-openshift-node[34256]: I1005 21:20:30.474982 34256 loader.go:357] Config loaded from file /etc/origin/node/system:node:ip-172-31-69-53.us-east-2.compute.internal.kubeconfig Oct 05 21:20:30 ip-172-31-69-53.us-east-2.compute.internal atomic-openshift-node[34256]: I1005 21:20:30.485298 34256 mount_linux.go:192] Detected OS with systemd Oct 05 21:20:30 ip-172-31-69-53.us-east-2.compute.internal atomic-openshift-node[34256]: I1005 21:20:30.485358 34256 client.go:72] Connecting to docker on unix:///var/run/docker.sock Oct 05 21:20:30 ip-172-31-69-53.us-east-2.compute.internal atomic-openshift-node[34256]: I1005 21:20:30.485385 34256 client.go:92] Start docker client with request timeout=2m0s Oct 05 21:20:30 ip-172-31-69-53.us-east-2.compute.internal atomic-openshift-node[34256]: I1005 21:20:30.502870 34256 aws.go:806] Building AWS cloudprovider Oct 05 21:20:30 ip-172-31-69-53.us-east-2.compute.internal atomic-openshift-node[34256]: I1005 21:20:30.503007 34256 regions.go:74] found AWS region "us-east-2" Oct 05 21:20:30 ip-172-31-69-53.us-east-2.compute.internal atomic-openshift-node[34256]: I1005 21:20:30.503027 34256 aws_credentials.go:90] registering credentials provider for AWS region "us-east-2" Oct 05 21:20:30 ip-172-31-69-53.us-east-2.compute.internal atomic-openshift-node[34256]: I1005 21:20:30.503043 34256 plugins.go:41] Registered credential provider "aws-ecr-us-east-2" Oct 05 21:20:30 ip-172-31-69-53.us-east-2.compute.internal atomic-openshift-node[34256]: I1005 21:20:30.504507 34256 log_handler.go:33] AWS request: ec2 DescribeInstances Oct 05 21:20:30 ip-172-31-69-53.us-east-2.compute.internal atomic-openshift-node[34256]: I1005 21:20:30.573189 34256 log_handler.go:33] AWS request: ec2 DescribeInstances Oct 05 21:20:30 ip-172-31-69-53.us-east-2.compute.internal atomic-openshift-node[34256]: I1005 21:20:30.608841 34256 tags.go:76] AWS cloud filtering on ClusterID: free-stg Oct 05 21:20:30 ip-172-31-69-53.us-east-2.compute.internal atomic-openshift-node[34256]: I1005 21:20:30.608892 34256 regions.go:74] found AWS region "ap-northeast-1" Oct 05 21:20:30 ip-172-31-69-53.us-east-2.compute.internal atomic-openshift-node[34256]: I1005 21:20:30.608904 34256 aws_credentials.go:90] registering credentials provider for AWS region "ap-northeast-1" Oct 05 21:20:30 ip-172-31-69-53.us-east-2.compute.internal atomic-openshift-node[34256]: I1005 21:20:30.608914 34256 plugins.go:41] Registered credential provider "aws-ecr-ap-northeast-1" Oct 05 21:20:30 ip-172-31-69-53.us-east-2.compute.internal atomic-openshift-node[34256]: I1005 21:20:30.608924 34256 regions.go:74] found AWS region "ap-northeast-2" Oct 05 21:20:30 ip-172-31-69-53.us-east-2.compute.internal atomic-openshift-node[34256]: I1005 21:20:30.608929 34256 aws_credentials.go:90] registering credentials provider for AWS region "ap-northeast-2" Oct 05 21:20:30 ip-172-31-69-53.us-east-2.compute.internal atomic-openshift-node[34256]: I1005 21:20:30.608934 34256 plugins.go:41] Registered credential provider "aws-ecr-ap-northeast-2" Oct 05 21:20:30 ip-172-31-69-53.us-east-2.compute.internal atomic-openshift-node[34256]: I1005 21:20:30.608939 34256 regions.go:74] found AWS region "ap-south-1" Oct 05 21:20:30 ip-172-31-69-53.us-east-2.compute.internal atomic-openshift-node[34256]: I1005 21:20:30.608944 34256 aws_credentials.go:90] registering credentials provider for AWS region "ap-south-1" Oct 05 21:20:30 ip-172-31-69-53.us-east-2.compute.internal atomic-openshift-node[34256]: I1005 21:20:30.608949 34256 plugins.go:41] Registered credential provider "aws-ecr-ap-south-1" Oct 05 21:20:30 ip-172-31-69-53.us-east-2.compute.internal atomic-openshift-node[34256]: I1005 21:20:30.608954 34256 regions.go:74] found AWS region "ap-southeast-1" Oct 05 21:20:30 ip-172-31-69-53.us-east-2.compute.internal atomic-openshift-node[34256]: I1005 21:20:30.608958 34256 aws_credentials.go:90] registering credentials provider for AWS region "ap-southeast-1" Oct 05 21:20:30 ip-172-31-69-53.us-east-2.compute.internal atomic-openshift-node[34256]: I1005 21:20:30.608963 34256 plugins.go:41] Registered credential provider "aws-ecr-ap-southeast-1" Oct 05 21:20:30 ip-172-31-69-53.us-east-2.compute.internal atomic-openshift-node[34256]: I1005 21:20:30.608969 34256 regions.go:74] found AWS region "ap-southeast-2" Oct 05 21:20:30 ip-172-31-69-53.us-east-2.compute.internal atomic-openshift-node[34256]: I1005 21:20:30.608973 34256 aws_credentials.go:90] registering credentials provider for AWS region "ap-southeast-2" Oct 05 21:20:30 ip-172-31-69-53.us-east-2.compute.internal atomic-openshift-node[34256]: I1005 21:20:30.608978 34256 plugins.go:41] Registered credential provider "aws-ecr-ap-southeast-2" Oct 05 21:20:30 ip-172-31-69-53.us-east-2.compute.internal atomic-openshift-node[34256]: I1005 21:20:30.608984 34256 regions.go:74] found AWS region "ca-central-1" Oct 05 21:20:30 ip-172-31-69-53.us-east-2.compute.internal atomic-openshift-node[34256]: I1005 21:20:30.608989 34256 aws_credentials.go:90] registering credentials provider for AWS region "ca-central-1" Oct 05 21:20:30 ip-172-31-69-53.us-east-2.compute.internal atomic-openshift-node[34256]: I1005 21:20:30.608999 34256 plugins.go:41] Registered credential provider "aws-ecr-ca-central-1" Oct 05 21:20:30 ip-172-31-69-53.us-east-2.compute.internal atomic-openshift-node[34256]: I1005 21:20:30.609023 34256 regions.go:74] found AWS region "eu-central-1" Oct 05 21:20:30 ip-172-31-69-53.us-east-2.compute.internal atomic-openshift-node[34256]: I1005 21:20:30.609033 34256 aws_credentials.go:90] registering credentials provider for AWS region "eu-central-1" Oct 05 21:20:30 ip-172-31-69-53.us-east-2.compute.internal atomic-openshift-node[34256]: I1005 21:20:30.609042 34256 plugins.go:41] Registered credential provider "aws-ecr-eu-central-1" Oct 05 21:20:30 ip-172-31-69-53.us-east-2.compute.internal atomic-openshift-node[34256]: I1005 21:20:30.609051 34256 regions.go:74] found AWS region "eu-west-1" Oct 05 21:20:30 ip-172-31-69-53.us-east-2.compute.internal atomic-openshift-node[34256]: I1005 21:20:30.609060 34256 aws_credentials.go:90] registering credentials provider for AWS region "eu-west-1" Oct 05 21:20:30 ip-172-31-69-53.us-east-2.compute.internal atomic-openshift-node[34256]: I1005 21:20:30.609068 34256 plugins.go:41] Registered credential provider "aws-ecr-eu-west-1" Oct 05 21:20:30 ip-172-31-69-53.us-east-2.compute.internal atomic-openshift-node[34256]: I1005 21:20:30.609076 34256 regions.go:74] found AWS region "eu-west-2" Oct 05 21:20:30 ip-172-31-69-53.us-east-2.compute.internal atomic-openshift-node[34256]: I1005 21:20:30.609081 34256 aws_credentials.go:90] registering credentials provider for AWS region "eu-west-2" Oct 05 21:20:30 ip-172-31-69-53.us-east-2.compute.internal atomic-openshift-node[34256]: I1005 21:20:30.609086 34256 plugins.go:41] Registered credential provider "aws-ecr-eu-west-2" Oct 05 21:20:30 ip-172-31-69-53.us-east-2.compute.internal atomic-openshift-node[34256]: I1005 21:20:30.609092 34256 regions.go:74] found AWS region "sa-east-1" Oct 05 21:20:30 ip-172-31-69-53.us-east-2.compute.internal atomic-openshift-node[34256]: I1005 21:20:30.609096 34256 aws_credentials.go:90] registering credentials provider for AWS region "sa-east-1" Oct 05 21:20:30 ip-172-31-69-53.us-east-2.compute.internal atomic-openshift-node[34256]: I1005 21:20:30.609101 34256 plugins.go:41] Registered credential provider "aws-ecr-sa-east-1" Oct 05 21:20:30 ip-172-31-69-53.us-east-2.compute.internal atomic-openshift-node[34256]: I1005 21:20:30.609106 34256 regions.go:74] found AWS region "us-east-1" Oct 05 21:20:30 ip-172-31-69-53.us-east-2.compute.internal atomic-openshift-node[34256]: I1005 21:20:30.609111 34256 aws_credentials.go:90] registering credentials provider for AWS region "us-east-1" Oct 05 21:20:30 ip-172-31-69-53.us-east-2.compute.internal atomic-openshift-node[34256]: I1005 21:20:30.609116 34256 plugins.go:41] Registered credential provider "aws-ecr-us-east-1" Oct 05 21:20:30 ip-172-31-69-53.us-east-2.compute.internal atomic-openshift-node[34256]: I1005 21:20:30.609121 34256 regions.go:70] found AWS region "us-east-2" again - ignoring Oct 05 21:20:30 ip-172-31-69-53.us-east-2.compute.internal atomic-openshift-node[34256]: I1005 21:20:30.609126 34256 regions.go:74] found AWS region "us-west-1" Oct 05 21:20:30 ip-172-31-69-53.us-east-2.compute.internal atomic-openshift-node[34256]: I1005 21:20:30.609131 34256 aws_credentials.go:90] registering credentials provider for AWS region "us-west-1" Oct 05 21:20:30 ip-172-31-69-53.us-east-2.compute.internal atomic-openshift-node[34256]: I1005 21:20:30.609135 34256 plugins.go:41] Registered credential provider "aws-ecr-us-west-1" Oct 05 21:20:30 ip-172-31-69-53.us-east-2.compute.internal atomic-openshift-node[34256]: I1005 21:20:30.609140 34256 regions.go:74] found AWS region "us-west-2" Oct 05 21:20:30 ip-172-31-69-53.us-east-2.compute.internal atomic-openshift-node[34256]: I1005 21:20:30.609145 34256 aws_credentials.go:90] registering credentials provider for AWS region "us-west-2" Oct 05 21:20:30 ip-172-31-69-53.us-east-2.compute.internal atomic-openshift-node[34256]: I1005 21:20:30.609150 34256 plugins.go:41] Registered credential provider "aws-ecr-us-west-2" Oct 05 21:20:30 ip-172-31-69-53.us-east-2.compute.internal atomic-openshift-node[34256]: I1005 21:20:30.609156 34256 regions.go:74] found AWS region "cn-north-1" Oct 05 21:20:30 ip-172-31-69-53.us-east-2.compute.internal atomic-openshift-node[34256]: I1005 21:20:30.609161 34256 aws_credentials.go:90] registering credentials provider for AWS region "cn-north-1" Oct 05 21:20:30 ip-172-31-69-53.us-east-2.compute.internal atomic-openshift-node[34256]: I1005 21:20:30.609166 34256 plugins.go:41] Registered credential provider "aws-ecr-cn-north-1" Oct 05 21:20:30 ip-172-31-69-53.us-east-2.compute.internal atomic-openshift-node[34256]: I1005 21:20:30.609171 34256 regions.go:74] found AWS region "us-gov-west-1" Oct 05 21:20:30 ip-172-31-69-53.us-east-2.compute.internal atomic-openshift-node[34256]: I1005 21:20:30.609175 34256 aws_credentials.go:90] registering credentials provider for AWS region "us-gov-west-1" Oct 05 21:20:30 ip-172-31-69-53.us-east-2.compute.internal atomic-openshift-node[34256]: I1005 21:20:30.609180 34256 plugins.go:41] Registered credential provider "aws-ecr-us-gov-west-1" Oct 05 21:20:30 ip-172-31-69-53.us-east-2.compute.internal atomic-openshift-node[34256]: I1005 21:20:30.609202 34256 node_config.go:137] Successfully initialized cloud provider: "aws" from the config file: "/etc/origin/cloudprovider/aws.conf" Oct 05 21:20:30 ip-172-31-69-53.us-east-2.compute.internal atomic-openshift-node[34256]: I1005 21:20:30.609458 34256 start_node.go:348] Starting node ip-172-31-69-53.us-east-2.compute.internal (v3.7.0-0.126.6) Oct 05 21:20:30 ip-172-31-69-53.us-east-2.compute.internal atomic-openshift-node[34256]: I1005 21:20:30.609536 34256 client.go:72] Connecting to docker on unix:///var/run/docker.sock Oct 05 21:20:30 ip-172-31-69-53.us-east-2.compute.internal atomic-openshift-node[34256]: I1005 21:20:30.609553 34256 client.go:92] Start docker client with request timeout=2m0s Oct 05 21:20:30 ip-172-31-69-53.us-east-2.compute.internal atomic-openshift-node[34256]: I1005 21:20:30.613384 34256 node.go:114] Connecting to Docker at unix:///var/run/docker.sock Oct 05 21:20:30 ip-172-31-69-53.us-east-2.compute.internal atomic-openshift-node[34256]: I1005 21:20:30.614442 34256 node.go:185] Replacing empty-dir volume plugin with quota wrapper Oct 05 21:20:30 ip-172-31-69-53.us-east-2.compute.internal atomic-openshift-node[34256]: I1005 21:20:30.628722 34256 round_trippers.go:405] GET https://internal.api.free-stg.openshift.com:443/api/v1/namespaces/default/endpoints/kubernetes 200 OK in 4 milliseconds Oct 05 21:20:30 ip-172-31-69-53.us-east-2.compute.internal atomic-openshift-node[34256]: I1005 21:20:30.628949 34256 feature_gate.go:144] feature gates: map[] Oct 05 21:20:30 ip-172-31-69-53.us-east-2.compute.internal atomic-openshift-node[34256]: I1005 21:20:30.629107 34256 server.go:672] cloud provider determined current node name to be ip-172-31-69-53.us-east-2.compute.internal Oct 05 21:20:30 ip-172-31-69-53.us-east-2.compute.internal atomic-openshift-node[34256]: I1005 21:20:30.630194 34256 loader.go:357] Config loaded from file /etc/origin/node/system:node:ip-172-31-69-53.us-east-2.compute.internal.kubeconfig Oct 05 21:20:30 ip-172-31-69-53.us-east-2.compute.internal atomic-openshift-node[34256]: I1005 21:20:30.636490 34256 round_trippers.go:405] GET https://internal.api.free-stg.openshift.com:443/oapi/v1/clusternetworks/default 200 OK in 7 milliseconds Oct 05 21:20:30 ip-172-31-69-53.us-east-2.compute.internal atomic-openshift-node[34256]: F1005 21:20:30.636714 34256 network.go:39] error: SDN node startup failed: failed to get network information: failed to parse ClusterNetwork CIDR : invalid CIDR address: Oct 05 21:20:30 ip-172-31-69-53.us-east-2.compute.internal atomic-openshift-node[34256]: I1005 21:20:30.637507 34256 manager.go:144] cAdvisor running in container: "/system.slice/atomic-openshift-node.service" Oct 05 21:20:30 ip-172-31-69-53.us-east-2.compute.internal systemd[1]: atomic-openshift-node.service: main process exited, code=exited, status=255/n/a Oct 05 21:20:30 ip-172-31-69-53.us-east-2.compute.internal systemd[1]: Failed to start OpenShift Node. Oct 05 21:20:30 ip-172-31-69-53.us-east-2.compute.internal systemd[1]: Unit atomic-openshift-node.service entered failed state. Oct 05 21:20:30 ip-172-31-69-53.us-east-2.compute.internal systemd[1]: atomic-openshift-node.service failed. ^C
Node read the 'default' ClusterNetwork from etcd but failed to parse cidr entry in 'ClusterNetworks' field. Can we get output of 'oc get clusternetwork default -o yaml'
Might be related to https://github.com/openshift/origin/pull/16649
Ravi: $ oc get clusternetwork default -o yaml apiVersion: v1 clusterNetworks: - CIDR: 10.128.0.0/14 hostSubnetLength: 9 kind: ClusterNetwork metadata: creationTimestamp: 2017-04-06T19:57:10Z name: default resourceVersion: "50883637" selfLink: /oapi/v1/clusternetworks/default uid: 38b68e4a-1b03-11e7-871b-0203ad7dfcd7 pluginName: redhat/openshift-ovs-multitenant serviceNetwork: 172.30.0.0/16
One finding after initially submitting this: Due to a partial upgrade, the nodes were running 3.7.0-0.126.6 while the master was 3.7.0-0.143.1.
After completing the upgrade to bring the nodes into alignment with the master version, this problem went away. I'm removing DeliveryBlocker/Urgent, but this window of risk still seems dangerous.
Persisted ClusterNetwork Object is valid and it is as per new multiple cidr changes. Node failed to parse cidr entry indicates something wrong with conversions. So as Ben pointed out, I believe https://github.com/openshift/origin/pull/16649 fixes the issue.
*** This bug has been marked as a duplicate of bug 1502866 ***