Bug 1304582
Summary: | Node or Master will not start when /etc/hosts has 127.0.0.1 equal to hostname | ||
---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Ryan Howe <rhowe> |
Component: | Networking | Assignee: | Dan Williams <dcbw> |
Status: | CLOSED ERRATA | QA Contact: | Meng Bo <bmeng> |
Severity: | medium | Docs Contact: | |
Priority: | unspecified | ||
Version: | 3.1.0 | CC: | agoldste, aos-bugs, eparis, erich, jokerman, mmccomas, mmcgrath, rteabeault, tdawson |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2016-05-12 16:27:49 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Ryan Howe
2016-02-04 02:50:19 UTC
I'm surprised that this would ever work. Was this supported in the past? If you pass NodeIP in the node config then it should work. But what's the point of having the hostname set to 127.0.0.1? The Linux networking stack is smart enough to handle traffic to the local IP in a faster manner, so there should be no need to set it to 127.0.0.1. You have a few options: 1) do not alias the hostname to 127.0.0.1/localhost, but allow the node's hostname to resolve to the node's IP address through DNS 2) set NodeName in the configuration to either an IP address or to some name that resolves via DNS/hosts to something other than 127.0.0.1 3) set NodeIP in the configuration to the IP address that the node is accessible by This is not really related to either of the changes mentioned, it's simply that OpenShift/Kubernetes/openshift-sdn need to know the actual IP address of the node, and that is determined by: 1) if NodeIP is given, always use that 2) if NodeName is given and is a hostname, that is looked up via the glibc resolver and must resolve to something other than 127.0.0.1 3) if NodeName is given and is an IP address, use that 4) if no NodeName is given, and no NodeIP is given, take the machine name from 'uname -n' and look that up via the glibc resolver (In reply to Dan Williams from comment #3) > You have a few options: > > 1) do not alias the hostname to 127.0.0.1/localhost, but allow the node's > hostname to resolve to the node's IP address through DNS > > 2) set NodeName in the configuration to either an IP address or to some name > that resolves via DNS/hosts to something other than 127.0.0.1 > > 3) set NodeIP in the configuration to the IP address that the node is > accessible by > > This is not really related to either of the changes mentioned, it's simply > that OpenShift/Kubernetes/openshift-sdn need to know the actual IP address > of the node, and that is determined by: > > 1) if NodeIP is given, always use that > > 2) if NodeName is given and is a hostname, that is looked up via the glibc > resolver and must resolve to something other than 127.0.0.1 > > 3) if NodeName is given and is an IP address, use that > What would thes options look like when passed to the installer? > 4) if no NodeName is given, and no NodeIP is given, take the machine name > from 'uname -n' and look that up via the glibc resolver (In reply to Eric Rich from comment #4) > (In reply to Dan Williams from comment #3) > > You have a few options: > > > > 1) do not alias the hostname to 127.0.0.1/localhost, but allow the node's > > hostname to resolve to the node's IP address through DNS > > > > 2) set NodeName in the configuration to either an IP address or to some name > > that resolves via DNS/hosts to something other than 127.0.0.1 > > > > 3) set NodeIP in the configuration to the IP address that the node is > > accessible by > > > > This is not really related to either of the changes mentioned, it's simply > > that OpenShift/Kubernetes/openshift-sdn need to know the actual IP address > > of the node, and that is determined by: > > > > 1) if NodeIP is given, always use that > > > > 2) if NodeName is given and is a hostname, that is looked up via the glibc > > resolver and must resolve to something other than 127.0.0.1 > > > > 3) if NodeName is given and is an IP address, use that > > > > What would thes options look like when passed to the installer? # Configure nodeIP in the node config # This is needed in cases where node traffic is desired to go over an # interface other than the default network interface. #openshift_node_set_node_ip=True For nodeName, I don't think you can override it with the Ansible installer, it will always be set to the system hostname. But in the end, just don't alias the system hostname to 127.0.0.1. I'd love to know why that was done... Upstream pull request to grab NodeIP off the default gateway interface (matching kubelet behavior): https://github.com/openshift/openshift-sdn/pull/256 os-sdn merged Checked on OSE build 3.1.1.905, issue has been fixed. [root@ose-master ~]# ping ose-master.bmeng.local PING localhost (127.0.0.1) 56(84) bytes of data. 64 bytes from localhost (127.0.0.1): icmp_seq=1 ttl=64 time=0.019 ms 64 bytes from localhost (127.0.0.1): icmp_seq=2 ttl=64 time=0.032 ms ^C --- localhost ping statistics --- 2 packets transmitted, 2 received, 0% packet loss, time 999ms rtt min/avg/max/mdev = 0.019/0.025/0.032/0.008 ms [root@ose-master ~]# ping -c1 ose-master.bmeng.local PING localhost (127.0.0.1) 56(84) bytes of data. 64 bytes from localhost (127.0.0.1): icmp_seq=1 ttl=64 time=0.018 ms --- localhost ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 0.018/0.018/0.018/0.000 ms [root@ose-master ~]# systemctl restart atomic-openshift-master.service [root@ose-master ~]# systemctl status atomic-openshift-master.service ● atomic-openshift-master.service - Atomic OpenShift Master Loaded: loaded (/usr/lib/systemd/system/atomic-openshift-master.service; enabled; vendor preset: disabled) Active: active (running) since Wed 2016-02-24 18:12:43 CST; 6s ago Docs: https://github.com/openshift/origin Main PID: 2502 (openshift) CGroup: /system.slice/atomic-openshift-master.service └─2502 /usr/bin/openshift start master --config=/etc/origin/master/master-config.yaml --loglevel=2 Feb 24 18:12:48 ose-master.bmeng.local atomic-openshift-master[2502]: ua5EspbpY/n75siWeQk/++e1CEYx9JbYW8qk8A7HtgMzyO2k09G1ZgrKJAouEZ4R Feb 24 18:12:48 ose-master.bmeng.local atomic-openshift-master[2502]: KLAWiG+T7rDj+AG6HVxxvb/QSF7/9XyV5aNxzCSOVV4UoxQvKB+PziBcykMZZBRh Feb 24 18:12:48 ose-master.bmeng.local atomic-openshift-master[2502]: MfZIJJsCgYEAuISl8QZmipueb7w6Bh+4yt8vohy+vcJv9Ydb4oNnfB/mcJmtrjPo Feb 24 18:12:48 ose-master.bmeng.local atomic-openshift-master[2502]: q0VOo06zNo1WQfw2YCHL8SdL0WjCzclQ7lcRrYid5hxRpgut4nBt9vCrnY5phMHi Feb 24 18:12:48 ose-master.bmeng.local atomic-openshift-master[2502]: Zvx53sPdHVpszBq5FcIPFEU5Ts9kX7GBWRzTapBVZr+z4rLFXglQEAo= Feb 24 18:12:48 ose-master.bmeng.local atomic-openshift-master[2502]: -----END RSA PRIVATE KEY----- Feb 24 18:12:48 ose-master.bmeng.local atomic-openshift-master[2502]: ValueFrom:<nil>} {Name:OPENSHIFT_MASTER Value:https://ose-master.bmeng.local:8443 Valu....io/ser Feb 24 18:12:48 ose-master.bmeng.local atomic-openshift-master[2502]: E0224 18:12:48.057521 2502 factory.go:340] Error scheduling default ipf-ha-1-lvt6r:...ny node Feb 24 18:12:48 ose-master.bmeng.local atomic-openshift-master[2502]: fit failure on node (ose-node1.bmeng.local): PodFitsPorts Feb 24 18:12:48 ose-master.bmeng.local atomic-openshift-master[2502]: ; retrying Hint: Some lines were ellipsized, use -l to show in full. [root@ose-node1 ~]# ping -c 1 ose-node1.bmeng.local PING localhost (127.0.0.1) 56(84) bytes of data. 64 bytes from localhost (127.0.0.1): icmp_seq=1 ttl=64 time=0.026 ms --- localhost ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 0.026/0.026/0.026/0.000 ms [root@ose-node1 ~]# systemctl restart atomic-openshift-node.service [root@ose-node1 ~]# systemctl status atomic-openshift-node.service ● atomic-openshift-node.service - Atomic OpenShift Node Loaded: loaded (/usr/lib/systemd/system/atomic-openshift-node.service; disabled; vendor preset: disabled) Drop-In: /usr/lib/systemd/system/atomic-openshift-node.service.d └─openshift-sdn-ovs.conf Active: active (running) since Wed 2016-02-24 18:13:28 CST; 2s ago Docs: https://github.com/openshift/origin Main PID: 3248 (openshift) CGroup: /system.slice/atomic-openshift-node.service └─3248 /usr/bin/openshift start node --config=/etc/origin/node/node-config.yaml --loglevel=2 Feb 24 18:13:28 ose-node1.bmeng.local atomic-openshift-node[3248]: I0224 18:13:28.299975 3248 proxier.go:477] Setting endpoints for "default/kubernetes:d...8.6:53] Feb 24 18:13:28 ose-node1.bmeng.local atomic-openshift-node[3248]: I0224 18:13:28.299984 3248 proxier.go:477] Setting endpoints for "default/kubernetes:h...6:8443] Feb 24 18:13:28 ose-node1.bmeng.local atomic-openshift-node[3248]: I0224 18:13:28.300001 3248 proxier.go:558] Not syncing iptables until Services and End... master Feb 24 18:13:28 ose-node1.bmeng.local atomic-openshift-node[3248]: I0224 18:13:28.300236 3248 proxier.go:414] Adding new service "default/kubernetes:http...443/TCP Feb 24 18:13:28 ose-node1.bmeng.local atomic-openshift-node[3248]: I0224 18:13:28.300302 3248 proxier.go:414] Adding new service "default/kubernetes:dns"...:53/UDP Feb 24 18:13:28 ose-node1.bmeng.local atomic-openshift-node[3248]: I0224 18:13:28.300325 3248 proxier.go:414] Adding new service "default/kubernetes:dns-...:53/TCP Feb 24 18:13:28 ose-node1.bmeng.local atomic-openshift-node[3248]: I0224 18:13:28.300342 3248 proxier.go:414] Adding new service "default/router:80-tcp" ...:80/TCP Feb 24 18:13:28 ose-node1.bmeng.local atomic-openshift-node[3248]: I0224 18:13:28.300360 3248 proxier.go:414] Adding new service "default/router:443-tcp"...443/TCP Feb 24 18:13:28 ose-node1.bmeng.local atomic-openshift-node[3248]: I0224 18:13:28.300376 3248 proxier.go:414] Adding new service "default/router:1936-tcp...936/TCP Feb 24 18:13:28 ose-node1.bmeng.local atomic-openshift-node[3248]: I0224 18:13:28.300397 3248 proxier.go:414] Adding new service "u1p1/ha-service:" at 17...736/TCP Hint: Some lines were ellipsized, use -l to show in full. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2016:1064 |