Bug 1194471
| Summary: | openshift-sdn-node doesn't correctly detect hostname | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Erik M Jacobs <ejacobs> |
| Component: | Networking | Assignee: | Rajat Chopra <rchopra> |
| Status: | CLOSED CURRENTRELEASE | QA Contact: | |
| Severity: | unspecified | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 3.0.0 | CC: | anli, bleanhar, dmcphers, jialiu, jokerman, libra-onpremise-devel, mmccomas, rchopra, xtian |
| Target Milestone: | --- | ||
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | openshift-sdn-0.4-1.git.0.bc3855b.el7ose | Doc Type: | Bug Fix |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2015-11-23 14:43:25 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Erik M Jacobs
2015-02-19 22:23:30 UTC
Fixed with https://github.com/openshift/openshift-sdn/pull/25 if /etc/hosts has several entries to the public IP entry, e.g. 192.168.133.2 ose3-master.example.com ose3-master hostname -f picks up ose3-master.example.com But the code 'os.Hostname' in golang picks ose3-master, triggering a mismatch between openshift-master as it sees a minion and openshift-sdn-node. Pull request above reverts to what the documentation says, openshift-sdn-node will use 'hostname -f' if -hostname option is not specified. We'll make sure this gets built today in OSE. For now it can be tested in Origin as soon at the PR is merged. Verified and pass.
1) The sdn-master can be started with only "-v=4"
[root@master222 sysconfig]# systemctl status openshift-sdn-master
openshift-sdn-master.service - OpenShift SDN Master
Loaded: loaded (/usr/lib/systemd/system/openshift-sdn-master.service; disabled)
Active: active (running) since Thu 2015-02-26 09:39:12 CST; 26min ago
Docs: https://github.com/openshift/openshift-sdn
Main PID: 13472 (openshift-sdn)
CGroup: /system.slice/openshift-sdn-master.service
└─13472 /usr/bin/openshift-sdn -v=4
Feb 26 09:43:00 master222.ose.com.cn openshift-sdn[13472]: I0226 09:43:00.452965 13472 registry.go:269] Issuing a minion event: &{compareAndSwap 0xc208004420 0xc208004540 25 106506 1}
Feb 26 09:43:00 master222.ose.com.cn openshift-sdn[13472]: I0226 09:43:00.457984 13472 registry.go:212] unmarshalling {"Minion":"192.168.0.224","Sub":"10.1.2.0/24"}
2) The sdn-node can connected to the master using long hostname
osc get nodes
NAME LABELS STATUS
master222.ose.com.cn <none> Ready
node223.ose.com.cn <none> Ready
Add one more step for comment 5, add the following line to /ect/hosts <IP> master222.ose.com.cn master222 When /etc/hosts have the following lines, this bug would reproduced, so re-open it. # hostname jialiu-node1.example.com # hostname -f jialiu-node1.example.com # cat /etc/hosts 10.66.79.112 jialiu-node1 jialiu-node1.example.com ***NOTE:*** When the line is 10.66.79.112 jialiu-node1.example.com jialiu-node1 This issue would not happen. Sorry, not clear how there is a mismatch again. The doc says it will pick 'hostname -f' and that is what it does irrespective of what is there in /etc/hosts. Version:
3.0/2015-05-30.1/
Verify:
Now we don't have openshift-sdn-master service, so check openshift-master service instead.
[root@jia-master ~]# systemctl status openshift-master
openshift-master.service - OpenShift Master
Loaded: loaded (/usr/lib/systemd/system/openshift-master.service; enabled)
Active: active (running) since Mon 2015-06-01 17:41:17 CST; 15min ago
Docs: https://github.com/openshift/origin
Main PID: 3681 (openshift)
CGroup: /system.slice/openshift-master.service
└─3681 /usr/bin/openshift start master --config=/etc/openshift/master/master-config.yaml --loglevel=4
Jun 01 17:56:19 jia-master.v3-ose.com openshift-master[3681]: I0601 17:56:19.622990 3681 reflector.go:241] Watch close - *api.Pod tota...ceived
Jun 01 17:56:20 jia-master.v3-ose.com openshift-master[3681]: I0601 17:56:20.118436 3681 reflector.go:241] Watch close - *api.Namespac...ceived
Jun 01 17:56:21 jia-master.v3-ose.com openshift-master[3681]: I0601 17:56:21.760152 3681 nodecontroller.go:279] Nodes ReadyCondition u...Heartb
Jun 01 17:56:21 jia-master.v3-ose.com openshift-master[3681]: vs {Capacity:map[memory:{Amount:3975819264.000 Format:BinarySI} pods:{Amoun...kubele
Jun 01 17:56:26 jia-master.v3-ose.com openshift-master[3681]: I0601 17:56:26.450394 3681 reflector.go:241] Watch close - *api.Namespac...ceived
Jun 01 17:56:26 jia-master.v3-ose.com openshift-master[3681]: I0601 17:56:26.672080 3681 reflector.go:241] Watch close - *api.LimitRan...ceived
Jun 01 17:56:26 jia-master.v3-ose.com openshift-master[3681]: I0601 17:56:26.849378 3681 reflector.go:241] Watch close - *api.ServiceA...ceived
Jun 01 17:56:27 jia-master.v3-ose.com openshift-master[3681]: I0601 17:56:27.047428 3681 reflector.go:241] Watch close - *api.Secret t...ceived
Jun 01 17:56:27 jia-master.v3-ose.com openshift-master[3681]: I0601 17:56:27.247934 3681 reflector.go:241] Watch close - *api.Resource...ceived
Jun 01 17:56:28 jia-master.v3-ose.com openshift-master[3681]: I0601 17:56:28.004772 3681 reflector.go:241] Watch close - *api.ServiceA...ceived
Hint: Some lines were ellipsized, use -l to show in full.
[root@jia-master ~]# cat /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
10.14.6.125 jia-master.v3-ose.com
10.14.6.116 jia-minion.v3-ose.com
[root@jia-master ~]# osc get nodes
NAME LABELS STATUS
jia-minion.v3-ose.com region=primary,zone=east Ready
Sorry, seems my step is not enough.
Add these steps for this bug:
[root@openshift-v3 training]# hostname
jia-master.v3-ose.com
[root@openshift-v3 training]# hostname -f
jia-master
[root@openshift-v3 training]# cat /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
10.14.6.125 jia-minion jia-minion.v3-ose.com
10.14.6.116 jia-master jia-master.v3-ose.com
[root@openshift-v3 training]#
[root@openshift-v3 training]#
[root@openshift-v3 training]# systemctl status openshift-master
openshift-master.service - OpenShift Master
Loaded: loaded (/usr/lib/systemd/system/openshift-master.service; enabled)
Active: active (running) since Mon 2015-06-01 18:38:42 CST; 1min 21s ago
Docs: https://github.com/openshift/origin
Main PID: 3562 (openshift)
CGroup: /system.slice/openshift-master.service
└─3562 /usr/bin/openshift start master --config=/etc/openshift/master/master-config.yaml --loglevel=4
Jun 01 18:38:44 jia-master.v3-ose.com openshift-master[3562]: I0601 18:38:44.612345 3562 endpoints_controller.go:258] Finished syncing service "default/kubernetes-ro" endpoints. (1.066µs)
Jun 01 18:38:54 jia-master.v3-ose.com openshift-master[3562]: I0601 18:38:54.816009 3562 trace.go:57] Trace "getFromCache" (started 2015-06-01 18:38:54.815757954 +0800 CST):
Jun 01 18:38:54 jia-master.v3-ose.com openshift-master[3562]: [12.887µs] [12.887µs] Raw get done
Jun 01 18:38:54 jia-master.v3-ose.com openshift-master[3562]: [206.035µs] [193.148µs] Deep copied
Jun 01 18:38:54 jia-master.v3-ose.com openshift-master[3562]: [209.15µs] [3.115µs] END
Jun 01 18:39:13 jia-master.v3-ose.com openshift-master[3562]: I0601 18:39:13.170072 3562 endpoints_controller.go:258] Finished syncing service "default/kubernetes" endpoints. (19.228µs)
Jun 01 18:39:13 jia-master.v3-ose.com openshift-master[3562]: I0601 18:39:13.170154 3562 endpoints_controller.go:258] Finished syncing service "default/kubernetes-ro" endpoints. (1.29µs)
Jun 01 18:39:44 jia-master.v3-ose.com openshift-master[3562]: I0601 18:39:44.159335 3562 endpoints_controller.go:258] Finished syncing service "default/kubernetes" endpoints. (13.724µs)
Jun 01 18:39:44 jia-master.v3-ose.com openshift-master[3562]: I0601 18:39:44.159381 3562 endpoints_controller.go:258] Finished syncing service "default/kubernetes-ro" endpoints. (853ns)
Jun 01 18:39:55 jia-master.v3-ose.com openshift-master[3562]: I0601 18:39:55.030710 3562 nodecontroller.go:252] Creating timestamp entry for newly observed Node jia-minion.v3-ose.com
Continue comment 10, this bug should be more specific for node, but not master. # hostname jia-minion.v3-ose.com # hostname -f jia-minion # cat /etc/hosts 127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4 ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6 10.14.6.125 jia-minion jia-minion.v3-ose.com 10.14.6.116 jia-master jia-master.v3-ose.com # service openshift-node restart Redirecting to /bin/systemctl restart openshift-node.service # osc get nodes NAME LABELS STATUS jia-minion.v3-ose.com region=primary,zone=west Ready |