Bug 1997404
| Summary: | SSH'ing in to node does not result in annotation being applied to node | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Andy Bartlett <andbartl> |
| Component: | Documentation | Assignee: | Jessi <jmanthei> |
| Status: | CLOSED DEFERRED | QA Contact: | Rio Liu <rioliu> |
| Severity: | low | Docs Contact: | Latha S <lmurthy> |
| Priority: | low | ||
| Version: | 4.7 | CC: | aos-bugs, dornelas, jkaur, jkyros, mkrejci, pamoedom, rioliu, skumari |
| Target Milestone: | --- | ||
| Target Release: | 4.8.z | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2023-03-09 01:06:12 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | 2010947 | ||
| Bug Blocks: | 1903544 | ||
|
Description
Andy Bartlett
2021-08-25 07:42:25 UTC
From the must-gather, in pod machine-config-daemon-8h8ck on master-0.ocp4-bare.andytest.lab, it looks like that node might be having some connectivity issues: 2021-08-22T09:45:58.388270376Z E0822 09:45:58.384638 6405 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.Node: failed to list *v1.Node: Get "https://172.30.0.1:443/api/v1/nodes?limit=500&resourceVersion=0": dial tcp 172.30.0.1:443: i/o timeout 2021-08-22T09:46:00.523698766Z I0822 09:46:00.523244 6405 daemon.go:381] Node master-0.ocp4-bare.andytest.lab is part of the control plane 2021-08-22T09:46:01.170642137Z I0822 09:46:01.167861 6405 daemon.go:802] Current config: rendered-master-012acc289869be3f8becc00e86aec428 2021-08-22T09:46:01.170642137Z I0822 09:46:01.167889 6405 daemon.go:803] Desired config: rendered-master-754da41cc91abf2c7a0f19bc7e8745cf 2021-08-22T09:46:01.202362117Z I0822 09:46:01.201866 6405 update.go:1943] Disk currentConfig rendered-master-754da41cc91abf2c7a0f19bc7e8745cf overrides node's currentConfig annotation rendered-master-012acc289869be3f8becc00e86aec428 2021-08-22T09:46:01.216451220Z I0822 09:46:01.215039 6405 daemon.go:1085] Validating against pending config rendered-master-754da41cc91abf2c7a0f19bc7e8745cf 2021-08-22T09:46:01.298839675Z I0822 09:46:01.298727 6405 daemon.go:1096] Validated on-disk state 2021-08-22T09:46:01.512403291Z I0822 09:46:01.511434 6405 daemon.go:1151] Completing pending config rendered-master-754da41cc91abf2c7a0f19bc7e8745cf 2021-08-22T09:46:01.621500050Z I0822 09:46:01.621402 6405 update.go:1943] completed update for config rendered-master-754da41cc91abf2c7a0f19bc7e8745cf 2021-08-22T09:46:01.644233131Z I0822 09:46:01.642890 6405 daemon.go:1167] In desired config rendered-master-754da41cc91abf2c7a0f19bc7e8745cf 2021-08-22T09:57:47.016779861Z W0822 09:57:47.016703 6405 reflector.go:436] k8s.io/client-go/informers/factory.go:134: watch of *v1.Node ended with: an error on the server ("unable to decode an event from the watch stream: http2: client connection lost") has prevented the request from succeeding 2021-08-22T09:57:47.016971691Z W0822 09:57:47.016720 6405 reflector.go:436] github.com/openshift/machine-config-operator/pkg/generated/informers/externalversions/factory.go:101: watch of *v1.MachineConfig ended with: an error on the server ("unable to decode an event from the watch stream: http2: client connection lost") has prevented the request from succeeding If the mcd doesn't have a client connection to get to the node object, that would prevent the SSHAccessed annotation from being set on the node object. Looking at the mcd pods on the other nodes, they appear to be reporting connectivity errors there too. In the the host service logs, I see some ovs stuff going on. Stop times are: Aug 22 09:42:00.108073 master-0.ocp4-bare.andytest.lab systemd[1]: Stopped Open vSwitch Forwarding Unit. (this is before the error) Aug 22 09:50:53.934653 master-1.ocp4-bare.andytest.lab systemd[1]: Stopped Open vSwitch Forwarding Unit. Aug 22 09:57:14.952266 master-2.ocp4-bare.andytest.lab systemd[1]: Stopped Open vSwitch Forwarding Unit. I also see: Aug 22 09:44:33.312315 master-0.ocp4-bare.andytest.lab systemd[1]: ovs-configuration.service: Succeeded. Aug 22 09:44:33.313290 master-0.ocp4-bare.andytest.lab systemd[1]: Started Configures OVS with proper host networking configuration. Aug 22 09:44:33.313877 master-0.ocp4-bare.andytest.lab systemd[1]: ovs-configuration.service: Consumed 262ms CPU time Was this just a clean cluster build or were other things done to it ? Were you by chance testing/doing anything that would have affected connectivity before this occurred? *** Bug 1842603 has been marked as a duplicate of this bug. *** https://github.com/openshift/openshift-docs/pull/54465 Starting deprecation notice. https://github.com/openshift/openshift-docs/pull/54465 Starting deprecation notice. OpenShift has moved to Jira for its defect tracking! This bug can now be found in the OCPBUGS project in Jira. https://issues.redhat.com/browse/OCPBUGS-8958 |