Bug 1916843 - collect logs from openshift-sdn-controller pod
Summary: collect logs from openshift-sdn-controller pod
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Insights Operator
Version: 4.7
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: 4.7.0
Assignee: Serhii Zakharov
QA Contact: Pavel Šimovec
URL:
Whiteboard:
Depends On:
Blocks: 1921743
TreeView+ depends on / blocked
 
Reported: 2021-01-15 16:34 UTC by Serhii Zakharov
Modified: 2021-02-24 15:54 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Enhancement
Doc Text:
sdn-controller (openshift-sdn namespace) emits important messages when it finds issues affecting Egress IPs. The important messages are the following: - “Node %s is not Ready”: A node has been set offline for egress IPs because it is reported not ready at API - “Node %s may be offline... retrying”: An egress node has failed the egress IP health check once, so it has big chances to be marked as offline soon or, at the very least, there has been a connectivity glitch. - “Node %s is offline”: An egress node has failed enough probes to have been marked offline for egress IPs. If it has egress CIDRs assigned, its egress IPs have been moved to other nodes. Indicates issues at either the node or the network between the master and the node. - “Node %s is back online”: This indicates that a node has recovered from the condition described at the previous message, by starting succeeding the egress IP health checks. Useful just in case that previous “Node %s is offline” messages are lost, so that we have a clue that there was failure previously. As IO is gathered every 2hrs we want to gather latest occurrences of those errors in logs
Clone Of:
: 1921743 (view as bug list)
Environment:
Last Closed: 2021-02-24 15:53:53 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift insights-operator pull 314 0 None closed Bug 1916843: collect logs from openshift-sdn-controller pod 2021-01-25 06:16:48 UTC
Red Hat Product Errata RHSA-2020:5633 0 None None None 2021-02-24 15:54:18 UTC

Description Serhii Zakharov 2021-01-15 16:34:06 UTC
sdn-controller (openshift-sdn namespace) emits important messages when it finds issues affecting Egress IPs. The important messages are the following:

    “Node %s is not Ready”: A node has been set offline for egress IPs because it is reported not ready at API
    “Node %s may be offline... retrying”: An egress node has failed the egress IP health check once, so it has big chances to be marked as offline soon or, at the very least, there has been a connectivity glitch.
    “Node %s is offline”: An egress node has failed enough probes to have been marked offline for egress IPs. If it has egress CIDRs assigned, its egress IPs have been moved to other nodes. Indicates issues at either the node or the network between the master and the node.
    “Node %s is back online”: This indicates that a node has recovered from the condition described at the previous message, by starting succeeding the egress IP health checks. Useful just in case that previous “Node %s is offline” messages are lost, so that we have a clue that there was failure previously.

As IO is gathered every 2hrs we want to gather latest occurrences of those errors in logs

Comment 5 errata-xmlrpc 2021-02-24 15:53:53 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.7.0 security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:5633


Note You need to log in before you can comment on or make changes to this bug.