Bug 1851338
Summary: | Tests are failing due to constant etcd leader elections changes | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Maysa Macedo <mdemaced> | ||||||
Component: | Networking | Assignee: | Maysa Macedo <mdemaced> | ||||||
Networking sub component: | kuryr | QA Contact: | GenadiC <gcheresh> | ||||||
Status: | CLOSED ERRATA | Docs Contact: | |||||||
Severity: | high | ||||||||
Priority: | urgent | CC: | cdaley, gcheresh, ltomasbo, openshift-bugzilla-robot, rlobillo | ||||||
Version: | 4.5 | ||||||||
Target Milestone: | --- | ||||||||
Target Release: | 4.3.z | ||||||||
Hardware: | Unspecified | ||||||||
OS: | Unspecified | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | Doc Type: | No Doc Update | |||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | 1849540 | Environment: | |||||||
Last Closed: | 2020-07-14 16:11:52 UTC | Type: | --- | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Bug Depends On: | 1849540 | ||||||||
Bug Blocks: | |||||||||
Attachments: |
|
Comment 3
Jan Safranek
2020-07-02 07:59:59 UTC
Created attachment 1700134 [details]
NP test results
Created attachment 1700135 [details]
ETCD metrics during test execution
Verified on OCP4.3.0-0.nightly-2020-07-06-074036 with OSP16.1 (RHOS-16.1-RHEL-8-20200701.n.0) with OVN. Ingress rules to etcd are splitted in two instead of setting a range: (shiftstack) [stack@undercloud-0 ~]$ openstack security group show ostest-h5nsm-master | grep 10.196.0.0 | grep -e 2379 -e 2380 | | created_at='2020-07-06T14:19:41Z', direction='ingress', ethertype='IPv4', id='45689162-6486-4a62-988e-7fc75f3b9178', port_range_max='2379', port_range_min='2379', protocol='tcp', remote_ip_prefix='10.196.0.0/16', updated_at='2020-07-06T14:19:41Z' | | | created_at='2020-07-06T14:19:41Z', direction='ingress', ethertype='IPv4', id='b7230eda-b467-4ea7-8b1e-1aa48fae8818', port_range_max='2380', port_range_min='2380', protocol='tcp', remote_ip_prefix='10.196.0.0/16', updated_at='2020-07-06T14:19:41Z' | NP tests run with parallelism set to 2 with expected results. No etcd leader change observed during test execution (on day 2020-07-6 from 17:00 onwards): (overcloud) [stack@undercloud-0 ~]$ for i in $(oc get pods -n openshift-etcd -l k8s-app=etcd -o NAME); do echo "# $i"; oc logs $i -n openshift-etcd -c etcd-member |grep 'became leader'; done # pod/etcd-member-ostest-h5nsm-master-0 2020-07-06 14:17:22.082454 I | raft: 7e92ed1f2b132c63 became leader at term 8 # pod/etcd-member-ostest-h5nsm-master-1 # pod/etcd-member-ostest-h5nsm-master-2 No timeouts on port 2380 during test execution: (overcloud) [stack@undercloud-0 ~]$ for i in $(oc get pods -n openshift-etcd -l k8s-app=etcd -o NAME); do echo "# $i"; oc logs $i -n openshift-etcd -c etcd-member |grep 'timeout'; done # pod/etcd-member-ostest-h5nsm-master-0 # pod/etcd-member-ostest-h5nsm-master-1 # pod/etcd-member-ostest-h5nsm-master-2 Furthermore, etcd metrics show an stable behaviour during the same (attached). Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:2872 |