Bug 1851338
| Summary: | Tests are failing due to constant etcd leader elections changes | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Maysa Macedo <mdemaced> | ||||||
| Component: | Networking | Assignee: | Maysa Macedo <mdemaced> | ||||||
| Networking sub component: | kuryr | QA Contact: | GenadiC <gcheresh> | ||||||
| Status: | CLOSED ERRATA | Docs Contact: | |||||||
| Severity: | high | ||||||||
| Priority: | urgent | CC: | cdaley, gcheresh, ltomasbo, openshift-bugzilla-robot, rlobillo | ||||||
| Version: | 4.5 | ||||||||
| Target Milestone: | --- | ||||||||
| Target Release: | 4.3.z | ||||||||
| Hardware: | Unspecified | ||||||||
| OS: | Unspecified | ||||||||
| Whiteboard: | |||||||||
| Fixed In Version: | Doc Type: | No Doc Update | |||||||
| Doc Text: | Story Points: | --- | |||||||
| Clone Of: | 1849540 | Environment: | |||||||
| Last Closed: | 2020-07-14 16:11:52 UTC | Type: | --- | ||||||
| Regression: | --- | Mount Type: | --- | ||||||
| Documentation: | --- | CRM: | |||||||
| Verified Versions: | Category: | --- | |||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||
| Embargoed: | |||||||||
| Bug Depends On: | 1849540 | ||||||||
| Bug Blocks: | |||||||||
| Attachments: |
|
||||||||
|
Comment 3
Jan Safranek
2020-07-02 07:59:59 UTC
Created attachment 1700134 [details]
NP test results
Created attachment 1700135 [details]
ETCD metrics during test execution
Verified on OCP4.3.0-0.nightly-2020-07-06-074036 with OSP16.1 (RHOS-16.1-RHEL-8-20200701.n.0) with OVN. Ingress rules to etcd are splitted in two instead of setting a range: (shiftstack) [stack@undercloud-0 ~]$ openstack security group show ostest-h5nsm-master | grep 10.196.0.0 | grep -e 2379 -e 2380 | | created_at='2020-07-06T14:19:41Z', direction='ingress', ethertype='IPv4', id='45689162-6486-4a62-988e-7fc75f3b9178', port_range_max='2379', port_range_min='2379', protocol='tcp', remote_ip_prefix='10.196.0.0/16', updated_at='2020-07-06T14:19:41Z' | | | created_at='2020-07-06T14:19:41Z', direction='ingress', ethertype='IPv4', id='b7230eda-b467-4ea7-8b1e-1aa48fae8818', port_range_max='2380', port_range_min='2380', protocol='tcp', remote_ip_prefix='10.196.0.0/16', updated_at='2020-07-06T14:19:41Z' | NP tests run with parallelism set to 2 with expected results. No etcd leader change observed during test execution (on day 2020-07-6 from 17:00 onwards): (overcloud) [stack@undercloud-0 ~]$ for i in $(oc get pods -n openshift-etcd -l k8s-app=etcd -o NAME); do echo "# $i"; oc logs $i -n openshift-etcd -c etcd-member |grep 'became leader'; done # pod/etcd-member-ostest-h5nsm-master-0 2020-07-06 14:17:22.082454 I | raft: 7e92ed1f2b132c63 became leader at term 8 # pod/etcd-member-ostest-h5nsm-master-1 # pod/etcd-member-ostest-h5nsm-master-2 No timeouts on port 2380 during test execution: (overcloud) [stack@undercloud-0 ~]$ for i in $(oc get pods -n openshift-etcd -l k8s-app=etcd -o NAME); do echo "# $i"; oc logs $i -n openshift-etcd -c etcd-member |grep 'timeout'; done # pod/etcd-member-ostest-h5nsm-master-0 # pod/etcd-member-ostest-h5nsm-master-1 # pod/etcd-member-ostest-h5nsm-master-2 Furthermore, etcd metrics show an stable behaviour during the same (attached). Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:2872 |