Bug 1878688 - [updates] 0.08% packet loss OSP16 Z updates after l3 connectivity check
Summary: [updates] 0.08% packet loss OSP16 Z updates after l3 connectivity check
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openvswitch
Version: 16.0 (Train)
Hardware: Unspecified
OS: Unspecified
high
urgent
Target Milestone: z3
: 16.0 (Train on RHEL 8.1)
Assignee: Sofer Athlan-Guyot
QA Contact: Eran Kuris
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-09-14 10:56 UTC by Ronnie Rasouli
Modified: 2020-10-05 12:47 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-10-05 12:47:50 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Launchpad 1895829 0 None None None 2020-09-16 12:36:05 UTC
OpenStack gerrit 752232 0 None MERGED Block on fip creation before starting the ping test. 2020-10-11 13:04:43 UTC

Description Ronnie Rasouli 2020-09-14 10:56:08 UTC
Description of problem:

Testing Z update of OSP16 GA (IPv4, IPv6 Composable) fail stop l3 agent connectivity check:

STDOUT:

12718 packets transmitted, 12709 received, 0.0707658% packet loss, time 13331ms
rtt min/avg/max/mdev = 0.394/0.897/35.319/0.773 ms
Ping loss higher than 0 seconds detected (9 seconds)


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1. Deploy RHOS16 GA
2. update the undercloud
3.update the overcloud
4. run ping test after l3 stop

Actual results:

0.0707658% packet loss 
Expected results:
no packet loss

Additional info:

Comment 2 Sofer Athlan-Guyot 2020-09-14 11:55:31 UTC
Hi,

so there is no special treatment for openvswitch in osp16.0.  So it gets updated and certainly create the small cut we see here:

95205:2020-09-11 22:55:07 | TASK [Update all packages] *****************************************************                                                                                                                                                                                                                  
95207:2020-09-11 22:55:07 | changed: [controller-2] => {"changed": true, "msg": "", "rc": 0, "results": ["Installed: iwl2030-firmware-18.168.6.1-95.el8_1.1.noarch", "Installed: ansible-pacemaker-1.0.4-0.20200324121423.5847167.el8ost.noarch", "Installed: iwl3945-firmware-15.32.2.9-95.el8_1.1.noarch", "Installed: iwl51
50-firmware-8.24.2.2-95.el8_1.1.noarch", "Installed: gnutls-utils-3.6.8-9.el8_1.x86_64", "Installed: gnutls-dane-3.6.8-9.el8_1.x86_64", "Installed: bind-export-libs-32:9.11.4-26.P2.el8_1.3.x86_64", "Installed: python3-neutronclient-6.14.0-0.20200221162537.115f60f.el8ost.noarch", "Installed: gnutls-3.6.8-9.el8_1.x86_6
4", "Installed: python3-novaclient-1:15.1.0-0.20200225115308.cd396b8.el8ost.noarch", "Installed: cloud-init-18.5-7.el8_1.2.noarch", "Installed: libvirt-daemon-driver-storage-scsi-5.6.0-10.module+el8.1.1+5309+6d656f05.x86_64", "Installed: libvirt-client-5.6.0-10.module+el8.1.1+5309+6d656f05.x86_64", "Installed: system
d-udev-239-18.el8_1.7.x86_64", "Installed: rpm-build-4.14.2-26.el8_1.x86_64", "Installed: certmonger-0.79.7-6.el8_1.x86_64", "Installed: systemd-libs-239-18.el8_1.7.x86_64", "Installed: systemd-pam-239-18.el8_1.7.x86_64", "Installed: netcf-libs-0.2.8-12.module+el8.1.1+5309+6d656f05.x86_64", "Installed: systemd-contai
ner-239-18.el8_1.7.x86_64", "Installed: libvirt-daemon-5.6.0-10.module+el8.1.1+5309+6d656f05.x86_64", "Installed: python3-openstackclient-4.0.0-0.20200221163859.aa64eb6.el8ost.noarch", "Installed: python3-openstacksdk-0.36.2-0.20200319144116.211bdc6.el8ost.noarch", "Installed: slirp4netns-0.4.2-3.git21fdece.module+el
8.1.1+5657+524a77d7.x86_64", "Installed: python3-os-client-config-1.33.0-0.20200225125413.d0eea17.el8ost.noarch", "Installed: systemd-239-18.el8_1.7.x86_64", "Installed: gdb-headless-8.2-6.el8_0.x86_64", "Installed: python3-os-service-types-1.7.0-0.20200225085439.0b2f473.el8ost.noarch", "Installed: microcode_ctl-4:20
190618-1.20200609.1.el8_1.x86_64", "Installed: libvirt-daemon-driver-interface-5.6.0-10.module+el8.1.1+5309+6d656f05.x86_64", "Installed: libvirt-daemon-config-nwfilter-5.6.0-10.module+el8.1.1+5309+6d656f05.x86_64", "Installed: libvirt-daemon-driver-storage-gluster-5.6.0-10.module+el8.1.1+5309+6d656f05.x86_64", "Inst
alled: dib-utils-0.0.11-0.20200224215429.51661c3.el8ost.noarch", "Installed: python3-osc-lib-1.14.1-0.20200221153720.a0d9746.el8ost.noarch", "Installed: libvirt-daemon-driver-qemu-5.6.0-10.module+el8.1.1+5309+6d656f05.x86_64", "Installed: python3-oslo-concurrency-3.30.0-0.20200225090342.610df38.el8ost.noarch", "Insta
lled: python3-oslo-config-2:6.11.2-0.20200221150642.22c286c.el8ost.noarch", "Installed: libiscsi-1.18.0-8.module+el8.1.1+5309+6d656f05.x86_64", "Installed: python3-oslo-context-2.23.0-0.20200225215428.07f068d.el8ost.noarch", "Installed: python3-oslo-i18n-3.24.0-0.20200221144708.91b39bb.el8ost.noarch", "Installed: pyt
hon3-oslo-log-3.44.1-0.20200225215430.3ff497d.el8ost.noarch", "Installed: python3-oslo-messaging-10.2.0-0.20200225104634.b7e9faf.el8ost.noarch", "Installed: python3-oslo-middleware-3.38.1-0.20200225095528.9bae80e.el8ost.noarch", "Installed: libvirt-bash-completion-5.6.0-10.module+el8.1.1+5309+6d656f05.x86_64", "Insta
lled: python3-oslo-serialization-2.29.2-0.20200225092603.fa399b6.el8ost.noarch", "Installed: python3-oslo-service-1.40.2-0.20200225103516.a7621c8.el8ost.noarch", "Installed: libnghttp2-1.33.0-3.el8_1.1.x86_64", "Installed: python3-oslo-utils-3.41.5-0.20200310140157.85cd57d.el8ost.noarch", "Installed: grub2-tools-efi-
1:2.02-87.el8_1.x86_64", "Installed: libvirt-5.6.0-10.module+el8.1.1+5309+6d656f05.x86_64", "Installed: python3-osprofiler-2.8.2-0.20200221151756.d431c7a.el8ost.noarch", "Installed: grub2-efi-x64-1:2.02-87.el8_1.x86_64", "Installed: kernel-tools-4.18.0-147.24.2.el8_1.x86_64", "Installed: grub2-pc-1:2.02-87.el8_1.x86_
64", "Installed: grub2-pc-modules-1:2.02-87.el8_1.noarch", "Installed: grub2-tools-1:2.02-87.el8_1.x86_64", "Installed: openvswitch-selinux-extra-policy-1.0-22.el8fdp.noarch", "Installed: kernel-tools-libs-4.18.0-147.24.2.el8_1.x86_64", "Installed: python3-paunch-5.3.2-0.20200320172310.ebc49c4.el8ost.noarch", 


We can see that openvswitch is updated during the usual yum upgrade task.

The version are updated from:


openvswitch2.11-2.11.0-35.el8
openvswitch2.11-2.11.3-60.el8

and 

openvswitch-2.11-0.6.el8
openvswitch-2.11-0.5.el8

To be positive that's what is causing the issue we could just launch a test run with the backport from 
https://bugzilla.redhat.com/show_bug.cgi?id=1858745 and https://bugzilla.redhat.com/show_bug.cgi?id=1863024
to verify it's the same issue.

Comment 6 Sofer Athlan-Guyot 2020-09-14 16:55:33 UTC
Lowering the priority as we're talking about less than 10seconds cut during all update run tasks.  We really have to ask ourself if that's good enough before avoiding the reload of ovs.

Comment 7 Sofer Athlan-Guyot 2020-09-15 17:42:08 UTC
Hi, 

lowering again,

I've run the test with all four patches and it didn't help.  We got the special openvswitch handling but still the small cut.

The good new is that looking more closely to the ping test It seems that there isn't any issue during openvswitch update.

This is to be a *timing* issue between the start of the instance and its fip availability and the start of the fip test:

See the start of that ping test:

PING 10.0.0.211 (10.0.0.211) 56(84) bytes of data.
[1600116667.740145] 64 bytes from 10.0.0.211: icmp_seq=9 ttl=63 time=2.91 ms

we start at icmp_seq 9, we have lost the first 8 packets.

We don't have any more cut in the file:

grep -v '64 bytes from' ping_results_202009142050.log
PING 10.0.0.211 (10.0.0.211) 56(84) bytes of data.

--- 10.0.0.211 ping statistics ---
9195 packets transmitted, 9187 received, 0.0870038% packet loss, time 9717ms
rtt min/avg/max/mdev = 0.371/0.864/36.029/0.708 ms

I need to verify all jobs but given they have a similar cut time, it's expected to be the same issue.

The solution will be then to wait for the fip to be available before starting the ping logging.

So it will be in the tripleo-upgrade role.  I'll confirm asap.


Note You need to log in before you can comment on or make changes to this bug.