Bug 1852358

Summary: OSP16 | Ceph-RGW-Manila | OCP 4.5 installation fail due to OSP API connectivity | OVN none-DVR | geneve vlan support
Product: Red Hat OpenStack Reporter: Udi Shkalim <ushkalim>
Component: python-networking-ovnAssignee: Assaf Muller <amuller>
Status: CLOSED NOTABUG QA Contact: Eran Kuris <ekuris>
Severity: urgent Docs Contact:
Priority: unspecified    
Version: 16.0 (Train)CC: apevec, gcheresh, jdurgin, lhh, majopela, m.andre, pgrist, scohen, srangach, tbarron
Target Milestone: ---Keywords: TestBlockerForLayeredProduct
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-07-02 07:12:45 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Udi Shkalim 2020-06-30 08:59:39 UTC
Description of problem:
Openshift bootstrap node is failing to obtain ignition information.
  859.670841] ignition[776]: GET error: Get https://10.8.100.190:13292//v2/images/8b249e5a-60f9-4f5f-b238-cfb82ab62b16/file: dial tcp 10.8.100.190:13292: i/o timeout


Version-Release number of selected component (if applicable):


How reproducible:
100%

Steps to Reproduce:
1. Install OSP16 with Ceph-rgw and Manila
2. Run openshift installation
3.

Actual results:
OCP installation fails

Expected results:
OCP installation pass

Additional info:

Comment 4 Udi Shkalim 2020-07-01 08:29:19 UTC
Setup is OSP16 OVN geneve,vlan support NONE-DVR
Network configuration was verified in the previous installations
OVN is working with shiftstack installations
Floating IPs connectivity from redhat network is not working (in/out)
Floating IPs connectivity from hypervisor is working.
The new changes in the network include adding VLAN support and geneve for the Manila network.
Manually added bridge mapping on the compute for vlan base network(sudo ovs-vsctl set open . external_ids:ovn-bridge-mappings=tenant:br-isolated)

Comment 5 Martin André 2020-07-01 09:13:04 UTC
This seems to be unrelated to https://bugzilla.redhat.com/show_bug.cgi?id=1836963 after all.

I provisioned two VMs in this environment, one with FIP and one without. It can ping floating IPs (10.8.100.199), however can't reach OpenStack endpoint at 10.8.100.190 or IP addresses on the internet (8.8.8.8).

(shiftstack) [stack@undercloud-0 ~]$ openstack server list | grep mandre-test
| 06dabe4d-8b4d-41d3-a19e-e75f5a8deb26 | mandre-test-nofip           | ACTIVE | wj45ios630a-m4vr7-openshift=192.168.3.93                | cirros-0.4.0-x86_64     |        |
| 92b50b12-039e-42b3-ab00-498dfbbac304 | mandre-test                 | ACTIVE | wj45ios630a-m4vr7-openshift=192.168.3.130, 10.8.100.196 | cirros-0.4.0-x86_64     |        |


(shiftstack) [stack@undercloud-0 ~]$ ssh -J cirros.100.196 cirros.3.93
Warning: Permanently added '10.8.100.196' (ECDSA) to the list of known hosts.
cirros.100.196's password: 
Warning: Permanently added '192.168.3.93' (ECDSA) to the list of known hosts.
cirros.3.93's password: 
$ ping 8.8.8.8
PING 8.8.8.8 (8.8.8.8): 56 data bytes
^C
--- 8.8.8.8 ping statistics ---
4 packets transmitted, 0 packets received, 100% packet loss
$ ping 10.8.100.190
PING 10.8.100.190 (10.8.100.190): 56 data bytes
^C
--- 10.8.100.190 ping statistics ---
4 packets transmitted, 0 packets received, 100% packet loss
$ ping 10.8.100.199
PING 10.8.100.199 (10.8.100.199): 56 data bytes
64 bytes from 10.8.100.199: seq=0 ttl=62 time=2.011 ms
64 bytes from 10.8.100.199: seq=1 ttl=62 time=1.133 ms
64 bytes from 10.8.100.199: seq=2 ttl=62 time=0.725 ms
^C
--- 10.8.100.199 ping statistics ---
3 packets transmitted, 3 packets received, 0% packet loss
round-trip min/avg/max = 0.725/1.289/2.011 ms

Comment 6 Tom Barron 2020-07-01 16:29:49 UTC
(In reply to Martin André from comment #5)
> This seems to be unrelated to
> https://bugzilla.redhat.com/show_bug.cgi?id=1836963 after all.
> 
> I provisioned two VMs in this environment, one with FIP and one without. It
> can ping floating IPs (10.8.100.199), however can't reach OpenStack endpoint
> at 10.8.100.190 or IP addresses on the internet (8.8.8.8).
> 
> (shiftstack) [stack@undercloud-0 ~]$ openstack server list | grep mandre-test
> | 06dabe4d-8b4d-41d3-a19e-e75f5a8deb26 | mandre-test-nofip           |
> ACTIVE | wj45ios630a-m4vr7-openshift=192.168.3.93                |
> cirros-0.4.0-x86_64     |        |
> | 92b50b12-039e-42b3-ab00-498dfbbac304 | mandre-test                 |
> ACTIVE | wj45ios630a-m4vr7-openshift=192.168.3.130, 10.8.100.196 |
> cirros-0.4.0-x86_64     |        |
> 
> 
> (shiftstack) [stack@undercloud-0 ~]$ ssh -J cirros.100.196
> cirros.3.93
> Warning: Permanently added '10.8.100.196' (ECDSA) to the list of known hosts.
> cirros.100.196's password: 
> Warning: Permanently added '192.168.3.93' (ECDSA) to the list of known hosts.
> cirros.3.93's password: 
> $ ping 8.8.8.8
> PING 8.8.8.8 (8.8.8.8): 56 data bytes
> ^C
> --- 8.8.8.8 ping statistics ---
> 4 packets transmitted, 0 packets received, 100% packet loss
> $ ping 10.8.100.190
> PING 10.8.100.190 (10.8.100.190): 56 data bytes
> ^C
> --- 10.8.100.190 ping statistics ---
> 4 packets transmitted, 0 packets received, 100% packet loss
> $ ping 10.8.100.199
> PING 10.8.100.199 (10.8.100.199): 56 data bytes
> 64 bytes from 10.8.100.199: seq=0 ttl=62 time=2.011 ms
> 64 bytes from 10.8.100.199: seq=1 ttl=62 time=1.133 ms
> 64 bytes from 10.8.100.199: seq=2 ttl=62 time=0.725 ms
> ^C
> --- 10.8.100.199 ping statistics ---
> 3 packets transmitted, 3 packets received, 0% packet loss
> round-trip min/avg/max = 0.725/1.289/2.011 ms

external_subnet for "nova" provider network was lacking gateway_ip and its allocation pool (used for floating IPs) overlapped with the pool used for br-ex and the public endpoints' VIP on the controller nodes.

This is a deployment issue, not a bug.