Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1974061

Summary: [OVN] Metadata issues on SR-IOV environment
Product: Red Hat OpenStack Reporter: Roman Safronov <rsafrono>
Component: python-networking-ovnAssignee: Jakub Libosvar <jlibosva>
Status: CLOSED CURRENTRELEASE QA Contact: Eran Kuris <ekuris>
Severity: high Docs Contact:
Priority: high    
Version: 16.1 (Train)CC: apevec, eolivare, ihrachys, jlibosva, lhh, lmadsen, lmartins, majopela, nusiddiq, scohen, spower, supadhya
Target Milestone: betaKeywords: Regression, TestOnly, Triaged
Target Release: 16.2 (Train on RHEL 8.4)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1974062 1986484 (view as bug list) Environment:
Last Closed: 2022-02-22 13:53:15 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1974062    
Bug Blocks: 1986484    

Description Roman Safronov 2021-06-20 09:07:15 UTC
Description of problem:
Instance is not able to get metadata on creation. SSH to the instance is not working.

Version-Release number of selected component (if applicable):
RHOS-16.1-RHEL-8-20210604.n.0

How reproducible:
Happens very often, mainly on SR-IOV environment

Steps to Reproduce:
1. Deploy SR-IOV environment, make sure that external network exist.
2. Create a security group with allowed icmp and ssh and a keypair.
3. Create a new network, create a router and connect the internal network to the external one through the router.
4. Launch a VM connected to the internal network.
5. Create a FIP and attach it to the VM's port.
6. Try to ping the VM FIP
Result: Ping works - OK
7. Try to ssh the VM
Result: Access using SSH fails - NOK (BUG)

Actual results:
Metadata service is not accessible from a VM so SSH key can not be obtained. It is not possible to connect to VM using SSH.

Expected results:
Metadata service is accessible from VM. It is possible to connect to VM using SSH.

Additional info:

Try run openstack console log show <VM UUID>
It can be seen that VM is not able to access metadata:
   35.102236] cloud-init[797]: 2021-06-18 18:11:07,494 - util.py[WARNING]: No active metadata service found

Connect to the compute node where the VM is running and try to ping the VM from metadata namespace:
sudo ip net exec ovnmeta-<DATAPATH UUID> ping 192.168.2.225
Result: no replies

Try to trace a packet from the VM to metadata port:

[heat-admin@computesriov-1 ~]$ sudo ovs-appctl ofproto/trace br-int in_port=493,dl_src=fa:16:3e:35:7c:7d,dl_dst=fa:16:3e:52:37:e5
Flow: in_port=493,vlan_tci=0x0000,dl_src=fa:16:3e:35:7c:7d,dl_dst=fa:16:3e:52:37:e5,dl_type=0x0000

bridge("br-int")
----------------
 0. in_port=493, priority 100, cookie 0xcee76ab7
    set_field:0xd->reg13
    set_field:0xf->reg11
    set_field:0xe->reg12
    set_field:0x4->metadata
    set_field:0x3->reg14
    resubmit(,8)
 8. reg14=0x3,metadata=0x4,dl_src=fa:16:3e:35:7c:7d, priority 50, cookie 0x278e23bd
    resubmit(,9)
 9. metadata=0x4, priority 0, cookie 0x773414cc
    resubmit(,10)
10. metadata=0x4, priority 0, cookie 0x6c0ff7a9
    resubmit(,11)
11. metadata=0x4, priority 0, cookie 0x781aaddf
    resubmit(,12)
12. metadata=0x4, priority 0, cookie 0xf29f0a11
    resubmit(,13)
13. metadata=0x4, priority 0, cookie 0x29be1854
    resubmit(,14)
14. metadata=0x4, priority 0, cookie 0xe29979c2
    resubmit(,15)
15. metadata=0x4, priority 0, cookie 0x1ba260d6
    resubmit(,16)
16. ct_state=-trk,metadata=0x4, priority 5, cookie 0x3cd9fb46
    set_field:0x100000000000000000000000000/0x100000000000000000000000000->xxreg0
    set_field:0x200000000000000000000000000/0x200000000000000000000000000->xxreg0
    resubmit(,17)
17. metadata=0x4, priority 0, cookie 0xe50c1ff9
    resubmit(,18)
18. metadata=0x4, priority 0, cookie 0x32efad62
    resubmit(,19)
19. metadata=0x4, priority 0, cookie 0x12f09e82
    resubmit(,20)
20. metadata=0x4, priority 0, cookie 0x5c1d64dd
    resubmit(,21)
21. metadata=0x4, priority 0, cookie 0x7c74f24b
    resubmit(,22)
22. metadata=0x4, priority 0, cookie 0xbdcdc10b
    resubmit(,23)
23. metadata=0x4, priority 0, cookie 0x71ce481b
    resubmit(,24)
24. metadata=0x4, priority 0, cookie 0x9c0631be
    resubmit(,25)
25. metadata=0x4, priority 0, cookie 0x3337a67f
    resubmit(,26)
26. metadata=0x4, priority 0, cookie 0xe80589b0
    resubmit(,27)
27. metadata=0x4, priority 0, cookie 0xfceb4a8d
    resubmit(,28)
28. metadata=0x4, priority 0, cookie 0x3d2bd176
    resubmit(,29)
29. metadata=0x4, priority 0, cookie 0xa31797e7
    resubmit(,30)
30. metadata=0x4, priority 0, cookie 0xd33564d0
    resubmit(,31)
31. metadata=0x4,dl_dst=fa:16:3e:52:37:e5, priority 50, cookie 0xdbb8b7c4
    set_field:0x2->reg15
    resubmit(,37)
37. priority 0
    resubmit(,38)
38. reg15=0x2,metadata=0x4, priority 100, cookie 0xe2f53bb7
    set_field:0x1->reg15
    resubmit(,38)
38. reg15=0x1,metadata=0x4, priority 100
    set_field:0x10->reg13
    set_field:0xf->reg11
    set_field:0xe->reg12
    resubmit(,39)
39. priority 0
    set_field:0->reg0
    set_field:0->reg1
    set_field:0->reg2
    set_field:0->reg3
    set_field:0->reg4
    set_field:0->reg5
    set_field:0->reg6
    set_field:0->reg7
    set_field:0->reg8
    set_field:0->reg9
    resubmit(,40)
40. metadata=0x4, priority 0, cookie 0x2d2084a7
    resubmit(,41)
41. metadata=0x4, priority 0, cookie 0x9a0d473
    resubmit(,42)
42. metadata=0x4, priority 0, cookie 0xa37266fe
    resubmit(,43)
43. metadata=0x4, priority 0, cookie 0xbf5498f8
    resubmit(,44)
44. ct_state=-trk,metadata=0x4, priority 5, cookie 0x3af4b3cc
    set_field:0x100000000000000000000000000/0x100000000000000000000000000->xxreg0
    set_field:0x200000000000000000000000000/0x200000000000000000000000000->xxreg0
    resubmit(,45)
45. metadata=0x4, priority 0, cookie 0xa775f992
    resubmit(,46)
46. metadata=0x4, priority 0, cookie 0x76593a3
    resubmit(,47)
47. metadata=0x4, priority 0, cookie 0xb0395be2
    resubmit(,48)
48. metadata=0x4, priority 0, cookie 0x1ac6c088
    resubmit(,49)
49. metadata=0x4, priority 0, cookie 0x50392d97
    resubmit(,50)
50. reg15=0x1,metadata=0x4, priority 50, cookie 0x9de154a2
    resubmit(,64)
64. priority 0
    resubmit(,65)
65. reg15=0x1,metadata=0x4, priority 100, cookie 0x149767bd
    push_vlan:0x8100
    set_field:4418->vlan_vid
    output:487

    bridge("br-ext-int")
    --------------------
         0. priority 0
            NORMAL
             -> no learned MAC for destination, flooding

            bridge("br-int")
            ----------------
                 0. No match.
                    drop

        bridge("br-int")
        ----------------
         0. No match.
            drop
    pop_vlan

Final flow: reg0=0x300,reg11=0xf,reg12=0xe,reg13=0x10,reg14=0x3,reg15=0x1,metadata=0x4,in_port=493,vlan_tci=0x0000,dl_src=fa:16:3e:35:7c:7d,dl_dst=fa:16:3e:52:37:e5,dl_type=0x0000
Megaflow: recirc_id=0,ct_state=-new-est-rel-rpl-inv-trk,ct_label=0/0x1,eth,in_port=493,dl_src=fa:16:3e:35:7c:7d,dl_dst=fa:16:3e:52:37:e5,dl_type=0x0000
Datapath actions: push_vlan(vid=322,pcp=0),1,3


Note: in case I try to launch on the network a VM with '--config-drive True', i.e. like this:
openstack server create --flavor rhel-flavor --security-group overcloud_sg --image rhel-8 --nic net-id=internal_A vm2 --key-name test-key --config-drive True
it succeeds to get metadata. After this all new VMs on the same network will be able to access the metadata service on this network.

Comment 2 Roman Safronov 2021-06-20 09:30:12 UTC
packages versions on the OSP16.1 SR-IOV environment:
python3-networking-ovn-7.3.1-1.20210409093428.4e24f4c.el8ost.noarch
ovn2.13-20.12.0-104.el8fdp.x86_64
openvswitch2.13-2.13.0-105.el8fdp.x86_64

Comment 3 Yaniv Kaul 2021-06-30 08:07:13 UTC
Is this a regression?

Comment 4 Roman Safronov 2021-06-30 08:14:53 UTC
(In reply to Yaniv Kaul from comment #3)
> Is this a regression?

Yes

Comment 5 Ihar Hrachyshka 2021-07-16 01:37:51 UTC
The underlying issue in OVN is now fixed (1974062 in MODIFIED). I believe no changes needed on Neutron side. Feel free to close the bug.

Comment 6 Jakub Libosvar 2021-07-16 07:17:03 UTC
(In reply to Ihar Hrachyshka from comment #5)
> The underlying issue in OVN is now fixed (1974062 in MODIFIED). I believe no
> changes needed on Neutron side. Feel free to close the bug.

Thanks! We'll want to get the new OVN version tested with OSP so we'll flip this to ON_QA once compose with your fix is available for OSP consumption.

Comment 12 Ihar Hrachyshka 2021-07-26 19:29:58 UTC
@Jakub, @Numan, it looks like this bug is considered to be about "normal" ports in contrast to 1974062 that is for SR-IOV ports. (The latter is VERIFIED now.) I believe the "normal" ports were supposed to be fixed by a fresh OVN package, were they not? I am vague on details though. Perhaps you guys could update here about which fix we believe addressed the normal ports case? Thanks in advance.

Comment 14 Jakub Libosvar 2021-07-28 10:29:28 UTC
(In reply to Ihar Hrachyshka from comment #12)
> @Jakub, @Numan, it looks like this bug is considered to be about "normal"
> ports in contrast to 1974062 that is for SR-IOV ports. (The latter is
> VERIFIED now.) I believe the "normal" ports were supposed to be fixed by a
> fresh OVN package, were they not? I am vague on details though. Perhaps you
> guys could update here about which fix we believe addressed the normal ports
> case? Thanks in advance.

My understanding is that this BZ is the OSP side for the SR-IOV metadata ports not working. Not sure now where the normal ports not working are being handled from OSP side.

Comment 15 Eran Kuris 2021-08-04 06:23:20 UTC
Fix verified
 RHOS-16.1-RHEL-8-20210727.n.1
https://rhos-ci-jenkins.lab.eng.tlv2.redhat.com/view/DFG/view/network/view/networking-ovn/job/DFG-network-networking-ovn-16.1_director-rhel-virthost-3cont_2comp-ipv4-vlan-sriov/lastCompletedBuild/testReport/
()[root@computesriov-0 /]# rpm -qa | grep ovn 
puppet-ovn-15.4.1-1.20210528102649.192ac4e.el8ost.noarch
rhosp-ovn-2.13-12.el8ost.noarch
ovn2.13-20.12.0-149.el8fdp.x86_64
rhosp-ovn-host-2.13-12.el8ost.noarch
ovn2.13-host-20.12.0-149.el8fdp.x86_64
()[root@computesriov-0 /]# rpm -qa | grep openvswi
python3-rhosp-openvswitch-2.13-12.el8ost.noarch
network-scripts-openvswitch2.13-2.13.0-116.el8fdp.x86_64
openvswitch-selinux-extra-policy-1.0-28.el8fdp.noarch
openvswitch2.13-2.13.0-116.el8fdp.x86_64
rhosp-openvswitch-2.13-12.el8ost.noarch
python3-openvswitch2.13-2.13.0-116.el8fdp.x86_64

Comment 19 Red Hat Bugzilla 2023-09-15 01:10:12 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 500 days