Bug 1948546 - VM of worker is in error state when a network has port_security_enabled=False
Summary: VM of worker is in error state when a network has port_security_enabled=False
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Cloud Compute
Version: 4.8
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: 4.8.0
Assignee: egarcia
QA Contact: Itzik Brown
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-04-12 12:09 UTC by Itzik Brown
Modified: 2021-07-27 22:59 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-07-27 22:59:25 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift cluster-api-provider-openstack pull 175 0 None open Bug 1948546: Port create bugs 2021-04-14 15:29:05 UTC
Github openshift cluster-api-provider-openstack pull 179 0 None open Bug 1948546: Allow all networking interfaces to be defined as ports 2021-04-26 20:04:31 UTC
Red Hat Product Errata RHSA-2021:2438 0 None None None 2021-07-27 22:59:46 UTC

Description Itzik Brown 2021-04-12 12:09:52 UTC
When creating a worker with the following machineset

    apiVersion: machine.openshift.io/v1beta1
    kind: MachineSet
    metadata:
      annotations:
        machine.openshift.io/memoryMb: "16384"
        machine.openshift.io/vCPU: "4"
      labels:
        machine.openshift.io/cluster-api-cluster: ostest-x9tft
        machine.openshift.io/cluster-api-machine-role: worker
        machine.openshift.io/cluster-api-machine-type: worker
      name: ostest-x9tft-worker-1
      namespace: openshift-machine-api
      resourceVersion: "24397"
    spec:
      replicas: 1
      selector:
        matchLabels:
          machine.openshift.io/cluster-api-cluster: ostest-x9tft
          machine.openshift.io/cluster-api-machineset: ostest-x9tft-worker-0
      template:
        metadata:
          labels:
            machine.openshift.io/cluster-api-cluster: ostest-x9tft
            machine.openshift.io/cluster-api-machine-role: worker
            machine.openshift.io/cluster-api-machine-type: worker
            machine.openshift.io/cluster-api-machineset: ostest-x9tft-worker-0
        spec:
          metadata: {}
          providerSpec:
            value:
              apiVersion: openstackproviderconfig.openshift.io/v1alpha1
              availabilityZone: AZsriov-0
              cloudName: openstack
              cloudsSecret:
                name: openstack-cloud-credentials
                namespace: openshift-machine-api
              flavor: m4.xlarge
              image: ostest-x9tft-rhcos
              kind: OpenstackProviderSpec
              metadata:
                creationTimestamp: null
              networks:
              - filter: {}
                subnets:
                - filter:
                    name: ostest-x9tft-nodes
                    tags: openshiftClusterID=ostest-x9tft
     
              securityGroups:
              - filter: {}
                name: ostest-x9tft-worker
              ports:
                - networkID: 53e5f4b8-8dcd-4cb8-aea3-b76c478dbb32
                  nameSuffix: sriov
                  fixedIPs:
                    - subnet_id: ccbfb766-7f26-4be2-8371-183fc4dc25d4
                  tags:
                    - sriov
                  vnicType: direct
                  portSecurity: false
              primarySubnet: ffa7ae93-faac-4522-ad99-ff1696e9ee52
              serverMetadata:
                Name: ostest-x9tft-worker
                openshiftClusterID: ostest-x9tft
              tags:
              - openshiftClusterID=ostest-x9tft
              trunk: false
              userDataSecret:
                name: worker-user-data
The VM is in error state and I see the following in the compute log:

2021-04-08 08:15:11.417 8 ERROR nova.compute.manager [req-7c702456-73ac-4be1-8ce7-4fad40ddab48 80310651a4504249ba97c49e7bb1d01a d43f248317b44213a7b1cde8e01e8066 - default default] [instance: 7296bcc5-bada-4f5f-a12a-3812d16ea20b] Failed to build and run instance: nova.exception.SecurityGroupCannotBeApplied: Network requires port_security_enabled and subnet associated in order to apply security groups.

When setting the port_security_enabled=true 

Version:
OCP 4.8.0-0.nightly-2021-04-06-162113
OSP RHOS-16.1-RHEL-8-20210311.n.1


Platform:
OCP on OSP IPI

Comment 1 Adolfo Duarte 2021-04-12 16:33:51 UTC
Its possible that the security group : 
              securityGroups:
              - filter: {}
                name: ostest-x9tft-worker

gets applied to all ports of the vm. 

As a check, the same test should be run with tthe securityGroup entry removed

Comment 2 egarcia 2021-04-12 17:16:00 UTC
This worked in the same environment and created a node:

apiVersion: machine.openshift.io/v1beta1                                                                                                                                                                                                      
kind: MachineSet                                                                                                                                                                                                                              
metadata:                                                                                                                                                                                                                                    
  annotations:                                                                                                                                                                                                                              
    machine.openshift.io/memoryMb: "16384"                                                                                                                                                                                                    
    machine.openshift.io/vCPU: "4"                                                                                                                                                                                                            
  creationTimestamp: "2021-04-12T15:05:48Z"                                                                                                                                                                                                  
  generation: 2                                                                                                                                                                                                                              
  labels:                                                                                                                                                                                                                                    
    machine.openshift.io/cluster-api-cluster: ostest-x9tft                                                                                                                                                                                    
    machine.openshift.io/cluster-api-machine-role: worker                                                                                                                                                                                    
    machine.openshift.io/cluster-api-machine-type: worker                                                                                                                                                                                    
  name: ostest-x9tft-worker-2                                                                                                                                                                                                                
  namespace: openshift-machine-api                                                                                                                                                                                                            
  resourceVersion: "2472380"                                                                                                                                                                                                                  
  uid: d9e25641-643a-4133-bee6-c29775cfd772                                                                                                                                                                                                  
spec:                                                                                                                                                                                                                                        
  replicas: 1                                                                                                                                                                                                                                
  selector:                                                                                                                                                                                                                                  
    matchLabels:                                                                                                                                                                                                                            
      machine.openshift.io/cluster-api-cluster: ostest-x9tft                                                                                                                                                                                  
      machine.openshift.io/cluster-api-machineset: ostest-x9tft-worker-0                                                                                                                                                                      
  template:                                                                                                                                                                                                                                  
    metadata:                                                                                                                                                                                                                                
      labels:
        machine.openshift.io/cluster-api-cluster: ostest-x9tft
        machine.openshift.io/cluster-api-machine-role: worker
        machine.openshift.io/cluster-api-machine-type: worker
        machine.openshift.io/cluster-api-machineset: ostest-x9tft-worker-0
    spec:
      metadata: {}
      providerSpec:
        value:
          apiVersion: openstackproviderconfig.openshift.io/v1alpha1
          availabilityZone: AZsriov-0
          cloudName: openstack
          cloudsSecret:
            name: openstack-cloud-credentials
            namespace: openshift-machine-api
          configDrive: true
          flavor: m4.xlarge
          image: ostest-x9tft-rhcos
          kind: OpenstackProviderSpec
          metadata:
            creationTimestamp: null
          networks:
          - subnets:
            - uuid: ffa7ae93-faac-4522-ad99-ff1696e9ee52
          ports:
          - fixedIPs:
            - subnet_id: ccbfb766-7f26-4be2-8371-183fc4dc25d4
            nameSuffix: sriov
            networkID: 53e5f4b8-8dcd-4cb8-aea3-b76c478dbb32
            portSecurity: false
            tags:
           - sriov
            vnicType: direct
          primarySubnet: ffa7ae93-faac-4522-ad99-ff1696e9ee52
          securityGroups:
          - filter: {}
            name: ostest-x9tft-worker
          serverMetadata:
            Name: ostest-x9tft-worker
            openshiftClusterID: ostest-x9tft
          tags:
         - openshiftClusterID=ostest-x9tft
          trunk: false
          userDataSecret:
            name: worker-user-data
status:
  availableReplicas: 1
  fullyLabeledReplicas: 1
  observedGeneration: 2
  readyReplicas: 1
  replicas: 1

Comment 7 egarcia 2021-04-19 14:39:37 UTC
I understand. When port security is disabled on a network, then logically a user expects that they can create a port from that subnet with the port security disabled without a problem. The issue is that we disable the security groups and allowed address pairs for the ports we create on an update, not on the initial create, under the expectation that they will be created on a network with port security enabled. This causes the machine to enter error state since OpenStack can't create ports with security groups and allowed address pairs. Ideally, a user should not have to set the port security on the port for this use case, and should just be able to use the nova default for a network. So we will have to modify the code to check the network's port security before creating the port and set the parameters accordingly.

Comment 8 egarcia 2021-04-22 21:04:54 UTC
Update: OpenStack does not allow you to attach interfaces from networks that have port security disabled when a security group is set on an instance. It will always try to apply that security group to all interfaces attached, causing it to error. This is invalid usage, so as it stands users have to either disable port security for each individual port, which works, or not set security groups on the instance. Another option is to modify the code to allow all interfaces to be defined using only the ports api, allowing users to set security groups and allowed address pairs on a per port basis. However, this is not currently supported, and would require a moderate amount of work.

Comment 9 egarcia 2021-04-23 18:00:25 UTC
Setting this to prio and sev medium because the core functionality works, so its non blocking.

Comment 11 Itzik Brown 2021-05-05 13:00:35 UTC
OCP version: 4.8.0-0.nightly-2021-04-30-201824
OSP: RHOS-16.1-RHEL-8-20210323.n.0

Used the following machineset:

    apiVersion: machine.openshift.io/v1beta1
    kind: MachineSet
    metadata:
      annotations:
        machine.openshift.io/memoryMb: "32768"
        machine.openshift.io/vCPU: "4"
      generation: 1
      labels:
        machine.openshift.io/cluster-api-cluster: ostest-7qlx2
        machine.openshift.io/cluster-api-machine-role: worker
        machine.openshift.io/cluster-api-machine-type: worker
      name: ostest-7qlx2-worker-50
      namespace: openshift-machine-api
    spec:
      replicas: 2
      selector:
        matchLabels:
          machine.openshift.io/cluster-api-cluster: ostest-7qlx2
          machine.openshift.io/cluster-api-machineset: ostest-7qlx2-worker-50
      template:
        metadata:
          labels:
            machine.openshift.io/cluster-api-cluster: ostest-7qlx2
            machine.openshift.io/cluster-api-machine-role: worker
            machine.openshift.io/cluster-api-machine-type: worker
            machine.openshift.io/cluster-api-machineset: ostest-7qlx2-worker-50
        spec:
          metadata: {}
          providerSpec:
            value:
              apiVersion: openstackproviderconfig.openshift.io/v1alpha1
              availabilityZone: AZsriov-0
              cloudName: openstack
              cloudsSecret:
                name: openstack-cloud-credentials
                namespace: openshift-machine-api
              flavor: m4.worker  
              image: ostest-7qlx2-rhcos
              kind: OpenstackProviderSpec
              configDrive: True
              metadata:
                creationTimestamp: null
                #          networks:
                #            - subnets:
                #                - uuid: e658016a-e848-4a62-9677-01c5e5962ed2
              ports:
                - allowedAddressPairs:
                  - ipAddress: 10.196.0.5
                  - ipAddress: 10.196.0.7
                  fixedIPs:
                    - subnetID: e658016a-e848-4a62-9677-01c5e5962ed2
                  nameSuffix: nodes
                  networkID: de8f8ce7-bea7-4b9e-a880-30f1e8b0ea7d
                  securityGroups:
                      - ed336231-d8c6-4136-aef2-8e40d09db511
                - networkID: 5a732fa7-0320-402e-8df1-c1bec78f31cb
                  nameSuffix: sriov
                  fixedIPs:
                    - subnetID: e04b820d-7cad-4e50-acb2-68219e0a2ef8
                  tags:
                    - sriov
                  vnicType: direct
                  portSecurity: False
              primarySubnet: e658016a-e848-4a62-9677-01c5e5962ed2
              serverMetadata:
                Name: ostest-7qlx2-worker
                openshiftClusterID: ostest-7qlx2
              tags:
              - openshiftClusterID=ostest-7qlx2
              trunk: false
              userDataSecret:
                name: worker-user-data

Comment 14 errata-xmlrpc 2021-07-27 22:59:25 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:2438


Note You need to log in before you can comment on or make changes to this bug.