Created attachment 1505597 [details] templates Description of problem: ODL based deployment of ovs-dpdk ends with 'group = "qemu"' in /var/lib/config-data/nova_libvirt/etc/libvirt/qemu.conf. I see this working well if i define Controller role 1st in roles file & ComputeOvsDpdkSriov 2nd. But when the roles_data.yaml file has ComputeOvsDpdkSriov 1st & Controller 2nd, the qemu.conf file gets the wrong entry of 'group = "qemu"'. This is seen even though we have 'VhostuserSocketGroup' defined under 'RoleParametersDefault' in roles_data.yaml file. Version-Release number of selected component (if applicable): RHOS13 How reproducible: Always Steps to Reproduce: 1. Create ODL based templates for ovs-dpdk 2. Make sure ComputeOvsDpdkSriov role is mentioned 1st & then the Controller role in the roles_data.yaml file. 3. Actual results: wrong group permission on ovs-dpdk compute in qemu.conf . Expected results: group permission mentioned in environment file in ComputeOvsDpdkSriovParameters: VhostuserSocketGroup: "hugetlbfs" & the same value mentioned in roles_data.yaml for ComputeOvsDpdkSriov role RoleParametersDefault: VhostuserSocketGroup: "hugetlbfs" should get applied in /var/lib/config-data/nova_libvirt/etc/libvirt/qemu.conf Additional info: openstack overcloud deploy \ --templates \ -r /home/stack/templates/roles_data.yaml \ -n /home/stack/templates/network_data.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/host-config-and-reboot.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/services-docker/neutron-opendaylight.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/services-docker/neutron-opendaylight-dpdk.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/services-docker/neutron-opendaylight-sriov.yaml \ -e /home/stack/templates/environments/dpdk_sriov-environment.yaml \ -e /home/stack/templates/rhel-registration/environment-rhel-registration.yaml \ -e /home/stack/templates/environments/network-environment.yaml \ -e /home/stack/templates/environments/storage-environment.yaml \ -e /home/stack/templates/environments/disable-telemetry.yaml \ -e /home/stack/templates/environments/collectd-environment.yaml \ -e /home/stack/templates/node-info.yaml \ -e /home/stack/templates/overcloud_images2.yaml \ --log-file overcloud_install.log
1. Deployment command openstack overcloud deploy \ --templates \ -r /home/stack/templates/roles_data.yaml \ -n /home/stack/templates/network_data.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/host-config-and-reboot.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/services-docker/neutron-opendaylight.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/services-docker/neutron-opendaylight-dpdk.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/services-docker/neutron-opendaylight-sriov.yaml \ -e /home/stack/templates/environments/dpdk_sriov-environment.yaml \ -e /home/stack/templates/rhel-registration/environment-rhel-registration.yaml \ -e /home/stack/templates/environments/network-environment.yaml \ -e /home/stack/templates/environments/storage-environment.yaml \ -e /home/stack/templates/environments/disable-telemetry.yaml \ -e /home/stack/templates/environments/collectd-environment.yaml \ -e /home/stack/templates/node-info.yaml \ -e /home/stack/templates/overcloud_images2.yaml \ --log-file overcloud_install.log 2. Roles file used: - name: ComputeOvsDpdkSriov description: | Compute OvS DPDK Sriov Role CountDefault: 1 networks: - InternalApi - Tenant - Storage HostnameFormatDefault: '%stackname%-computeovsdpdk-%index%' RoleParametersDefault: VhostuserSocketGroup: "hugetlbfs" TunedProfileName: "cpu-partitioning" disable_upgrade_deployment: True ServicesDefault: .. - OS::TripleO::Services::ComputeNeutronCorePlugin - OS::TripleO::Services::ComputeNeutronL3Agent - OS::TripleO::Services::ComputeNeutronMetadataAgent - OS::TripleO::Services::ComputeNeutronOvsAgent - OS::TripleO::Services::ComputeNeutronOvsDpdk ... - OS::TripleO::Services::OpenDaylightOvs - OS::TripleO::Services::NovaCompute .. - name: Controller description: | Controller role that has all the controler services loaded and handles Database, Messaging and Network functions. CountDefault: 1 tags: - primary - controller networks: - External - InternalApi - Storage - StorageMgmt - Tenant HostnameFormatDefault: 'controller-%index%' ServicesDefault: .. - OS::TripleO::Services::OpenDaylightOvs ... 3. Environment variable has the right values set: /home/stack/templates/environments/dpdk_sriov-environment.yaml parameter_defaults: ComputeOvsDpdkSriovParameters: VhostuserSocketGroup: "hugetlbfs" 4. End result on compute - Configuration value [root@overcloud-computeovsdpdk-0 ~]# grep ^group /var/lib/config-data/puppet-generated/nova_libvirt/etc/libvirt/qemu.conf group = "qemu" 5. End result on compute - Hieradata noticed on the compute is wrong. /etc/puppet/hieradata/service_configs.json "nova::compute::libvirt::qemu::group": "qemu", .. "tripleo::profile::base::neutron::plugins::ovs::opendaylight::vhostuser_socket_group": "hugetlbfs", ## Now , how this value gets set in tripleo (In my understanding.) 6. ODL environments setup ovs configs using the following environment file included in deploy command. /usr/share/openstack-tripleo-heat-templates/environments/services-docker/neutron-opendaylight.yaml OS::TripleO::Services::OpenDaylightOvs: ../../puppet/services/opendaylight-ovs.yaml 7. opendaylight-ovs service yaml above, set the hieradata for 'nova::compute::libvirt::qemu::group' . /usr/share/openstack-tripleo-heat-templates/puppet/services/opendaylight-ovs.yaml outputs: role_data: description: Role data for the OpenDaylight service. value: .. service_config_settings: nova_libvirt: nova::compute::libvirt::qemu::group: {get_attr: [RoleParametersValue, value, 'tripleo::profile::base::neutron::plugins::ovs::opendaylight::vhostuser_socket_group']} The following command confirms this (undercloud) [stack@ocp-130-107 ~]$ openstack stack show -c outputs overcloud-ComputeOvsDpdkSriovServiceChain-kdwhbc4qvt6y-ServiceChain-3bl2edhfkld2-26-sjkzgebw77ft +---------+------------------------------------------------------------------------------------------------------------+ | Field | Value | +---------+------------------------------------------------------------------------------------------------------------+ | outputs | - description: Role data for the OpenDaylight service. | | | output_key: role_data | | | output_value: | | | config_settings: | | | neutron::agents::ml2::ovs::local_ip: tenant | | | neutron::plugins::ovs::opendaylight::allowed_network_types: | | | - local | | | - flat | | | - vlan | | | - vxlan | | | - gre | | | neutron::plugins::ovs::opendaylight::enable_dpdk: true | | | neutron::plugins::ovs::opendaylight::enable_hw_offload: false | | | neutron::plugins::ovs::opendaylight::odl_password: 26fZMPPtjCUYvHapgwugF4adR | | | neutron::plugins::ovs::opendaylight::odl_username: admin | | | neutron::plugins::ovs::opendaylight::provider_mappings: | | | - dpdk1:br-link1 | | | - dpdk2:br-link2 | | | - dpdk3:br-link3 | | | - dpdk4:br-link4 | | | neutron::plugins::ovs::opendaylight::vhostuser_mode: server | | | neutron::plugins::ovs::opendaylight::vhostuser_socket_dir: /var/lib/vhost_sockets | | | opendaylight::odl_rest_port: '8081' | | | opendaylight::password: 26fZMPPtjCUYvHapgwugF4adR | | | opendaylight::username: admin | | | opendaylight_check_url: restconf/operational/network-topology:network-topology/topology/netvirt:1 | | | tripleo.opendaylight_ovs.firewall_rules: | | | 118 neutron vxlan networks: | | | dport: 4789 | | | proto: udp | | | 136 neutron gre networks: | | | proto: gre | | | tripleo::profile::base::neutron::plugins::ovs::opendaylight::vhostuser_socket_group: hugetlbfs | | | tripleo::profile::base::neutron::plugins::ovs::opendaylight::vhostuser_socket_user: qemu | | | vswitch::dpdk::driver_type: vfio-pci | | | vswitch::dpdk::host_core_list: 0,1 | | | vswitch::dpdk::memory_channels: '2' | | | vswitch::dpdk::pmd_core_list: 2-17 | | | vswitch::dpdk::socket_mem: 16384,16384 | | | vswitch::ovs::enable_hw_offload: false | | | metadata_settings: null | | | service_config_settings: | | | nova_libvirt: | | | nova::compute::libvirt::qemu::group: hugetlbfs | <<<<<<<<----------- | | service_name: opendaylight_ovs | | | step_config: 'include tripleo::profile::base::neutron::plugins::ovs::opendaylight | | | | | | ' | | | update_tasks: | | | - block: | | | - name: store update level to update_level variable | | | set_fact: | | | odl_update_level: 1 | | | name: Get ODL update level | | | - block: | | | - command: systemctl is-enabled openvswitch | | | ignore_errors: true | | | name: Check if openvswitch is deployed | | | register: openvswitch_enabled | | | tags: common | | | - command: systemctl is-active --quiet openvswitch | | | name: 'PreUpgrade step0,validation: Check service openvswitch is running' | | | tags: validation | | | when: | | | - step|int == 0 | | | - openvswitch_enabled.rc == 0 | | | - name: Delete OVS groups and ports | | | shell: 'sudo ovs-ofctl -O Openflow13 del-groups br-int; for tun_port in $(sudo | | | ovs-vsctl list-ports br-int | grep tun); do sudo ovs-vsctl del-port br-int | | | $tun_port; done | | | | | | ' | | | when: | | | - step|int == 0 | | | - openvswitch_enabled.rc == 0 | | | name: Run L2 update tasks that are similar to upgrade_tasks when update level | | | is 2 | | | when: odl_update_level == 2 | | | upgrade_tasks: | | | - ignore_errors: true | | | name: Check openvswitch version. | | | register: ovs_version | | | shell: rpm -qa | awk -F- '/^openvswitch-2/{print $2 "-" $3}' | | | when: step|int == 2 | | | - ignore_errors: true | | | name: Check openvswitch packaging. | | | register: ovs_packaging_issue | | | shell: rpm -q --scripts openvswitch | awk '/postuninstall/,/*/' | grep -q "systemctl.*try-restart" | | | when: step|int == 2 | | | - block: | | | - file: | | | path: /root/OVS_UPGRADE | | | state: absent | | | name: 'Ensure empty directory: emptying.' | | | - file: | | | group: root | | | mode: 488 | | | owner: root | | | path: /root/OVS_UPGRADE | | | state: directory | | | name: 'Ensure empty directory: creating.' | | | - command: yum makecache | | | name: Make yum cache. | | | - command: yumdownloader --destdir /root/OVS_UPGRADE --resolve openvswitch | | | name: Download OVS packages. | | | - name: Get rpm list for manual upgrade of OVS. | | | register: ovs_list_of_rpms | | | shell: ls -1 /root/OVS_UPGRADE/*.rpm | | | - args: | | | chdir: /root/OVS_UPGRADE | | | name: Manual upgrade of OVS | | | shell: 'rpm -U --test {{item}} 2>&1 | grep "already installed" || \ | | | | | | rpm -U --replacepkgs --notriggerun --nopostun {{item}}; | | | | | | ' | | | with_items: | | | - '{{ovs_list_of_rpms.stdout_lines}}' | | | when: | | | - step|int == 2 | | | - '''2.5.0-14'' in ovs_version.stdout|default('''') or ovs_packaging_issue|default(false)|succeeded' | | | - block: | | | - command: systemctl is-enabled openvswitch | | | ignore_errors: true | | | name: Check if openvswitch is deployed | | | register: openvswitch_enabled | | | tags: common | | | - command: systemctl is-active --quiet openvswitch | | | name: 'PreUpgrade step0,validation: Check service openvswitch is running' | | | tags: validation | | | when: | | | - step|int == 0 | | | - openvswitch_enabled.rc == 0 | | | - name: Delete OVS groups and ports | | | shell: 'sudo ovs-ofctl -O Openflow13 del-groups br-int; for tun_port in $(sudo | | | ovs-vsctl list-ports br-int | grep tun); do sudo ovs-vsctl del-port br-int | | | $tun_port; done | | | | | | ' | | | when: | | | - step|int == 0 | | | - openvswitch_enabled.rc == 0 | | | name: ODL container L2 update and upgrade tasks | | | | +---------+------------------------------------------------------------------------------------------------------------+ 8. By default, this value should come from the following resource: /usr/share/openstack-tripleo-heat-templates/docker/services/nova-libvirt.yaml resources: RoleParametersValue: type: OS::Heat::Value properties: type: json value: map_replace: - map_replace: - vhostuser_socket_group: VhostuserSocketGroup - values: {get_param: [RoleParameters]} - values: VhostuserSocketGroup: {get_param: VhostuserSocketGroup} Point 7 really clarfies exactly on the value of the exact hieradata parameter that was required to be set, but still if we want to check if the above resource took the right value, we can check: (undercloud) [stack@ocp-130-107 ~]$ openstack stack resource show overcloud-ComputeOvsDpdkSriovServiceChain-kdwhbc4qvt6y-ServiceChain-3bl2edhfkld2-28-2sdzhq6rhbze RoleParametersValue +------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Field | Value | +------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | attributes | {u'value': {u'vhostuser_socket_group': u'hugetlbfs'}} | | creation_time | 2018-11-13T18:26:38Z | | description | | | links | [{u'href': u'http://10.74.167.65:8004/v1/e6f2626ad9044f17b7c904274619d5ba/stacks/overcloud-ComputeOvsDpdkSriovServiceChain-kdwhbc4qvt6y-ServiceChain-3bl2edhfkld2-28-2sdzhq6rhbze/769dbf1e-67fa-477f-baaf-e65061f80f70/resources/RoleParametersValue', u'rel': u'self'}, {u'href': u'http://10.74.167.65:8004/v1/e6f2626ad9044f17b7c904274619d5ba/stacks/overcloud-ComputeOvsDpdkSriovServiceChain-kdwhbc4qvt6y-ServiceChain-3bl2edhfkld2-28-2sdzhq6rhbze/769dbf1e-67fa-477f-baaf-e65061f80f70', u'rel': u'stack'}] | | logical_resource_id | RoleParametersValue | | parent_resource | 28 | | physical_resource_id | overcloud-ComputeOvsDpdkSriovServiceChain-kdwhbc4qvt6y-ServiceChain-3bl2edhfkld2-28-2sdzhq6rhbze-RoleParametersValue-surcbkmafw2e | | required_by | [] | | resource_name | RoleParametersValue | | resource_status | CREATE_COMPLETE | | resource_status_reason | state changed | | resource_type | OS::Heat::Value | | updated_time | 2018-11-13T18:26:38Z | +------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 9. Other than 8. , i think 'RoleParametersDefault' is expected to take precedence which we defined in roles_data.yaml. /usr/share/openstack-tripleo-heat-templates/overcloud.j2.yaml {% for role in roles %} # Resources generated for {{role.name}} Role {{role.name}}ServiceChain: type: OS::TripleO::Services properties: RoleParameters: map_merge: - {{role.RoleParametersDefault|default({})}} - get_param: {{role.name}}Parameters To confirm 9. , i pulled swift data & i found the following, which indicates the VhostuserSocketGroup value was set in RoleParameters. overcloud.yaml ComputeOvsDpdkSriovServiceChain: type: OS::TripleO::Services properties: Services: get_param: ComputeOvsDpdkSriovServices ServiceNetMap: {get_attr: [ServiceNetMap, service_net_map]} ServiceData: net_cidr_map: {get_attr: [NetCidrMapValue, value]} EndpointMap: {get_attr: [EndpointMapData, value]} DefaultPasswords: {get_attr: [DefaultPasswords, passwords]} RoleName: ComputeOvsDpdkSriov RoleParameters: map_merge: - {'TunedProfileName': 'cpu-partitioning', 'VhostuserSocketGroup': 'hugetlbfs'} - get_param: ComputeOvsDpdkSriovParameters
The issue happens with ODL deployments when a non-DPDK role is added after DPDK role in the roles_data.yaml file. This is because the OpendaylightOvs service is present in the all the roles, because of which "service_config_settings" will be applied on non-DPDK roles too. But the "service_config_settings" of non-DPDK roles will have the "qemu" as group instead of "hugetlbfs". The service_config_settings merge will happen based on the order of roles defined in the roles_data.yaml file. I see the solution would be to move the VhostuserSocketGroup parameter to nova-libvirt service to avoid this issue.
We will need a solution/fix that would work irrespective of the order of roles mentioned in the file, even if the roles have non-dpdk compute nodes. Suggestion provided by Saravanan worked well. 1. Removed setting nova_libvirt $ diff /usr/share/openstack-tripleo-heat-templates/puppet/services/opendaylight-ovs.yaml-bak /usr/share/openstack-tripleo-heat-templates/puppet/services/opendaylight-ovs.yaml 240,242c240,242 < service_config_settings: < nova_libvirt: < nova::compute::libvirt::qemu::group: {get_attr: [RoleParametersValue, value, 'tripleo::profile::base::neutron::plugins::ovs::opendaylight::vhostuser_socket_group']} --- > # service_config_settings: > # nova_libvirt: > # nova::compute::libvirt::qemu::group: {get_attr: [RoleParametersValue, value, 'tripleo::profile::base::neutron::plugins::ovs::opendaylight::vhostuser_socket_group']} 2. Added setting in nova libvirt service yaml. diff /usr/share/openstack-tripleo-heat-templates/docker/services/nova-libvirt.yaml-bak /usr/share/openstack-tripleo-heat-templates/docker/services/nova-libvirt.yaml 208,210c208,210 < # service_config_settings: < # nova_libvirt: < # nova::compute::libvirt::qemu::group: {get_attr: [RoleParametersValue, value, 'tripleo::profile::base::neutron::plugins::ovs::opendaylight::vhostuser_socket_group']} --- > service_config_settings: > nova_libvirt: > nova::compute::libvirt::qemu::group: {get_attr: [RoleParametersValue, value, 'vhostuser_socket_group']} 3. Hiera value is correctly set. [root@overcloud-computeovsdpdk-0 ~]# grep nova::compute::libvirt::qemu::group -r /etc/puppet/ /etc/puppet/hieradata/service_configs.json: "nova::compute::libvirt::qemu::group": "hugetlbfs", 4. Group permission is set correctly [root@overcloud-computeovsdpdk-0 ~]# grep ^group /var/lib/config-data/puppet-generated/nova_libvirt/etc/libvirt/qemu.conf group = "hugetlbfs"