Created attachment 1459162 [details] rhos 10 initial templates Description of problem: After the FFU procedure was finished, I have tried to run a fresh instance for testing. The instance booted successfully, but the network has not been configured as expected: openstack console log show dpdk_test_vm [ 35.706679] cloud-init[766]: Stderr: '' [ 35.716646] cloud-init[766]: ci-info: +++++++++++++++++++++++Net device info+++++++++++++++++++++++ [ 35.719563] cloud-init[766]: ci-info: +--------+------+-----------+-----------+-------------------+ [ 35.722448] cloud-init[766]: ci-info: | Device | Up | Address | Mask | Hw-Address | [ 35.725579] cloud-init[766]: ci-info: +--------+------+-----------+-----------+-------------------+ [ 35.728385] cloud-init[766]: ci-info: | lo: | True | 127.0.0.1 | 255.0.0.0 | . | [ 35.728615] cloud-init[766]: ci-info: | eth0: | True | . | . | fa:16:3e:52:f1:44 | [ 35.728903] cloud-init[766]: ci-info: +--------+------+-----------+-----------+-------------------+ [ 35.731155] cloud-init[766]: ci-info: !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!Route info failed!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! [ 35.964688] cloud-init[766]: 2018-07-16 07:10:11,311 - url_helper.py[WARNING]: Calling 'http://169.254.169.254/2009-04-04/meta-data/instance-id' failed [0/120s]: request error [('Connection aborted.', error(101, 'Network is unreachable'))] [ 36.968456] cloud-init[766]: 2018-07-16 07:10:12,315 - url_helper.py[WARNING]: Calling 'http://169.254.169.254/2009-04-04/meta-data/instance-id' failed [1/120s]: request error [('Connection aborted.', error(101, 'Network is unreachable'))] [ 37.976223] cloud-init[766]: 2018-07-16 07:10:13,322 - url_helper.py[WARNING]: Calling 'http://169.254.169.254/2009-04-04/meta-data/instance-id' failed [2/120s]: request error [('Connection aborted.', error(101, 'Network is unreachable'))] How reproducible: Always Steps to Reproduce: 1. deploy rhos10 z5 with the attached templates 2. minor update to the latest rhos 10z 3. upgrade to the latest rhos13 Additional info: Error logs for nova/neutron from the controller are attached.
Created attachment 1459171 [details] neutron errors from controller
Created attachment 1459172 [details] nova errors from controller
Also seeing that errors on ovs on the compute: 2018-07-16T13:07:47.368Z|02471|dpif_netlink|WARN|system@ovs-system: cannot create port `dpdk0' because it has unsupported type `dpdk' 2018-07-16T13:07:47.368Z|02472|bridge|WARN|could not add network device dpdk0 to ofproto (Invalid argument) 2018-07-16T13:07:47.368Z|02473|dpif_netlink|WARN|system@ovs-system: cannot create port `dpdk1' because it has unsupported type `dpdk' 2018-07-16T13:07:47.368Z|02474|bridge|WARN|could not add network device dpdk1 to ofproto (Invalid argument)
Also, in case that can be relevant, i found that on the messages from the compute: Jul 16 11:07:06 compute-0 cloud-init: 2018-07-16 11:07:06,469 - stages.py[WARNING]: Failed to rename devices: duplicate mac found! both 'p7p1' and 'bond_api' have mac 'f8:f2:1e:03:bc:40'
Have you proceeded with a reboot on the compute after ovs 2.9 minor update?
[root@compute-0 log]# ovs-vsctl get bridge br-link0 datapath_type system [root@compute-0 log]# ovs-vsctl --may-exist add-br br0 -- set bridge br-link0 datapath_type=netdev [root@compute-0 log]# ovs-vsctl get bridge br-link0 datapath_type netdev [root@compute-0 log]# ifdown br-link0 && ifup br-link0 arping: recvfrom: La red no está activa Cannot find device "br-link0" ERROR : [/etc/sysconfig/network-scripts/ifup-eth] Error adding address 10.10.126.104 for br-link0. arping: Device br-link0 not available. Cannot find device "br-link0" [root@compute-0 log]# ovs-vsctl get bridge br-link0 datapath_type system
(In reply to Yolanda Robla from comment #6) > Have you proceeded with a reboot on the compute after ovs 2.9 minor update? Yes, and the NFV regression tempest tests pass.
(In reply to Yolanda Robla from comment #7) > [root@compute-0 log]# ovs-vsctl get bridge br-link0 datapath_type > system > [root@compute-0 log]# ovs-vsctl --may-exist add-br br0 -- set bridge > br-link0 datapath_type=netdev > [root@compute-0 log]# ovs-vsctl get bridge br-link0 datapath_type > netdev > [root@compute-0 log]# ifdown br-link0 && ifup br-link0 > arping: recvfrom: La red no está activa > Cannot find device "br-link0" > ERROR : [/etc/sysconfig/network-scripts/ifup-eth] Error adding address > 10.10.126.104 for br-link0. > arping: Device br-link0 not available. > Cannot find device "br-link0" > [root@compute-0 log]# ovs-vsctl get bridge br-link0 datapath_type > system This looks close. may I ask if there are any changes to file ifcfg-br-link0 during/before overcloud FFU (could be manually change of the file or due to nic-configs template changes)? note that even a minor change like adding a blank space needs to be counted here. if yes, when os-net-config cmd (os-net-config -c /etc/os-net-config/config.json -v) gets executed, br-link0 will be deleted and recreated (due to execution of cmd "ifdown br-link0 and ifup br-link0" triggered by os-net-config), the newly created br-link0 will use 'system' datepath_type instead of 'netdev'. why os-net-config gets executed during FFU? restart of httpd service during undercloud FFU upgrade will cause re-run of os-net-config in overcloud if there is any nic-config template changes.
Ok so the problem is isolated to be with os-net-config. When we use the 13 version, the type is changed to 13. When we downgrade the package version to the 10 one, the port it set correctly to netdev. I performed a diff between the nic configs generated between the 10/13 os-net-config versions, and just reported that: diff -u /etc/sysconfig/network-scripts /etc/sysconfig/network-scripts-13 diff -u /etc/sysconfig/network-scripts/ifcfg-br-link0 /etc/sysconfig/network-scripts-13/ifcfg-br-link0 --- /etc/sysconfig/network-scripts/ifcfg-br-link0 2018-07-19 11:26:08.923619095 +0000 +++ /etc/sysconfig/network-scripts-13/ifcfg-br-link0 2018-07-19 11:25:22.983392104 +0000 @@ -9,4 +9,4 @@ BOOTPROTO=static IPADDR=10.10.126.109 NETMASK=255.255.255.0 -OVS_EXTRA="set port br-link0 tag=526 -- set bridge br-link0 fail_mode=standalone" +OVS_EXTRA="set port br-link0 tag=526 -- set bridge br-link0 fail_mode=standalone -- del-controller br-link0 -- set bridge br-link0 fail_mode=standalone -- del-controller br-link0" diff -u /etc/sysconfig/network-scripts/ifcfg-dpdkbond0 /etc/sysconfig/network-scripts-13/ifcfg-dpdkbond0 --- /etc/sysconfig/network-scripts/ifcfg-dpdkbond0 2018-07-19 11:26:08.923619095 +0000 +++ /etc/sysconfig/network-scripts-13/ifcfg-dpdkbond0 2018-07-19 11:25:22.983392104 +0000 @@ -9,4 +9,4 @@ OVS_BRIDGE=br-link0 BOND_IFACES="dpdk0 dpdk1" MTU=9000 -OVS_EXTRA="set interface dpdk0 mtu_request=$MTU -- set interface dpdk1 mtu_request=$MTU -- set interface dpdk0 options:n_rxq=2 -- set interface dpdk1 options:n_rxq=2 -- set Interface dpdk0 options:dpdk-devargs=0000:82:00.0 -- set Interface dpdk1 options:dpdk-devargs=0000:82:00.1 -- set interface dpdk0 mtu_request=$MTU -- set interface dpdk1 mtu_request=$MTU -- set interface dpdk0 options:n_rxq=2 -- set interface dpdk1 options:n_rxq=2" +OVS_EXTRA="set interface dpdk0 mtu_request=$MTU -- set interface dpdk1 mtu_request=$MTU -- set interface dpdk0 options:n_rxq=2 -- set interface dpdk1 options:n_rxq=2 -- set Interface dpdk0 options:dpdk-devargs=0000:82:00.0 -- set Interface dpdk1 options:dpdk-devargs=0000:82:00.1 -- set Interface dpdk0 mtu_request=$MTU -- set Interface dpdk1 mtu_request=$MTU -- set interface dpdk0 mtu_request=$MTU -- set interface dpdk1 mtu_request=$MTU -- set interface dpdk0 options:n_rxq=2 -- set interface dpdk1 options:n_rxq=2"
Although that causes a change in config and causes ovs to reboot, that seems to be still a side effect. I downgraded os-collect-config to earlier versions, and ensured that the nic configs were as they were in 10. And on reboots, or ovs restarts, i still can see the type changed to system.
New findings, the ovs restart was just a red herring, but the issue seems to be that the neutron-ovs_agent container is not having the right datapath type there. When looking at the config from the container, it shows the following: ()[neutron@compute-0 /]$ cat /var/lib/kolla/config_files/src/etc/neutron/plugins/ml2/openvswitch_agent.ini [ovs] bridge_mappings=tenant:br-link0 integration_bridge=br-int tunnel_bridge=br-tun local_ip=10.10.126.109 [agent] l2_population=False arp_responder=False enable_distributed_routing=False drop_flows_on_start=False extensions=qos tunnel_types=vxlan vxlan_udp_port=4789 [securitygroup] firewall_driver=iptables_hybrid
Last findings there show that an ovs.pp puppet module is used for it, and dpdk is not correctly enabled. Looking at how it configures the openvswitch_agent.ini there, it just doesn't consider dpdk values there. I modified the module manually, passing the dpdk parameters and then i can see differences. So it may be that some template changes are needed as a preparartion from 10 to 13, before starting ffu, we are looking at it.
The prepare command for FFU was: (undercloud) [stack@undercloud-0 ~]$ cat overcloud_upgrade_prepare.sh #!/bin/env bash # # Setup HEAT's output # set -euo pipefail source /home/stack/stackrc echo "Running ffwd-upgrade upgrade prepare step" openstack overcloud ffwd-upgrade prepare --stack overcloud \ --templates /usr/share/openstack-tripleo-heat-templates \ --yes \ -e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/neutron-ovs-dpdk.yaml \ -e /home/stack/ospd-10-vxlan-dpdk-two-ports-ctlplane-bonding/network-environment.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/ovs-dpdk-permissions.yaml \ -e /home/stack/ffu_repos.yaml \ -e /home/stack/virt/docker-images.yaml \ 2>&1
I believe this could the issue after discussing with Yolanda. OSP10 has the same service ComputeNeutronOvsAgent mapped to ovs-dpdk service but in OSP10 the same environment file will have the new DPDK service - ComputeNeutronOvsDpdk. Adding a environment file with below content will ensure that the same mapping is retained for FFU ------------------ resource_registry: OS::TripleO::Services::ComputeNeutronOvsAgent: /usr/share/openstack-tripleo-heat-templates/docker/services/neutron-ovs-dpdk-agent.yaml ------------------
So comments from Saravanan made sense. I modified your /usr/share/openstack-tripleo-heat-templates/docker/services/neutron-ovs-dpdk-agent.yaml, and i set like the following: # A Heat environment that can be used to deploy DPDK with OVS # Deploying DPDK requires enabling hugepages for the overcloud nodes resource_registry: OS::TripleO::Services::ComputeNeutronOvsDpdk: ../puppet/services/neutron-ovs-dpdk-agent.yaml OS::TripleO::Services::ComputeNeutronOvsAgent: /usr/share/openstack-tripleo-heat-templates/docker/services/neutron-ovs-dpdk-agent.yaml parameter_defaults: NeutronDatapathType: "netdev" NeutronVhostuserSocketDir: "/var/lib/vhost_sockets" NovaSchedulerDefaultFilters: "RamFilter,ComputeFilter,AvailabilityZoneFilter,ComputeCapabilitiesFilter,ImagePropertiesFilter,NUMATopologyFilter" OvsDpdkDriverType: "vfio-pci" #ComputeOvsDpdkParameters: ## Host configuration Parameters #TunedProfileName: "cpu-partitioning" #IsolCpusList: "" # Logical CPUs list to be isolated from the host process (applied via cpu-partitioning tuned). # It is mandatory to provide isolated cpus for tuned to achive optimal performance. # Example: "3-8,12-15,18" #KernelArgs: "" # Space separated kernel args to configure hugepage and IOMMU. # Deploying DPDK requires enabling hugepages for the overcloud compute nodes. # It also requires enabling IOMMU when using the VFIO (vfio-pci) OvsDpdkDriverType. # This should be done by configuring parameters via host-config-and-reboot.yaml environment file. ## Attempting to deploy DPDK without appropriate values for the below parameters may lead to unstable deployments ## due to CPU contention of DPDK PMD threads. ## It is highly recommended to to enable isolcpus (via KernelArgs) on compute overcloud nodes and set the following parameters: #OvsDpdkSocketMemory: "" # Sets the amount of hugepage memory to assign per NUMA node. # It is recommended to use the socket closest to the PCIe slot used for the # desired DPDK NIC. Format should be comma separated per socket string such as: # "<socket 0 mem MB>,<socket 1 mem MB>", for example: "1024,0". #OvsPmdCoreList: "" # List or range of CPU cores for PMD threads to be pinned to. Note, NIC # location to cores on socket, number of hyper-threaded logical cores, and # desired number of PMD threads can all play a role in configuring this setting. # These cores should be on the same socket where OvsDpdkSocketMemory is assigned. # If using hyperthreading then specify both logical cores that would equal the # physical core. Also, specifying more than one core will trigger multiple PMD # threads to be spawned, which may improve dataplane performance. #NovaVcpuPinSet: "" # Cores to pin Nova instances to. For maximum performance, select cores # on the same NUMA node(s) selected for previous settings. #NumDpdkInterfaceRxQueues: 1 See that i added that new line there, that maps ComputeNeutronOvsAgent to the neutron-ovs-dpdk-agent service. Issue was that the service was just mapped to standard ovs, not to ovs-dpdk one. Ziv, so can you modify your template, and run FFU with that? This needs to be present there before the prepare steps of FFU.
As per your request, I have added the definition below to my THT's and rerun ffu end-to-end: ------------------ resource_registry: OS::TripleO::Services::ComputeNeutronOvsAgent: /usr/share/openstack-tripleo-heat-templates/docker/services/neutron-ovs-dpdk-agent.yaml ------------------ I checked the dpdk bond configuration, looks like it has no error after the ffu anymore: Port "dpdkbond0" Interface "dpdk0" type: dpdk options: {dpdk-devargs="0000:82:00.0", n_rxq="2"} Interface "dpdk1" type: dpdk options: {dpdk-devargs="0000:82:00.1", n_rxq="2"} Immediately after the ffu I tried to boot up a dpdk instance, but the following instance was in a spanning/build mode for a long period of time. So after a false attempt to delete it, I decided to reboot the overcloud nodes, just to make sure that everything is in place. A new problem comes up, virsh controllers are freezes/stuck with a black screen and eventually, instead of rebooting, they're powering off after 10min or so. Checking openstack status shows that they are 'ACTIVE'. I've powered them on manually and repeated the overcloud reboot again, exactly the same issue has happened. Please, any idea what could be causing this?
I can see that on logs 2018-07-23 09:26:03.812 19840 INFO neutron.agent.securitygroups_rpc [req-5c81362f-c289-47b9-b2c6-1027548f778f - - - - -] Refresh firewall rules 2018-07-23 09:26:03.843 19840 INFO neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent [req-5c81362f-c289-47b9-b2c6-1027548f778f - - - - -] Configuration for devices up [] and devices down [] completed. 2018-07-23 09:26:07.812 19840 WARNING neutron.agent.rpc [req-5c81362f-c289-47b9-b2c6-1027548f778f - - - - -] Device Port(admin_state_up=True,allowed_address_pairs=[],binding=PortBinding,binding_levels=[],created_at=2018-07-23T09:26:01Z,data_plane_status=<?>,description='',device_id='869e8699-aa23-40f0-bad7-5a94a72ec7f9',device_owner='compute:nova',dhcp_options=[],distributed_binding=None,dns=None,fixed_ips=[IPAllocation],id=a64b01e7-ff4e-4c62-a00e-ec6917e258c6,mac_address=fa:16:3e:a3:60:ac,name='',network_id=bf95a4f1-58e3-4c84-bb7a-4ec3122fec7e,project_id='6f2cc2c3c52b4137aea7cab531a34336',qos_policy_id=None,revision_number=8,security=PortSecurity(a64b01e7-ff4e-4c62-a00e-ec6917e258c6),security_group_ids=set([f878e7de-6a57-4ae1-919c-3add9bf0504a]),status='DOWN',updated_at=2018-07-23T09:26:04Z) is not bound. 2018-07-23 09:26:07.814 19840 WARNING neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent [req-5c81362f-c289-47b9-b2c6-1027548f778f - - - - -] Device a64b01e7-ff4e-4c62-a00e-ec6917e258c6 not defined on plugin or binding failed 2018-07-23 09:26:07.817 19840 INFO neutron.agent.securitygroups_rpc [req-5c81362f-c289-47b9-b2c6-1027548f778f - - - - -] Preparing filters for devices set([u'a64b01e7-ff4e-4c62-a00e-ec6917e258c6']) 2018-07-23 09:26:07.856 19840 INFO neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent [req-5c81362f-c289-47b9-b2c6-1027548f778f - - - - -] Configuration for devices up [] and devices down [] completed. 2018-07-23 09:26:39.138 19840 ERROR oslo.messaging._drivers.impl_rabbit [-] Failed to process message ... skipping it.: DuplicateMessageError: Found duplicate message(72a4883df5d042628b0a76008082a55a). Skipping it. 2018-07-23 09:26:39.138 19840 ERROR oslo.messaging._drivers.impl_rabbit Traceback (most recent call last): 2018-07-23 09:26:39.138 19840 ERROR oslo.messaging._drivers.impl_rabbit File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/impl_rabbit.py", line 368, in _callback 2018-07-23 09:26:39.138 19840 ERROR oslo.messaging._drivers.impl_rabbit self.callback(RabbitMessage(message)) 2018-07-23 09:26:39.138 19840 ERROR oslo.messaging._drivers.impl_rabbit File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 244, in __call__ 2018-07-23 09:26:39.138 19840 ERROR oslo.messaging._drivers.impl_rabbit unique_id = self.msg_id_cache.check_duplicate_message(message) 2018-07-23 09:26:39.138 19840 ERROR oslo.messaging._drivers.impl_rabbit File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqp.py", line 121, in check_duplicate_message 2018-07-23 09:26:39.138 19840 ERROR oslo.messaging._drivers.impl_rabbit raise rpc_common.DuplicateMessageError(msg_id=msg_id) 2018-07-23 09:26:39.138 19840 ERROR oslo.messaging._drivers.impl_rabbit DuplicateMessageError: Found duplicate message(72a4883df5d042628b0a76008082a55a). Skipping it. 2018-07-23 09:26:39.138 19840 ERROR oslo.messaging._drivers.impl_rabbit 2018-07-23 09:26:59.271 19840 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent [-] Failed reporting state!: MessagingTimeout: Timed out waiting for a reply to message ID f876992db6ac4c3d897f6dc871abf6af 2018-07-23 09:26:59.271 19840 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent Traceback (most recent call last): 2018-07-23 09:26:59.271 19840 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent File "/usr/lib/python2.7/site-packages/neutron/plugins/ml2/drivers/openvswitch/agent/ovs_neutron_agent.py", line 320, in _report_state 2018-07-23 09:26:59.271 19840 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent True) 2018-07-23 09:26:59.271 19840 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent File "/usr/lib/python2.7/site-packages/neutron/agent/rpc.py", line 93, in report_state 2018-07-23 09:26:59.271 19840 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent return method(context, 'report_state', **kwargs) 2018-07-23 09:26:59.271 19840 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/client.py", line 174, in call 2018-07-23 09:26:59.271 19840 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent retry=self.retry) 2018-07-23 09:26:59.271 19840 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent File "/usr/lib/python2.7/site-packages/oslo_messaging/transport.py", line 131, in _send 2018-07-23 09:26:59.271 19840 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent timeout=timeout, retry=retry) 2018-07-23 09:26:59.271 19840 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 559, in send 2018-07-23 09:26:59.271 19840 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent retry=retry) 2018-07-23 09:26:59.271 19840 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 548, in _send 2018-07-23 09:26:59.271 19840 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent result = self._waiter.wait(msg_id, timeout) 2018-07-23 09:26:59.271 19840 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 440, in wait 2018-07-23 09:26:59.271 19840 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent message = self.waiters.get(msg_id, timeout=timeout) 2018-07-23 09:26:59.271 19840 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 328, in get 2018-07-23 09:26:59.271 19840 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent 'to message ID %s' % msg_id) 2018-07-23 09:26:59.271 19840 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent MessagingTimeout: Timed out waiting for a reply to message ID f876992db6ac4c3d897f6dc871abf6af 2018-07-23 09:26:59.271 19840 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent 2018-07-23 09:26:59.272 19840 WARNING oslo.service.loopingcall [-] Function 'neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent.OVSNeutronAgent._report_state' run outlasted interval by 30.01 sec
I also could find issues when trying to restart controllers from the control plane. Seems they could not be restarted gracefully, they were giving issues with rabbit/galera/haproxy, etc... those are the logs http://pastebin.test.redhat.com/621684
The problem with locked vms, was a problem with incorrect mapping of VhostuserSocketGroup . It was mapped as: parameter_defaults: ComputeOvsDpdkParameters: VhostuserSocketGroup: "hugetlbfs" But this role was not used. This needs to be: parameter_defaults: ComputeParameters: VhostuserSocketGroup: "hugetlbfs" As this is the default role being used on this deployment.
(In reply to Yolanda Robla from comment #22) > The problem with locked vms, was a problem with incorrect mapping of > VhostuserSocketGroup . It was mapped as: > > parameter_defaults: > ComputeOvsDpdkParameters: > VhostuserSocketGroup: "hugetlbfs" > > But this role was not used. This needs to be: > > parameter_defaults: > ComputeParameters: > VhostuserSocketGroup: "hugetlbfs" > > As this is the default role being used on this deployment. The following solution above has been verified. Thanks, Ziv
This bug is marked for inclusion in the errata but does not currently contain draft documentation text. To ensure the timely release of this advisory please provide draft documentation text for this bug as soon as possible. If you do not think this bug requires errata documentation, set the requires_doc_text flag to "-". To add draft documentation text: * Select the documentation type from the "Doc Type" drop down field. * A template will be provided in the "Doc Text" field based on the "Doc Type" value selected. Enter draft text in the "Doc Text" field.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:2574
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 500 days