Description of problem: Failing to deploy hwoffload setup, the following error is reported: 2022-07-05 12:14:58.355698 | 525400ae-78b5-fb35-2c87-000000006651 | FATAL | Create containers managed by Podman for /var/lib/tripleo-config/container-puppet-config/step_1 | computehwoffload-r740 | error={^M "changed": false,^M "invocation": {^M "module_args": {^M "concurrency": 6,^M "config_dir": "/var/lib/tripleo-config/container-puppet-config/step_1",^M "config_id": "tripleo_puppet_step1",^M "config_overrides": {},^M "config_patterns": "container-puppet-*.json",^M "debug": false,^M "log_base_path": "/var/log/containers/stdouts"^M }^M },^M "msg": "Failed containers: container-puppet-ovn_controller"^M In computehwoffload-r740 I see the following error: [root@computehwoffload-r740 stdouts]# cat container-puppet-ovn_controller.log 2022-07-05T12:14:43.080126021+00:00 stdout F include ::tripleo::packages 2022-07-05T12:14:43.080126021+00:00 stdout F include tripleo::profile::base::neutron::agents::ovn 2022-07-05T12:14:43.080126021+00:00 stdout F 2022-07-05T12:14:43.269214534+00:00 stdout F Running puppet 2022-07-05T12:14:43.270944478+00:00 stderr F + logger -s -t puppet-user 2022-07-05T12:14:43.283502144+00:00 stderr F + /usr/bin/puppet apply --summarize --detailed-exitcodes --color=false --modulepath=/etc/puppet/modules:/usr/share/openstack-puppet/modules --tags '"file,file_line,concat,augeas,cron,vs_config,exec"' /etc/config.pp 2022-07-05T12:14:50.035623604+00:00 stderr F <13>Jul 5 12:14:43 puppet-user: Warning: /etc/puppet/hiera.yaml: Use of 'hiera.yaml' version 3 is deprecated. It should be converted to version 5 2022-07-05T12:14:50.037551973+00:00 stderr F <13>Jul 5 12:14:50 puppet-user: (file: /etc/puppet/hiera.yaml) 2022-07-05T12:14:50.039158745+00:00 stderr F <13>Jul 5 12:14:50 puppet-user: Warning: Undefined variable '::deploy_config_name'; 2022-07-05T12:14:50.039969777+00:00 stderr F <13>Jul 5 12:14:50 puppet-user: (file & line not available) 2022-07-05T12:14:50.104264665+00:00 stderr F <13>Jul 5 12:14:50 puppet-user: Warning: The function 'hiera' is deprecated in favor of using 'lookup'. See https://puppet.com/docs/puppet/7.10/deprecated_language.html 2022-07-05T12:14:50.104264665+00:00 stderr F <13>Jul 5 12:14:50 puppet-user: (file & line not available) 2022-07-05T12:14:50.528023670+00:00 stderr F <13>Jul 5 12:14:50 puppet-user: Notice: Compiled catalog for computehwoffload-r740.redhat.local in environment production in 0.53 seconds 2022-07-05T12:14:50.618343800+00:00 stderr F <13>Jul 5 12:14:50 puppet-user: Error: Found 1 dependency cycle: 2022-07-05T12:14:50.618343800+00:00 stderr F <13>Jul 5 12:14:50 puppet-user: (Service[openvswitch] => Vs_config[other_config:hw-offload] => Service[openvswitch])\nTry the '--graph' option and opening the resulting '.dot' file in OmniGraffle or GraphViz 2022-07-05T12:14:50.627993309+00:00 stderr F <13>Jul 5 12:14:50 puppet-user: Error: Failed to apply catalog: One or more resource dependency cycles detected in graph I will attach sos report and templates used Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info:
I forgot to add puddle used: RHOS-17.0-RHEL-9-20220628.n.1
I think there is a loop in puppet manifests: This is the error: 2022-07-13T13:45:52.172386522+00:00 stderr F <13>Jul 13 13:45:52 puppet-user: Error: Found 1 dependency cycle: 2022-07-13T13:45:52.172557445+00:00 stderr F <13>Jul 13 13:45:52 puppet-user: (Service[openvswitch] => Vs_config[other_config:hw-offload] => Service[openvswitch])\nTry the '--graph' option and opening the resulting '.dot' file in OmniGraffle or GraphViz 2022-07-13T13:45:52.175890118+00:00 stderr F <13>Jul 13 13:45:52 puppet-user: Error: Failed to apply catalog: One or more resource dependency cycles detected in graph So Service[openvswitch] deppends on Vs_config[other_config:hw-offload] and Vs_config[other_config:hw-offload] deppends on Service[openvswitch] We can see it in the code: /usr/share/openstack-tripleo-heat-templates/deployment/ovn/ovn-controller-container-puppet.yaml # Merging role-specific parameters (RoleParameters) with the default parameters. # RoleParameters will have the precedence over the default parameters. RoleParametersValue: type: OS::Heat::Value properties: type: json value: map_replace: - map_replace: - ovn::controller::ovn_bridge_mappings: NeutronBridgeMappings ovn::controller::ovn_cms_options: if: - az_ovn_unset - OVNCMSOptions - list_join: - '' - - OVNCMSOptions - ",availability-zones=" - {get_param: OVNAvailabilityZone} vswitch::ovs::enable_hw_offload: OvsHwOffload - values: {get_param: [RoleParameters]} - values: NeutronBridgeMappings: {get_param: NeutronBridgeMappings} OVNCMSOptions: {get_param: OVNCMSOptions} OvsHwOffload: {get_param: OvsHwOffload} /usr/share/openstack-puppet/modules/vswitch/manifests/ovs.pp if $enable_hw_offload { vs_config { 'other_config:hw-offload': value => 'true', notify => Service['openvswitch'], wait => true, } }
Modified container-puppet.sh and added options -d -v --graph /usr/bin/puppet apply --summarize -d -v --graph Checking graph and debug files: last_run_report.yaml There is a problem between lines 89 and 112 resource_statuses: Service[openvswitch]: title: openvswitch file: "/etc/puppet/modules/vswitch/manifests/ovs.pp" line: 112 resource: Service[openvswitch] resource_type: Service .... message: resource is part of a dependency cycle name: resource_error Vs_config[other_config:hw-offload]: title: other_config:hw-offload file: "/etc/puppet/modules/vswitch/manifests/ovs.pp" line: 89 resource: Vs_config[other_config:hw-offload] resource_type: Vs_config .... message: resource is part of a dependency cycle name: resource_error vi /etc/puppet/modules/vswitch/manifests/ovs.pp 88 if $enable_hw_offload { 89 vs_config { 'other_config:hw-offload': 90 value => 'true', 91 notify => Service['openvswitch'], 92 wait => true, 93 } 94 } 112 service { 'openvswitch': 113 ensure => true, 114 enable => true, 115 name => $::vswitch::params::ovs_service_name, 116 status => $::vswitch::params::ovs_status, 117 hasstatus => $::vswitch::params::ovs_service_hasstatus 118 }
I have tried to fix the cycle and commenting line 91 the container is able to start and offload is configured, I think there is an issue in puppet. vi /etc/puppet/modules/vswitch/manifests/ovs.pp 88 if $enable_hw_offload { 89 vs_config { 'other_config:hw-offload': 90 value => 'true', 91 # notify => Service['openvswitch'], 92 wait => true, 93 } 94 } 112 service { 'openvswitch': 113 ensure => true, 114 enable => true, 115 name => $::vswitch::params::ovs_service_name, 116 status => $::vswitch::params::ovs_status, 117 hasstatus => $::vswitch::params::ovs_service_hasstatus 118 } I have seen than in master it is done in a different way and notify is not there any more. Could it be that our puppet files are outdated? https://github.com/openstack/puppet-vswitch/blob/master/manifests/ovs.pp
Version installed is puppet-vswitch-14.4.2-0.20220317212602.3facbb3.el9ost.noarch Replacing notify by require also works. I do not understand why it is notify 88 if $enable_hw_offload { 89 vs_config { 'other_config:hw-offload': 90 value => 'true', 91 require => Service['openvswitch'], 92 wait => true, 93 } 94 }
I've went through stable/wallaby code but could not find out the actual trigger. The issue was reported in upstream and was fixed in xena and later. https://review.opendev.org/c/openstack/puppet-vswitch/+/805549 At that time we regarded the issue as a regression caused by the ordering added by https://review.opendev.org/c/openstack/puppet-vswitch/+/805549 so did not fix it in wallaby. However puppet-ovn has that problematic ordering in it here https://github.com/openstack/puppet-ovn/blob/stable/wallaby/manifests/controller.pp#L223 and that is triggering the problem it seems. > Replacing notify by require also works. I do not understand why it is notify We need to notify the service here because changing hw-oflload requires restarting the openvswitch service.
> I've went through stable/wallaby code but could not find out the actual trigger. Ignore this first line. I later found out the problem triggered by implementation in puppet-ovn. So I've submitted two patches to stable/wallaby. These replaces notification form Vs_config to Service by the new Exec resource to trigger the restart command directly so should solve that dependency problem. https://review.opendev.org/c/openstack/puppet-vswitch/+/849949 https://review.opendev.org/c/openstack/puppet-vswitch/+/849950
Hi I applied those patches to compute and controller nodes and it deployed sucessfully There are some files that were not patched because they are not in the installation. why are they missing? spec/classes/vswitch_dpdk_spec.rb spec/classes/vswitch_ovs_spec.rb
(In reply to Miguel Angel Nieto from comment #14) > Hi > > I applied those patches to compute and controller nodes and it deployed > sucessfully > > There are some files that were not patched because they are not in the > installation. why are they missing? > spec/classes/vswitch_dpdk_spec.rb > spec/classes/vswitch_ovs_spec.rb These are files for unit tests so are not installed or used in actual deployments.
(In reply to Takashi Kajinami from comment #12) > ... > > Replacing notify by require also works. I do not understand why it is notify > We need to notify the service here because changing hw-oflload requires > restarting the openvswitch service. Hmm... Looking at a few web articles it is mentioned that openvswitch service should be restart after setting other_config:hw-offload = true. example. https://docs.openstack.org/neutron/latest/admin/config-ovs-offload.html#create-compute-virtual-functions ~~~ 3. Restart Open vSwitch # sudo systemctl enable openvswitch.service # sudo ovs-vsctl set Open_vSwitch . other_config:hw-offload=true # sudo systemctl restart openvswitch.service ~~~ However in current TripleO, we do not enable the service resource type (and I don't think we can as we run puppet from containers) so openvswitch service is NOT restarted after other_config:hw-offload=true is set (*1) (*1) The current failure is caused by the defined resources but that does not necessarily mean all resources are executed, as we explicitly select enabled resources by tags. I'm not quite familiar with this area but is that expected ? It was earlier mentioned this was tested in OSP16 so I assume the current implementation worked without problems. If we don't need service restart then we'd be able to delete that notification.
Just for records. In stable/train we use puppet-ovn to set the hw-offload option. This was later deprecated in favor of the capability we added to puppet-vswitch. https://review.opendev.org/c/openstack/puppet-vswitch/+/779802/ https://review.opendev.org/c/openstack/puppet-ovn/+/779804 The old implementation in puppet-ovn did not notify the openvswitch service so did not cause this problem.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Release of components for Red Hat OpenStack Platform 17.0 (Wallaby)), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2022:6543
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days