RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1679007 - realtime-virtual-host profile apply remains stuck during initial compute deployment via openstack director
Summary: realtime-virtual-host profile apply remains stuck during initial compute depl...
Keywords:
Status: CLOSED DUPLICATE of bug 1554851
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: tuned
Version: 7.6
Hardware: x86_64
OS: Linux
medium
medium
Target Milestone: rc
: ---
Assignee: Jaroslav Škarvada
QA Contact: qe-baseos-daemons
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-02-20 05:22 UTC by Jaison Raju
Modified: 2019-02-28 05:25 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-02-28 05:25:11 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Jaison Raju 2019-02-20 05:22:47 UTC
Description of problem:
While deploying RT KVM computes via openstack director, the realtime-virtual-host profile apply remains stuck for over 600sec & the deployment fails.
The qemu-kvm process keeps running.

root      26847  17321  0 23:42 ?        00:00:00 /bin/sh /usr/lib/tuned/realtime-virtual-host/script.sh start
root      26863      2  0 23:42 ?        00:00:00 [kworker/u900:1]
root      26864      2  0 23:42 ?        00:00:00 [kworker/u901:1]
root      26919  26847 99 23:42 ?        00:07:03 /usr/libexec/qemu-kvm -enable-kvm -device pc-testdev -device isa-debug-exit,iobase=0xf4,iosize=0x4 -display none -serial stdio -device pci-testdev -kernel /usr/share/qemu-kvm/tscdeadline_latency.flat -cpu host
root      26920  26847  0 23:42 ?        00:00:00 grep latency
root      26921  26847  0 23:42 ?        00:00:00 cut -f 2 -d :

Version-Release number of selected component (if applicable):
[root@overcloud-computeovsdpdkrt-0 ~]# rpm -qa | grep tuned
tuned-2.10.0-6.el7.noarch
tuned-profiles-realtime-2.10.0-6.el7.noarch
tuned-profiles-nfv-host-2.10.0-6.el7.noarch
tuned-profiles-cpu-partitioning-2.10.0-6.el7.noarch
[root@overcloud-computeovsdpdkrt-0 ~]# rpm -q kernel-rt
kernel-rt-3.10.0-957.5.1.rt56.916.el7.x86_64

(undercloud) [stack@ocp-130-107 ~]$ rpm -q openstack-tripleo-heat-templates
openstack-tripleo-heat-templates-8.0.7-21.el7ost.noarch


How reproducible:
Always

Steps to Reproduce:
1. Deploy an environment with RT KVM
2.
3.

Actual results:
realtime-virtual-host tuned profile is takes indefinete time to finish & stack deployment fails.
$ openstack stack failures list --long overcloud
overcloud.ComputeOvsDpdkSriovRT.0.PreNetworkConfig.HostParametersDeployment:
  resource_type: OS::Heat::SoftwareDeployment
  physical_resource_id: 9e484e3c-2baf-4a9f-bf8b-a8d42c586085
  status: CREATE_FAILED
  status_reason: |
    Error: resources.HostParametersDeployment: Deployment to server failed: deploy_status_code : Deployment exited with non-zero status code: 2
  deploy_stdout: |
    
    PLAY [Configuration to be applied before rebooting the node] *******************
    
    TASK [Gathering Facts] *********************************************************
    ok: [localhost]
    
    TASK [Ensure the kernel args ( default_hugepagesz=1GB hugepagesz=1G hugepages=32 iommu=pt intel_iommu=on isolcpus=2-39 ) is present as TRIPLEO_HEAT_TEMPLATE_KERNEL_ARGS] ***
    changed: [localhost]
    
    TASK [Add TRIPLEO_HEAT_TEMPLATE_KERNEL_ARGS to the GRUB_CMDLINE_LINUX parameter] ***
    changed: [localhost]
    
    TASK [Generate grub config file] ***********************************************
    changed: [localhost]
    
    TASK [Tune-d Configuration] ****************************************************
    changed: [localhost]
    
    TASK [Tune-d profile activation] ***********************************************
    fatal: [localhost]: FAILED! => {"changed": true, "cmd": "tuned-adm profile realtime-virtual-host", "delta": "0:10:01.625060", "end": "2019-02-19 07:46:44.661856", "failed": true, "msg": "non-zero return code", "rc": 1, "start": "2019-02-19 07:36:43.036796", "stderr": "", "stderr_lines": [], "stdout": "Operation timed out after waiting 600 seconds(s), you may try to increase timeout by using --timeout command line option or using --async.", "stdout_lines": ["Operation timed out after waiting 600 seconds(s), you may try to increase timeout by using --timeout command line option or using --async."]}
    	to retry, use: --limit @/var/lib/heat-config/heat-config-ansible/7e86f3ac-b22f-4a87-acd2-1dbeb2755d93_playbook.retry
    
    PLAY RECAP *********************************************************************
    localhost                  : ok=5    changed=4    unreachable=0    failed=1   
    
  deploy_stderr: |


Expected results:
realtime-virtual-host tuned profile is applied successfully.


Additional info:

Comment 4 Marcelo Tosatti 2019-02-20 12:24:30 UTC
Looks like a duplicate of

https://bugzilla.redhat.com/show_bug.cgi?id=1670275

Can you confirm the workaround in that BZ fixes the problem?

Comment 6 Jaison Raju 2019-02-28 05:25:11 UTC
The following patch fixes the issue as suggested.
https://github.com/redhat-performance/tuned/commit/4790e570ce0e41bde4e1866ed6e3cba723b5f4d8
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
# guestmount -a /tmp/overcloud-realtime-compute.qcow2  -m /dev/sda /mnt/
# wget https://raw.githubusercontent.com/redhat-performance/tuned/4790e570ce0e41bde4e1866ed6e3cba723b5f4d8/profiles/realtime-virtual-host/script.sh
--2019-02-27 21:30:51--  https://raw.githubusercontent.com/redhat-performance/tuned/4790e570ce0e41bde4e1866ed6e3cba723b5f4d8/profiles/realtime-virtual-host/script.sh
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 151.101.152.133
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|151.101.152.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 3550 (3.5K) [text/plain]
Saving to: ‘script.sh.1’

100%[=========================================================================================================================================================================>] 3,550       --.-K/s   in 0s      

2019-02-27 21:30:51 (95.1 MB/s) - ‘script.sh.1’ saved [3550/3550]

# cat script.sh.1 > /mnt/usr/lib/tuned/realtime-virtual-host/script.sh
# guestunmount /mnt/

$ openstack overcloud image upload --update-existing --os-image-name overcloud-realtime-compute.qcow2 
Image "overcloud-realtime-compute-vmlinuz" is up-to-date, skipping.
Image "overcloud-realtime-compute-initrd" is up-to-date, skipping.
Image "overcloud-realtime-compute" was uploaded.
+--------------------------------------+----------------------------+-------------+------------+--------+
|                  ID                  |            Name            | Disk Format |    Size    | Status |
+--------------------------------------+----------------------------+-------------+------------+--------+
| f2a5a793-f5bc-4227-8192-cd738310d57f | overcloud-realtime-compute |    qcow2    | 2642280448 | active |
+--------------------------------------+----------------------------+-------------+------------+--------+
Image "bm-deploy-kernel" is up-to-date, skipping.
Image "bm-deploy-ramdisk" is up-to-date, skipping.
Image file "/httpboot/agent.kernel" is up-to-date, skipping.
Image file "/httpboot/agent.ramdisk" is up-to-date, skipping.
Some images have been updated in Glance, make sure to rerun
	openstack overcloud node configure
to reflect the changes on the nodes
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

*** This bug has been marked as a duplicate of bug 1554851 ***


Note You need to log in before you can comment on or make changes to this bug.