Bug 1488517 - Manual reboot needed to apply tuned kernel arguments
Summary: Manual reboot needed to apply tuned kernel arguments
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-tripleo-heat-templates
Version: 10.0 (Newton)
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: z9
: 10.0 (Newton)
Assignee: Saravanan KR
QA Contact: Yariv
URL:
Whiteboard:
Depends On: 1488369
Blocks: 1563386 1563743
TreeView+ depends on / blocked
 
Reported: 2017-09-05 14:57 UTC by Ofer Blaut
Modified: 2018-09-26 20:01 UTC (History)
17 users (show)

Fixed In Version:
Doc Type: Known Issue
Doc Text:
Clone Of: 1488369
Environment:
Last Closed: 2018-09-05 06:44:37 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Ofer Blaut 2017-09-05 14:57:07 UTC
+++ This bug was initially created as a clone of Bug #1488369 +++

Description of problem:

After deployment of DPDK and SR-IOV environment, manual reboot is needed to apply tuned kernel arguments,

The proper profile is active. To me the reboot has been done before activating tuned’s profile:

[root@overcloud-compute-0 ~]# tuned-adm active
Current active profile: cpu-partitioning
[root@overcloud-compute-0 ~]# cat /proc/cmdline 
BOOT_IMAGE=/boot/vmlinuz-3.10.0-693.el7.x86_64 root=UUID=8a1ee696-e3f1-416c-a126-e1e3350310e7 ro console=tty0 console=ttyS0,115200n8 crashkernel=auto rhgb quiet intel_iommu=on default_hugepagesz=1GB hugepagesz=1G hugepages=32 iommu=pt isolcpus=1-21,23-43,45-65,67-87

After Manual reboot:

[root@overcloud-compute-0 ~]# cat /proc/cmdline 
BOOT_IMAGE=/boot/vmlinuz-3.10.0-693.el7.x86_64 root=UUID=8a1ee696-e3f1-416c-a126-e1e3350310e7 ro console=tty0 console=ttyS0,115200n8 crashkernel=auto rhgb quiet intel_iommu=on default_hugepagesz=1GB hugepagesz=1G hugepages=32 iommu=pt isolcpus=1-21,23-43,45-65,67-87 nohz=on nohz_full=1-21,23-43,45-65,67-87 rcu_nocbs=1-21,23-43,45-65,67-87 tuned.non_isolcpus=00000004,00001000,00400001 intel_pstate=disable nosoftlockup

Using the following templates as reference:
https://code.engineering.redhat.com/gerrit/gitweb?p=nfv-qe.git;a=tree;f=ospd-11-vlan-sriov-single-port-composable-roles;hb=docs


Version-Release number of selected component (if applicable):
OSPD 11
tuned 2.8

How reproducible:
Always

Steps to Reproduce:
1. deploy the environemt
2. Check /proc/cmdline
3. reboot the system
4. Check /proc/cmdline

Actual results:
Kernel params are missing

Expected results:
Should exists

Additional info:

--- Additional comment from Saravanan KR on 2017-09-05 13:24:38 IDT ---

/var/log/tuned/tuned.log (with manual reboot - 2 reboots, first on cloud-init, second is manual)
http://chunk.io/krsacme/7e7eeed6c1574f3488bbe04547a24994

First reboot (cloud-init) is having below log
2017-09-05 05:14:48,657 INFO     tuned.daemon.daemon: terminating Tuned, rolling back all changes

When the reboot is triggered from first-boot script (which is run by cloud-init), the command "systemctl is-system-running" is returned as "starting", which is initiating the rollback of all changes.

Related change:
https://github.com/redhat-performance/tuned/commit/df9aa2f5c46e4db08a077081ca15b6da541b4514?diff=split#diff-ea63e0c5d4daa711fe01bc50e4db0145R151

Adding tuned team to comment on it.

--- Additional comment from Eyal Dannon on 2017-09-05 16:53:18 IDT ---

I've tested it on OSP10.
- latest OSPd10 provides tuned2.8 as OSPd11, gives the same result
- OSPd10 z3 provides:
[root@compute-0 ~]# rpm -qa | grep tuned
tuned-2.7.1-3.el7_3.2.noarch
tuned-profiles-cpu-partitioning-2.7.1-5.el7fdp.noarch

Gives us which result:
[root@compute-0 ~]# cat /proc/cmdline
BOOT_IMAGE=/boot/vmlinuz-3.10.0-514.21.1.el7.x86_64 root=UUID=fa9e939e-9e3c-4f1c-a07c-3f506756ad7b ro console=tty0 console=ttyS0,115200n8 crashkernel=auto rhgb quiet default_hugepagesz=1GB hugepagesz=1G hugepages=32 iommu=pt intel_iommu=on isolcpus=1,2,3,4,5,16,17,18,19,20,21 nohz=on nohz_full=1,2,3,4,5,16,17,18,19,20,21 rcu_nocbs=1,2,3,4,5,16,17,18,19,20,21 intel_pstate=disable nosoftlockup

I suppose it's related to the tuned package.

Comment 3 Yariv 2017-09-05 18:53:51 UTC
flag added requires_doc_text ? as a known issue with w/a

Comment 4 Saravanan KR 2017-09-06 08:32:09 UTC
The issue is caused in tuned-2.8 because of the patch [1]. Tuned is trying to identify its daemon is down because of system reboot or service stopped explicitly by user. Former does not require a roll-back, where as the later requires roll-back of tuned changes. So the problem is tuned changes are rolled back during the reboot from the cloud-init (first-boot).

The system state is identified via  "systemctl is-system-running" command. The expected value incase of reboot is "stopping" but when the reboot is called during the cloud-init (first-boot) scripts, the system state is still in "starting" and not changed to "stopping" even on reboot. I am not sure if it is expected behavior of systemd or a bug in systemd [2].

As of now, I don't have a workaround for this.

[1] https://github.com/redhat-performance/tuned/commit/df9aa2f5c46e4db08a077081ca15b6da541b4514?diff=split#diff-ea63e0c5d4daa711fe01bc50e4db0145R151
[2] https://github.com/systemd/systemd/blob/d9ada1e4e122bbabd167589478e1e0735ca8b028/src/core/manager.c#L3448

Comment 5 Saravanan KR 2017-09-07 12:46:25 UTC
I have tried to find a workaround to overcome this issue. It looks like if the reboot is done outside the cloud-init module, the flow works fine (not sure why). And tuned settings are retained.

              cat >/usr/lib/systemd/system/tripleo-reboot-epa.service<<EOF_CAT
            [Unit]
            Description=Reboot after system start-up
            [Service]
            Type=forking
            ExecStart=/root/test.sh
            EOF_CAT

              cat >/root/test.sh<<EOF_CAT
            #!/bin/bash
            set -x
            echo "EPA restart service, wait for system starting.."
            start() {
              set -x
              if [ ! -f /root/reboot_epa ]; then
                touch /root/reboot_epa;
                echo 'Restarting system';
                systemctl reboot; 
              fi
            }
            start&
            EOF_CAT
              chmod +x /root/test.sh
              systemctl daemon-reload
              systemctl start tripleo-reboot-epa


complete first-boot.yaml - http://chunk.io/krsacme/d4ff9b2a55f144799b4ed98681fbca81

I prefer this to be validated by QE before we conclude whether it could be used or not.

Comment 6 Ondřej Lysoněk 2017-09-08 14:37:32 UTC
This is indeed a regression in Tuned. The fix is available here:
https://github.com/redhat-performance/tuned/pull/66

(In reply to Saravanan KR from comment #5)
> I have tried to find a workaround to overcome this issue. It looks like if
> the reboot is done outside the cloud-init module, the flow works fine (not
> sure why).

I'm not familiar with cloud-init, so just a wild guess: is cloud-init run as a system service on bootup? If so, and systemd doesn't recognize it as started up before cloud-init run's 'reboot', then 'systemctl is-system-running' will keep reporting the system state as 'starting', which will trigger the Tuned bug.

Comment 7 Saravanan KR 2017-09-11 07:57:50 UTC
(In reply to Ondřej Lysoněk from comment #6)
> This is indeed a regression in Tuned. The fix is available here:
> https://github.com/redhat-performance/tuned/pull/66
> 
> (In reply to Saravanan KR from comment #5)
> > I have tried to find a workaround to overcome this issue. It looks like if
> > the reboot is done outside the cloud-init module, the flow works fine (not
> > sure why).
> 
> I'm not familiar with cloud-init, so just a wild guess: is cloud-init run as
> a system service on bootup? If so, and systemd doesn't recognize it as
> started up before cloud-init run's 'reboot', then 'systemctl
> is-system-running' will keep reporting the system state as 'starting', which
> will trigger the Tuned bug.

Yes, this is the case, detailed in comment #4

Comment 8 atelang 2017-09-18 13:24:05 UTC
Marking as POST based on Ondrej comment above: https://bugzilla.redhat.com/show_bug.cgi?id=1488517#c6

Comment 12 Yariv 2017-10-16 07:32:13 UTC
Waiting for Downstream Package

Comment 17 Mike Burns 2018-02-27 16:29:51 UTC
The tuned build with the fix is in 7.4.z now.  This can be tested with the latest puddles on OSP 10

Comment 20 Yariv 2018-06-21 22:32:00 UTC
After 10zAsync Fresh install

Tuned looks good, no reboot needed 

BOOT_IMAGE=/boot/vmlinuz-3.10.0-862.3.3.el7.x86_64 root=UUID=68da77cc-ba27-41f2-ba68-fb165a0d503f ro console=tty0 console=ttyS0,115200n8 crashkernel=auto rhgb quiet default_hugepagesz=1GB hugepagesz=1G hugepages=32 iommu=pt intel_iommu=on isolcpus=1,2,3,4,5,6,7,9,10,17,18,19,20,21,22,23,11,12,13,14,15,25,26,27,28,29,30,31 skew_tick=1 nohz=on nohz_full=1,2,3,4,5,6,7,9,10,17,18,19,20,21,22,23,11,12,13,14,15,25,26,27,28,29,30,31 rcu_nocbs=1,2,3,4,5,6,7,9,10,17,18,19,20,21,22,23,11,12,13,14,15,25,26,27,28,29,30,31 tuned.non_isolcpus=01010101 intel_pstate=disable nosoftlockup

[heat-admin@compute-0 ~]$ sudo tuned-adm active

In /var/log/tuned.log
Current active profile: cpu-partitioning
2018-06-20 23:40:36,063 INFO     tuned.daemon.daemon: static tuning from profile 'cpu-partitioning' applied

Comment 21 Saravanan KR 2018-06-22 03:58:34 UTC
As confirmed on comment #20, no manaual reboot is required.


Note You need to log in before you can comment on or make changes to this bug.