Bug 1571403

Summary: If grub.cfg is missing, then profile cleanup doesn't occur when switching tuned profiles.
Product: Red Hat Enterprise Linux 7 Reporter: Ryan Blakley <rblakley>
Component: tunedAssignee: Jaroslav Škarvada <jskarvad>
Status: CLOSED ERRATA QA Contact: Dominik Rehák <drehak>
Severity: medium Docs Contact:
Priority: medium    
Version: 7.5CC: aaron_wilk, drehak, jeder, jskarvad, olysonek, tcerna
Target Milestone: rcKeywords: Patch, Upstream
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: tuned-2.10.0-0.1.rc1.el7 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-10-30 10:50:19 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Ryan Blakley 2018-04-24 17:17:44 UTC
Description of problem: If the grub.cfg file doesn't exist for some reason, then profile cleanup doesn't occur when switching profiles. Example if a custom profile has a [bootloader] and [script] section, when switching from the custom profile to say the virtual-guest profile. The script in the custom profile is executed for every profile switch after that.


Version-Release number of selected component (if applicable): tuned-2.9.0-1.el7


Steps to Reproduce:
1. # mkdir /etc/tuned/test

2. # cat > /etc/tuned/test/tuned.conf <<EOF
[main]
summary=Test profile

[bootloader]
cmdline="transparent_hugepages=never"

[script]
script=${i:PROFILE_DIR}/test.sh
EOF

3. # cat > /etc/tuned/test/test.sh <<EOF
#!/bin/bash

OPERATION=$1

if [ "$OPERATION" = "start" ]; then
	echo "start"
else
	echo "not start"
fi
EOF

4. # mv /boot/grub2/grub.cfg /boot/grub2/grub.cfg.bak
5. # tuned-adm profile test
6. # tuned-adm profile <virtual-guest/balanced>
7. # egrep "loading profile|applied|calling script" /var/log/tuned/tuned.log

Actual results:
2018-04-24 12:48:36,491 INFO     tuned.profiles.loader: loading profile: test
2018-04-24 12:48:36,512 INFO     tuned.plugins.plugin_script: calling script '/etc/tuned/test/cpu_tweaks.sh' with arguments '['start']'
2018-04-24 12:48:36,596 INFO     tuned.plugins.plugin_script: calling script '/etc/tuned/test/test.sh' with arguments '['start']'
2018-04-24 12:48:36,601 INFO     tuned.daemon.daemon: static tuning from profile 'test' applied
2018-04-24 12:48:51,619 INFO     tuned.plugins.plugin_script: calling script '/etc/tuned/test/test.sh' with arguments '['stop', 'full_rollback']'
2018-04-24 12:48:51,655 INFO     tuned.profiles.loader: loading profile: virtual-guest
2018-04-24 12:48:51,696 INFO     tuned.plugins.plugin_script: calling script '/etc/tuned/test/cpu_tweaks.sh' with arguments '['start']'
2018-04-24 12:48:51,786 INFO     tuned.plugins.plugin_script: calling script '/etc/tuned/test/test.sh' with arguments '['start']'
2018-04-24 12:48:51,854 INFO     tuned.daemon.daemon: static tuning from profile 'virtual-guest' applied


Expected results:
2018-04-24 13:02:17,080 INFO     tuned.profiles.loader: loading profile: test
2018-04-24 13:02:17,095 INFO     tuned.plugins.plugin_script: calling script '/etc/tuned/test/test.sh' with arguments '['start']'
2018-04-24 13:02:17,107 INFO     tuned.daemon.daemon: static tuning from profile 'test' applied
2018-04-24 13:02:29,126 INFO     tuned.plugins.plugin_script: calling script '/etc/tuned/test/test.sh' with arguments '['stop', 'full_rollback']'
2018-04-24 13:02:29,147 INFO     tuned.profiles.loader: loading profile: virtual-guest
2018-04-24 13:02:29,260 INFO     tuned.daemon.daemon: static tuning from profile 'virtual-guest' applied


Additional info:
From my testing I found that if the grub.cfg file is missing, it hits an error that isn't caught. That causes the cleanup scripts to not run, which flushes the array that holds any scripts from the config file. Since the array isn't flushed, the test.sh script will get ran over and over until the service is restarted. Below is the error traceback from the journal, and code snippet causing the issue.

1135]: Exception in thread Thread-7:
1135]: Traceback (most recent call last):
1135]: File "/usr/lib64/python2.7/threading.py", line 812, in __bootstrap_inner
1135]: self.run()
1135]: File "/usr/lib64/python2.7/threading.py", line 765, in run
1135]: self.__target(*self.__args, **self.__kwargs)
1135]: File "/usr/lib/python2.7/site-packages/tuned/daemon/daemon.py", line 181, in _thread_code
1135]: self._unit_manager.stop_tuning(full_rollback)
1135]: File "/usr/lib/python2.7/site-packages/tuned/units/manager.py", line 113, in stop_tuning
1135]: instance.unapply_tuning(full_rollback)
1135]: File "/usr/lib/python2.7/site-packages/tuned/plugins/instance/instance.py", line 82, in unapply_t
1135]: self._plugin.instance_unapply_tuning(self, full_rollback)
1135]: File "/usr/lib/python2.7/site-packages/tuned/plugins/base.py", line 289, in instance_unapply_tuni
1135]: self._instance_unapply_static(instance, full_rollback)
1135]: File "/usr/lib/python2.7/site-packages/tuned/plugins/plugin_bootloader.py", line 104, in _instanc
1135]: self._remove_grub2_tuning()
1135]: File "/usr/lib/python2.7/site-packages/tuned/plugins/plugin_bootloader.py", line 96, in _remove_g
1135]: self._cmd.add_modify_option_in_file(self._grub2_cfg_file_name, {"set\s+" + consts.GRUB2_TUNED_VAR
1135]: File "/usr/lib/python2.7/site-packages/tuned/utils/commands.py", line 187, in add_modify_option_i
1135]: data = self.read_file(f)
1135]: File "/usr/lib/python2.7/site-packages/tuned/utils/commands.py", line 117, in read_file
1135]: f = open(f, "r")
1135]: TypeError: coercing to Unicode: need string or buffer, NoneType found

** Below is from /usr/lib/python2.7/site-packages/tuned/utils/commands.py

        def add_modify_option_in_file(self, f, d, add = True):
                data = self.read_file(f)--,
                ,-------------------------'
        def read_file(self, f, err_ret = "", no_error = False):
                old_value = err_ret
                try:
                        f = open(f, "r") <<<<----It errors out here.
                        old_value = f.read()
                        f.close()
                except (OSError,IOError) as e: <<<<----This except doesn't catch the error because it's a TypeError
                        if not no_error:
                                self._error("Error when reading file '%s': '%s'" % (f, e))
                self._debug("Read data from file: '%s' > '%s'" % (f, old_value))
                return old_value

If you fix the above by adding TypeError to the except, then it would just hit the same issue at the bottom of the add_modify_option_in_file function, as you can see below.

        def add_modify_option_in_file(self, f, d, add = True):
                data = self.read_file(f)
                for opt in d:
                        o = str(opt)
                        v = str(d[opt])
                        if re.search(r"\b" + o + r"\s*=.*$", data, flags = re.MULTILINE) is None:
                                if add:
                                        if len(data) > 0 and data[-1] != "\n":
                                                data += "\n"
                                        data += "%s=\"%s\"\n" % (o, v)
                        else:
                                data = re.sub(r"\b(" + o + r"\s*=).*$", r"\1" + "\"" + v + "\"", data, flags = re.MULTILINE)

                return self.write_to_file(f, data)----,
                  ,-----------------------------------'
        def write_to_file(self, f, data, makedir = False, no_error = False):
                self._debug("Writing to file: '%s' < '%s'" % (f, data))
                if makedir:
                        d = os.path.dirname(f)
                        if os.path.isdir(d):
                                makedir = False
                try:
                        if makedir:
                                os.makedirs(d)
                        fd = open(f, "w") <<<<<-----It would error here due to the TypeError
                        fd.write(str(data))
                        fd.close()
                        rc = True
                except (OSError,IOError) as e: <<<----It wouldn't catch here due to TypeError not being specified.
                        rc = False
                        if not no_error:
                                self._error("Writing to file '%s'" % (f))
                return rc

I fill there should be a check prior to calling the add_modify_option_in_file function to check if self._grub2_cfg_file_name actually contains a file name.

Comment 2 Ondřej Lysoněk 2018-04-25 06:46:24 UTC
Upstream report:
https://github.com/redhat-performance/tuned/issues/82

Pull request:
https://github.com/redhat-performance/tuned/pull/96

Comment 4 Aaron 2018-04-27 13:30:51 UTC
@Ondřej, reviewing the commit from PR 96 it seems the only change was to log an error and then return from the function _remove_grub2_tuning without doing anything to the grub2 file.

The problem described in this BZ is related to the optional script in the "[script]" from the profile we're moving away from getting called again with the "start" parameter after the grub2 failure. Tuned later prints a message in /var/log/tuned.log that the new profile we requested was successfully applied yet tuned re-applied the script from the previous profile.

Here is an condensed example of what's going on during the profile switch from a profile that had a script to one that doesn't (and grub2.cfg is missing). This was taken from logs included in https://access.redhat.com/support/cases/#/case/02018625

root@ventris-pa-arch-strtus02-45 ~$ tuned-adm profile balanced
root@ventris-pa-arch-strtus02-45 ~$ grep -Ei "applied|reapplying|plugin_bootloader|loading.profile|cpu_tweaks.sh|error" 
/var/log/tuned/tuned.log*
2018-04-17 16:47:53,120 INFO     tuned.plugins.plugin_script: calling script '/etc/tuned/vde-performance/cpu_tweaks.sh' with arguments '['stop', 'full_rollback']'
2018-04-17 16:47:53,164 INFO     tuned.plugins.plugin_bootloader: removing grub2 tuning previously added by Tuned
2018-04-17 16:47:53,172 INFO     tuned.profiles.loader: loading profile: balanced
2018-04-17 16:47:53,184 INFO     tuned.plugins.plugin_bootloader: installing additional boot command line parameters to grub2
2018-04-17 16:47:53,184 DEBUG    tuned.plugins.plugin_bootloader: patching grub.cfg
2018-04-17 16:47:53,184 ERROR    tuned.plugins.plugin_bootloader: cannot find grub.cfg to patch, you need to regenerate it by hand by grub2-mkconfig
2018-04-17 16:47:53,266 INFO     tuned.plugins.plugin_sysctl: reapplying system sysctl
2018-04-17 16:47:53,270 INFO     tuned.plugins.plugin_script: calling script '/etc/tuned/vde-performance/cpu_tweaks.sh' with arguments '['start']'
2018-04-17 16:47:53,353 ERROR    tuned.utils.commands: Error when reading file '/sys/class/drm/card0/device/power_method': '[Errno 2] No such file or directory: u'/sys/class/drm/card0/device/power_method''
2018-04-17 16:47:53,355 INFO     tuned.daemon.daemon: static tuning from profile 'balanced' applied


Can you confirm that PR 96 will prevent the script from the previous profile from executing again with a "start" parameter when grub2.cfg doesn't exist?

Comment 5 Aaron 2018-04-27 13:55:10 UTC
In addition to the above question regarding if PR 96 will fix the re-execution of the script with the start parameter...

I did some more looking because I noticed the added log message in PR 96 is already present in the output of the grep command in my previous post. I looked at the plugin_bootloader.py on git and I see it present on line 158:
https://github.com/olysonek/tuned/blob/dce9ad294149617f8dabd126c4a36e6015ca809e/tuned/plugins/plugin_bootloader.py#L158

Comment 6 Ondřej Lysoněk 2018-04-29 09:43:15 UTC
(In reply to Aaron from comment #4)
> Can you confirm that PR 96 will prevent the script from the previous profile
> from executing again with a "start" parameter when grub2.cfg doesn't exist?

Yes, it prevents that problem. The problem was that an exception was raised when the file grub2.cfg didn't exist, the exception was not handled, and as a result of that the profile was not properly cleaned up in memory. The patch prevents raising the exception in the first place.

Anyway, I created another patch to make the whole process more robust and avoid similar problems:
https://github.com/redhat-performance/tuned/pull/97

(In reply to Aaron from comment #5)
> I did some more looking because I noticed the added log message in PR 96 is
> already present in the output of the grep command in my previous post. I
> looked at the plugin_bootloader.py on git and I see it present on line 158:
> https://github.com/olysonek/tuned/blob/
> dce9ad294149617f8dabd126c4a36e6015ca809e/tuned/plugins/plugin_bootloader.
> py#L158

That is on a different code path. The code you're referring to is not reached when unapplying the tuning. We may want to refactor the code so that the check is done in one place, but this is not a priority at the moment.

Comment 10 errata-xmlrpc 2018-10-30 10:50:19 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:3172