Bug 1592743

Summary: tuned fails due to "RuntimeError: Set changed size during iteration"
Product: Red Hat Enterprise Linux 7 Reporter: Renaud Métrich <rmetrich>
Component: tunedAssignee: Ondřej Lysoněk <olysonek>
Status: CLOSED ERRATA QA Contact: Robin Hack <rhack>
Severity: medium Docs Contact:
Priority: medium    
Version: 7.5CC: fkrska, jeder, jskarvad, olysonek, rhack, tatsu, thozza
Target Milestone: rcKeywords: Patch, Upstream
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: tuned-2.11.0-0.1.rc1.el7 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1666680 (view as bug list) Environment:
Last Closed: 2019-08-06 13:04:43 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Bug Depends On:    
Bug Blocks: 1655750, 1663257, 1630913, 1666680    
Attachments:
Description Flags
base.py hacked to reproduce using a hotplug "sdb" device (see "HACK TO REPRODUCE ISSUE") none

Description Renaud Métrich 2018-06-19 08:31:34 UTC
Created attachment 1452843 [details]
base.py hacked to reproduce using a hotplug "sdb" device (see "HACK TO REPRODUCE ISSUE")

Description of problem:

The plugins/base.py code iterates through devices to perform commands:

        def _execute_all_device_commands(self, instance, devices):
                for command in filter(lambda command: command["per_device"], self._commands.values()):
                        new_value = self._variables.expand(instance.options.get(command["name"], None))
                        if new_value is None:
                                continue
---->                   for device in devices:
                                self._execute_device_command(instance, command, device, new_value)

If for some reason the devices set is modified while being iterated, a "RuntimeError: Set changed size during iteration" is thrown and tuned dies. Typical use case is when a SCSI device is being discovered at boot.


Version-Release number of selected component (if applicable):

tuned-2.9.0-1.el7.noarch (RHEL7.5) but also seen on tuned-2.7.1-3.el7.noarch (RHEL7.3 EUS)


How reproducible:

Always, using the manual reproducer in attachment


Steps to Reproduce:
1. Set up a VM with 1 disk ("sda")
2. Install the hack as /usr/lib/python2.7/site-packages/tuned/plugins/base.py
3. Restart tuned and monitor the log

  # tail -f /var/log/tuned.log &
  # systemctl restart tuned

4. Attach a new disk ("sdb") to the VM


Actual results:

# tail -f /var/log/tuned/tuned.log 
2018-06-19 10:11:19,207 INFO     tuned.daemon.daemon: starting tuning
2018-06-19 10:11:19,211 INFO     tuned.plugins.base: instance cpu: assigning devices cpu0
2018-06-19 10:11:19,216 WARNING  tuned.plugins.plugin_cpu: your CPU doesn't support MSR_IA32_ENERGY_PERF_BIAS, ignoring CPU energy performance bias
2018-06-19 10:11:19,216 INFO     tuned.plugins.base: instance disk: assigning devices sda, dm-0, dm-1
2018-06-19 10:11:19,217 INFO     tuned.plugins.base: Applying command 'readahead' to device 'sda'
2018-06-19 10:11:19,218 INFO     tuned.plugins.base: Waiting for hotplug device 'sdb' to show up
2018-06-19 10:11:42,078 INFO     tuned.plugins.hotplug: device 'sdb' added
2018-06-19 10:11:42,078 INFO     tuned.plugins.hotplug: instance disk: adding new device sdb
2018-06-19 10:11:42,249 INFO     tuned.plugins.base: Device 'sdb' is now there, continuing loop

# journalctl -u tuned

systemd[1]: Starting Dynamic System Tuning Daemon...
tuned[1959]: Exception in thread Thread-1:
tuned[1959]: Traceback (most recent call last):
tuned[1959]: File "/usr/lib64/python2.7/threading.py", line 812, in __bootstrap_inner
tuned[1959]: self.run()
tuned[1959]: File "/usr/lib64/python2.7/threading.py", line 765, in run
tuned[1959]: self.__target(*self.__args, **self.__kwargs)
tuned[1959]: File "/usr/lib/python2.7/site-packages/tuned/daemon/daemon.py", line 134, in _thread_code
tuned[1959]: self._unit_manager.start_tuning()
tuned[1959]: File "/usr/lib/python2.7/site-packages/tuned/units/manager.py", line 90, in start_tuning
tuned[1959]: instance.apply_tuning()
tuned[1959]: File "/usr/lib/python2.7/site-packages/tuned/plugins/instance/instance.py", line 73, in apply_tuning
tuned[1959]: self._plugin.instance_apply_tuning(self)
tuned[1959]: File "/usr/lib/python2.7/site-packages/tuned/plugins/base.py", line 250, in instance_apply_tuning
tuned[1959]: self._instance_apply_static(instance)
tuned[1959]: File "/usr/lib/python2.7/site-packages/tuned/plugins/base.py", line 298, in _instance_apply_static
tuned[1959]: self._execute_all_device_commands(instance, instance.devices)
tuned[1959]: File "/usr/lib/python2.7/site-packages/tuned/plugins/base.py", line 418, in _execute_all_device_commands
tuned[1959]: for device in devices:
tuned[1959]: RuntimeError: Set changed size during iteration
systemd[1]: tuned.service start operation timed out. Terminating.

Comment 2 Ondřej Lysoněk 2018-06-19 11:28:35 UTC
Yeah, Tuned doesn't really do a whole lot of locking when accessing shared resources. Thanks for the report.

Comment 5 Ondřej Lysoněk 2019-03-06 19:11:09 UTC
Upstream pull request:
https://github.com/redhat-performance/tuned/pull/167

Comment 15 errata-xmlrpc 2019-08-06 13:04:43 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:2201