Bug 1516123
Summary: | tuned-adm timeout while adding the host in manager and the deployment will fail/take time to complete | |||
---|---|---|---|---|
Product: | Red Hat Enterprise Virtualization Manager | Reporter: | nijin ashok <nashok> | |
Component: | redhat-release-rhev-hypervisor | Assignee: | Yuval Turgeman <yturgema> | |
Status: | CLOSED ERRATA | QA Contact: | Petr Matyáš <pmatyas> | |
Severity: | high | Docs Contact: | ||
Priority: | high | |||
Version: | 4.1.7 | CC: | bgraveno, cshao, danken, dfediuck, didi, dougsland, fgarciad, lsurette, lsvaty, lveyde, mgoldboi, mkalinin, nashok, nsoffer, rbalakri, rbarry, Rhev-m-bugs, skudupud, srevivo, vanhoof, ycui, ykaul, yturgema | |
Target Milestone: | ovirt-4.2.1 | Keywords: | ZStream | |
Target Release: | --- | |||
Hardware: | Unspecified | |||
OS: | Linux | |||
Whiteboard: | ||||
Fixed In Version: | Doc Type: | Bug Fix | ||
Doc Text: |
This update ensures that tuned.service is enabled by default to enable tuned-adm to set the active profile.
|
Story Points: | --- | |
Clone Of: | ||||
: | 1553258 (view as bug list) | Environment: | ||
Last Closed: | 2018-05-15 17:57:44 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | Node | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | 1443142, 1523194 | |||
Bug Blocks: | 1523346, 1553258 |
Description
nijin ashok
2017-11-22 06:04:56 UTC
Why isn't this a tuned bug? Specifically, I think it may have to do with https://bugzilla.redhat.com/show_bug.cgi?id=1258868 (In reply to Yaniv Kaul from comment #2) > Why isn't this a tuned bug? > Specifically, I think it may have to do with > https://bugzilla.redhat.com/show_bug.cgi?id=1258868 Somehow, I am unable to reproduce this on an RHEL server with the same set of packages and it only happens with RHV-H when I tried. That's the reason I opened it for RHV. I can surely open for tuned if that's needed. Moving to node team since this seems to affect RHV-H only Reproduced this on a Centos7.4 system well as well* Managed to reproduce this once, but not more than that. I had a CentOS 7.4 VM installed on Oct 25, rebooted a few times, didn't do much otherwise on it. When I tried it was up for something like 9 days. When I started, tuned was up, and had (the default?) profile virtual-guest. Did this: service tuned stop service tuned start tuned-adm profile virtual-host It seemed stuck, but when I checked tuned.log it showed that it did accept the command and handled it, just like in the attached logs. Tried several times to reproduce, with and without strace on the tuned-adm command, and it was always quick, no delay. Tried after reboot, same. Tried to reboot the VM from a snapshot I took shortly after installing it, same. Perhaps it's a timing issue or something like that. Anyway, it was CentOS, not node/RHVH. If we have a clear and 100% reliable reproducer, we might be able to come up with some workaround - but there isn't a bug in host-deploy. All it does is: self.services.state('tuned', True) rc, stdout, stderr = self.execute( ( self.command.get('tuned-adm'), 'profile', self._profile, ), raiseOnError=False, ) if rc != 0: self.logger.warning(_('Cannot set tuned profile')) else: self.services.startup('tuned', True) Last relevant change was to not fail if tuned-adm fails, ~ 5 years ago: https://gerrit.ovirt.org/10444 So at some point someone decided it's not critical. A possible workaround is to call, instead of self.execute, self.executePipeRaw, which has a parameter 'timeout', and pass there some value, say 30 seconds. (In reply to Yedidyah Bar David from comment #9) > If we have a clear and 100% reliable reproducer, we might be able to come up > with some workaround - but there isn't a bug in host-deploy. All it does is: Restarting dbus (if it's indeed the same issue as I mentioned in comment 2 ) helps. > > self.services.state('tuned', True) > rc, stdout, stderr = self.execute( > ( > self.command.get('tuned-adm'), > 'profile', > self._profile, > ), > raiseOnError=False, > ) > if rc != 0: > self.logger.warning(_('Cannot set tuned profile')) > else: > self.services.startup('tuned', True) > > Last relevant change was to not fail if tuned-adm fails, ~ 5 years ago: > > https://gerrit.ovirt.org/10444 > > So at some point someone decided it's not critical. > > A possible workaround is to call, instead of self.execute, > self.executePipeRaw, which has a parameter 'timeout', and pass there some > value, say 30 seconds. I could only reproduce this with `tuned-adm off` before stopping tuned. IIUC tuned-adm sends a message to dbus and waits for a "profile changed" response that never happens I'm guessing because the daemon is firing up so dbus can't find it. I tried with --async and it looks like the profile is set correctly (tuned-adm verify is ok). (In reply to Yuval Turgeman from comment #11) > I could only reproduce this with `tuned-adm off` before stopping tuned. > IIUC tuned-adm sends a message to dbus and waits for a "profile changed" > response that never happens I'm guessing because the daemon is firing up so > dbus can't find it. I tried with --async and it looks like the profile is > set correctly (tuned-adm verify is ok). This is a tuned-adm bug - can we move this to tuned? It's probably tuned or dbus, but we should definitely move this. The question is, why do we use tuned for virtual-host profile if vdsm manages its own kernel params with /etc/sysctl.d/vdsm.conf ? I mean, if we run `tuned-adm verify` after setting the virtual-host profile, it would fail, because vdsm.conf overrides some parameters. Not that I'm against tuned, but I think it's better to have one place (either tuned or sysctl.d) that sets those params. (In reply to Yuval Turgeman from comment #13) > It's probably tuned or dbus, but we should definitely move this. The > question is, why do we use tuned for virtual-host profile if vdsm manages > its own kernel params with /etc/sysctl.d/vdsm.conf ? I mean, if we run > `tuned-adm verify` after setting the virtual-host profile, it would fail, > because vdsm.conf overrides some parameters. Not that I'm against tuned, > but I think it's better to have one place (either tuned or sysctl.d) that > sets those params. I agree, and I think it should be tuned. Can you specify the different setting we have? I assume we have a good reason which may not be applicable to others (OpenStack) from diverging from virtual-host profile. In any case, can you move the bug to tuned? Sure, only one difference: static/etc/sysctl.d/vdsm.conf:vm.dirty_background_ratio = 2 usr/lib/tuned/virtual-host/tuned.conf:vm.dirty_background_ratio = 5 I opened bug 1523194 for tuned, and added it here as "depends on", do you want to move this bug to tuned as well ? (In reply to Yuval Turgeman from comment #15) > Sure, only one difference: > > static/etc/sysctl.d/vdsm.conf:vm.dirty_background_ratio = 2 > > usr/lib/tuned/virtual-host/tuned.conf:vm.dirty_background_ratio = 5 Nir, any idea why we have a different value in VDSM than tuned for this parameter? > > I opened bug 1523194 for tuned, and added it here as "depends on", do you > want to move this bug to tuned as well ? We can probably close this bug, or use it as 'create a dependency on tuned version XYZ' kind of bug. (In reply to Yaniv Kaul from comment #16) > (In reply to Yuval Turgeman from comment #15) > Nir, any idea why we have a different value in VDSM than tuned for this > parameter? These settings were added for bug 740887, suggested by the performance team for rhel 6.x. I don't know if these are needed for rhel 7 and can be replaced by dynamic setting by tuned. Dan, what do you think? Verified on redhat-release-virtualization-host-4.2-0.6.el7.x86_64 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2018:1524 BZ<2>Jira Resync |