Bug 1069119 - Stucked tuned service during host deploying
Summary: Stucked tuned service during host deploying
Keywords:
Status: CLOSED UPSTREAM
Alias: None
Product: ovirt-host-deploy
Classification: oVirt
Component: Plugins.tune
Version: 1.1.0
Hardware: Unspecified
OS: Unspecified
unspecified
urgent
Target Milestone: ---
: ---
Assignee: Alon Bar-Lev
QA Contact: Jiri Belka
URL:
Whiteboard: infra
Depends On: 1069245 1071453
Blocks: 1078981
TreeView+ depends on / blocked
 
Reported: 2014-02-24 09:18 UTC by Meital Bourvine
Modified: 2016-02-10 19:15 UTC (History)
13 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 1078981 (view as bug list)
Environment:
Last Closed: 2014-07-07 00:48:47 UTC
oVirt Team: Infra
Embargoed:


Attachments (Terms of Use)
logs (15.67 KB, application/x-gzip)
2014-02-24 09:34 UTC, Meital Bourvine
no flags Details

Description Meital Bourvine 2014-02-24 09:18:04 UTC
Description of problem:
Stucked tuned service during host deploying

Version-Release number of selected component (if applicable):
ovirt-beta3

How reproducible:
80%

Steps to Reproduce:
Install rhevm and add a host.

Additional info (thanks to lbendar):

There is an error about "Existing lock /var/run/yum.pid"

you can track the process which caused that using pid. follow
instruction bellow:

{{{
ls -la /proc/$(cat /var/run/yum.pid)/fd | grep 'log$'
l-wx------. 1 root root 64 Feb 24 09:42 1 ->
/tmp/ovirt-host-deploy-20140220152021.log
}}}

and see ... it was caused by rhevm bootstrap ...
so we can take a look at log what happened:

{{{
[root@puma29 ~]# tail /tmp/ovirt-host-deploy-20140220152021.log

2014-02-20 15:21:14 DEBUG otopi.plugins.otopi.services.rhel
plugin.executeRaw:366 execute: ('/sbin/service', 'tuned', 'start'),
executable='None', cwd='None', env=None
2014-02-20 15:21:15 DEBUG otopi.plugins.otopi.services.rhel
plugin.executeRaw:383 execute-result: ('/sbin/service', 'tuned',
'start'), rc=0
2014-02-20 15:21:15 DEBUG otopi.plugins.otopi.services.rhel
plugin.execute:441 execute-output: ('/sbin/service', 'tuned', 'start')
stdout:


2014-02-20 15:21:15 DEBUG otopi.plugins.otopi.services.rhel
plugin.execute:446 execute-output: ('/sbin/service', 'tuned', 'start')
stderr:


2014-02-20 15:21:15 DEBUG otopi.plugins.ovirt_host_deploy.tune.tuned
plugin.executeRaw:366 execute: ('/usr/bin/tuned-adm', 'profile',
'virtual-host'), executable='None', cwd='None', env=None
}}}

and we are here again ... it leads to tuned service.
lets take a look at /var/log/tuned/tuned.log

I can not find anything interesting there
{{{
....
2014-02-20 15:10:35,110 INFO     tuned: performing ktune conditional restart
....
}}}

Comment 1 Meital Bourvine 2014-02-24 09:34:39 UTC
Created attachment 866906 [details]
logs

Comment 2 Sandro Bonazzola 2014-02-24 09:41:20 UTC
Jaroslav, can you take a look? Maybe it's a tuned bug.

Comment 3 Jaroslav Škarvada 2014-02-24 10:37:54 UTC
(In reply to Sandro Bonazzola from comment #2)
> Jaroslav, can you take a look? Maybe it's a tuned bug.

I think this may be race. RHEL-6 tuned is not race free and it's something that cannot be correctly fixed without re-design (which happened in RHEL-7).

Could you try to add e.g. 5 seconds delay between '/sbin/service tuned start' and '/usr/bin/tuned-adm profile virtual-host'? In case it helps, I can try to workaround this specific problem.

Comment 4 Alon Bar-Lev 2014-02-24 10:55:33 UTC
(In reply to Jaroslav Škarvada from comment #3)
> (In reply to Sandro Bonazzola from comment #2)
> > Jaroslav, can you take a look? Maybe it's a tuned bug.
> 
> I think this may be race. RHEL-6 tuned is not race free and it's something
> that cannot be correctly fixed without re-design (which happened in RHEL-7).
> 
> Could you try to add e.g. 5 seconds delay between '/sbin/service tuned
> start' and '/usr/bin/tuned-adm profile virtual-host'? In case it helps, I
> can try to workaround this specific problem.

whoever reproduce it... modify: /usr/share/ovirt-host-deploy/plugins/ovirt-host-deploy/tune/tuned.py

    def _misc(self):
        # tuned-adm does not work if daemon is down!
        self.services.state('tuned', True)
+       import time
+       time.sleep(5)
        rc, stdout, stderr = self.execute(

Comment 5 Ilia Meerovich 2014-02-24 13:27:36 UTC
This bug blocks testing of ovirt 3.4

Comment 6 Alon Bar-Lev 2014-02-24 13:31:12 UTC
(In reply to Ilia Meerovich from comment #5)
> This bug blocks testing of ovirt 3.4

you do not understand important fact... this tuned version that was probably distributed in rhel-6.5 will effect production version since 3.2.

blocking ovirt-3.4 tests are the least of our worries.

Comment 8 Meital Bourvine 2014-02-24 14:22:42 UTC
This work around seems to be working.
I run it 3 times.

Comment 9 Sandro Bonazzola 2014-02-24 14:51:18 UTC
Removing from blockers since it's a tuned regression.
Meital, please open a BZ on tuned component.
While tuned waits to be fixed, please downgrade to previous version.
Alon, it's up to you if you want to close this as notabug, wontfix or add a conflict in spec file on this specific tuned version forcing a downgrade or an upgrade just to have a working tuned istalled.
Just let me know if you rebuild host-deploy before 09:00 UTC tomorrow, Feb 25th 2014.

Comment 10 Meital Bourvine 2014-02-24 14:57:47 UTC
https://bugzilla.redhat.com/show_bug.cgi?id=1069245

Comment 11 Alon Bar-Lev 2014-02-24 15:06:57 UTC
(In reply to Meital Bourvine from comment #10)
> https://bugzilla.redhat.com/show_bug.cgi?id=1069245

per what sandro wrote, it does not block ovirt-engine-3.4 from being released as this problem is specific to rhel and is effecting also previous releases that are already out.

action items for downstream are different.

please open a bug against rhel tuned to track this issue, this bug should block this one.

Comment 12 Sandro Bonazzola 2014-02-25 06:53:43 UTC
Removing from blockers again as per comment #9.
Added tuned bug #1069245 to this bug dependencies.

Comment 13 Sandro Bonazzola 2014-02-27 09:42:38 UTC
Removing AutomationBlocker, TestBlocker since downgrading tuned allow tests to be performed.

Comment 14 Sandro Bonazzola 2014-03-04 09:29:56 UTC
This is an automated message.
Re-targeting all non-blocker bugs still open on 3.4.0 to 3.4.1.

Comment 15 Sandro Bonazzola 2014-06-11 07:04:38 UTC
This is an automated message:
oVirt 3.4.2 has been released.
This bug has been re-targeted from 3.4.2 to 3.4.3 since priority or severity were high or urgent.

Comment 16 Sandro Bonazzola 2014-06-11 07:05:16 UTC
This is an automated message:
oVirt 3.4.2 has been released.
This bug has been re-targeted from 3.4.2 to 3.4.3 since priority or severity were high or urgent.

Comment 18 Alon Bar-Lev 2014-07-07 00:48:47 UTC
will be fixed in rhel-6.5, centos-6.5.


Note You need to log in before you can comment on or make changes to this bug.