Bug 1075013 - 'service ovirt-engine start' should return only after it is started
Summary: 'service ovirt-engine start' should return only after it is started
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: ovirt-engine
Classification: oVirt
Component: Services
Version: 3.5.4
Hardware: Unspecified
OS: Unspecified
low
low
Target Milestone: ---
: ---
Assignee: Yedidyah Bar David
QA Contact: Petr Matyáš
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2014-03-11 11:09 UTC by Yedidyah Bar David
Modified: 2022-02-25 08:14 UTC (History)
7 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2019-03-03 12:56:46 UTC
oVirt Team: Integration
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 1294025 0 unspecified CLOSED Spice vm console fails because servlet pki-resource is unavailable 2021-02-22 00:41:40 UTC
Red Hat Issue Tracker RHV-44896 0 None None None 2022-02-25 08:14:51 UTC
oVirt gerrit 77266 0 master MERGED packaging: Modified SysV service script for ovirt-engine 2017-06-01 15:59:16 UTC

Internal Links: 1294025

Description Yedidyah Bar David 2014-03-11 11:09:31 UTC
Description of problem:

Currently, 'service ovirt-engine start' followed by a loop of 'service ovirt-engine status', returns '0' (started) only after some time (up to a few seconds or more on a busy system).

This:
1. Is sometimes annoying - you manually start the engine, then have to wait some time until it's really started
2. Might cause problems in scripts that do not expect that
3. Requires various workarounds in existing code that wait till the engine is up - e.g. allinone plugin, dwh setup in 3.3.

Instead, the command should return only after the service is started, working, ready to serve requests.

Comment 1 Doron Fediuck 2014-03-12 04:16:40 UTC
Didi how would you define really started?
What happens if the DB is down or a port is taken? This leads to a stuck
service. So blocking the daemon is not considered as the right way to go,
since it may slow down the boot process.

Comment 2 Yedidyah Bar David 2014-03-12 07:16:48 UTC
(In reply to Doron Fediuck from comment #1)
> Didi how would you define really started?

I defined in the description "working, ready to serve requests". 'service ovirt-engine status' already checks something - perhaps that's enough. Perhaps better to also check the health page (if not done already by 'status').

> What happens if the DB is down or a port is taken?

Then we simply fail. I didn't say we must succeed always, just return when we know what our status is.

I don't know well the relevant code, so not sure the service knows very early if it succeeded starting or not. In principle it can take a long time - e.g. suppose that it only tries to connect to the db on the first request that actually needs db access. So we might need to add a loop of attempts with some maximum. Not sure.

> This leads to a stuck
> service. So blocking the daemon is not considered as the right way to go,
> since it may slow down the boot process.

Modern init systems start services in parallel, so I do not expect a significant impact here. Obviously this will have to be tested.

Comment 3 Sandro Bonazzola 2014-03-12 15:54:05 UTC
(In reply to Yedidyah Bar David from comment #2)
> (In reply to Doron Fediuck from comment #1)
> > Didi how would you define really started?
> 
> I defined in the description "working, ready to serve requests". 'service
> ovirt-engine status' already checks something - perhaps that's enough.
> Perhaps better to also check the health page (if not done already by
> 'status').


if not, please avoit to use the healt page, it's deprecated, check api availability instead.

> 
> > What happens if the DB is down or a port is taken?
> 
> Then we simply fail. I didn't say we must succeed always, just return when
> we know what our status is.
> 
> I don't know well the relevant code, so not sure the service knows very
> early if it succeeded starting or not. In principle it can take a long time
> - e.g. suppose that it only tries to connect to the db on the first request
> that actually needs db access. So we might need to add a loop of attempts
> with some maximum. Not sure.
> 
> > This leads to a stuck
> > service. So blocking the daemon is not considered as the right way to go,
> > since it may slow down the boot process.
> 
> Modern init systems start services in parallel, so I do not expect a
> significant impact here. Obviously this will have to be tested.

Comment 4 Sandro Bonazzola 2015-09-04 08:59:21 UTC
This is an automated message.
This Bugzilla report has been opened on a version which is not maintained anymore.
Please check if this bug is still relevant in oVirt 3.5.4.
If it's not relevant anymore, please close it (you may use EOL or CURRENT RELEASE resolution)
If it's an RFE please update the version to 4.0 if still relevant.

Comment 5 Yedidyah Bar David 2015-09-16 07:09:18 UTC
Still relevant IMO. Moving to 4.0 as it's considered low priority/severity - workarounds seem to mostly work...

Comment 6 Sandro Bonazzola 2015-09-16 07:17:43 UTC
Moving to new classification.

Comment 7 Red Hat Bugzilla Rules Engine 2015-10-19 10:59:13 UTC
Target release should be placed once a package build is known to fix a issue. Since this bug is not modified, the target version has been reset. Please use target milestone to plan a fix for a oVirt release.

Comment 8 Yaniv Kaul 2017-10-15 08:20:55 UTC
Can it be moved to MODIFIED or are there additional patches needed?

Comment 9 Lev Veyde 2017-10-15 10:07:45 UTC
(In reply to Yaniv Kaul from comment #8)
> Can it be moved to MODIFIED or are there additional patches needed?

No, we still checking if we can fix it for systemd.

Comment 10 Yedidyah Bar David 2019-03-03 12:56:46 UTC
Closing this. No-one other than me seems interested, and it's a bit non-trivial to get this right - and searching the net for stuff like 'systemctl start returns immediately' seems to show that this behavior is quite common.

Also, the linked patch is for the sysv init script, which is not used anymore - the engine is now supported only on el7 and fedora, meaning systemd. Porting to other init systems should be trivial, so if anyone wants to reopen current, it would still be nice to make the engine not less-compatible with other init systems.


Note You need to log in before you can comment on or make changes to this bug.