Bug 998366 - Power service provider - Possible race condition in code
Power service provider - Possible race condition in code
Status: CLOSED CANTFIX
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: openlmi-providers (Show other bugs)
7.0
Unspecified Unspecified
low Severity low
: rc
: ---
Assigned To: Radek Novacek
qe-baseos-daemons
:
Depends On:
Blocks: 922084
  Show dependency treegraph
 
Reported: 2013-08-19 03:24 EDT by Robin Hack
Modified: 2016-11-30 19:31 EST (History)
5 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2014-09-05 10:49:38 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Robin Hack 2013-08-19 03:24:12 EDT
Description of problem:
Possible race condition - unclean shutdown.

Step to reproduce:

In file power.c in function state_change_thread (on line 126)
173:             succeeded =  system("reboot --force &") == 0;
Initializes system reboot.

From line 210:
    MUTEX_LOCK(powerStateChangeJob->power);
    powerStateChangeJob->power->transitioningToPowerState = LMI_AssociatedPowerManagementService_TransitioningToPowerState_No_Change;

to line 230
is code, which should not be reached if restart is faster (SSD disks?).

Version-Release number of selected component (if applicable):
Last git commit:
commit 6532d453d6d25b816c3e0c08de3d3cea46dce543

How reproducible:
always

Actual results:
Small window for race condition.

Expected results:
Maybe.. no window for race condition? :)
Comment 2 Robin Hack 2013-08-20 04:38:21 EDT
Maybe you should use:
http://www.freedesktop.org/wiki/Software/systemd/inhibit/(In reply to Robin Hack from comment #0)
> Description of problem:
> Possible race condition - unclean shutdown.
> 
> Step to reproduce:
> 
> In file power.c in function state_change_thread (on line 126)
> 173:             succeeded =  system("reboot --force &") == 0;
> Initializes system reboot.
> 
> From line 210:
>     MUTEX_LOCK(powerStateChangeJob->power);
>     powerStateChangeJob->power->transitioningToPowerState =
> LMI_AssociatedPowerManagementService_TransitioningToPowerState_No_Change;
> 
> to line 230
> is code, which should not be reached if restart is faster (SSD disks?).
> 
> Version-Release number of selected component (if applicable):
> Last git commit:
> commit 6532d453d6d25b816c3e0c08de3d3cea46dce543
> 
> How reproducible:
> always
> 
> Actual results:
> Small window for race condition.
> 
> Expected results:
> Maybe.. no window for race condition? :)

Maybe you should use for systemd:
http://www.freedesktop.org/wiki/Software/systemd/inhibit/
Comment 3 Jan Safranek 2013-11-01 08:21:20 EDT
Maybe I miss point here, if system reboots, I don't see any possibility that the provider reaches line 230 and the machine is already restarted. The machine must turn off itself first, i.e. clearing all memory and loading fresh kernel, CIMOM and the provider.
Comment 4 Robin Hack 2013-11-01 08:33:35 EDT
1) 
system("reboot --force &") == 0; is nonsense and need to find solution.
Developers should solve this in time.

2)
This is theoretical possibility of race condition.
It's not urgent (maybe wont fix).
Comment 5 Robin Hack 2013-11-01 08:38:06 EDT
3) What about suspend? What if machine run to sleep before provider sends ACK to client?
Comment 6 Jan Safranek 2013-11-01 08:48:23 EDT
1) agreed

2) reboot is graceful, i.e. systemd stops the CIMOM and the CIMOM stops the provider, finishing all operations which are in progress.

3) sleep is not graceful, i.e. the machine could really shut down itself before sending a response, I am not sure if there is a sane way out of this problem.
Comment 7 Radek Novacek 2014-01-20 07:27:59 EST
I'm not sure what would be the proper solution of this bug. The provider should first reply to the client (via CIMOM) and then execute the operation. But I don't see any possibility for the provider to know when CIMOM finish processing the response.

In general, there are couple of cases that should be OK:
* graceful reboot
* graceful poweroff
Problematic cases are these:
* force reboot
* force poweroff
* sleep/hibernation

We can add some sleep to the working thread before executing the action, but is just workaround that will minimize the issue.

I'll move this to next release since it's not that urgent.
Comment 9 Radek Novacek 2014-09-05 10:49:38 EDT
I would say that not getting a reply when one want to execute 'reboot --force' is correct behaviour. If you want to know if the shutdown/reboot was successfully executed, don't use the '--force'.

Marking this as wontfix, it's not possible to fix this bug properly.

Note You need to log in before you can comment on or make changes to this bug.