998366 – Power service provider - Possible race condition in code

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 998366 - Power service provider - Possible race condition in code

Summary: Power service provider - Possible race condition in code

Keywords:
Status:	CLOSED CANTFIX
Alias:	None
Product:	Red Hat Enterprise Linux 7
Classification:	Red Hat
Component:	openlmi-providers
Sub Component:
Version:	7.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	low
Severity:	low
Target Milestone:	rc
Target Release:	---
Assignee:	Radek Novacek
QA Contact:	qe-baseos-daemons
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	922084
TreeView+	depends on / blocked

Reported:	2013-08-19 07:24 UTC by Robin Hack
Modified:	2016-12-01 00:31 UTC (History)
CC List:	5 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2014-09-05 14:49:38 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Description Robin Hack 2013-08-19 07:24:12 UTC

Description of problem:
Possible race condition - unclean shutdown.

Step to reproduce:

In file power.c in function state_change_thread (on line 126)
173:             succeeded =  system("reboot --force &") == 0;
Initializes system reboot.

From line 210:
    MUTEX_LOCK(powerStateChangeJob->power);
    powerStateChangeJob->power->transitioningToPowerState = LMI_AssociatedPowerManagementService_TransitioningToPowerState_No_Change;

to line 230
is code, which should not be reached if restart is faster (SSD disks?).

Version-Release number of selected component (if applicable):
Last git commit:
commit 6532d453d6d25b816c3e0c08de3d3cea46dce543

How reproducible:
always

Actual results:
Small window for race condition.

Expected results:
Maybe.. no window for race condition? :)

Comment 2 Robin Hack 2013-08-20 08:38:21 UTC

Maybe you should use:
http://www.freedesktop.org/wiki/Software/systemd/inhibit/(In reply to Robin Hack from comment #0)
> Description of problem:
> Possible race condition - unclean shutdown.
> 
> Step to reproduce:
> 
> In file power.c in function state_change_thread (on line 126)
> 173:             succeeded =  system("reboot --force &") == 0;
> Initializes system reboot.
> 
> From line 210:
>     MUTEX_LOCK(powerStateChangeJob->power);
>     powerStateChangeJob->power->transitioningToPowerState =
> LMI_AssociatedPowerManagementService_TransitioningToPowerState_No_Change;
> 
> to line 230
> is code, which should not be reached if restart is faster (SSD disks?).
> 
> Version-Release number of selected component (if applicable):
> Last git commit:
> commit 6532d453d6d25b816c3e0c08de3d3cea46dce543
> 
> How reproducible:
> always
> 
> Actual results:
> Small window for race condition.
> 
> Expected results:
> Maybe.. no window for race condition? :)

Maybe you should use for systemd:
http://www.freedesktop.org/wiki/Software/systemd/inhibit/

Comment 3 Jan Safranek 2013-11-01 12:21:20 UTC

Maybe I miss point here, if system reboots, I don't see any possibility that the provider reaches line 230 and the machine is already restarted. The machine must turn off itself first, i.e. clearing all memory and loading fresh kernel, CIMOM and the provider.

Comment 4 Robin Hack 2013-11-01 12:33:35 UTC

1) 
system("reboot --force &") == 0; is nonsense and need to find solution.
Developers should solve this in time.

2)
This is theoretical possibility of race condition.
It's not urgent (maybe wont fix).

Comment 5 Robin Hack 2013-11-01 12:38:06 UTC

3) What about suspend? What if machine run to sleep before provider sends ACK to client?

Comment 6 Jan Safranek 2013-11-01 12:48:23 UTC

1) agreed

2) reboot is graceful, i.e. systemd stops the CIMOM and the CIMOM stops the provider, finishing all operations which are in progress.

3) sleep is not graceful, i.e. the machine could really shut down itself before sending a response, I am not sure if there is a sane way out of this problem.

Comment 7 Radek Novacek 2014-01-20 12:27:59 UTC

I'm not sure what would be the proper solution of this bug. The provider should first reply to the client (via CIMOM) and then execute the operation. But I don't see any possibility for the provider to know when CIMOM finish processing the response.

In general, there are couple of cases that should be OK:
* graceful reboot
* graceful poweroff
Problematic cases are these:
* force reboot
* force poweroff
* sleep/hibernation

We can add some sleep to the working thread before executing the action, but is just workaround that will minimize the issue.

I'll move this to next release since it's not that urgent.

Comment 9 Radek Novacek 2014-09-05 14:49:38 UTC

I would say that not getting a reply when one want to execute 'reboot --force' is correct behaviour. If you want to know if the shutdown/reboot was successfully executed, don't use the '--force'.

Marking this as wontfix, it's not possible to fix this bug properly.

Note You need to log in before you can comment on or make changes to this bug.