| Summary: | The showed status of the resource does not equal the target-role | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | Raoul Scarazzini <rscarazz> |
| Component: | pacemaker | Assignee: | Ken Gaillot <kgaillot> |
| Status: | CLOSED NOTABUG | QA Contact: | cluster-qe <cluster-qe> |
| Severity: | low | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 7.2 | CC: | abeekhof, cluster-maint, michele, rscarazz |
| Target Milestone: | rc | ||
| Target Release: | --- | ||
| Hardware: | x86_64 | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2016-02-02 14:39:34 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
|
Description
Raoul Scarazzini
2015-12-23 17:24:42 UTC
Highly unlikely that we would have started it. More likely, we found it running (ie. it hadn't been stopped prior to the cleanup being run). It does look like nova-compute was somehow started outside cluster control. I see target-role set to Stopped multiple times (04:17:54, 05:32:09, 05:36:27, 05:49:28, and 06:45:25). After the last one, the cluster correctly ensures nova-compute is stopped on all compute nodes. It appears that cleanup was run at 11:04:44. The cluster initiates probes (one-time monitor operations) on all 4 compute nodes, finds nova-compute running on overcloud-novacompute-1 and overcloud-novacompute-2, and stops them. This implies that the service was started outside cluster control sometime between 06:45:25 and 11:04:44. (Pacemaker won't run the regular recurring monitor while the service is stopped, although it is possible to configure a recurring monitor for target-role=Stopped exactly for the purpose of catching cases like this sooner.) If you want me to investigate further on the pacemaker side, let me know, but I think the issue is likely elsewhere. Ok, further investigations on this issue also ended up in additional problems with the resource agent, so I think that this bug can be closed, also because that lab is not available anymore. |