Bug 1697466
| Summary: | Provide guidance in order to get proper healthchecks for "cron" containers | ||
|---|---|---|---|
| Product: | Red Hat OpenStack | Reporter: | Cédric Jeanneret <cjeanner> |
| Component: | openstack-tripleo-common | Assignee: | Cédric Jeanneret <cjeanner> |
| Status: | CLOSED ERRATA | QA Contact: | Sasha Smolyak <ssmolyak> |
| Severity: | medium | Docs Contact: | |
| Priority: | medium | ||
| Version: | 15.0 (Stein) | CC: | aschultz, emacchi, jcoufal, mburns, sbaker, slinaber, ssmolyak |
| Target Milestone: | beta | Keywords: | FutureFeature, Triaged |
| Target Release: | 15.0 (Stein) | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | openstack-tripleo-common-10.7.1-0.20190423125010.2199eeb.el8ost | Doc Type: | If docs needed, set a value |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2019-09-21 11:21:11 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Cédric Jeanneret
2019-04-08 13:17:05 UTC
oh great, BZ doing its stuff (drop all content when we change the component, how nice).. So. This is a "research paper" in order to find the best way to get healthchecks for "cron" containers. We have to take into account that: - it's probably not in root crontab - it's probably not a "crontab", some have a dedicated file in /etc/cron.* directories We have to push ideas in here and test/validate them. So, currently, the state is:
[root@undercloud ~]# podman exec logrotate_crond crontab -l
# HEADER: This file was autogenerated at 2019-04-10 07:47:33 +0000 by puppet.
# HEADER: While it can still be managed manually, it is definitely not recommended.
# HEADER: Note particularly that the comments starting with 'Puppet Name' should
# HEADER: not be deleted, as doing so could cause duplicate cron jobs.
# Puppet Name: logrotate-crond
PATH=/bin:/usr/bin:/usr/sbin SHELL=/bin/sh
0 * * * * sleep `expr ${RANDOM} \% 90`; /usr/sbin/logrotate -s /var/lib/logrotate/logrotate-crond.status /etc/logrotate-crond.conf 2>&1|logger -t logrotate-crond
[root@undercloud ~]# podman exec cinder_api_cron crontab -l
no crontab for root
exit status 1
[root@undercloud ~]# podman exec nova_api_cron crontab -l
no crontab for root
exit status 1
[root@undercloud ~]# podman exec keystone_cron crontab -l
no crontab for root
exit status 1
This means:
- we can't rely on "crontab -l", especially since puppet adds a lot of garbage, counting lines isn't good
- container user doesn't necessarily own the job
We might want to use something like:
[root@undercloud ~]# podman exec cinder_api_cron ls /var/spool/cron
cinder
[root@undercloud ~]# podman exec nova_api_cron ls /var/spool/cron
nova
[root@undercloud ~]# podman exec keystone_cron ls /var/spool/cron
keystone
For instance:
[root@undercloud ~]# podman exec keystone_cron cat /var/spool/cron/keystone
# HEADER: This file was autogenerated at 2019-04-10 07:46:44 +0000 by puppet.
# HEADER: While it can still be managed manually, it is definitely not recommended.
# HEADER: Note particularly that the comments starting with 'Puppet Name' should
# HEADER: not be deleted, as doing so could cause duplicate cron jobs.
# Puppet Name: keystone-manage token_flush
PATH=/bin:/usr/bin:/usr/sbin SHELL=/bin/sh
1 * * * * keystone-manage token_flush >>/var/log/keystone/keystone-tokenflush.log 2>&1
So a 2-step validation might be possible:
step 1: list all the crontab in /var/lib/spool/cron
step 2: ensure we either have something in "root", or something in container_name.split('_')[0] (returns keystone, cinder, nova)
For the second step, we can use some "grep -cEv '^#' <file>" in order to get the number of uncommented lines, which should be >=2 so far.
So instead of guidance the health checks are actually already added to the relevant containers. There are two patches, on in tripleo-common, creating the script, and one in tripleo-heat-templates, activating the healthcheck. All the needed healthchecks, including for cron jobs, are present Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2019:2811 |