Bug 1697466 - Provide guidance in order to get proper healthchecks for "cron" containers
Summary: Provide guidance in order to get proper healthchecks for "cron" containers
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-tripleo-common
Version: 15.0 (Stein)
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: beta
: 15.0 (Stein)
Assignee: Cédric Jeanneret
QA Contact: Sasha Smolyak
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-04-08 13:17 UTC by Cédric Jeanneret
Modified: 2019-09-26 10:49 UTC (History)
7 users (show)

Fixed In Version: openstack-tripleo-common-10.7.1-0.20190423125010.2199eeb.el8ost
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-09-21 11:21:11 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
OpenStack gerrit 651456 0 'None' 'MERGED' 'New health check for cron containers' 2019-11-19 07:31:27 UTC
OpenStack gerrit 651460 0 'None' 'ABANDONED' 'Add health check directory and script to the container via kolla' 2019-11-19 07:31:27 UTC
OpenStack gerrit 651777 0 'None' 'MERGED' 'Activate health checks for cron containers' 2019-11-19 07:31:27 UTC
Red Hat Product Errata RHEA-2019:2811 0 None None None 2019-09-21 11:21:35 UTC

Description Cédric Jeanneret 2019-04-08 13:17:05 UTC
Description of problem:


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 1 Cédric Jeanneret 2019-04-08 13:26:20 UTC
oh great, BZ doing its stuff (drop all content when we change the component, how nice)..

So.

This is a "research paper" in order to find the best way to get healthchecks for "cron" containers.

We have to take into account that:
- it's probably not in root crontab
- it's probably not a "crontab", some have a dedicated file in /etc/cron.* directories

We have to push ideas in here and test/validate them.

Comment 2 Cédric Jeanneret 2019-04-10 08:13:44 UTC
So, currently, the state is:
[root@undercloud ~]# podman exec logrotate_crond crontab -l
# HEADER: This file was autogenerated at 2019-04-10 07:47:33 +0000 by puppet.
# HEADER: While it can still be managed manually, it is definitely not recommended.
# HEADER: Note particularly that the comments starting with 'Puppet Name' should
# HEADER: not be deleted, as doing so could cause duplicate cron jobs.
# Puppet Name: logrotate-crond
PATH=/bin:/usr/bin:/usr/sbin SHELL=/bin/sh
0 * * * * sleep `expr ${RANDOM} \% 90`; /usr/sbin/logrotate -s /var/lib/logrotate/logrotate-crond.status /etc/logrotate-crond.conf 2>&1|logger -t logrotate-crond
[root@undercloud ~]# podman exec cinder_api_cron crontab -l
no crontab for root
exit status 1
[root@undercloud ~]# podman exec nova_api_cron crontab -l
no crontab for root
exit status 1
[root@undercloud ~]# podman exec keystone_cron crontab -l
no crontab for root
exit status 1

This means:
- we can't rely on "crontab -l", especially since puppet adds a lot of garbage, counting lines isn't good
- container user doesn't necessarily own the job

We might want to use something like:
[root@undercloud ~]# podman exec cinder_api_cron ls /var/spool/cron
cinder
[root@undercloud ~]# podman exec nova_api_cron ls /var/spool/cron
nova
[root@undercloud ~]# podman exec keystone_cron ls /var/spool/cron
keystone

For instance:
[root@undercloud ~]# podman exec keystone_cron cat /var/spool/cron/keystone
# HEADER: This file was autogenerated at 2019-04-10 07:46:44 +0000 by puppet.
# HEADER: While it can still be managed manually, it is definitely not recommended.
# HEADER: Note particularly that the comments starting with 'Puppet Name' should
# HEADER: not be deleted, as doing so could cause duplicate cron jobs.
# Puppet Name: keystone-manage token_flush
PATH=/bin:/usr/bin:/usr/sbin SHELL=/bin/sh
1 * * * * keystone-manage token_flush >>/var/log/keystone/keystone-tokenflush.log 2>&1

So a 2-step validation might be possible:
step 1: list all the crontab in /var/lib/spool/cron
step 2: ensure we either have something in "root", or something in container_name.split('_')[0] (returns keystone, cinder, nova)

For the second step, we can use some "grep -cEv '^#' <file>" in order to get the number of uncommented lines, which should be >=2 so far.

Comment 3 Cédric Jeanneret 2019-05-20 10:49:31 UTC
So instead of guidance the health checks are actually already added to the relevant containers. There are two patches, on in tripleo-common, creating the script, and one in tripleo-heat-templates, activating the healthcheck.

Comment 7 Sasha Smolyak 2019-07-07 08:53:08 UTC
All the needed healthchecks, including for cron jobs, are present

Comment 9 errata-xmlrpc 2019-09-21 11:21:11 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2019:2811


Note You need to log in before you can comment on or make changes to this bug.