Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1697466

Summary: Provide guidance in order to get proper healthchecks for "cron" containers
Product: Red Hat OpenStack Reporter: Cédric Jeanneret <cjeanner>
Component: openstack-tripleo-commonAssignee: Cédric Jeanneret <cjeanner>
Status: CLOSED ERRATA QA Contact: Sasha Smolyak <ssmolyak>
Severity: medium Docs Contact:
Priority: medium    
Version: 15.0 (Stein)CC: aschultz, emacchi, jcoufal, mburns, sbaker, slinaber, ssmolyak
Target Milestone: betaKeywords: FutureFeature, Triaged
Target Release: 15.0 (Stein)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: openstack-tripleo-common-10.7.1-0.20190423125010.2199eeb.el8ost Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-09-21 11:21:11 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Cédric Jeanneret 2019-04-08 13:17:05 UTC
Description of problem:


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 1 Cédric Jeanneret 2019-04-08 13:26:20 UTC
oh great, BZ doing its stuff (drop all content when we change the component, how nice)..

So.

This is a "research paper" in order to find the best way to get healthchecks for "cron" containers.

We have to take into account that:
- it's probably not in root crontab
- it's probably not a "crontab", some have a dedicated file in /etc/cron.* directories

We have to push ideas in here and test/validate them.

Comment 2 Cédric Jeanneret 2019-04-10 08:13:44 UTC
So, currently, the state is:
[root@undercloud ~]# podman exec logrotate_crond crontab -l
# HEADER: This file was autogenerated at 2019-04-10 07:47:33 +0000 by puppet.
# HEADER: While it can still be managed manually, it is definitely not recommended.
# HEADER: Note particularly that the comments starting with 'Puppet Name' should
# HEADER: not be deleted, as doing so could cause duplicate cron jobs.
# Puppet Name: logrotate-crond
PATH=/bin:/usr/bin:/usr/sbin SHELL=/bin/sh
0 * * * * sleep `expr ${RANDOM} \% 90`; /usr/sbin/logrotate -s /var/lib/logrotate/logrotate-crond.status /etc/logrotate-crond.conf 2>&1|logger -t logrotate-crond
[root@undercloud ~]# podman exec cinder_api_cron crontab -l
no crontab for root
exit status 1
[root@undercloud ~]# podman exec nova_api_cron crontab -l
no crontab for root
exit status 1
[root@undercloud ~]# podman exec keystone_cron crontab -l
no crontab for root
exit status 1

This means:
- we can't rely on "crontab -l", especially since puppet adds a lot of garbage, counting lines isn't good
- container user doesn't necessarily own the job

We might want to use something like:
[root@undercloud ~]# podman exec cinder_api_cron ls /var/spool/cron
cinder
[root@undercloud ~]# podman exec nova_api_cron ls /var/spool/cron
nova
[root@undercloud ~]# podman exec keystone_cron ls /var/spool/cron
keystone

For instance:
[root@undercloud ~]# podman exec keystone_cron cat /var/spool/cron/keystone
# HEADER: This file was autogenerated at 2019-04-10 07:46:44 +0000 by puppet.
# HEADER: While it can still be managed manually, it is definitely not recommended.
# HEADER: Note particularly that the comments starting with 'Puppet Name' should
# HEADER: not be deleted, as doing so could cause duplicate cron jobs.
# Puppet Name: keystone-manage token_flush
PATH=/bin:/usr/bin:/usr/sbin SHELL=/bin/sh
1 * * * * keystone-manage token_flush >>/var/log/keystone/keystone-tokenflush.log 2>&1

So a 2-step validation might be possible:
step 1: list all the crontab in /var/lib/spool/cron
step 2: ensure we either have something in "root", or something in container_name.split('_')[0] (returns keystone, cinder, nova)

For the second step, we can use some "grep -cEv '^#' <file>" in order to get the number of uncommented lines, which should be >=2 so far.

Comment 3 Cédric Jeanneret 2019-05-20 10:49:31 UTC
So instead of guidance the health checks are actually already added to the relevant containers. There are two patches, on in tripleo-common, creating the script, and one in tripleo-heat-templates, activating the healthcheck.

Comment 7 Sasha Smolyak 2019-07-07 08:53:08 UTC
All the needed healthchecks, including for cron jobs, are present

Comment 9 errata-xmlrpc 2019-09-21 11:21:11 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2019:2811