Bug 1503833

Summary: ceilometer-expirer cron job does not run correctly
Product: Red Hat OpenStack Reporter: nalmond
Component: instack-undercloudAssignee: Pradeep Kilambi <pkilambi>
Status: CLOSED ERRATA QA Contact: Sasha Smolyak <ssmolyak>
Severity: medium Docs Contact:
Priority: low    
Version: 8.0 (Liberty)CC: augol, mburns, pkilambi, rhel-osp-director-maint, srevivo
Target Milestone: Upstream M2Keywords: Reopened, Triaged
Target Release: 13.0 (Queens)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: instack-undercloud-8.3.1-0.20180304032746.fc5704f.el7ost Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1598927 1598931 (view as bug list) Environment:
Last Closed: 2018-06-27 13:37:56 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1598927, 1598931    

Description nalmond 2017-10-18 21:08:04 UTC
Description of problem:
The ceilometer-expirer cronjob that is configured by director per https://access.redhat.com/solutions/2497621 does not execute properly.

In /var/spool/mail/ceilometer, the following message is observed several times:
~~~
From ceilometer  Wed Oct 18 00:01:01 2017
Return-Path: <ceilometer>
X-Original-To: ceilometer
Delivered-To: ceilometer
Received: by overcloud-controller-0.localdomain (Postfix, from userid 166)
	id 63C172070DF2; Wed, 18 Oct 2017 00:01:01 +0000 (UTC)
From: "(Cron Daemon)" <ceilometer>
To: ceilometer
Subject: Cron <ceilometer@overcloud-controller-0> sleep $(($(od -A n -t d -N 3 /dev/urandom) 
Content-Type: text/plain; charset=UTF-8
Auto-Submitted: auto-generated
Precedence: bulk
X-Cron-Env: <XDG_SESSION_ID=1611>
X-Cron-Env: <XDG_RUNTIME_DIR=/run/user/166>
X-Cron-Env: <LANG=en_US.UTF-8>
X-Cron-Env: <PATH=/bin:/usr/bin:/usr/sbin SHELL=/bin/sh>
X-Cron-Env: <SHELL=/bin/sh>
X-Cron-Env: <HOME=/var/lib/ceilometer>
X-Cron-Env: <LOGNAME=ceilometer>
X-Cron-Env: <USER=ceilometer>
Message-Id: <20171018000101.63C172070DF2>
Date: Wed, 18 Oct 2017 00:01:01 +0000 (UTC)

/bin/sh: -c: line 0: unexpected EOF while looking for matching `)'
/bin/sh: -c: line 1: syntax error: unexpected end of file
~~~

The ceilometer user's cron contains:
1 0 * * * sleep $(($(od -A n -t d -N 3 /dev/urandom) % 86400)) && ceilometer-expirer

This seems to work if the % is escaped in the cron:
1 0 * * * sleep $(($(od -A n -t d -N 3 /dev/urandom) \% 86400)) && ceilometer-expirer

Version-Release number of selected component (if applicable):
openstack-puppet-modules-9.3.0-1.el7ost.noarch
Also observed in OSP 7, 10, and 11 (I don't have a recently deployed osp 9 environment to check)

How reproducible:
Everytime, I see this on several lab boxes and a customer environment

Steps to Reproduce:
1. Install undercloud
2. Deploy overcloud with director

Actual results:
cron does not run and error is seen in /var/spool/mail/ceilometer

Expected results:
cron runs and does not produce error

Additional info:
This is observed on both the undercloud and overcloud. I see this is set in /usr/share/instack-undercloud/puppet-stack-config/puppet-stack-config.pp and /usr/share/openstack-puppet/modules/tripleo/manifests/profile/base/ceilometer/expirer.pp

Comment 2 Julien Danjou 2017-10-19 14:00:25 UTC
I'm not sure Bugzilla are the best place to report problem about KCS, but anyway, I've fixed the KCS and added the missing \ in front of %.

Comment 3 nalmond 2017-10-19 14:19:03 UTC
Reopening this as this was opened because the puppet modules are creating the cron job incorrectly. My understanding is the kcs was written to verify the value set by the puppet modules, but the kcs is not the target of this bug.

The issue lies within these files:
/usr/share/instack-undercloud/puppet-stack-config/puppet-stack-config.pp
and
/usr/share/openstack-puppet/modules/tripleo/manifests/profile/base/ceilometer/expirer.pp

Upon rereading this bug, I realized this was poorly worded, my apologies for any confusion. Let me know if I can clarify this further.

Comment 10 Sasha Smolyak 2018-03-06 12:41:38 UTC
In /var/spool/mail/ceilometer in undercloud we now see following error:

From ceilometer.local  Tue Mar  6 00:32:46 2018
Return-Path: <ceilometer.local>
X-Original-To: ceilometer
Delivered-To: ceilometer.local
Received: by undercloud-0.redhat.local (Postfix, from userid 166)
	id 574CDC957E5; Tue,  6 Mar 2018 00:32:46 -0500 (EST)
From: "(Cron Daemon)" <ceilometer.local>
To: ceilometer.local
Subject: Cron <ceilometer@undercloud-0> sleep $(($(od -A n -t d -N 3 /dev/urandom) % 86400)) && ceilometer-expirer
Content-Type: text/plain; charset=UTF-8
Auto-Submitted: auto-generated
Precedence: bulk
X-Cron-Env: <XDG_SESSION_ID=346>
X-Cron-Env: <XDG_RUNTIME_DIR=/run/user/166>
X-Cron-Env: <LANG=en_US.UTF-8>
X-Cron-Env: <PATH=/bin:/usr/bin:/usr/sbin SHELL=/bin/sh>
X-Cron-Env: <SHELL=/bin/sh>
X-Cron-Env: <HOME=/var/lib/ceilometer>
X-Cron-Env: <LOGNAME=ceilometer>
X-Cron-Env: <USER=ceilometer>
Message-Id: <20180306053246.574CDC957E5.local>
Date: Tue,  6 Mar 2018 00:32:46 -0500 (EST)

/bin/sh: ceilometer-expirer: command not found

Comment 16 errata-xmlrpc 2018-06-27 13:37:56 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2018:2086