Bug 1383802 - osp-director-10: After Upgrading osp9 to osp10 'openstack-ceilometer-api service' won't start after reboot. (unrecognized arguments: --logfile /var/log/ceilometer/api.log)
Summary: osp-director-10: After Upgrading osp9 to osp10 'openstack-ceilometer-api serv...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-ceilometer
Version: 10.0 (Newton)
Hardware: x86_64
OS: Linux
high
high
Target Milestone: rc
: 10.0 (Newton)
Assignee: Pradeep Kilambi
QA Contact: Yurii Prokulevych
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-10-11 19:36 UTC by Omri Hochman
Modified: 2018-03-05 12:59 UTC (History)
12 users (show)

Fixed In Version: openstack-ceilometer-7.0.0-2.1.el7ost
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-12-14 16:16:04 UTC


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2016:2948 normal SHIPPED_LIVE Red Hat OpenStack Platform 10 enhancement update 2016-12-14 19:55:27 UTC
RDO 3239 None None None 2016-10-12 08:06:19 UTC
RDO 3256 None None None 2016-10-14 17:51:08 UTC

Description Omri Hochman 2016-10-11 19:36:07 UTC
osp-director-10: After Upgrading osp9 to osp10 'openstack-ceilometer-api service' won't start after reboot. [unrecognized arguments: --logfile /var/log/ceilometer/api.log]


Environment: 
-------------
instack-undercloud-5.0.0-0.20160930175750.9d2a655.el7ost.noarch
instack-5.0.0-1.el7ost.noarch
openstack-tripleo-heat-templates-compat-2.0.0-34.3.el7ost.noarch
python-heatclient-1.5.0-1.el7ost.noarch
openstack-heat-common-7.0.0-0.20160926200847.dd707bc.el7ost.noarch
python-heat-tests-7.0.0-0.20160926200847.dd707bc.el7ost.noarch
openstack-tripleo-heat-templates-5.0.0-0.20161003064637.d636e3a.1.1.el7ost.noarch
python-heat-agent-0.0.1-0.20160920204709.f123aa1.el7ost.noarch
openstack-heat-templates-0.0.1-0.20160920204709.f123aa1.el7ost.noarch
openstack-heat-api-cfn-7.0.0-0.20160926200847.dd707bc.el7ost.noarch
puppet-heat-9.4.0-1.1.el7ost.noarch
openstack-heat-engine-7.0.0-0.20160926200847.dd707bc.el7ost.noarch
heat-cfntools-1.3.0-2.el7ost.noarch
openstack-heat-api-7.0.0-0.20160926200847.dd707bc.el7ost.noarch
openstack-ceilometer-central-7.0.0-0.20160928024313.67bbd3f.el7ost.noarch
python-ceilometer-7.0.0-0.20160928024313.67bbd3f.el7ost.noarch
openstack-ceilometer-api-7.0.0-0.20160928024313.67bbd3f.el7ost.noarch
openstack-ceilometer-common-7.0.0-0.20160928024313.67bbd3f.el7ost.noarch
puppet-ceilometer-9.4.0-1.el7ost.noarch
python-ceilometer-tests-7.0.0-0.20160928024313.67bbd3f.el7ost.noarch
openstack-ceilometer-collector-7.0.0-0.20160928024313.67bbd3f.el7ost.noarch
openstack-ceilometer-notification-7.0.0-0.20160928024313.67bbd3f.el7ost.noarch
python-ceilometerclient-2.6.1-1.el7ost.noarch
openstack-ceilometer-polling-7.0.0-0.20160928024313.67bbd3f.el7ost.noarch


Steps:
--------
(1) Deploy Osp9 with 3 controllers , 1 compute 
(2) Upgrade Osp9 to Osp10 follow the guide : 
https://gitlab.cee.redhat.com/sathlang/ospd-9-to-10-upgrade
(3) reboot the Undercloud + the Overcloud 
(4) Check  openstack-ceilometer-api service on the underlcoud  
(5) Attempt to start the service manually. 

Results :
----------
[root@undercloud72 ~]# systemctl status openstack-ceilometer-api.service
● openstack-ceilometer-api.service - OpenStack ceilometer API service
   Loaded: loaded (/usr/lib/systemd/system/openstack-ceilometer-api.service; enabled; vendor preset: disabled)
   Active: failed (Result: start-limit) since Tue 2016-10-11 15:30:37 EDT; 2s ago
  Process: 23493 ExecStart=/usr/bin/ceilometer-api --logfile /var/log/ceilometer/api.log (code=exited, status=2)
 Main PID: 23493 (code=exited, status=2)

Oct 11 15:30:37 undercloud72.localdomain systemd[1]: openstack-ceilometer-api.service: main process exited, code=exited, status=2/INVALIDARGUMENT
Oct 11 15:30:37 undercloud72.localdomain systemd[1]: Unit openstack-ceilometer-api.service entered failed state.
Oct 11 15:30:37 undercloud72.localdomain systemd[1]: openstack-ceilometer-api.service failed.
Oct 11 15:30:37 undercloud72.localdomain systemd[1]: openstack-ceilometer-api.service holdoff time over, scheduling restart.
Oct 11 15:30:37 undercloud72.localdomain systemd[1]: start request repeated too quickly for openstack-ceilometer-api.service
Oct 11 15:30:37 undercloud72.localdomain systemd[1]: Failed to start OpenStack ceilometer API service.
Oct 11 15:30:37 undercloud72.localdomain systemd[1]: Unit openstack-ceilometer-api.service entered failed state.
Oct 11 15:30:37 undercloud72.localdomain systemd[1]: openstack-ceilometer-api.service failed.


/var/log/messages :
---------------------
Oct 11 15:13:42 undercloud72 systemd: openstack-ceilometer-api.service holdoff time over, scheduling restart.
Oct 11 15:13:42 undercloud72 systemd: Started OpenStack ceilometer API service.
Oct 11 15:13:42 undercloud72 systemd: Starting OpenStack ceilometer API service...
Oct 11 15:13:43 undercloud72 ironic-inspector: 2016-10-11 15:13:43.027 728 DEBUG futurist.periodics [-] Submitting periodic function 'ironic_inspector.main.periodic_clean_up' _process_schedu
led /usr/lib/python2.7/site-packages/futurist/periodics.py:614
Oct 11 15:13:43 undercloud72 ceilometer-api: usage: ceilometer-api [-h] [--port PORT]
Oct 11 15:13:43 undercloud72 ceilometer-api: ceilometer-api: error: unrecognized arguments: --logfile /var/log/ceilometer/api.log
Oct 11 15:13:43 undercloud72 systemd: openstack-ceilometer-api.service: main process exited, code=exited, status=2/INVALIDARGUMENT
Oct 11 15:13:43 undercloud72 systemd: Unit openstack-ceilometer-api.service entered failed state.
Oct 11 15:13:43 undercloud72 systemd: openstack-ceilometer-api.service failed.
Oct 11 15:13:43 undercloud72 ironic-api: 2016-10-11 15:13:43.272 6214 INFO eventlet.wsgi.server [req-5dfe1370-a75b-4fd3-acab-33ae4e124f1f ironic service - - -] 192.168.0.1 "GET /v1/nodes HTT
P/1.1" status: 200  len: 3112 time: 0.2385290
Oct 11 15:13:43 undercloud72 systemd: openstack-ceilometer-api.service holdoff time over, scheduling restart.
Oct 11 15:13:43 undercloud72 systemd: start request repeated too quickly for openstack-ceilometer-api.service
Oct 11 15:13:43 undercloud72 systemd: Failed to start OpenStack ceilometer API service.
Oct 11 15:13:43 undercloud72 systemd: Unit openstack-ceilometer-api.service entered failed state.
Oct 11 15:13:43 undercloud72 systemd: openstack-ceilometer-api.service failed.

Comment 1 Alexander Chuzhoy 2016-10-11 19:47:22 UTC
So this version of openstack-ceilometer-api doesn't support the arg --logfile nor --log-file.

Downgraded to 6.1.3-2.el7ost  and was able to start the service as well as runing the command with this arg.

Comment 5 Marios Andreou 2016-10-12 07:43:19 UTC
So, to be clear there are 2 issues here:

1. *undercloud* ceilometer is disabled by default on OSP9 to OSP10 upgrade unless operator sets "enable_telemetry = true" in undercloud.conf so we definitely need a docs bug for that to rhos-docs team

2. *undercloud* ceilometer fails to even start manually after the upgrade - there is a legitimate issue there that prad has fixed with https://review.rdoproject.org/r/#/c/3239/ which has merged. (@prad would this potentially affect the overcloud nodes too  - I assume they get the same versions?)

I'd propose we use this bug for 2 (since it has the trace above pointing at that). I filed https://bugzilla.redhat.com/show_bug.cgi?id=1383923 for the docs fix and assigned to rhos-docs.  

I think we can also move this BZ to post since that rdo fix merged.

Comment 6 Marios Andreou 2016-10-12 08:54:16 UTC
just reporting back to confirm that when I set enable_telemetry to true in /home/stack/undercloud.conf before running the OSP9-->OSP10 undercloud upgrade, I get to keep my ceilometer services (except the ceilometer-api issue that this BZ reports with the "--logfile " argument for which I didn't apply/get the fix yet):


        [stack@instack ~]$ grep enable_telemetry ./undercloud.conf 
        enable_telemetry = true
        [stack@instack ~]$ openstack-service status  | grep ceilo
        Failed to get properties: Unit name openstack-zaqar@.service is missing the instance name.
        MainPID=0 Id=openstack-ceilometer-api.service ActiveState=failed
        MainPID=12570 Id=openstack-ceilometer-central.service ActiveState=active
        MainPID=12504 Id=openstack-ceilometer-collector.service ActiveState=active
        MainPID=12491 Id=openstack-ceilometer-notification.service ActiveState=active
        [stack@instack ~]$ systemctl status -l openstack-ceilometer-api
        ● openstack-ceilometer-api.service - OpenStack ceilometer API service
           Loaded: loaded (/usr/lib/systemd/system/openstack-ceilometer-api.service; enabled; vendor preset: disabled)
           Active: failed (Result: start-limit) since Wed 2016-10-12 04:28:36 EDT; 18min ago
          Process: 12683 ExecStart=/usr/bin/ceilometer-api --logfile /var/log/ceilometer/api.log (code=exited, status=2)
         Main PID: 12683 (code=exited, status=2)

Comment 7 Yurii Prokulevych 2016-10-12 11:59:46 UTC
Hi Marios,

Looks like the issue is not just with '--logfile' argument but also with port binding. I've opened separate bz 1384005 to track it.

Comment 8 Marios Andreou 2016-10-12 12:19:50 UTC
(In reply to Yurii Prokulevych from comment #7)
> Hi Marios,
> 
> Looks like the issue is not just with '--logfile' argument but also with
> port binding. I've opened separate bz 1384005 to track it.

Thanks for taking a look Yurii - left a note there... not saying there isn't another issue but I'm not clear if the undercloud config was properly applied see https://bugzilla.redhat.com/show_bug.cgi?id=1384005#c1.

I guess we will also find out soon enough once we get the fix from Prad linked above into a puddle for testing.

Comment 9 Mike Burns 2016-10-14 17:41:57 UTC
7.0.0-2.1 does not yet include that fix.

Comment 14 errata-xmlrpc 2016-12-14 16:16:04 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHEA-2016-2948.html


Note You need to log in before you can comment on or make changes to this bug.