Hide Forgot
osp-director-10: After Upgrading osp9 to osp10 'openstack-ceilometer-api service' won't start after reboot. [unrecognized arguments: --logfile /var/log/ceilometer/api.log] Environment: ------------- instack-undercloud-5.0.0-0.20160930175750.9d2a655.el7ost.noarch instack-5.0.0-1.el7ost.noarch openstack-tripleo-heat-templates-compat-2.0.0-34.3.el7ost.noarch python-heatclient-1.5.0-1.el7ost.noarch openstack-heat-common-7.0.0-0.20160926200847.dd707bc.el7ost.noarch python-heat-tests-7.0.0-0.20160926200847.dd707bc.el7ost.noarch openstack-tripleo-heat-templates-5.0.0-0.20161003064637.d636e3a.1.1.el7ost.noarch python-heat-agent-0.0.1-0.20160920204709.f123aa1.el7ost.noarch openstack-heat-templates-0.0.1-0.20160920204709.f123aa1.el7ost.noarch openstack-heat-api-cfn-7.0.0-0.20160926200847.dd707bc.el7ost.noarch puppet-heat-9.4.0-1.1.el7ost.noarch openstack-heat-engine-7.0.0-0.20160926200847.dd707bc.el7ost.noarch heat-cfntools-1.3.0-2.el7ost.noarch openstack-heat-api-7.0.0-0.20160926200847.dd707bc.el7ost.noarch openstack-ceilometer-central-7.0.0-0.20160928024313.67bbd3f.el7ost.noarch python-ceilometer-7.0.0-0.20160928024313.67bbd3f.el7ost.noarch openstack-ceilometer-api-7.0.0-0.20160928024313.67bbd3f.el7ost.noarch openstack-ceilometer-common-7.0.0-0.20160928024313.67bbd3f.el7ost.noarch puppet-ceilometer-9.4.0-1.el7ost.noarch python-ceilometer-tests-7.0.0-0.20160928024313.67bbd3f.el7ost.noarch openstack-ceilometer-collector-7.0.0-0.20160928024313.67bbd3f.el7ost.noarch openstack-ceilometer-notification-7.0.0-0.20160928024313.67bbd3f.el7ost.noarch python-ceilometerclient-2.6.1-1.el7ost.noarch openstack-ceilometer-polling-7.0.0-0.20160928024313.67bbd3f.el7ost.noarch Steps: -------- (1) Deploy Osp9 with 3 controllers , 1 compute (2) Upgrade Osp9 to Osp10 follow the guide : https://gitlab.cee.redhat.com/sathlang/ospd-9-to-10-upgrade (3) reboot the Undercloud + the Overcloud (4) Check openstack-ceilometer-api service on the underlcoud (5) Attempt to start the service manually. Results : ---------- [root@undercloud72 ~]# systemctl status openstack-ceilometer-api.service ● openstack-ceilometer-api.service - OpenStack ceilometer API service Loaded: loaded (/usr/lib/systemd/system/openstack-ceilometer-api.service; enabled; vendor preset: disabled) Active: failed (Result: start-limit) since Tue 2016-10-11 15:30:37 EDT; 2s ago Process: 23493 ExecStart=/usr/bin/ceilometer-api --logfile /var/log/ceilometer/api.log (code=exited, status=2) Main PID: 23493 (code=exited, status=2) Oct 11 15:30:37 undercloud72.localdomain systemd[1]: openstack-ceilometer-api.service: main process exited, code=exited, status=2/INVALIDARGUMENT Oct 11 15:30:37 undercloud72.localdomain systemd[1]: Unit openstack-ceilometer-api.service entered failed state. Oct 11 15:30:37 undercloud72.localdomain systemd[1]: openstack-ceilometer-api.service failed. Oct 11 15:30:37 undercloud72.localdomain systemd[1]: openstack-ceilometer-api.service holdoff time over, scheduling restart. Oct 11 15:30:37 undercloud72.localdomain systemd[1]: start request repeated too quickly for openstack-ceilometer-api.service Oct 11 15:30:37 undercloud72.localdomain systemd[1]: Failed to start OpenStack ceilometer API service. Oct 11 15:30:37 undercloud72.localdomain systemd[1]: Unit openstack-ceilometer-api.service entered failed state. Oct 11 15:30:37 undercloud72.localdomain systemd[1]: openstack-ceilometer-api.service failed. /var/log/messages : --------------------- Oct 11 15:13:42 undercloud72 systemd: openstack-ceilometer-api.service holdoff time over, scheduling restart. Oct 11 15:13:42 undercloud72 systemd: Started OpenStack ceilometer API service. Oct 11 15:13:42 undercloud72 systemd: Starting OpenStack ceilometer API service... Oct 11 15:13:43 undercloud72 ironic-inspector: 2016-10-11 15:13:43.027 728 DEBUG futurist.periodics [-] Submitting periodic function 'ironic_inspector.main.periodic_clean_up' _process_schedu led /usr/lib/python2.7/site-packages/futurist/periodics.py:614 Oct 11 15:13:43 undercloud72 ceilometer-api: usage: ceilometer-api [-h] [--port PORT] Oct 11 15:13:43 undercloud72 ceilometer-api: ceilometer-api: error: unrecognized arguments: --logfile /var/log/ceilometer/api.log Oct 11 15:13:43 undercloud72 systemd: openstack-ceilometer-api.service: main process exited, code=exited, status=2/INVALIDARGUMENT Oct 11 15:13:43 undercloud72 systemd: Unit openstack-ceilometer-api.service entered failed state. Oct 11 15:13:43 undercloud72 systemd: openstack-ceilometer-api.service failed. Oct 11 15:13:43 undercloud72 ironic-api: 2016-10-11 15:13:43.272 6214 INFO eventlet.wsgi.server [req-5dfe1370-a75b-4fd3-acab-33ae4e124f1f ironic service - - -] 192.168.0.1 "GET /v1/nodes HTT P/1.1" status: 200 len: 3112 time: 0.2385290 Oct 11 15:13:43 undercloud72 systemd: openstack-ceilometer-api.service holdoff time over, scheduling restart. Oct 11 15:13:43 undercloud72 systemd: start request repeated too quickly for openstack-ceilometer-api.service Oct 11 15:13:43 undercloud72 systemd: Failed to start OpenStack ceilometer API service. Oct 11 15:13:43 undercloud72 systemd: Unit openstack-ceilometer-api.service entered failed state. Oct 11 15:13:43 undercloud72 systemd: openstack-ceilometer-api.service failed.
So this version of openstack-ceilometer-api doesn't support the arg --logfile nor --log-file. Downgraded to 6.1.3-2.el7ost and was able to start the service as well as runing the command with this arg.
So, to be clear there are 2 issues here: 1. *undercloud* ceilometer is disabled by default on OSP9 to OSP10 upgrade unless operator sets "enable_telemetry = true" in undercloud.conf so we definitely need a docs bug for that to rhos-docs team 2. *undercloud* ceilometer fails to even start manually after the upgrade - there is a legitimate issue there that prad has fixed with https://review.rdoproject.org/r/#/c/3239/ which has merged. (@prad would this potentially affect the overcloud nodes too - I assume they get the same versions?) I'd propose we use this bug for 2 (since it has the trace above pointing at that). I filed https://bugzilla.redhat.com/show_bug.cgi?id=1383923 for the docs fix and assigned to rhos-docs. I think we can also move this BZ to post since that rdo fix merged.
just reporting back to confirm that when I set enable_telemetry to true in /home/stack/undercloud.conf before running the OSP9-->OSP10 undercloud upgrade, I get to keep my ceilometer services (except the ceilometer-api issue that this BZ reports with the "--logfile " argument for which I didn't apply/get the fix yet): [stack@instack ~]$ grep enable_telemetry ./undercloud.conf enable_telemetry = true [stack@instack ~]$ openstack-service status | grep ceilo Failed to get properties: Unit name openstack-zaqar@.service is missing the instance name. MainPID=0 Id=openstack-ceilometer-api.service ActiveState=failed MainPID=12570 Id=openstack-ceilometer-central.service ActiveState=active MainPID=12504 Id=openstack-ceilometer-collector.service ActiveState=active MainPID=12491 Id=openstack-ceilometer-notification.service ActiveState=active [stack@instack ~]$ systemctl status -l openstack-ceilometer-api ● openstack-ceilometer-api.service - OpenStack ceilometer API service Loaded: loaded (/usr/lib/systemd/system/openstack-ceilometer-api.service; enabled; vendor preset: disabled) Active: failed (Result: start-limit) since Wed 2016-10-12 04:28:36 EDT; 18min ago Process: 12683 ExecStart=/usr/bin/ceilometer-api --logfile /var/log/ceilometer/api.log (code=exited, status=2) Main PID: 12683 (code=exited, status=2)
Hi Marios, Looks like the issue is not just with '--logfile' argument but also with port binding. I've opened separate bz 1384005 to track it.
(In reply to Yurii Prokulevych from comment #7) > Hi Marios, > > Looks like the issue is not just with '--logfile' argument but also with > port binding. I've opened separate bz 1384005 to track it. Thanks for taking a look Yurii - left a note there... not saying there isn't another issue but I'm not clear if the undercloud config was properly applied see https://bugzilla.redhat.com/show_bug.cgi?id=1384005#c1. I guess we will also find out soon enough once we get the fix from Prad linked above into a puddle for testing.
7.0.0-2.1 does not yet include that fix.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHEA-2016-2948.html