Bug 1315442 - rhel-osp-director: Upgrade undercloud 7.3->8.0. First run of "openstack undercloud install" fails, re-running completes fine.
Summary: rhel-osp-director: Upgrade undercloud 7.3->8.0. First run of "openstack under...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: instack-undercloud
Version: 8.0 (Liberty)
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ga
: 8.0 (Liberty)
Assignee: Marios Andreou
QA Contact: Alexander Chuzhoy
URL:
Whiteboard:
: 1326809 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-03-07 18:19 UTC by Alexander Chuzhoy
Modified: 2022-08-16 14:06 UTC (History)
16 users (show)

Fixed In Version: instack-undercloud-2.2.7-4.el7ost
Doc Type: Bug Fix
Doc Text:
Cause: There is a problem with the restart of the systemd-journald service after the undercloud packages have been updated, as part of the undercloud upgrade. Consequence: This causes the "openstack undercloud upgrade" to exit with an error such as "2016-03-07 09:55:52,925 INFO: ERROR: 2016-03-07 09:55:52,924 -- Hook FAILED.". Re-running "openstack undercloud upgrade" at this point would then complete without error. Fix: Ensure that the systemd-journald service restart completes successfully, like we do at https://review.openstack.org/#/c/300051/. Result: This allows the "openstack undercloud upgrade" to complete successfully on first run.
Clone Of:
Environment:
Last Closed: 2016-04-15 14:31:11 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
install-undercloud.log (135.03 KB, application/x-gzip)
2016-03-07 18:21 UTC, Alexander Chuzhoy
no flags Details


Links
System ID Private Priority Status Summary Last Updated
OpenStack gerrit 300051 0 'None' 'MERGED' 'Temporarily set +e on systemd-journald restart for +bug/1564471' 2019-11-29 17:14:01 UTC
OpenStack gerrit 300360 0 'None' 'MERGED' 'Temporarily set +e on systemd-journald restart for +bug/1564471' 2019-11-29 17:14:01 UTC
Red Hat Issue Tracker OSP-4548 0 None None None 2022-08-16 14:06:03 UTC
Red Hat Product Errata RHBA-2016:0637 0 normal SHIPPED_LIVE Red Hat OpenStack Platform 8 director release candidate Bug Fix Advisory 2016-04-15 18:28:05 UTC

Description Alexander Chuzhoy 2016-03-07 18:19:11 UTC
rhel-osp-director: Upgrade undercloud 7.3->8.0. First run of "openstack undercloud install" fails, rerunning completes fine.


Environment:
instack-undercloud-2.2.4-1.el7ost.noarch

Steps to reproduce:
1. Deploy 7.3 
2. Get the yum repos of 8.0
3. run yum update on the undercloud node.
4. run "openstack undercloud install"

Result: the run of openstack undercloud install fails:


2016-03-07 09:55:38,018 INFO: + touch /etc/sysconfig/iptables
2016-03-07 09:55:38,023 INFO: + os-svc-enable -n iptables
2016-03-07 09:55:38,057 INFO: WARNING: map-services has been deprecated.  Please use the svc-map element.
2016-03-07 09:55:38,224 INFO: + os-svc-restart -n iptables
2016-03-07 09:55:38,261 INFO: WARNING: map-services has been deprecated.  Please use the svc-map element.
2016-03-07 09:55:38,532 INFO: + target_tag=01-iptables
2016-03-07 09:55:38,533 INFO: + date +%s.%N
2016-03-07 09:55:38,535 INFO: + output '01-iptables completed'
2016-03-07 09:55:38,535 INFO: ++ date
2016-03-07 09:55:38,537 INFO: + echo dib-run-parts Mon Mar 7 09:55:38 EST 2016 01-iptables completed
2016-03-07 09:55:38,537 INFO: dib-run-parts Mon Mar 7 09:55:38 EST 2016 01-iptables completed
2016-03-07 09:55:38,537 INFO: + for target in '$targets'
2016-03-07 09:55:38,537 INFO: + output 'Running /tmp/tmp5t46Kc/pre-install.d/01-persistent-journal'
2016-03-07 09:55:38,537 INFO: ++ date
2016-03-07 09:55:38,538 INFO: + echo dib-run-parts Mon Mar 7 09:55:38 EST 2016 Running /tmp/tmp5t46Kc/pre-install.d/01-persistent-journal
2016-03-07 09:55:38,538 INFO: dib-run-parts Mon Mar 7 09:55:38 EST 2016 Running /tmp/tmp5t46Kc/pre-install.d/01-persistent-journal
2016-03-07 09:55:38,538 INFO: + target_tag=01-persistent-journal
2016-03-07 09:55:38,538 INFO: + date +%s.%N
2016-03-07 09:55:38,539 INFO: + /tmp/tmp5t46Kc/pre-install.d/01-persistent-journal
2016-03-07 09:55:52,918 INFO: Job for systemd-journald.service failed. See "systemctl status systemd-journald.service" and "journalctl -xe" for details.
2016-03-07 09:55:52,925 INFO: INFO: 2016-03-07 09:55:52,923 -- ############### End stdout/stderr logging ###############
2016-03-07 09:55:52,925 INFO: ERROR: 2016-03-07 09:55:52,924 --     Hook FAILED.
2016-03-07 09:55:52,925 INFO: ERROR: 2016-03-07 09:55:52,924 -- Failed running command ['dib-run-parts', u'/tmp/tmp5t46Kc/pre-install.d']
2016-03-07 09:55:52,925 INFO:   File "/usr/lib/python2.7/site-packages/instack/main.py", line 163, in main
2016-03-07 09:55:52,926 INFO:     em.run()
2016-03-07 09:55:52,926 INFO:   File "/usr/lib/python2.7/site-packages/instack/runner.py", line 79, in run
2016-03-07 09:55:52,950 INFO:     self.run_hook(hook)
2016-03-07 09:55:52,951 INFO:   File "/usr/lib/python2.7/site-packages/instack/runner.py", line 172, in run_hook
2016-03-07 09:55:52,951 INFO:     raise Exception("Failed running command %s" % command)
2016-03-07 09:55:52,951 INFO: ERROR: 2016-03-07 09:55:52,940 -- None



Note: rerunning "openstack undercloud install" completes without issues.

Expected result:
Successful run of "openstack undercloud install" on first attempt.

Comment 2 Alexander Chuzhoy 2016-03-07 18:21:36 UTC
Created attachment 1133888 [details]
install-undercloud.log

Comment 3 Brad P. Crochet 2016-03-17 12:35:32 UTC
I am unable to reproduce this. Please retest on latest build.

Comment 4 Alexander Chuzhoy 2016-03-17 20:14:24 UTC
FailedQA.
Reproduced against the latest poodle.

Comment 5 Lukas Bezdicka 2016-03-29 12:14:15 UTC
I reproduced the journal issue on memory constrained system. Enabling swap on undercloud node helps.

Comment 6 Andreas Karis 2016-03-29 15:44:15 UTC
Just reproduced on a system with enough RAM and swap available, this does not seem to be an issue with memory constraints:

[root@undercloud site-packages]# free -m
              total        used        free      shared  buff/cache   available
Mem:          15887        3518        3228         759        9139       11251
Swap:         32767           0       32767

Comment 7 Omri Hochman 2016-03-29 17:02:17 UTC
reproduced with : 
----------------
instack-0.0.8-2.el7ost.noarch
instack-undercloud-2.2.7-1.el7ost.noarch
puppet-3.6.2-2.el7.noarch
openstack-puppet-modules-7.0.17-1.el7ost.noarch
openstack-tripleo-puppet-elements-0.0.5-1.el7ost.noarch
openstack-tripleo-heat-templates-kilo-0.8.14-1.el7ost.noarch
python-tripleoclient-0.3.4-1.el7ost.noarch
openstack-tripleo-image-elements-0.9.9-1.el7ost.noarch
openstack-tripleo-heat-templates-0.8.14-1.el7ost.noarch
openstack-tripleo-0.0.7-1.el7ost.noarch
openstack-tripleo-common-0.3.1-1.el7ost.noarch
openstack-tripleo-puppet-elements-0.0.5-1.el7ost.noarch

Comment 10 Alexander Chuzhoy 2016-03-30 22:39:37 UTC
Reproduced with "openstack undercloud upgrade".

Comment 11 Marios Andreou 2016-03-31 12:36:22 UTC
am trying to understand more about this today. It fails trying to run this file 
fyi https://github.com/openstack/instack-undercloud/blob/master/elements/undercloud-install/pre-install.d/01-persistent-journal 

(from the trace above), I'm also reproducing this with a setup today, trying the openstack undercloud upgrade 

When I first deployed an overcloud and then upgraded my undercloud it reproduced as above. When I just upgraded a fresh new undercloud it completed OK.

Comment 12 Marios Andreou 2016-03-31 14:54:59 UTC
This is from the journal when the error just happened again:

Mar 31 09:30:57 instack.localdomain systemd[1]: Stopping Flush Journal to Persistent Storage...
Mar 31 09:30:57 instack.localdomain systemd-journal[393]: Journal stopped
Mar 31 09:30:57 instack.localdomain systemd-journal[23175]: Permanent journal is using 8.0M (max allowed 2.9G, trying to leave 4.0G free of 14.5G available → current limit 2.9G).
Mar 31 09:30:57 instack.localdomain systemd-journal[23175]: Permanent journal is using 8.0M (max allowed 2.9G, trying to leave 4.0G free of 14.5G available → current limit 2.9G).
Mar 31 09:31:07 instack.localdomain systemd-journal[23252]: Permanent journal is using 168.0M (max allowed 2.9G, trying to leave 4.0G free of 14.3G available → current limit 2.9G).
Mar 31 09:31:07 instack.localdomain systemd-journald[393]: Received SIGTERM from PID 1 (systemd).
Mar 31 09:31:07 instack.localdomain systemd-journald[23175]: Failed to create new runtime journal: No such file or directory
Mar 31 09:31:07 instack.localdomain systemd-journald[23175]: Assertion 'f' failed at src/journal/journal-file.c:132, function journal_file_close(). Aborting.
Mar 31 09:31:07 instack.localdomain systemd-journal[23252]: Journal started
Mar 31 09:31:02 instack.localdomain sudo[23221]:    stack : TTY=pts/2 ; PWD=/home/stack ; USER=root ; COMMAND=/bin/journalctl -fn 100
Mar 31 09:31:07 instack.localdomain polkitd[663]: Unregistered Authentication Agent for unix-process:23169:362368 (system bus name :1.37, object path /org/freedesktop/PolicyKit1/AuthenticationAgent, locale en_US.UTF-8) (disconnected from bus)
Mar 31 09:31:07 instack.localdomain systemd[1]: Starting Flush Journal to Persistent Storage...
Mar 31 09:31:07 instack.localdomain systemd[1]: Started Flush Journal to Persistent Storage.


There is an assertion fail above ^^^ and since we have -e at  https://github.com/openstack/instack-undercloud/blob/master/elements/undercloud-install/pre-install.d/01-persistent-journal the undercloud install fails.

Now the -e is normally a good thing, we want to make sure nothing fails here.

For now could we set +e for that restart, and then have a check to make sure journalctl is indeed running?

Comment 13 Marios Andreou 2016-03-31 15:56:43 UTC
have a workaround for this will be posting shortly (filed upstream bug at https://bugs.launchpad.net/tripleo/+bug/1564471 )

Comment 14 Marios Andreou 2016-03-31 16:08:26 UTC
I just tested the workaround at https://review.openstack.org/300051 and it seems to work for me. I'd appreciate more testing please. For now you can apply manually to your env like:

sudo su
pushd /usr/share/instack-undercloud/undercloud-install/pre-install.d/
mv 01-persistent-journal 01-persistent-journal.ORIG

curl -o 01-persistent-journal "https://review.openstack.org/gitweb?p=openstack/instack-undercloud.git;a=blob_plain;f=elements/undercloud-install/pre-install.d/01-persistent-journal;h=237308a0b002c45fcd7d638d1f1e8b21be4bf8b9;hb=a583594bb967767888f7baf19b062875ecc71ad3"

chmod 755 01-persistent-journal
popd
exit

thanks!

Comment 18 Alexander Chuzhoy 2016-04-12 17:33:04 UTC
Verified:
Environment:
instack-undercloud-2.2.7-4.el7ost.noarch

The issue doesn't reproduce with the last puddle.

Comment 20 errata-xmlrpc 2016-04-15 14:31:11 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-0637.html

Comment 21 Alex Schultz 2017-03-03 21:11:44 UTC
*** Bug 1326809 has been marked as a duplicate of this bug. ***

Comment 22 Steve Baker 2018-04-16 20:41:30 UTC
*** Bug 1395666 has been marked as a duplicate of this bug. ***


Note You need to log in before you can comment on or make changes to this bug.