Bug 1395666

Summary: heat stack-update of the overcloud stops executing the puppet modules after systemd-journald restart
Product: Red Hat OpenStack Reporter: Eduard Barrera <ebarrera>
Component: rhosp-directorAssignee: Alex Schultz <aschultz>
Status: CLOSED WONTFIX QA Contact: Omri Hochman <ohochman>
Severity: medium Docs Contact:
Priority: medium    
Version: 7.0 (Kilo)CC: aschultz, astupnik, athomas, dbecker, ebarrera, emacchi, jslagle, mburns, morazi, pablo.iranzo, pcaruana, rcernin, sbaker, shardy, srevivo
Target Milestone: ---Keywords: Reopened, Triaged, ZStream
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-04-23 14:35:13 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Eduard Barrera 2016-11-16 11:49:44 UTC
Description of problem:
We using the OSP director to deploy the overcloud and we adding our custom   features and configurations via heat stack-update. Those changes executed by custom puppet classes and one of those configures the journal to use persistent storage with the following code:

service { 'systemd-journald':
         ensure     => 'running',
         enable     => 'true',
    }

    file { '/etc/systemd/journald.conf.d':
        ensure => 'directory',
        owner  => 'root',
        group  => 'root',
        mode   => '0755',
    }
    file { '/etc/systemd/journald.conf.d/xxxxxx.conf':
        ensure  => 'present',
        require => File['/etc/systemd/journald.conf.d'],
        owner   => 'root',
        group   => 'root',
        mode    => '0644',
        content => "[Journal]\nStorage=xxxxxx\n",
        notify  => Service['systemd-journald'],
    }
This puppet file executes successfully and restarts the systemd-journald on every node but somehow the puppet execution stops. Many heat resources left in create_in_progress state and doesn't do anything when in a normal execution those are executed in a few seconds. The stack update fails eventually after a few hours when the keystone token expires. Then those resources state goes to create_failed, but when I check the resource with heat deployment-output show I don't get anything(check the commands.txt).During debugging the problem I noticed that if I removed the notify then the stack update would finished successfully.



Version-Release number of selected component (if applicable):

OSP7

$ grep heat */installed-rpms | awk '{print $1'}
overcloud-controller-0.localdomain/installed-rpms:openstack-heat-api-2015.1.4-1.el7ost.noarch
vercloud-controller-0.localdomain/installed-rpms:openstack-heat-api-cfn-2015.1.4-1.el7ost.noarch
vercloud-controller-0.localdomain/installed-rpms:openstack-heat-api-cloudwatch-2015.1.4-1.el7ost.noarch
overcloud-controller-0.localdomain/installed-rpms:python-heatclient-0.6.0-1.el7ost.noarch
undercloud/installed-rpms:heat-cfntools-1.2.8-2.el7.noarch
undercloud/installed-rpms:ncio-tripleo-heat-templates-1.2-1.el7.noarch
undercloud/installed-rpms:openstack-heat-api-2015.1.4-1.el7ost.noarch
undercloud/installed-rpms:openstack-heat-api-cfn-2015.1.4-1.el7ost.noarch
undercloud/installed-rpms:openstack-heat-api-cloudwatch-2015.1.4-1.el7ost.noarch
undercloud/installed-rpms:openstack-heat-common-2015.1.4-1.el7ost.noarch
undercloud/installed-rpms:openstack-heat-engine-2015.1.4-1.el7ost.noarch
undercloud/installed-rpms:openstack-heat-templates-0-0.8.20150605git.el7ost.noarch
undercloud/installed-rpms:openstack-tripleo-heat-templates-0.8.6-127.el7ost.noarch
undercloud/installed-rpms:python-heatclient-0.6.0-1.el7ost.noarch


How reproducible:
Always

Steps to Reproduce:
1. create a heat stack containing the previous puppet manifest restarting  journald (attached the stack I used to reproduce)
2. update the overcloud
3.

Actual results:
The update hangs

Expected results:
update finish

Additional info:
also reproduced with OSP9

Comment 9 Steve Baker 2018-04-16 20:41:30 UTC

*** This bug has been marked as a duplicate of bug 1315442 ***

Comment 10 Alex Stupnikov 2018-04-17 06:50:25 UTC
Hello Steve. Please note that this bug reports the same issue as bug 1315442, but nominates it for different RHOSP version. I am sorry about confusion created by my comment. Re-opened the bug.

BR, Alex.

Comment 11 Alex Schultz 2018-04-23 14:35:13 UTC
At this point we're not going to fix the issue in OSP7. If this is still occurring on one of the newer versions, please let us know.