Bug 1730435 - [RHOSP10]"openstack overcloud update" command fails because of timeout. The same messages are looped in heat-engine.log
Summary: [RHOSP10]"openstack overcloud update" command fails because of timeout. The s...
Keywords:
Status: CLOSED DUPLICATE of bug 1678225
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-heat
Version: 10.0 (Newton)
Hardware: All
OS: All
medium
medium
Target Milestone: ---
: ---
Assignee: Alex Schultz
QA Contact: Victor Voronkov
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-07-16 17:49 UTC by Alex Stupnikov
Modified: 2019-07-19 22:05 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-07-19 11:05:14 UTC
Target Upstream Version:


Attachments (Terms of Use)

Description Alex Stupnikov 2019-07-16 17:49:27 UTC
Description of problem:

Customer is unable to complete minor update procedure because "openstack overcloud update stack -i overcloud" fails after timeout without reasonable output.

Observations:

- originally customer issued overcloud update command against environment without active subscription. We have observed yum-related errors and fixed them by allocating proper subscriptions and enabling proper repos;
- "openstack overcloud deploy" command was successfully issued by customer to restore overcloud stack's health;
- Currently "openstack overcloud update stack -i overcloud" command fails because of timeout.

In heat-engine.log we see a loop of common messages, for example:

2019-07-12 16:46:19.179 4410 DEBUG heat.engine.scheduler [req-037781ea-281c-401c-b710-ad2029b71a45 - - - - -] Task update_task from Stack "overcloud-Compute-jyjpqgykup3r-0-wlgyupxughgk" [30f3cb61-8a59-4c24-bbcb-63693a03e84f] running step /usr/lib/python2.7/site-packages/heat/engine/scheduler.py:216

Latest failed command was issued at 2019-07-12 12:42 (local time). Log files and sosreport will be provided privately.

At this point we need a help to identify a root cause...

Comment 2 Alex Schultz 2019-07-16 20:17:16 UTC
Traditionally when it's looping like that, a deployment task never completed on one of the systems.  Was that compute down at time of deployment or os-collect-config not running?  Please provide a sosreport from the compute node

Comment 6 Alex Schultz 2019-07-17 17:10:24 UTC
They are likely hitting Bug 1678225 which was resolved in openstack-heat-7.0.6-6.el7ost but according to the sosreports they have openstack-heat-7.0.6-4.el7ost.noarch

Comment 7 Alex Schultz 2019-07-17 17:12:37 UTC
Please have them update the undercloud and try again. I will leave this bug open for a bit longer, but it likely needs to be marked as a duplicate of Bug 1678225

Comment 8 Alex Stupnikov 2019-07-19 09:32:37 UTC
Alex, thank you for you help and sorry for ambiguous report. Customer confirmed your conclusion and closed the case. I believe that this bug can also be closed.

Regards, Alex S.

Comment 9 Alex Schultz 2019-07-19 22:05:39 UTC

*** This bug has been marked as a duplicate of bug 1678225 ***


Note You need to log in before you can comment on or make changes to this bug.