In the scale lab testing with a 65 node overcloud, I was seeing that it was taking 90 seconds to clear each node breakpoint on the UpdateDeployment resource. This is too long given it would take over an hour just to clear breakpoints, and you're not even into the update logic yet. Note that using heat client directly, it's able to clear an individual breakpoing in just a few seconds. Also it can clear all breakpoints on the compute nodes (60 of them) in around a minute: heat hook-clear --pre-update overcloud Compute/*/UpdateDeployment So, I suspect this issue is on the tripleo-common side: [stack@host03-rack02 ~]$ rpm -q openstack-tripleo-common openstack-tripleo-common-0.0.1.dev6-1.git49b57eb.el7ost.noarch
here's the captured output from the command: [stack@host03-rack02 ~]$ time openstack overcloud update stack --templates -e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml -e network-environment.yaml overcloud -i WAITING on_breakpoint: [u'overcloud-compute-7', u'overcloud-compute-10', u'overcloud-compute-36', u'overcloud-compute-45', u'overcloud-compute-14', u'overcloud-compute-37', u'overcloud-compute-59', u'overcloud-compute-56', u'overclou d-compute-55', u'overcloud-compute-42', u'overcloud-compute-27', u'overcloud-compute-23', u'overcloud-compute-26', u'overcloud-compute-39', u'overcloud-compute-43', u'overcloud-compute-50', u'overcloud-compute-4', u'overcloud -compute-52', u'overcloud-compute-16', u'overcloud-compute-25', u'overcloud-compute-44', u'overcloud-compute-18', u'overcloud-compute-28', u'overcloud-compute-38', u'overcloud-compute-47', u'overcloud-compute-1', u'overcloud- cephstorage-1', u'overcloud-compute-0', u'overcloud-compute-3', u'overcloud-controller-0', u'overcloud-controller-2', u'overcloud-controller-1', u'overcloud-compute-21', u'overcloud-compute-8', u'overcloud-compute-40', u'over cloud-compute-48', u'overcloud-compute-5', u'overcloud-compute-12', u'overcloud-compute-51', u'overcloud-compute-24', u'overcloud-cephstorage-0', u'overcloud-compute-9', u'overcloud-compute-41', u'overcloud-compute-34', u'ove rcloud-compute-13', u'overcloud-compute-35', u'overcloud-compute-57', u'overcloud-compute-22', u'overcloud-compute-11', u'overcloud-compute-17', u'overcloud-compute-49', u'overcloud-compute-6', u'overcloud-compute-31', u'over cloud-compute-19', u'overcloud-compute-32', u'overcloud-compute-15', u'overcloud-compute-20', u'overcloud-compute-2', u'overcloud-compute-54', u'overcloud-compute-33', u'overcloud-compute-53', u'overcloud-compute-30', u'overc loud-compute-58', u'overcloud-compute-46', u'overcloud-compute-29'] Breakpoint reached, continue? Regexp or Enter=proceed, no=cancel update, C-c=quit interactive mode: .* removing breakpoint on overcloud-compute-7 removing breakpoint on overcloud-compute-10 removing breakpoint on overcloud-compute-36 removing breakpoint on overcloud-compute-45 removing breakpoint on overcloud-compute-14 removing breakpoint on overcloud-compute-37 removing breakpoint on overcloud-compute-59 removing breakpoint on overcloud-compute-56 removing breakpoint on overcloud-compute-55 removing breakpoint on overcloud-compute-42 removing breakpoint on overcloud-compute-27 removing breakpoint on overcloud-compute-23 removing breakpoint on overcloud-compute-26 removing breakpoint on overcloud-compute-39 removing breakpoint on overcloud-compute-43 removing breakpoint on overcloud-compute-50 removing breakpoint on overcloud-compute-4 removing breakpoint on overcloud-compute-52 removing breakpoint on overcloud-compute-16 removing breakpoint on overcloud-compute-25 removing breakpoint on overcloud-compute-44 removing breakpoint on overcloud-compute-18 it was taking close to 90s for each "removing breakpoint ..." line to clear. That's when I switched to heatclient and bulk cleared them all in one go.
Created attachment 1069238 [details] tripleo-common stack_update.py patch patch from Jan to test performance improvement. Also adds debug logging to measure the time between each breakpoint clear call.
output when using the attached patch showing a significant improvement. Full update completed in 20 minutes, and only 2-3 seconds between each call to clear a breakpoint: [stack@host03-rack02 ~]$ time openstack overcloud update stack --templates -e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml -e network-environment.yaml overcloud -i starting package update on stack overcloud WAITING not_started: [u'overcloud-compute-7', u'overcloud-compute-10', u'overcloud-compute-36', u'overcloud-compute-45', u'overcloud-compute-14', u'overcloud-compute-37', u'overcloud-compute-59', u'overcloud-compute-56', u'overcloud- compute-55', u'overcloud-compute-42', u'overcloud-compute-27', u'overcloud-compute-23', u'overcloud-compute-26', u'overcloud-compute-39', u'overcloud-compute-43', u'overcloud-compute-50', u'overcloud-compute-4', u'overcloud-c ompute-52', u'overcloud-compute-16', u'overcloud-compute-25', u'overcloud-compute-44', u'overcloud-compute-18', u'overcloud-compute-28', u'overcloud-compute-38', u'overcloud-compute-47', u'overcloud-compute-1', u'overcloud-co mpute-0', u'overcloud-compute-3', u'overcloud-controller-0', u'overcloud-controller-2', u'overcloud-controller-1', u'overcloud-compute-21', u'overcloud-compute-8', u'overcloud-compute-40', u'overcloud-compute-48', u'overcloud -compute-5', u'overcloud-compute-12', u'overcloud-compute-51', u'overcloud-compute-24', u'overcloud-compute-9', u'overcloud-compute-41', u'overcloud-compute-34', u'overcloud-compute-13', u'overcloud-compute-35', u'overcloud-c ompute-57', u'overcloud-compute-22', u'overcloud-compute-11', u'overcloud-compute-17', u'overcloud-compute-49', u'overcloud-compute-6', u'overcloud-compute-31', u'overcloud-compute-19', u'overcloud-compute-32', u'overcloud-co mpute-15', u'overcloud-compute-20', u'overcloud-compute-2', u'overcloud-compute-54', u'overcloud-compute-33', u'overcloud-compute-53', u'overcloud-compute-30', u'overcloud-compute-58', u'overcloud-compute-46', u'overcloud-com pute-29'] on_breakpoint: [u'overcloud-cephstorage-0', u'overcloud-cephstorage-1'] Breakpoint reached, continue? Regexp or Enter=proceed, no=cancel update, C-c=quit interactive mode: .* xxx1 = 151.565850973 xxx2 = 151.565960884 removing breakpoint on overcloud-cephstorage-0 xxx3 = 151.566158056 xxx4 = 151.905207872 xxx2 = 151.90525198 removing breakpoint on overcloud-cephstorage-1 xxx3 = 151.905302048 xxx4 = 152.222728968 WAITING completed: [u'overcloud-cephstorage-0', u'overcloud-cephstorage-1'] on_breakpoint: [u'overcloud-compute-7', u'overcloud-compute-10', u'overcloud-compute-36', u'overcloud-compute-45', u'overcloud-compute-14', u'overcloud-compute-37', u'overcloud-compute-59', u'overcloud-compute-56', u'overclou d-compute-55', u'overcloud-compute-42', u'overcloud-compute-27', u'overcloud-compute-23', u'overcloud-compute-26', u'overcloud-compute-39', u'overcloud-compute-43', u'overcloud-compute-50', u'overcloud-compute-4', u'overcloud -compute-52', u'overcloud-compute-16', u'overcloud-compute-25', u'overcloud-compute-44', u'overcloud-compute-18', u'overcloud-compute-28', u'overcloud-compute-38', u'overcloud-compute-47', u'overcloud-compute-1', u'overcloud- compute-0', u'overcloud-compute-3', u'overcloud-controller-0', u'overcloud-controller-2', u'overcloud-controller-1', u'overcloud-compute-21', u'overcloud-compute-8', u'overcloud-compute-40', u'overcloud-compute-48', u'overclo ud-compute-5', u'overcloud-compute-12', u'overcloud-compute-51', u'overcloud-compute-24', u'overcloud-compute-9', u'overcloud-compute-41', u'overcloud-compute-34', u'overcloud-compute-13', u'overcloud-compute-35', u'overcloud -compute-57', u'overcloud-compute-22', u'overcloud-compute-11', u'overcloud-compute-17', u'overcloud-compute-49', u'overcloud-compute-6', u'overcloud-compute-31', u'overcloud-compute-19', u'overcloud-compute-32', u'overcloud- compute-15', u'overcloud-compute-20', u'overcloud-compute-2', u'overcloud-compute-54', u'overcloud-compute-33', u'overcloud-compute-53', u'overcloud-compute-30', u'overcloud-compute-58', u'overcloud-compute-46', u'overcloud-c ompute-29'] Breakpoint reached, continue? Regexp or Enter=proceed, no=cancel update, C-c=quit interactive mode: .* xxx1 = 90.0696220398 xxx2 = 90.069685936 removing breakpoint on overcloud-compute-7 xxx3 = 90.0697479248 xxx4 = 90.3890998363 xxx2 = 90.3891439438 removing breakpoint on overcloud-compute-10 xxx3 = 90.389193058 xxx4 = 90.6645958424 xxx2 = 90.664634943 removing breakpoint on overcloud-compute-36 xxx3 = 90.6646769047 xxx4 = 91.0970318317 xxx2 = 91.0971038342 removing breakpoint on overcloud-compute-45 xxx3 = 91.0972080231 xxx4 = 91.4854798317 xxx2 = 91.485532999 removing breakpoint on overcloud-compute-14 xxx3 = 91.4855899811 xxx4 = 91.8707098961 xxx2 = 91.8707778454 removing breakpoint on overcloud-compute-37 xxx3 = 91.8708689213 xxx4 = 92.2763340473 xxx2 = 92.276419878 removing breakpoint on overcloud-compute-59 xxx3 = 92.2764949799 xxx4 = 92.6725299358 xxx2 = 92.6725840569 removing breakpoint on overcloud-compute-56 xxx3 = 92.6726579666 xxx4 = 93.0304319859 xxx2 = 93.0305030346 removing breakpoint on overcloud-compute-55 xxx3 = 93.0305988789 xxx4 = 93.5047240257 xxx2 = 93.5047729015 removing breakpoint on overcloud-compute-42 xxx3 = 93.5048408508 xxx4 = 94.1290340424 xxx2 = 94.1290860176 removing breakpoint on overcloud-compute-27 xxx3 = 94.1291520596 xxx4 = 94.5861740112 xxx2 = 94.5862300396 removing breakpoint on overcloud-compute-23 xxx3 = 94.5863089561 xxx4 = 95.2493729591 xxx2 = 95.2494459152 removing breakpoint on overcloud-compute-26 xxx3 = 95.2495479584 xxx4 = 95.8082299232 xxx2 = 95.8082900047 removing breakpoint on overcloud-compute-39 xxx3 = 95.8083748817 xxx4 = 96.3215780258 xxx2 = 96.3216228485 removing breakpoint on overcloud-compute-43 xxx3 = 96.3216750622 xxx4 = 96.8651218414 xxx2 = 96.8651909828 removing breakpoint on overcloud-compute-50 [166/1249] xxx3 = 96.8652789593 xxx4 = 97.3590559959 xxx2 = 97.3591089249 removing breakpoint on overcloud-compute-4 xxx3 = 97.3591680527 xxx4 = 97.9667749405 xxx2 = 97.9668149948 removing breakpoint on overcloud-compute-52 xxx3 = 97.9668600559 xxx4 = 98.317937851 xxx2 = 98.3179979324 removing breakpoint on overcloud-compute-16 xxx3 = 98.318073988 xxx4 = 98.7605190277 xxx2 = 98.7605609894 removing breakpoint on overcloud-compute-25 xxx3 = 98.760602951 xxx4 = 99.0217180252 xxx2 = 99.021766901 removing breakpoint on overcloud-compute-44 xxx3 = 99.0218298435 xxx4 = 99.274078846 xxx2 = 99.2741270065 removing breakpoint on overcloud-compute-18 xxx3 = 99.2741889954 xxx4 = 99.5340590477 xxx2 = 99.5341289043 removing breakpoint on overcloud-compute-28 xxx3 = 99.5342199802 xxx4 = 100.153893948 xxx2 = 100.15394187 removing breakpoint on overcloud-compute-38 xxx3 = 100.153992891 xxx4 = 100.483422995 xxx2 = 100.483454943 removing breakpoint on overcloud-compute-47 xxx3 = 100.48348999 xxx4 = 100.71407795 xxx2 = 100.714113951 removing breakpoint on overcloud-compute-1 xxx3 = 100.714148998 xxx4 = 100.956265926 xxx2 = 100.956300974 removing breakpoint on overcloud-compute-0 xxx3 = 100.956336021 xxx4 = 101.370429039 xxx2 = 101.370476961 removing breakpoint on overcloud-compute-3 xxx3 = 101.370525837 xxx4 = 101.855455875 xxx2 = 101.855515957 removing breakpoint on overcloud-controller-0 xxx3 = 101.855588913 xxx4 = 102.402540922 xxx2 = 102.40259099 removing breakpoint on overcloud-controller-2 xxx3 = 102.402644873 xxx4 = 103.054533005 xxx2 = 103.054605961 removing breakpoint on overcloud-controller-1 xxx3 = 103.054708958 xxx4 = 103.951122046 xxx2 = 103.951179981 removing breakpoint on overcloud-compute-21 xxx3 = 103.951247931 xxx4 = 104.784229994 xxx2 = 104.784288883 removing breakpoint on overcloud-compute-8 xxx3 = 104.784368992 xxx4 = 105.464069843 xxx2 = 105.464205027 removing breakpoint on overcloud-compute-40 xxx3 = 105.46438694 xxx4 = 106.120981932 xxx2 = 106.121058941 removing breakpoint on overcloud-compute-48 xxx3 = 106.121158838 xxx4 = 107.076825857 xxx2 = 107.076896906 removing breakpoint on overcloud-compute-5 xxx3 = 107.07699585 xxx4 = 107.631747961 xxx2 = 107.631821871 removing breakpoint on overcloud-compute-12 xxx3 = 107.631917953 xxx4 = 108.454469919 xxx2 = 108.454546928 removing breakpoint on overcloud-compute-51 xxx3 = 108.454638004 xxx4 = 109.316258907 xxx2 = 109.316310883 removing breakpoint on overcloud-compute-24 xxx3 = 109.316380024 xxx4 = 109.858588934 xxx2 = 109.858666897 removing breakpoint on overcloud-compute-9 xxx3 = 109.858757973 xxx4 = 110.428331852 xxx2 = 110.428400993 removing breakpoint on overcloud-compute-41 xxx3 = 110.428478956 xxx4 = 111.073166847 xxx2 = 111.073220015 removing breakpoint on overcloud-compute-34 xxx3 = 111.073269844 xxx4 = 111.430397987 xxx2 = 111.430450916 removing breakpoint on overcloud-compute-13 xxx3 = 111.430500031 xxx4 = 111.925314903 xxx2 = 111.925384998 removing breakpoint on overcloud-compute-35 xxx3 = 111.92544198 xxx4 = 113.081521034 xxx2 = 113.081616879 removing breakpoint on overcloud-compute-57 xxx3 = 113.081737041 xxx4 = 113.775181055 xxx2 = 113.775262833 removing breakpoint on overcloud-compute-22 xxx3 = 113.775378942 xxx4 = 114.350182056 xxx2 = 114.35027194 removing breakpoint on overcloud-compute-11 xxx3 = 114.350399971 xxx4 = 115.180938959 xxx2 = 115.181010962 removing breakpoint on overcloud-compute-17 xxx3 = 115.181087971 xxx4 = 115.866230965 xxx2 = 115.866313934 removing breakpoint on overcloud-compute-49 xxx3 = 115.866425991 xxx4 = 116.692327023 xxx2 = 116.692415953 removing breakpoint on overcloud-compute-6 xxx3 = 116.692517996 xxx4 = 118.110244989 xxx2 = 118.110321045 removing breakpoint on overcloud-compute-31 xxx3 = 118.110426903 xxx4 = 118.986992836 xxx2 = 118.987063885 removing breakpoint on overcloud-compute-19 xxx3 = 118.987154007 xxx4 = 119.587746859 xxx2 = 119.587826014 removing breakpoint on overcloud-compute-32 xxx3 = 119.587915897 xxx4 = 120.733727932 xxx2 = 120.733803988 removing breakpoint on overcloud-compute-15 xxx3 = 120.733897924 xxx4 = 121.408231974 xxx2 = 121.408298016 removing breakpoint on overcloud-compute-20 xxx3 = 121.408373833 xxx4 = 122.299252033 xxx2 = 122.299332857 removing breakpoint on overcloud-compute-2 xxx3 = 122.29944706 xxx4 = 123.18203187 xxx2 = 123.182126999 removing breakpoint on overcloud-compute-54 xxx3 = 123.182240963 xxx4 = 124.223815918 xxx2 = 124.223883867 removing breakpoint on overcloud-compute-33 xxx3 = 124.223957062 xxx4 = 125.608160019 xxx2 = 125.608237982 removing breakpoint on overcloud-compute-53 xxx3 = 125.608314991 xxx4 = 126.089272976 xxx2 = 126.08937788 removing breakpoint on overcloud-compute-30 xxx3 = 126.08948493 xxx4 = 126.686313868 xxx2 = 126.686420918 removing breakpoint on overcloud-compute-58 xxx3 = 126.686547041 xxx4 = 127.293951035 xxx2 = 127.29401803 removing breakpoint on overcloud-compute-46 xxx3 = 127.294095993 xxx4 = 127.890305996 xxx2 = 127.890382051 removing breakpoint on overcloud-compute-29 xxx3 = 127.890449047 xxx4 = 128.847321987 IN_PROGRESS IN_PROGRESS IN_PROGRESS COMPLETE update finished with status COMPLETE real 20m55.636s user 0m11.239s sys 0m1.226s
upstream patch: https://review.openstack.org/#/c/219642/
https://code.engineering.redhat.com/gerrit/#/c/59753/
Verified: openstack-tripleo-common-0.0.1.dev6-5.git49b57eb.el7ost.noarch
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2015:2651