Bug 1855423 - [FFU13-16.1] Controller upgrade step failed for instanceHA environment due to missing port 2224 on remote nodes
Summary: [FFU13-16.1] Controller upgrade step failed for instanceHA environment due to...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: puppet-tripleo
Version: 16.1 (Train)
Hardware: x86_64
OS: Linux
urgent
urgent
Target Milestone: z2
: 16.1 (Train on RHEL 8.2)
Assignee: Michele Baldessari
QA Contact: Ronnie Rasouli
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-07-09 20:18 UTC by MD Sufiyan
Modified: 2023-12-15 18:25 UTC (History)
18 users (show)

Fixed In Version: openstack-tripleo-heat-templates-11.3.2-1.20200914170156.el8ost.noarch puppet-pacemaker-1.0.1-1.20200902203410.b501c08.el8ost.noarch puppet-tripleo-11.5.0-1.20200914161840.f716ef5.el8ost.noarch
Doc Type: Bug Fix
Doc Text:
This update fixes a bug that prevented fast forward upgrades (FFU) of instance HA environments from RHOSP 13 to RHOSP 16.1.
Clone Of:
Environment:
Last Closed: 2020-10-28 15:38:25 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Launchpad 1888398 0 None None None 2020-07-21 13:06:40 UTC
Red Hat Issue Tracker OSP-18246 0 None None None 2022-08-17 17:42:43 UTC
Red Hat Product Errata RHEA-2020:4284 0 None None None 2020-10-28 15:38:54 UTC

Description MD Sufiyan 2020-07-09 20:18:04 UTC
Description of problem:

Controller-0 upgrade[1] failed with error[2] while it was trying to authenticate the remote computeHA node by running command[3] during ansible workflow execution of "TASK [Wait for puppet host configuration to finish]"

[1] openstack overcloud upgrade run --stack msufiyan --limit controller-0
[2] Error

~~~
Jul  9 16:16:03 puppet-user: Debug: Error: Error: Unable to communicate with msufiyan-novacomputeiha-0", "
Jul  9 16:16:03 puppet-user: Error: Operation timed out", "Jul  9 16:17:03 puppet-user: Debug: try 4/5: pcs host auth msufiyan-novacomputeiha-0 addr=172.17.1.23 -u hacluster -p \"KG9eY7F6Mkf46wP8\"", "
Jul  9 16:18:04 puppet-user: Debug: Error: Error: Unable to communicate with msufiyan-novacomputeiha-0", "Jul  9 16:18:04 puppet-user: Error: Operation timed out", "
Jul  9 16:19:04 puppet-user: Debug: try 5/5: pcs host auth msufiyan-novacomputeiha-0 addr=172.17.1.23 -u hacluster -p \"KG9eY7F6Mkf46wP8\"", "
Jul  9 16:20:05 puppet-user: Debug: Error: Error: Unable to communicate with msufiyan-novacomputeiha-0", "
Jul  9 16:20:05 puppet-user: Error: Operation timed out", "
Jul  9 16:21:05 puppet-user: Error: pcs create failed: Error: Unable to communicate with msufiyan-novacomputeiha-0", "
Jul  9 16:21:05 puppet-user: Error: /Stage[main]/Tripleo::Profile::Base::Pacemaker/Pacemaker::Resource::Remote[msufiyan-novacomputeiha-0]/Pcmk_remote[msufiyan-novacomputeiha-0]/ensure: change from 'absent'to 'present' failed: pcs create failed: Error: Unable to communicate with msufiyan-novacomputeiha-0", "
Jul  9 16:21:05 puppet-user: Debug: Pacemaker::Resource::Remote[msufiyan-novacomputeiha-0]: Resource is being skipped, unscheduling all events", "
Jul  9 16:21:05 puppet-user: Notice: /Stage[main]/Tripleo::Profile::Base::Pacemaker/Exec[exec-wait-for-msufiyan-novacomputeiha-0]: Dependency Pcmk_remote[msufiyan-novacomputeiha-0] has failures: true", 
Jul  9 16:21:05 puppet-user: Warning: /Stage[main]/Tripleo::Profile::Base::Pacemaker/Exec[exec-wait-for-msufiyan-novacomputeiha-0]: Skipping because of failed dependencies", "
Jul  9 16:21:05 puppet-user: Debug: /Stage[main]/Tripleo::Profile::Base::Pacemaker/Exec[exec-wait-for-msufiyan-novacomputeiha-0]: Resource is being skipped, unscheduling all events", "
Jul  9 16:21:06 puppet-user: Debug: backup_cib: pcs cluster cib /var/lib/pacemaker/cib/puppet-cib-backup20200709-116249-17sbit4 returned ", "
Jul  9 16:21:06 puppet-user: Debug: pcs -f /var/lib/pacemaker/cib/puppet-cib-backup20200709-116249-17sbit4 resource show msufiyan-novacomputeiha-1 > /dev/null 2>&1", "
Jul  9 16:21:07 puppet-user: Debug: Exists: resource msufiyan-novacomputeiha-1 exists 1 resource deep_compare: false", "
Jul  9 16:21:07 puppet-user: Debug: try 1/5: pcs host auth msufiyan-novacomputeiha-1 addr=172.17.1.147 -u hacluster -p \"KG9eY7F6Mkf46wP8\"", "
Jul  9 16:22:07 puppet-user: Debug: Error: Error: Unable to communicate with msufiyan-novacomputeiha-1", "
Jul  9 16:22:07 puppet-user: Error: Operation timed out", "
Jul  9 16:23:07 puppet-user: Debug: try 2/5: pcs host auth msufiyan-novacomputeiha-1 addr=172.17.1.147 -u hacluster -p \"KG9eY7F6Mkf46wP8\"", "
Jul  9 16:24:08 puppet-user: Debug: Error: Error: Unable to communicate with msufiyan-novacomputeiha-1", "
Jul  9 16:24:08 puppet-user: Error: Operation timed out", "
Jul  9 16:25:08 puppet-user: Debug: try 3/5: pcs host auth msufiyan-novacomputeiha-1 addr=172.17.1.147 -u hacluster -p \"KG9eY7F6Mkf46wP8\"", "
Jul  9 16:26:09 puppet-user: Debug: Error: Error: Unable to communicate with msufiyan-novacomputeiha-1", "
Jul  9 16:26:09 puppet-user: Error: Operation timed out", "
Jul  9 16:27:09 puppet-user: Debug: try 4/5: pcs host auth msufiyan-novacomputeiha-1 addr=172.17.1.147 -u hacluster -p \"KG9eY7F6Mkf46wP8\"", "
Jul  9 16:28:10 puppet-user: Debug: Error: Error: Unable to communicate with msufiyan-novacomputeiha-1", "
Jul  9 16:28:10 puppet-user: Error: Operation timed out", "
~~~
 
From controller-0

~~~
[root@controller-0 ~]#  pcs host auth msufiyan-novacomputeiha-2 addr=172.17.1.96 -u hacluster -p KG9eY7F6Mkf46wP8 --debug                                                                     
Running: /usr/bin/ruby -I/usr/lib/pcsd/ /usr/lib/pcsd/pcsd-cli.rb auth                                                                                                                        
Environment:                                                                                                                                                                                  
  GEM_HOME=/usr/lib/pcsd/vendor/bundle/ruby                                                                                                                                                   
  HISTCONTROL=ignoredups                                                                                                                                                                      
  HISTSIZE=1000                                                                                                                                                                               
  HOME=/root                                                                                                                                                                                  
  HOSTNAME=controller-0                                                                                                                                                                       
  LANG=en_US.UTF-8                                                                                                                                                                            
  LC_ALL=C                                                                                                                                                                                    
  LESSOPEN=||/usr/bin/lesspipe.sh %s                                                                                                                                                          
  LOGNAME=root                                                                                                                                                                                
  LS_COLORS=rs=0:di=01;34:ln=01;36:mh=00:pi=40;33:so=01;35:do=01;35:bd=40;33;01:cd=40;33;01:or=40;31;01:mi=01;05;37;41:su=37;41:sg=30;43:ca=30;41:tw=30;42:ow=34;42:st=37;44:ex=01;32:*.tar=01
;31:*.tgz=01;31:*.arc=01;31:*.arj=01;31:*.taz=01;31:*.lha=01;31:*.lz4=01;31:*.lzh=01;31:*.lzma=01;31:*.tlz=01;31:*.txz=01;31:*.tzo=01;31:*.t7z=01;31:*.zip=01;31:*.z=01;31:*.dz=01;31:*.gz=01;
31:*.lrz=01;31:*.lz=01;31:*.lzo=01;31:*.xz=01;31:*.zst=01;31:*.tzst=01;31:*.bz2=01;31:*.bz=01;31:*.tbz=01;31:*.tbz2=01;31:*.tz=01;31:*.deb=01;31:*.rpm=01;31:*.jar=01;31:*.war=01;31:*.ear=01;
31:*.sar=01;31:*.rar=01;31:*.alz=01;31:*.ace=01;31:*.zoo=01;31:*.cpio=01;31:*.7z=01;31:*.rz=01;31:*.cab=01;31:*.wim=01;31:*.swm=01;31:*.dwm=01;31:*.esd=01;31:*.jpg=01;35:*.jpeg=01;35:*.mjpg=
01;35:*.mjpeg=01;35:*.gif=01;35:*.bmp=01;35:*.pbm=01;35:*.pgm=01;35:*.ppm=01;35:*.tga=01;35:*.xbm=01;35:*.xpm=01;35:*.tif=01;35:*.tiff=01;35:*.png=01;35:*.svg=01;35:*.svgz=01;35:*.mng=01;35:
*.pcx=01;35:*.mov=01;35:*.mpg=01;35:*.mpeg=01;35:*.m2v=01;35:*.mkv=01;35:*.webm=01;35:*.ogm=01;35:*.mp4=01;35:*.m4v=01;35:*.mp4v=01;35:*.vob=01;35:*.qt=01;35:*.nuv=01;35:*.wmv=01;35:*.asf=01
;35:*.rm=01;35:*.rmvb=01;35:*.flc=01;35:*.avi=01;35:*.fli=01;35:*.flv=01;35:*.gl=01;35:*.dl=01;35:*.xcf=01;35:*.xwd=01;35:*.yuv=01;35:*.cgm=01;35:*.emf=01;35:*.ogv=01;35:*.ogx=01;35:*.aac=01
;36:*.au=01;36:*.flac=01;36:*.m4a=01;36:*.mid=01;36:*.midi=01;36:*.mka=01;36:*.mp3=01;36:*.mpc=01;36:*.ogg=01;36:*.ra=01;36:*.wav=01;36:*.oga=01;36:*.opus=01;36:*.spx=01;36:*.xspf=01;36:    
  MAIL=/var/spool/mail/root                                                                                                                                                                   
  PATH=/usr/local/sbin:/sbin:/bin:/usr/sbin:/usr/bin:/usr/local/bin/:/root/bin                                                                                                                
  PCSD_DEBUG=true                                                                                                                                                                             
  PCSD_NETWORK_TIMEOUT=60                                                                                                                                                                     
  PWD=/root                                                                                                                                                                                   
  SHELL=/bin/bash                                                                                                                                                                             
  SHLVL=1                                                                                                                                                                                     
  SUDO_COMMAND=/bin/bash                                                                                                                                                                      
  SUDO_GID=1001                                                                                                                                                                               
  SUDO_UID=1000                                                                                                                                                                               
  SUDO_USER=heat-admin                                                                                                                                                                        
  TERM=screen                                                                                                                                                                                 
  USER=root                                                                                                                                                                                   
  _=/sbin/pcs                                                                                                                                                                                 
--Debug Input Start--                                                                                                                                                                         
{"nodes": {"msufiyan-novacomputeiha-2": {"dest_list": [{"addr": "172.17.1.96", "port": 2224}], "username": "hacluster", "password": "KG9eY7F6Mkf46wP8"}}}                                     
--Debug Input End--                                                                                                                                                                           
--Debug Stdout End--                                                                  
--Debug Stderr Start--                                                                
                                                                                      
--Debug Stderr End--                                                                  
                                                                                      
Error: Operation timed out                                                            
Error: Unable to communicate with msufiyan-novacomputeiha-2                           
[root@controller-0 ~]#                                                                
~~~

>> controller-0 is also not able to connect via port 2224 for node(msufiyan-novacomputeiha-2/172.17.1.96)

~~~
[root@controller-0 ~]# telnet 172.17.1.96 2224
Trying 172.17.1.96...
~~~

~~~
[root@controller-0 ~]# netstat -tulpn | grep -i 2224
tcp        0      0 172.17.1.142:2224       0.0.0.0:*               LISTEN      74390/platform-pyth 
[root@controller-0 ~]# systemctl | grep -i pcsd                                           
pcsd.service                                loaded active running   PCS GUI and remote configuration interface                                                         
~~~

>> from computeha node:-

~~~
[heat-admin@msufiyan-novacomputeiha-2 ~]$ sudo -i 
[root@msufiyan-novacomputeiha-2 ~]# systemctl list-unit-files | grep -i pacemaker
pacemaker.service                                                       disabled
pacemaker_remote.service                                                enabled 
[root@msufiyan-novacomputeiha-2 ~]# systemctl list-unit-files | grep -i pcsd
pcsd.service                                                            disabled
[root@msufiyan-novacomputeiha-2 ~]# netstat -peanut | grep -i 2224
[root@msufiyan-novacomputeiha-2 ~]# 
~~~

Comment 4 Sofer Athlan-Guyot 2020-07-13 13:13:59 UTC
Hi,

it seems that Pidone was running a test of Friday.

Let us know if you need help/assistance from Ugrades.

Thanks,

Comment 12 spower 2020-07-27 10:17:54 UTC
Removing z1 flag as this bug does not have approval, please follow the usual blocker process to have this included in z1. If it gets approved as a blocker for 16.1.1 then it will have to be in Modified by July 29th to be included.

Comment 24 errata-xmlrpc 2020-10-28 15:38:25 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat OpenStack Platform 16.1 bug fix and enhancement advisory), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2020:4284


Note You need to log in before you can comment on or make changes to this bug.