Description of problem: Controller-0 upgrade[1] failed with error[2] while it was trying to authenticate the remote computeHA node by running command[3] during ansible workflow execution of "TASK [Wait for puppet host configuration to finish]" [1] openstack overcloud upgrade run --stack msufiyan --limit controller-0 [2] Error ~~~ Jul 9 16:16:03 puppet-user: Debug: Error: Error: Unable to communicate with msufiyan-novacomputeiha-0", " Jul 9 16:16:03 puppet-user: Error: Operation timed out", "Jul 9 16:17:03 puppet-user: Debug: try 4/5: pcs host auth msufiyan-novacomputeiha-0 addr=172.17.1.23 -u hacluster -p \"KG9eY7F6Mkf46wP8\"", " Jul 9 16:18:04 puppet-user: Debug: Error: Error: Unable to communicate with msufiyan-novacomputeiha-0", "Jul 9 16:18:04 puppet-user: Error: Operation timed out", " Jul 9 16:19:04 puppet-user: Debug: try 5/5: pcs host auth msufiyan-novacomputeiha-0 addr=172.17.1.23 -u hacluster -p \"KG9eY7F6Mkf46wP8\"", " Jul 9 16:20:05 puppet-user: Debug: Error: Error: Unable to communicate with msufiyan-novacomputeiha-0", " Jul 9 16:20:05 puppet-user: Error: Operation timed out", " Jul 9 16:21:05 puppet-user: Error: pcs create failed: Error: Unable to communicate with msufiyan-novacomputeiha-0", " Jul 9 16:21:05 puppet-user: Error: /Stage[main]/Tripleo::Profile::Base::Pacemaker/Pacemaker::Resource::Remote[msufiyan-novacomputeiha-0]/Pcmk_remote[msufiyan-novacomputeiha-0]/ensure: change from 'absent'to 'present' failed: pcs create failed: Error: Unable to communicate with msufiyan-novacomputeiha-0", " Jul 9 16:21:05 puppet-user: Debug: Pacemaker::Resource::Remote[msufiyan-novacomputeiha-0]: Resource is being skipped, unscheduling all events", " Jul 9 16:21:05 puppet-user: Notice: /Stage[main]/Tripleo::Profile::Base::Pacemaker/Exec[exec-wait-for-msufiyan-novacomputeiha-0]: Dependency Pcmk_remote[msufiyan-novacomputeiha-0] has failures: true", Jul 9 16:21:05 puppet-user: Warning: /Stage[main]/Tripleo::Profile::Base::Pacemaker/Exec[exec-wait-for-msufiyan-novacomputeiha-0]: Skipping because of failed dependencies", " Jul 9 16:21:05 puppet-user: Debug: /Stage[main]/Tripleo::Profile::Base::Pacemaker/Exec[exec-wait-for-msufiyan-novacomputeiha-0]: Resource is being skipped, unscheduling all events", " Jul 9 16:21:06 puppet-user: Debug: backup_cib: pcs cluster cib /var/lib/pacemaker/cib/puppet-cib-backup20200709-116249-17sbit4 returned ", " Jul 9 16:21:06 puppet-user: Debug: pcs -f /var/lib/pacemaker/cib/puppet-cib-backup20200709-116249-17sbit4 resource show msufiyan-novacomputeiha-1 > /dev/null 2>&1", " Jul 9 16:21:07 puppet-user: Debug: Exists: resource msufiyan-novacomputeiha-1 exists 1 resource deep_compare: false", " Jul 9 16:21:07 puppet-user: Debug: try 1/5: pcs host auth msufiyan-novacomputeiha-1 addr=172.17.1.147 -u hacluster -p \"KG9eY7F6Mkf46wP8\"", " Jul 9 16:22:07 puppet-user: Debug: Error: Error: Unable to communicate with msufiyan-novacomputeiha-1", " Jul 9 16:22:07 puppet-user: Error: Operation timed out", " Jul 9 16:23:07 puppet-user: Debug: try 2/5: pcs host auth msufiyan-novacomputeiha-1 addr=172.17.1.147 -u hacluster -p \"KG9eY7F6Mkf46wP8\"", " Jul 9 16:24:08 puppet-user: Debug: Error: Error: Unable to communicate with msufiyan-novacomputeiha-1", " Jul 9 16:24:08 puppet-user: Error: Operation timed out", " Jul 9 16:25:08 puppet-user: Debug: try 3/5: pcs host auth msufiyan-novacomputeiha-1 addr=172.17.1.147 -u hacluster -p \"KG9eY7F6Mkf46wP8\"", " Jul 9 16:26:09 puppet-user: Debug: Error: Error: Unable to communicate with msufiyan-novacomputeiha-1", " Jul 9 16:26:09 puppet-user: Error: Operation timed out", " Jul 9 16:27:09 puppet-user: Debug: try 4/5: pcs host auth msufiyan-novacomputeiha-1 addr=172.17.1.147 -u hacluster -p \"KG9eY7F6Mkf46wP8\"", " Jul 9 16:28:10 puppet-user: Debug: Error: Error: Unable to communicate with msufiyan-novacomputeiha-1", " Jul 9 16:28:10 puppet-user: Error: Operation timed out", " ~~~ From controller-0 ~~~ [root@controller-0 ~]# pcs host auth msufiyan-novacomputeiha-2 addr=172.17.1.96 -u hacluster -p KG9eY7F6Mkf46wP8 --debug Running: /usr/bin/ruby -I/usr/lib/pcsd/ /usr/lib/pcsd/pcsd-cli.rb auth Environment: GEM_HOME=/usr/lib/pcsd/vendor/bundle/ruby HISTCONTROL=ignoredups HISTSIZE=1000 HOME=/root HOSTNAME=controller-0 LANG=en_US.UTF-8 LC_ALL=C LESSOPEN=||/usr/bin/lesspipe.sh %s LOGNAME=root LS_COLORS=rs=0:di=01;34:ln=01;36:mh=00:pi=40;33:so=01;35:do=01;35:bd=40;33;01:cd=40;33;01:or=40;31;01:mi=01;05;37;41:su=37;41:sg=30;43:ca=30;41:tw=30;42:ow=34;42:st=37;44:ex=01;32:*.tar=01 ;31:*.tgz=01;31:*.arc=01;31:*.arj=01;31:*.taz=01;31:*.lha=01;31:*.lz4=01;31:*.lzh=01;31:*.lzma=01;31:*.tlz=01;31:*.txz=01;31:*.tzo=01;31:*.t7z=01;31:*.zip=01;31:*.z=01;31:*.dz=01;31:*.gz=01; 31:*.lrz=01;31:*.lz=01;31:*.lzo=01;31:*.xz=01;31:*.zst=01;31:*.tzst=01;31:*.bz2=01;31:*.bz=01;31:*.tbz=01;31:*.tbz2=01;31:*.tz=01;31:*.deb=01;31:*.rpm=01;31:*.jar=01;31:*.war=01;31:*.ear=01; 31:*.sar=01;31:*.rar=01;31:*.alz=01;31:*.ace=01;31:*.zoo=01;31:*.cpio=01;31:*.7z=01;31:*.rz=01;31:*.cab=01;31:*.wim=01;31:*.swm=01;31:*.dwm=01;31:*.esd=01;31:*.jpg=01;35:*.jpeg=01;35:*.mjpg= 01;35:*.mjpeg=01;35:*.gif=01;35:*.bmp=01;35:*.pbm=01;35:*.pgm=01;35:*.ppm=01;35:*.tga=01;35:*.xbm=01;35:*.xpm=01;35:*.tif=01;35:*.tiff=01;35:*.png=01;35:*.svg=01;35:*.svgz=01;35:*.mng=01;35: *.pcx=01;35:*.mov=01;35:*.mpg=01;35:*.mpeg=01;35:*.m2v=01;35:*.mkv=01;35:*.webm=01;35:*.ogm=01;35:*.mp4=01;35:*.m4v=01;35:*.mp4v=01;35:*.vob=01;35:*.qt=01;35:*.nuv=01;35:*.wmv=01;35:*.asf=01 ;35:*.rm=01;35:*.rmvb=01;35:*.flc=01;35:*.avi=01;35:*.fli=01;35:*.flv=01;35:*.gl=01;35:*.dl=01;35:*.xcf=01;35:*.xwd=01;35:*.yuv=01;35:*.cgm=01;35:*.emf=01;35:*.ogv=01;35:*.ogx=01;35:*.aac=01 ;36:*.au=01;36:*.flac=01;36:*.m4a=01;36:*.mid=01;36:*.midi=01;36:*.mka=01;36:*.mp3=01;36:*.mpc=01;36:*.ogg=01;36:*.ra=01;36:*.wav=01;36:*.oga=01;36:*.opus=01;36:*.spx=01;36:*.xspf=01;36: MAIL=/var/spool/mail/root PATH=/usr/local/sbin:/sbin:/bin:/usr/sbin:/usr/bin:/usr/local/bin/:/root/bin PCSD_DEBUG=true PCSD_NETWORK_TIMEOUT=60 PWD=/root SHELL=/bin/bash SHLVL=1 SUDO_COMMAND=/bin/bash SUDO_GID=1001 SUDO_UID=1000 SUDO_USER=heat-admin TERM=screen USER=root _=/sbin/pcs --Debug Input Start-- {"nodes": {"msufiyan-novacomputeiha-2": {"dest_list": [{"addr": "172.17.1.96", "port": 2224}], "username": "hacluster", "password": "KG9eY7F6Mkf46wP8"}}} --Debug Input End-- --Debug Stdout End-- --Debug Stderr Start-- --Debug Stderr End-- Error: Operation timed out Error: Unable to communicate with msufiyan-novacomputeiha-2 [root@controller-0 ~]# ~~~ >> controller-0 is also not able to connect via port 2224 for node(msufiyan-novacomputeiha-2/172.17.1.96) ~~~ [root@controller-0 ~]# telnet 172.17.1.96 2224 Trying 172.17.1.96... ~~~ ~~~ [root@controller-0 ~]# netstat -tulpn | grep -i 2224 tcp 0 0 172.17.1.142:2224 0.0.0.0:* LISTEN 74390/platform-pyth [root@controller-0 ~]# systemctl | grep -i pcsd pcsd.service loaded active running PCS GUI and remote configuration interface ~~~ >> from computeha node:- ~~~ [heat-admin@msufiyan-novacomputeiha-2 ~]$ sudo -i [root@msufiyan-novacomputeiha-2 ~]# systemctl list-unit-files | grep -i pacemaker pacemaker.service disabled pacemaker_remote.service enabled [root@msufiyan-novacomputeiha-2 ~]# systemctl list-unit-files | grep -i pcsd pcsd.service disabled [root@msufiyan-novacomputeiha-2 ~]# netstat -peanut | grep -i 2224 [root@msufiyan-novacomputeiha-2 ~]# ~~~
Hi, it seems that Pidone was running a test of Friday. Let us know if you need help/assistance from Ugrades. Thanks,
Removing z1 flag as this bug does not have approval, please follow the usual blocker process to have this included in z1. If it gets approved as a blocker for 16.1.1 then it will have to be in Modified by July 29th to be included.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Red Hat OpenStack Platform 16.1 bug fix and enhancement advisory), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2020:4284