Bug 1769231 - [tripleo] Undercloud minor upgrade in 15 breaks podman
Summary: [tripleo] Undercloud minor upgrade in 15 breaks podman
Keywords:
Status: ON_DEV
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-tripleo-heat-templates
Version: 15.0 (Stein)
Hardware: x86_64
OS: Linux
urgent
urgent
Target Milestone: async
: 15.0 (Stein)
Assignee: Sofer Athlan-Guyot
QA Contact: Sofer Athlan-Guyot
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-11-06 08:43 UTC by Mauro Oddi
Modified: 2019-12-04 17:29 UTC (History)
11 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
OpenStack gerrit 693387 'None' 'NEW' 'Special handling for podman update.' 2019-12-09 12:30:49 UTC
OpenStack gerrit 693394 'None' 'NEW' 'Special handling of podman update from 1.0.X to 1.1.X.' 2019-12-09 12:30:49 UTC
Launchpad 1851848 None None None 2019-11-08 14:33:13 UTC
Red Hat Knowledge Base (Solution) 4558001 None None None 2019-11-06 12:02:22 UTC

Internal Links: 1772541

Description Mauro Oddi 2019-11-06 08:43:53 UTC
Description of problem:
Perform a minor upgrade of the undercloud from rhosp 15.0.0 to 15.0.1 breaks podman behavior


Version-Release number of selected component (if applicable):
RHOSP 15.0.1

How reproducible:
always

Steps to Reproduce:
1. install undercloud in 15 GA
2. run openstack undercloud upgrade
3. run podman

Actual results:
podman hangs

Expected results:
podman runs as usual

Additional info:

Comment 1 Mauro Oddi 2019-11-06 08:57:48 UTC
1. Original state:

[root@undercloud ~]# rpm -qa | grep podman
podman-1.0.5-1.gitf604175.module+el8.0.0+4017+bbba319f.x86_64
[root@undercloud ~]# podman ps
CONTAINER ID  IMAGE                                                                      COMMAND               CREATED      STATUS          PORTS  NAMES
c8aa67c95a2b  172.16.0.1:8787/rhosp15-rhel8/openstack-neutron-dhcp-agent:15.0-68         dumb-init --singl...  7 hours ago  Up 7 hours ago         neutron-dnsmasq-qdhcp-20a7c7f6-f9ff-4a6b-b865-3ca02db10288
4ac77063676f  172.16.0.1:8787/rhosp15-rhel8/openstack-nova-compute-ironic:15.0-70        dumb-init --singl...  3 days ago   Up 7 hours ago         nova_compute
dbc001bd2b0f  172.16.0.1:8787/rhosp15-rhel8/openstack-ironic-inspector:15.0-70           dumb-init --singl...  3 days ago   Up 7 hours ago         ironic_inspector_dnsmasq
b34745db1945  172.16.0.1:8787/rhosp15-rhel8/openstack-ironic-inspector:15.0-70           dumb-init --singl...  3 days ago   Up 7 hours ago         ironic_inspector
d18b29ee6f18  172.16.0.1:8787/rhosp15-rhel8/openstack-ironic-pxe:15.0-66                 dumb-init --singl...  3 days ago   Up 7 hours ago         ironic_pxe_http
4a4d0d5942d0  172.16.0.1:8787/rhosp15-rhel8/openstack-ironic-pxe:15.0-66                 dumb-init --singl...  3 days ago   Up 7 hours ago         ironic_pxe_tftp
7894e88f6b2c  172.16.0.1:8787/rhosp15-rhel8/openstack-ironic-neutron-agent:15.0-66       dumb-init --singl...  3 days ago   Up 7 hours ago         ironic_neutron_agent
5303dc4dfe44  172.16.0.1:8787/rhosp15-rhel8/openstack-ironic-conductor:15.0-68           dumb-init --singl...  3 days ago   Up 7 hours ago         ironic_conductor
54de20a08e83  172.16.0.1:8787/rhosp15-rhel8/openstack-mistral-api:15.0-68                dumb-init --singl...  3 days ago   Up 7 hours ago         mistral_api
36dfe0b816e7  172.16.0.1:8787/rhosp15-rhel8/openstack-neutron-openvswitch-agent:15.0-66  dumb-init --singl...  3 days ago   Up 7 hours ago         neutron_ovs_agent
4001c6cb636e  172.16.0.1:8787/rhosp15-rhel8/openstack-neutron-l3-agent:15.0-68           dumb-init --singl...  3 days ago   Up 7 hours ago         neutron_l3_agent
cf2cc4638640  172.16.0.1:8787/rhosp15-rhel8/openstack-neutron-dhcp-agent:15.0-68         dumb-init --singl...  3 days ago   Up 7 hours ago         neutron_dhcp
c85f4af7e83a  172.16.0.1:8787/rhosp15-rhel8/openstack-ironic-api:15.0-66                 dumb-init --singl...  3 days ago   Up 7 hours ago         ironic_api
5b426ff3672e  172.16.0.1:8787/rhosp15-rhel8/openstack-swift-proxy-server:15.0-68         dumb-init --singl...  3 days ago   Up 7 hours ago         swift_proxy
01561a29d64e  172.16.0.1:8787/rhosp15-rhel8/openstack-nova-api:15.0-69                   dumb-init --singl...  3 days ago   Up 7 hours ago         nova_metadata
23e28d8205b7  172.16.0.1:8787/rhosp15-rhel8/openstack-nova-api:15.0-69                   dumb-init --singl...  3 days ago   Up 7 hours ago         nova_api
1a66a90a25ee  172.16.0.1:8787/rhosp15-rhel8/openstack-glance-api:15.0-66                 dumb-init --singl...  3 days ago   Up 7 hours ago         glance_api
642257979cee  172.16.0.1:8787/rhosp15-rhel8/openstack-nova-placement-api:15.0-73         dumb-init --singl...  3 days ago   Up 7 hours ago         nova_placement
0848358484fc  172.16.0.1:8787/rhosp15-rhel8/openstack-zaqar-wsgi:15.0-67                 dumb-init --singl...  3 days ago   Up 7 hours ago         zaqar_websocket
641047f0ef04  172.16.0.1:8787/rhosp15-rhel8/openstack-zaqar-wsgi:15.0-67                 dumb-init --singl...  3 days ago   Up 7 hours ago         zaqar
b495b970871e  172.16.0.1:8787/rhosp15-rhel8/openstack-swift-object:15.0-67               dumb-init --singl...  3 days ago   Up 7 hours ago         swift_rsync
4c04d21c4daf  172.16.0.1:8787/rhosp15-rhel8/openstack-swift-object:15.0-67               dumb-init --singl...  3 days ago   Up 7 hours ago         swift_object_updater
6e218012a08f  172.16.0.1:8787/rhosp15-rhel8/openstack-swift-object:15.0-67               dumb-init --singl...  3 days ago   Up 7 hours ago         swift_object_server
898da7e2f571  172.16.0.1:8787/rhosp15-rhel8/openstack-swift-proxy-server:15.0-68         dumb-init --singl...  3 days ago   Up 7 hours ago         swift_object_expirer
d8464c1aa66d  172.16.0.1:8787/rhosp15-rhel8/openstack-swift-container:15.0-67            dumb-init --singl...  3 days ago   Up 7 hours ago         swift_container_updater
1930d798a108  172.16.0.1:8787/rhosp15-rhel8/openstack-swift-container:15.0-67            dumb-init --singl...  3 days ago   Up 7 hours ago         swift_container_server
e32d9ca7532c  172.16.0.1:8787/rhosp15-rhel8/openstack-swift-account:15.0-66              dumb-init --singl...  3 days ago   Up 7 hours ago         swift_account_server
82352693e76d  172.16.0.1:8787/rhosp15-rhel8/openstack-swift-account:15.0-66              dumb-init --singl...  3 days ago   Up 7 hours ago         swift_account_reaper
e1c3cab2e1f9  172.16.0.1:8787/rhosp15-rhel8/openstack-nova-scheduler:15.0-71             dumb-init --singl...  3 days ago   Up 7 hours ago         nova_scheduler
98cccbfb316f  172.16.0.1:8787/rhosp15-rhel8/openstack-nova-conductor:15.0-72             dumb-init --singl...  3 days ago   Up 7 hours ago         nova_conductor
d707a55e78c6  172.16.0.1:8787/rhosp15-rhel8/openstack-nova-api:15.0-69                   dumb-init --singl...  3 days ago   Up 7 hours ago         nova_api_cron
21cd71511cf1  172.16.0.1:8787/rhosp15-rhel8/openstack-neutron-server:15.0-68             dumb-init --singl...  3 days ago   Up 7 hours ago         neutron_api
999153afcc90  172.16.0.1:8787/rhosp15-rhel8/openstack-mistral-executor:15.0-67           dumb-init --singl...  3 days ago   Up 7 hours ago         mistral_executor
e92e59b852db  172.16.0.1:8787/rhosp15-rhel8/openstack-mistral-event-engine:15.0-69       dumb-init --singl...  3 days ago   Up 7 hours ago         mistral_event_engine
b2b1be205f49  172.16.0.1:8787/rhosp15-rhel8/openstack-mistral-engine:15.0-66             dumb-init --singl...  3 days ago   Up 7 hours ago         mistral_engine
d0148103e5d7  172.16.0.1:8787/rhosp15-rhel8/openstack-cron:15.0-74                       dumb-init --singl...  3 days ago   Up 7 hours ago         logrotate_crond
aaf68267b447  172.16.0.1:8787/rhosp15-rhel8/openstack-heat-engine:15.0-67                dumb-init --singl...  3 days ago   Up 7 hours ago         heat_engine
d2e68cc59ed2  172.16.0.1:8787/rhosp15-rhel8/openstack-heat-api:15.0-67                   dumb-init --singl...  3 days ago   Up 7 hours ago         heat_api_cron
a9ff37145e94  172.16.0.1:8787/rhosp15-rhel8/openstack-heat-api-cfn:15.0-67               dumb-init --singl...  3 days ago   Up 7 hours ago         heat_api_cfn
15f2f718a1d0  172.16.0.1:8787/rhosp15-rhel8/openstack-heat-api:15.0-67                   dumb-init --singl...  3 days ago   Up 7 hours ago         heat_api
a393fc586783  172.16.0.1:8787/rhosp15-rhel8/openstack-keystone:15.0-68                   dumb-init --singl...  3 days ago   Up 7 hours ago         keystone_cron
ed0a08ea8381  172.16.0.1:8787/rhosp15-rhel8/openstack-keystone:15.0-68                   dumb-init --singl...  3 days ago   Up 7 hours ago         keystone
2799cd44d8a3  172.16.0.1:8787/rhosp15-rhel8/openstack-iscsid:15.0-74                     dumb-init --singl...  3 days ago   Up 7 hours ago         iscsid
50432daa8b50  172.16.0.1:8787/rhosp15-rhel8/openstack-mariadb:15.0-80                    dumb-init -- koll...  3 days ago   Up 7 hours ago         mysql
00be2128d1a9  172.16.0.1:8787/rhosp15-rhel8/openstack-rabbitmq:15.0-78                   dumb-init --singl...  3 days ago   Up 7 hours ago         rabbitmq
d45628cb15d0  172.16.0.1:8787/rhosp15-rhel8/openstack-haproxy:15.0-76                    dumb-init --singl...  3 days ago   Up 7 hours ago         haproxy
9cca13367a6d  172.16.0.1:8787/rhosp15-rhel8/openstack-memcached:15.0-72                  dumb-init --singl...  3 days ago   Up 7 hours ago         memcached
dd7631c46b2c  172.16.0.1:8787/rhosp15-rhel8/openstack-keepalived:15.0-73                 dumb-init --singl...  3 days ago   Up 7 hours ago         keepalived


2. Tripleo client upgrade
[stack@undercloud ~]$ sudo yum update -y python3-tripleoclient* openstack-tripleo-common openstack-tripleo-heat-templates


3. Run undercloud upgrade finished successfully

[stack@undercloud ~]$ openstack underclod upgrade
...
########################################################

Deployment successful!

########################################################

Writing the stack virtual update mark file /var/lib/tripleo-heat-installer/update_mark_undercloud

##########################################################

The Undercloud has been successfully upgraded.

Useful files:

Password file is at ~/undercloud-passwords.conf
The stackrc file is at ~/stackrc

Use these files to interact with OpenStack services, and
ensure they are secured.

##########################################################


4. Reboot


5. Try to run podman again and gets hanged but runc shows the containers are there



[root@undercloud ~]# podman ps
r^C
[root@undercloud ~]# runc list
ID                                                                 PID         STATUS      BUNDLE                                                                                                                     CREATED                          OWNER
01561a29d64ea75599c28d7ff527bcc5df924623e51cafa1de352d52b512ac54   9210        running     /var/lib/containers/storage/overlay-containers/01561a29d64ea75599c28d7ff527bcc5df924623e51cafa1de352d52b512ac54/userdata   2019-11-05T23:19:11.65295134Z    root
02b595afa31ccb3a2684886e0a8272e1fd71250586558fd359f7535306c203a3   4081        running     /var/lib/containers/storage/overlay-containers/02b595afa31ccb3a2684886e0a8272e1fd71250586558fd359f7535306c203a3/userdata   2019-11-05T23:15:11.061476178Z   root
1a66a90a25ee646d8977b5afea1283ad5dc44748eed517c11f855037534361a1   9393        running     /var/lib/containers/storage/overlay-containers/1a66a90a25ee646d8977b5afea1283ad5dc44748eed517c11f855037534361a1/userdata   2019-11-05T23:19:13.014990741Z   root
43e3eac74718efaa903fd6203a902b868532f66795e68b00be2e086c61e9bfbb   4092        running     /var/lib/containers/storage/overlay-containers/43e3eac74718efaa903fd6203a902b868532f66795e68b00be2e086c61e9bfbb/userdata   2019-11-05T23:15:11.057431426Z   root
5303dc4dfe44ae0cf4c8d50acfd4b0b0daac901a7fd3712b4ac7a02595bdb2a8   4214        running     /var/lib/containers/storage/overlay-containers/5303dc4dfe44ae0cf4c8d50acfd4b0b0daac901a7fd3712b4ac7a02595bdb2a8/userdata   2019-11-05T23:15:11.111956937Z   root
642257979cee637857db27e641ceaa5854ffc976efe31c9cf7d96c0a38390f7b   9303        running     /var/lib/containers/storage/overlay-containers/642257979cee637857db27e641ceaa5854ffc976efe31c9cf7d96c0a38390f7b/userdata   2019-11-05T23:19:12.302354501Z   root
6d61dcde4fce5632d94524f2331b294a510ae669c3bff58121c7553d6a5a8598   4097        running     /var/lib/containers/storage/overlay-containers/6d61dcde4fce5632d94524f2331b294a510ae669c3bff58121c7553d6a5a8598/userdata   2019-11-05T23:15:11.361758278Z   root
7481d6e6bb42f38db7a4b44bb294a78448dbde4cc5f47cc825ff124ffe6356df   4108        running     /var/lib/containers/storage/overlay-containers/7481d6e6bb42f38db7a4b44bb294a78448dbde4cc5f47cc825ff124ffe6356df/userdata   2019-11-05T23:15:11.375680909Z   root
7a8324919696019dfa86fe1666f6b871cc8f95da2123f2a4aaf19987a0ee611a   4710        running     /var/lib/containers/storage/overlay-containers/7a8324919696019dfa86fe1666f6b871cc8f95da2123f2a4aaf19987a0ee611a/userdata   2019-11-05T23:15:11.964336897Z   root
7d0dcec4bc455e307b376bf88e63646d6e19d45c1e725649a36a6146b3ef92da   9471        created     /var/lib/containers/storage/overlay-containers/7d0dcec4bc455e307b376bf88e63646d6e19d45c1e725649a36a6146b3ef92da/userdata   2019-11-05T23:19:13.66504398Z    root
a9ff37145e944c2ee5632a5a4adb37523e901fd938144f1e08b31c3fc721059a   9333        running     /var/lib/containers/storage/overlay-containers/a9ff37145e944c2ee5632a5a4adb37523e901fd938144f1e08b31c3fc721059a/userdata   2019-11-05T23:19:12.674668578Z   root
b03e7692d26744dd2b34f54d7e1d186e75aaafdc411d59c7966edbbfd8b3a108   4663        running     /var/lib/containers/storage/overlay-containers/b03e7692d26744dd2b34f54d7e1d186e75aaafdc411d59c7966edbbfd8b3a108/userdata   2019-11-05T23:15:11.925206292Z   root
c57af0d210f85d59c2453e39151f7cc544c3606079f7b8edadeb61884db8742d   4318        running     /var/lib/containers/storage/overlay-containers/c57af0d210f85d59c2453e39151f7cc544c3606079f7b8edadeb61884db8742d/userdata   2019-11-05T23:15:11.587937912Z   root
c961c747891e111ecfb57e88cf3454545c45f40aa1f73f7ba14772e6c775b251   4321        running     /var/lib/containers/storage/overlay-containers/c961c747891e111ecfb57e88cf3454545c45f40aa1f73f7ba14772e6c775b251/userdata   2019-11-05T23:15:11.460578778Z   root
d840b0ae29b0212aa0a363f35f719833f2d6337d3b338a74d50fbc1631eb8b72   4610        running     /var/lib/containers/storage/overlay-containers/d840b0ae29b0212aa0a363f35f719833f2d6337d3b338a74d50fbc1631eb8b72/userdata   2019-11-05T23:15:11.889811616Z   root
e92e59b852dba5e73d0e5ec27fd1113570309438d7ac28c971f4df7932a17d78   0           stopped     /var/lib/containers/storage/overlay-containers/e92e59b852dba5e73d0e5ec27fd1113570309438d7ac28c971f4df7932a17d78/userdata   2019-11-05T23:19:11.988101424Z   root
f281e02e0b6cd1bceea81abe530e575f10fb095add60f453d79e0cec5378bfc9   4349        running     /var/lib/containers/storage/overlay-containers/f281e02e0b6cd1bceea81abe530e575f10fb095add60f453d79e0cec5378bfc9/userdata   2019-11-05T23:15:11.578415729Z   root
[root@undercloud ~]# 


gdb shows 



(gdb) info threads
  Id   Target Id                                  Frame 
* 1    Thread 0x7f4bdd564b80 (LWP 15506) "podman" 0x00007f4bdd1508a8 in __pthread_mutex_lock_full () from /lib64/libpthread.so.0
  2    Thread 0x7f4bd1e1b700 (LWP 15507) "podman" runtime.futex () at /usr/lib/golang/src/runtime/sys_linux_amd64.s:536
  3    Thread 0x7f4bd161a700 (LWP 15508) "podman" 0x00007f4bdd1508a8 in __pthread_mutex_lock_full () from /lib64/libpthread.so.0
  4    Thread 0x7f4bd0e19700 (LWP 15509) "podman" runtime.futex () at /usr/lib/golang/src/runtime/sys_linux_amd64.s:536
  5    Thread 0x7f4bcbfff700 (LWP 15510) "podman" runtime.epollwait () at /usr/lib/golang/src/runtime/sys_linux_amd64.s:675
  6    Thread 0x7f4bcb7fe700 (LWP 15511) "podman" runtime.futex () at /usr/lib/golang/src/runtime/sys_linux_amd64.s:536
  7    Thread 0x7f4bcaffd700 (LWP 15512) "podman" 0x00007f4bdd1508a8 in __pthread_mutex_lock_full () from /lib64/libpthread.so.0
  8    Thread 0x7f4bca7fc700 (LWP 15513) "podman" runtime.futex () at /usr/lib/golang/src/runtime/sys_linux_amd64.s:536
  9    Thread 0x7f4bc9ffb700 (LWP 15514) "podman" syscall.Syscall6 () at /usr/lib/golang/src/syscall/asm_linux_amd64.s:53
  10   Thread 0x7f4bc97fa700 (LWP 15515) "podman" 0x00007f4bdd1508a8 in __pthread_mutex_lock_full () from /lib64/libpthread.so.0
  11   Thread 0x7f4baafff700 (LWP 15519) "podman" 0x00007f4bdd1508a8 in __pthread_mutex_lock_full () from /lib64/libpthread.so.0
  12   Thread 0x7f4baa7fe700 (LWP 15520) "podman" 0x00007f4bdd1508a8 in __pthread_mutex_lock_full () from /lib64/libpthread.so.0
  13   Thread 0x7f4ba9ffd700 (LWP 15521) "podman" 0x00007f4bdd1508a8 in __pthread_mutex_lock_full () from /lib64/libpthread.so.0
  14   Thread 0x7f4ba97fc700 (LWP 15522) "podman" 0x00007f4bdd1508a8 in __pthread_mutex_lock_full () from /lib64/libpthread.so.0
  15   Thread 0x7f4ba8ffb700 (LWP 15524) "podman" runtime.futex () at /usr/lib/golang/src/runtime/sys_linux_amd64.s:536
  16   Thread 0x7f4babfff700 (LWP 16326) "podman" runtime.futex () at /usr/lib/golang/src/runtime/sys_linux_amd64.s:536
  17   Thread 0x7f4b8ffff700 (LWP 16327) "podman" runtime.futex () at /usr/lib/golang/src/runtime/sys_linux_amd64.s:536

Comment 2 Cédric Jeanneret 2019-11-06 09:03:16 UTC
I'll be investigating this one in order to try to understand what's going on.

The solution is, probably, to stop containers, update podman, run `podman system migrate' and reboot. But.....

Maybe not due to SELinux - need more info.

Comment 3 Julie Pichon 2019-11-06 09:21:45 UTC
Thanks Cedric. If you believe it to be a SELinux issue, please set the system to permissive, reproduce the issue and attach the audit.log that shows the denials to the bug. Thank you!

Comment 4 Cédric Jeanneret 2019-11-06 09:43:34 UTC
After some more tests, it appears that:
- it's NOT a SELinux issue
- the issue is fully on podman side
- the right way to handle that is to add some post-podman update task

In order to make it right, you need to:
- stop all the containers with `systemctl stop tripleo_*'
- update podman (or the system - whatever)
- run `podman system migrate'
- reboot

The `podman system migrate' does stop the containers, but since we have systemd managing the state, they will be restarted before the end of the migration. So we have to manually stop them using systemd first.

Comment 5 Sofer Athlan-Guyot 2019-11-06 12:59:09 UTC
Hi Mauro,

so podman jumped from 1.0.5 to 1.4.2 with the latest rhel, so first can you confirm that after the update of the undercloud you get podman 1.4.2.

From Cedric:
10:56:56 cjeanner|fra:chem: we have to run those procedure BEFORE ANY REBOOT
10:57:01 cjeanner|fra:else.... you're doomed.
10:57:02 cjeanner|fra:really
10:57:23 cjeanner|fra:and the podman system migrate has to be launched manually yes - it's a new thing

So we get breaking change in minor update of rhel.

The procedure looks complicated and is very disruptive:

- stop all the containers with `systemctl stop tripleo_*'

The sidecar containers are still up

- update podman
- podman system migrate

That will shutdown the sidecar containers

- reboot

oups!  This won't do for the overcloud nodes.  So we need to explore "systemctl start tripleo_*".

Overall this is a major disruption and I'm surprised that it comes with a minor update of rhel.

Comment 7 Mauro Oddi 2019-11-06 13:11:09 UTC
Hi Sofer,

I can confirm you get 1.4.2:
[root@undercloud services]# rpm -qa | grep podman
podman-1.4.2-5.module+el8.1.0+4240+893c1ab8.x86_64
podman-manpages-1.4.2-5.module+el8.1.0+4240+893c1ab8.noarch



Cheers,
---
Mauro S. Oddi

Comment 11 Sofer Athlan-Guyot 2019-12-04 17:29:55 UTC
Hi,

so lastly, I've been unable to replicate the issue in downstream ci.  The podman get updated, nothing special is done, we reboot and the environment seems to work fine.  We have other issues, so I need to clear them out to be sure, and make more testing, but currently, we have no issue related to the podman upgrade from 1.0.5 to 1.4.2


Note You need to log in before you can comment on or make changes to this bug.