Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 2227676

Summary: [RHOSP16.2.5] The glance_api_cron container is created and tripleo_glance_api_cron_healthcheck.service fails post FFU from RHOSP 13 to 16.2.5
Product: Red Hat OpenStack Reporter: Anjana <anbs>
Component: openstack-tripleo-heat-templatesAssignee: Manoj Katari <mkatari>
Status: CLOSED ERRATA QA Contact: msava
Severity: medium Docs Contact:
Priority: medium    
Version: 16.2 (Train)CC: abishop, drosenfe, eshames, mburns, mkatari, nkawamot
Target Milestone: z6Keywords: Triaged
Target Release: 16.2 (Train on RHEL 8.4)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: openstack-tripleo-heat-templates-11.6.1-2.20230717085025.1608f56.el8ost Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-11-08 19:19:16 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 2243639, 2243643    
Bug Blocks:    

Description Anjana 2023-07-31 05:55:25 UTC
Description of problem:

In RHOSP 16.2.5, the glance_api_cron container is created even though the image cache feature of glance is not enabled[1]

In that case, tripleo_glance_api_cron_healthcheck.service will fail if the Image cache of the glance is not enabled.[2]

It is assumed that this may be due to the fact that cron jobs are not created in the glance_api_cron container if the glance image cache is not enabled.[3]


[1] 
[root@control01tbmoc ~]# podman ps|grep glance
f873fc92a78e  undercloudtb.ctlplane.localdomain:8787/rhosp-rhel8/openstack-glance-api:16.2                 kolla_start           2 weeks ago     Up 2 weeks ago             glance_api
3f9bfe1b37e0  undercloudtb.ctlplane.localdomain:8787/rhosp-rhel8/openstack-glance-api:16.2                 kolla_start           2 weeks ago     Up 2 weeks ago             glance_api_cron


/var/lib/config-data/puppet-generated/glance_api/etc/glance/glance-api.conf:
~~~
...
[paste_deploy]
...
flavor=keystone
...
~~~

[2]

[root@control01tbmoc ~]# systemctl status tripleo_glance_api_cron_healthcheck.service
● tripleo_glance_api_cron_healthcheck.service - glance_api_cron healthcheck
   Loaded: loaded (/etc/systemd/system/tripleo_glance_api_cron_healthcheck.service; disabled; vendor preset: disabled)
   Active: failed (Result: exit-code) since Wed 2023-07-26 10:32:33 +08; 31s ago
  Process: 418699 ExecStart=/usr/bin/podman exec --user root glance_api_cron /usr/share/openstack-tripleo-common/healthcheck/cron glance (code=exited, status=1/FAILURE)
 Main PID: 418699 (code=exited, status=1/FAILURE)

[3]
[root@control01tbmoc ~]# podman exec -it glance_api_cron crontab -l
no crontab for root
WARN[0000] Error resizing exec session 9ba6504bcfb042f73bb11837a07ab5ebd3fd0933b259fb5382a446f14160ec8c: could not open ctl file for terminal resize for container 3f9bfe1b37e0e1c8b51989b0151bde4382e41f14c8f841127eafa6be79ea451c: open /var/lib/containers/storage/overlay-containers/3f9bfe1b37e0e1c8b51989b0151bde4382e41f14c8f841127eafa6be79ea451c/userdata/9ba6504bcfb042f73bb11837a07ab5ebd3fd0933b259fb5382a446f14160ec8c/ctl: no such device or address

[root@control01tbmoc ~]# podman exec -it glance_api_cron cat /usr/share/openstack-tripleo-common/healthcheck/cron
#!/bin/bash


file="${1:-root}"
if [ -f /var/spool/cron/${file} ]; then
    nb_lines=$(grep -cEv '^#' /var/spool/cron/${file})
    if [ $nb_lines -ge 2 ]; then
        exit 0
    fi
fi
exit 1
WARN[0000] Error resizing exec session f40464e695091fc8a10cb04f34f5b8c5d3cad262af1c90cbec2859fd7ecd5551: could not open ctl file for terminal resize for container 3f9bfe1b37e0e1c8b51989b0151bde4382e41f14c8f841127eafa6be79ea451c: open /var/lib/containers/storage/overlay-containers/3f9bfe1b37e0e1c8b51989b0151bde4382e41f14c8f841127eafa6be79ea451c/userdata/f40464e695091fc8a10cb04f34f5b8c5d3cad262af1c90cbec2859fd7ecd5551/ctl: no such device or address



Version-Release number of selected component (if applicable):
puppet-glance-15.5.0-2.20220804175403.d54e942.el8ost.noarch

How reproducible:

This always happens when Image cache is disabled in RHOSP16.2.5

Actual results:
tripleo_glance_api_cron_healthcheck.service fail

Expected results:
tripleo_glance_api_cron_healthcheck.service does not fail



Additional info:

found this bugzilla similar to the error,

https://bugzilla.redhat.com/show_bug.cgi?id=2159566 

but the bugzilla is for RHOSP version 16.2.4 and the resolution is to upgrade to 16.2.5

The solution cannot be applied here as the customer environment is already 16.2.5

Comment 1 David Rosenfeld 2023-07-31 12:38:08 UTC
Is it this: https://bugzilla.redhat.com/show_bug.cgi?id=2142951 ? If so please mark as duplicate.

Comment 9 Manoj Katari 2023-08-02 07:32:40 UTC
@Takashi  Sure, assigned it to me.

Comment 14 msava 2023-10-05 10:53:17 UTC
Test failed.

openstack-tripleo-heat-templates-11.6.1-2.20230808225213.9adcac6.el8ost.noarch
Deploy glance with cache enabled.

1.glance-conf:
------------
[DEFAULT]
image_member_quota=128
show_image_direct_url=True
show_multiple_locations=True
enable_v2_api=True
node_staging_uri=file:///var/lib/glance/staging
enabled_import_methods=[web-download]
bind_host=172.17.1.95
bind_port=9292
workers=4
enabled_backends=default_backend:rbd
image_cache_max_size=10737418240
image_cache_stall_time=86400
image_cache_dir=/var/lib/glance/image-cache
registry_host=0.0.0.0
debug=True
log_file=/var/log/glance/api.log
log_dir=/var/log/glance
transport_url=rabbit://guest:YoHgDTqqXbLC0rsSKBAjo1WkA.redhat.local:5672,guest:YoHgDTqqXbLC0rsSKBAjo1WkA.redhat.local:5672,guest:YoHgDTqqXbLC0rsSKBAjo1WkA.redhat.local:5672/?ssl=0
cache_prefetcher_interval=300
enable_v1_api=False
[cinder]
[cors]
[database]
connection=mysql+pymysql://glance:Jre4yOvrHzWGcO8XRlgqiEkIS.1.84/glance?read_default_file=/etc/my.cnf.d/tripleo.cnf&read_default_group=tripleo
[file]
[glance.store.http.store]
[glance.store.rbd.store]
[glance.store.sheepdog.store]
[glance.store.swift.store]
[glance.store.vmware_datastore.store]
[glance_store]
default_backend=default_backend
os_region_name=regionOne
[image_format]
[keystone_authtoken]
www_authenticate_uri=http://10.0.0.103:5000
region_name=regionOne
memcached_servers=controller-0.internalapi.redhat.local:11211,controller-1.internalapi.redhat.local:11211,controller-2.internalapi.redhat.local:11211
memcache_use_advanced_pool=True
auth_type=password
auth_url=http://172.17.1.84:5000
username=glance
password=Jre4yOvrHzWGcO8XRlgqiEkIS
user_domain_name=Default
project_name=service
project_domain_name=Default
[oslo_concurrency]
lock_path=/var/lib/glance/tmp



2. Api cron healthcheck
------------------------
[heat-admin@controller-0 ~]$ systemctl status tripleo_glance_api_cron_healthcheck.service                                                                                                                                                    
● tripleo_glance_api_cron_healthcheck.service - glance_api_cron healthcheck
   Loaded: loaded (/etc/systemd/system/tripleo_glance_api_cron_healthcheck.service; disabled; vendor preset: disabled)
   Active: inactive (dead) since Thu 2023-10-05 10:13:10 UTC; 55s ago
  Process: 221326 ExecStart=/usr/bin/podman exec --user root glance_api_cron /usr/share/openstack-tripleo-common/healthcheck/cron glance (code=exited, status=0/SUCCESS)                                                                     
 Main PID: 221326 (code=exited, status=0/SUCCESS)


3.
[heat-admin@controller-0 ~]$ sudo podman ps|grep glance                                                                                                                                                                                      
21e1658033b2  undercloud-0.ctlplane.redhat.local:8787/rh-osbs/rhosp16-openstack-glance-api:16.2_20230925.1                  kolla_start           About an hour ago  Up About an hour ago          glance_api                                
32e131a6da70  undercloud-0.ctlplane.redhat.local:8787/rh-osbs/rhosp16-openstack-glance-api:16.2_20230925.1                  kolla_start           About an hour ago  Up About an hour ago          glance_api_cron  



4.Crontab check
---------------
[heat-admin@controller-0 ~]$ sudo podman exec -it glance_api_cron crontab -l
no crontab for root
WARN[0000] Error resizing exec session c8f76a7a5bd9df1289a85ea7f414baf26ed7a074f2658fc4fe2d8da331722e93: could not open ctl file for terminal resize for container 32e131a6da70622303d6a49c4f1e7f66bf33cbd9548cda36bb91febb5bb57962: open /var/lib/containers/storage/overlay-containers/32e131a6da70622303d6a49c4f1e7f66bf33cbd9548cda36bb91febb5bb57962/userdata/c8f76a7a5bd9df1289a85ea7f414baf26ed7a074f2658fc4fe2d8da331722e93/ctl: no such device or address 
[heat-admin@controller-0 ~]$ sudo podman exec -it glance_api_cron /bin/bash
[root@controller-0 /]# crontab -l
no crontab for root


5.Create image and cache image in controller
----------------------------------------------
bash-4.4$ glance-cache-manage --host=192.168.24.43 queue-image 455f343c-c7db-4484-bdcd-3f386b22ba18
Queue image 455f343c-c7db-4484-bdcd-3f386b22ba18 for caching? [y/N] y
Failed to queue the specified image for caching. Got error:
[Errno 111] Connection refused
bash-4.4$

Comment 15 Alan Bishop 2023-10-06 02:31:41 UTC
Max, I'm moving this back to ON_QA because of issues in your verification process.

First, the bug we're fixing is one in which the glance cache is NOT enabled prior to the FFU. In that situation, glance's cron job is not supposed to be enabled after the FFU. The problem (the bug) is that the cron job was created, and was failing because it had nothing to do because the cache isn't enabled. 

For this BZ, I envision the test procedure to do something like this:

- Deploy OSP-13 with glance cache disabled.
- FFU to 16.2 and verify there's no glance_api_cron pod running at all.

Separate from verifying the BZ, I want to note the crontab check you ran in item 4. is not correct. If you look carefully at the tripleo_glance_api_cron_healthcheck.service you will see it runs this command:

==> ExecStart=/usr/bin/podman exec --user root glance_api_cron /usr/share/openstack-tripleo-common/healthcheck/cron glance

Note the last argument is "glance". That gets passed into the /usr/share/openstack-tripleo-common/healthcheck/cron, and it represents the user associated with the cron job. In this case, the cron job runs as the glance user, not root. If you want to see what I mean, try this:

$ sudo podman exec -it glance_api_cron crontab -u glance -l

Just be sure to do that when glance cache is enabled (but verify this BZ with it disabled).

Comment 22 errata-xmlrpc 2023-11-08 19:19:16 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat OpenStack Platform 16.2.6 (Train) bug fix and enhancement advisory), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2023:6307