Bug 1917261 - Satellite job failed on undercloud stage while uploading images to local registry
Summary: Satellite job failed on undercloud stage while uploading images to local regi...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-tripleo-common
Version: 16.2 (Train)
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: rc
: 16.2 (Train on RHEL 8.4)
Assignee: Alex Schultz
QA Contact: David Rosenfeld
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-01-18 08:30 UTC by Jad Haj Yahya
Modified: 2021-09-15 07:11 UTC (History)
5 users (show)

Fixed In Version: openstack-tripleo-common-11.4.1-2.20210206004924.a42a18b.el8ost
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-09-15 07:11:11 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
undercloud-install.log (136.19 KB, text/plain)
2021-01-18 15:47 UTC, Cédric Jeanneret
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Launchpad 1912645 0 None None None 2021-01-21 15:53:11 UTC
OpenStack gerrit 772719 0 None MERGED Check existence of expires_in for auth 2021-02-02 11:43:34 UTC
Red Hat Product Errata RHEA-2021:3483 0 None None None 2021-09-15 07:11:33 UTC

Description Jad Haj Yahya 2021-01-18 08:30:58 UTC
Description of problem:
Undercloud deployment failed during image_uploader processes. Here is an  excerpt from log:

http://pastebin.test.redhat.com/932018

Tripleo deploy command:

sudo --preserve-env openstack tripleo deploy --standalone --standalone-role Undercloud --stack undercloud --local-domain=redhat.local --local-ip=192.168.24.1/24 --templates=/usr/share/openstack-tripleo-heat-templates/ --networks-file=network_data_undercloud.yaml --heat-native -e /usr/share/openstack-tripleo-heat-templates/environments/undercloud.yaml -e /usr/share/openstack-tripleo-heat-templates/environments/use-dns-for-vips.yaml -e /usr/share/openstack-tripleo-heat-templates/environments/podman.yaml -e /home/stack/containers-prepare-parameter.yaml -e /usr/share/openstack-tripleo-heat-templates/environments/services/masquerade-networks.yaml -e /usr/share/openstack-tripleo-heat-templates/environments/services/ironic.yaml -e /usr/share/openstack-tripleo-heat-templates/environments/services/ironic-inspector.yaml -e /usr/share/openstack-tripleo-heat-templates/environments/services/mistral.yaml -e /usr/share/openstack-tripleo-heat-templates/environments/services/zaqar-swift-backend.yaml -e /usr/share/openstack-tripleo-heat-templates/environments/disable-telemetry.yaml -e /usr/share/openstack-tripleo-heat-templates/environments/services/tempest.yaml --public-virtual-ip 192.168.24.2 --control-virtual-ip 192.168.24.3 -e /usr/share/openstack-tripleo-heat-templates/environments/ssl/tls-endpoints-public-ip.yaml -e /usr/share/openstack-tripleo-heat-templates/environments/services/undercloud-haproxy.yaml -e /usr/share/openstack-tripleo-heat-templates/environments/services/undercloud-keepalived.yaml --deployment-user stack --output-dir=/home/stack --cleanup -e /home/stack/tripleo-config-generated-env-files/undercloud_parameters.yaml --hieradata-override=/home/stack/hiera_override.yaml -e /usr/share/openstack-tripleo-heat-templates/environments/tripleo-validations.yaml --log-file=install-undercloud.log -e /usr/share/openstack-tripleo-heat-templates/undercloud-stack-vstate-dropin.yam

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1. Deploy DFG-df-deployment-16.2-virthost-3cont_2comp_3ceph-ceph-ipv4-geneve-satellite-local-registry job
2.
3.

Actual results:
Undercloud deployment failed 

Expected results:
Undercloud deployment should pass

Additional info:
Job logs: https://rhos-ci-jenkins.lab.eng.tlv2.redhat.com/job/DFG-df-deployment-16.2-virthost-3cont_2comp_3ceph-ceph-ipv4-geneve-satellite-local-registry/5/artifact/.sh/install-undercloud-sat.log

Comment 1 Cédric Jeanneret 2021-01-18 15:47:10 UTC
Created attachment 1748482 [details]
undercloud-install.log

Attached is a "reformatted" log that should make the issue more obvious than the terrible original output.

It sounds like some internal timeout with the session, as we can see with the following stack-trace (it's the last block in the attached file):


"ERROR:root:Image prepare failed: '<' not supported between instances of 'int' and 'NoneType'"
"concurrent.futures.process._RemoteTraceback: "                                 
"""""                                                                           
"Traceback (most recent call last):"                                            
"  File "/usr/lib64/python3.6/concurrent/futures/process.py"                    
line 175, in _process_worker"                                                   
"    r = call_item.fn(*call_item.args, **call_item.kwargs)"                     
"  File "/usr/lib64/python3.6/concurrent/futures/process.py"                    
line 153, in _process_chunk"                                                    
"    return [fn(*args) for args in chunk]"                                      
"  File "/usr/lib64/python3.6/concurrent/futures/process.py"                    
line 153, in <listcomp>"                                                        
"    return [fn(*args) for args in chunk]"                                      
"  File "/usr/lib/python3.6/site-packages/tripleo_common/image/image_uploader.py"
line 2678, in upload_task"                                                      
"    return uploader.upload_image(task)"                                        
"  File "/usr/lib/python3.6/site-packages/tripleo_common/image/image_uploader.py"
line 1564, in upload_image"                                                     
"    password=target_password"                                                  
"  File "/usr/lib/python3.6/site-packages/tenacity/__init__.py"                 
line 292, in wrapped_f"                                                         
"    return self.call(f, *args, **kw)"                                          
"  File "/usr/lib/python3.6/site-packages/tenacity/__init__.py"                 
line 358, in call"                                                              
"    do = self.iter(retry_state=retry_state)"                                   
"  File "/usr/lib/python3.6/site-packages/tenacity/__init__.py"                 
line 319, in iter"                                                              
"    return fut.result()"                                                       
"  File "/usr/lib64/python3.6/concurrent/futures/_base.py"                      
line 425, in result"                                                            
"    return self.__get_result()"                                                
"  File "/usr/lib64/python3.6/concurrent/futures/_base.py"                      
line 384, in __get_result"                                                      
"    raise self._exception"                                                     
"  File "/usr/lib/python3.6/site-packages/tenacity/__init__.py"                 
line 361, in call"                                                              
"    result = fn(*args, **kwargs)"                                              
"  File "/usr/lib/python3.6/site-packages/tripleo_common/image/image_uploader.py"
line 732, in authenticate"                                                      
"    get_cached_bearer_token(self.lock, scope)"                                 
"  File "/usr/lib/python3.6/site-packages/tripleo_common/image/image_uploader.py"
line 316, in get_cached_bearer_token"                                           
"    if (now - token_time).seconds < data.get('expires_in'):"                   
"TypeError: '<' not supported between instances of 'int' and 'NoneType'"        
"""""                                                                           
""                                                                              
"The above exception was the direct cause of the following exception:"          
""                                                                              
"Traceback (most recent call last):"                                            
"  File "/usr/bin/tripleo-container-image-prepare"                              
line 138, in <module>"                                                          
"    lock=lock)"                                                                
"  File "/usr/lib/python3.6/site-packages/tripleo_common/image/kolla_builder.py"
line 241, in container_images_prepare_multi"                                    
"    uploader.upload()"                                                         
"  File "/usr/lib/python3.6/site-packages/tripleo_common/image/image_uploader.py"
line 573, in upload"                                                            
"    uploader.run_tasks()"                                                      
"  File "/usr/lib/python3.6/site-packages/tripleo_common/image/image_uploader.py"
line 2620, in run_tasks"                                                        
"    for result in p.map(upload_task, self.upload_tasks):"                      
"  File "/usr/lib64/python3.6/concurrent/futures/process.py"                    
line 366, in _chain_from_iterable_of_lists"                                     
"    for element in iterable:"                                                  
"  File "/usr/lib64/python3.6/concurrent/futures/_base.py"                      
line 586, in result_iterator"                                                   
"    yield fs.pop().result()"                                                   
"  File "/usr/lib64/python3.6/concurrent/futures/_base.py"                      
line 432, in result"                                                            
"    return self.__get_result()"                                                
"  File "/usr/lib64/python3.6/concurrent/futures/_base.py"                      
line 384, in __get_result"                                                      
"    raise self._exception"                                                     
"TypeError: '<' not supported between instances of 'int' and 'NoneType'"], "stdout": ""
"stdout_lines": []}                                                             
2021-01-14 17:43:50.962574 | 52540070-da86-bc14-3777-000000000c8d |     TIMING | Run tripleo-container-image-prepare logged to: /var/log/tripleo-container-image-prepare.log | undercloud-0 | 0:03:59.879271 | 7.20s

Comment 2 Jad Haj Yahya 2021-01-19 09:11:23 UTC
New findings:

After setting the tag to 16.2 explicitly on containers-prepare-parameter.yaml, the undercloud deployment passed

Note: in 16.1 no need for explicitly setting tag to 16.1

Comment 3 Alex Schultz 2021-01-21 15:08:24 UTC
That sounds like we might be missing the downstream tag patch. Will look this week.

Comment 4 Alex Schultz 2021-01-21 15:44:19 UTC
Adding tag bypasses the tag lookup so it doesn't exercise the code that is failing. I'll look into why the authentication code is unhappy.

Comment 7 Jad Haj Yahya 2021-03-22 06:21:13 UTC
UC deployment via satellite passed without explicit tag

Comment 11 errata-xmlrpc 2021-09-15 07:11:11 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat OpenStack Platform (RHOSP) 16.2 enhancement advisory), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2021:3483


Note You need to log in before you can comment on or make changes to this bug.