Traceback in ironic/ironic-conductor.log during deployment: ~~~ [req-c53baaf7-21ff-40c6-8734-4c3dd7b20cdc 2e1561e8f87744bea311b6ce7734240e c242b9ae848e418caddcd3e04adf21db - default default] Node 93fe2702-3696-4278-9d15-507ab646aa74 failed deploy step {'step': 'deploy', 'priority': 100, 'argsinfo': None, 'interface': 'deploy'} with unexpected error: Invalid string format: 1eB: ValueError: Invalid string format: 1eB Traceback (most recent call last): File "/usr/lib/python3.6/site-packages/ironic/conductor/manager.py", line 3914, in _do_next_deploy_step result = interface.execute_deploy_step(task, step) File "/usr/lib/python3.6/site-packages/ironic/drivers/base.py", line 360, in execute_deploy_step return self._execute_step(task, step) File "/usr/lib/python3.6/site-packages/ironic/drivers/base.py", line 283, in _execute_step return getattr(self, step['step'])(task) File "/usr/lib/python3.6/site-packages/ironic_lib/metrics.py", line 60, in wrapped result = f(*args, **kwargs) File "/usr/lib/python3.6/site-packages/ironic/conductor/task_manager.py", line 148, in wrapper return f(*args, **kwargs) File "/usr/lib/python3.6/site-packages/ironic/drivers/modules/iscsi_deploy.py", line 422, in deploy deploy_utils.cache_instance_image(task.context, node) File "/usr/lib/python3.6/site-packages/ironic_lib/metrics.py", line 60, in wrapped result = f(*args, **kwargs) File "/usr/lib/python3.6/site-packages/ironic/drivers/modules/deploy_utils.py", line 1197, in cache_instance_image force_raw) File "/usr/lib/python3.6/site-packages/ironic/drivers/modules/deploy_utils.py", line 531, in fetch_images cache.fetch_image(href, path, ctx=ctx, force_raw=force_raw) File "/usr/lib/python3.6/site-packages/ironic/drivers/modules/image_cache.py", line 140, in fetch_image href, master_path, dest_path, ctx=ctx, force_raw=force_raw) File "/usr/lib/python3.6/site-packages/ironic/drivers/modules/image_cache.py", line 167, in _download_image _fetch(ctx, href, tmp_path, force_raw) File "/usr/lib/python3.6/site-packages/ironic/drivers/modules/image_cache.py", line 321, in _fetch required_space = images.converted_size(path_tmp) File "/usr/lib/python3.6/site-packages/ironic/common/images.py", line 396, in converted_size data = disk_utils.qemu_img_info(path) File "/usr/lib/python3.6/site-packages/ironic_lib/disk_utils.py", line 386, in qemu_img_info return imageutils.QemuImgInfo(out) File "/usr/lib/python3.6/site-packages/oslo_utils/imageutils.py", line 62, in __init__ details = self._parse(cmd_output or '') File "/usr/lib/python3.6/site-packages/oslo_utils/imageutils.py", line 178, in _parse details = self._extract_details(root, root_details, lines) File "/usr/lib/python3.6/site-packages/oslo_utils/imageutils.py", line 127, in _extract_details real_details = self._extract_bytes(root_details) File "/usr/lib/python3.6/site-packages/oslo_utils/imageutils.py", line 113, in _extract_bytes return_int=True) File "/usr/lib/python3.6/site-packages/oslo_utils/strutils.py", line 225, in string_to_bytes raise ValueError(msg) ValueError: Invalid string format: 1eB ~~~ Investigation: ~~~ (Pdb) n > / usr/lib/python3.6/site-packages/ironic/drivers/modules/image_cache.py(317)_fetch() -> path_tmp = "%s.part" % path (Pdb) l 312 return stat.f_frsize * stat.f_bavail 313 314 315 def _fetch(context, image_href, path, force_raw=False): 316 """Fetch image and convert to raw format if needed.""" 317 -> path_tmp = "%s.part" % path 318 images.fetch(context, image_href, path_tmp, force_raw=False) # Notes(yjiang5): If glance can provide the virtual size information, 319 320 # then we can firstly clean cache and then invoke images.fetch(). 321 if force_raw: 322 required_space = images.converted_size(path_tmp) (Pdb) p force_raw True (Pdb) n > / usr/lib/python3.6/site-packages/ironic/drivers/modules/image_cache.py(318)_fetch() -> images.fetch(context, image_href, path_tmp, force_raw=False) (Pdb) p force_raw True (Pdb) s --Call-- > / usr/lib/python3.6/site-packages/ironic/common/images.py(320)fetch() -> def fetch(context, image_href, path, force_raw=False): (Pdb) l 315 except processutils.ProcessExecutionError as e: 316 LOG.exception("Creating ISO image failed.") 317 raise exception.ImageCreationFailed(image_type='iso', error=e) 318 319 320 -> def fetch(context, image_href, path, force_raw=False): 321 # TODO(vish): Improve context handling and add owner and auth data 322 # when it is added to glance. Right now there is no 323 # auth checking in glance, so we assume that access was 324 # checked before we got here. 325 image_service = service.get_image_service(image_href, (Pdb) p force_raw False (Pdb) ~~~ As you can see here, after stepping into images.fetch we are immediately missing force_raw, and it's being reset to False by: ~~~ 318 images.fetch(context, image_href, path_tmp, force_raw=False) ~~~ This is just a pseudo code execution, but same logic(replace qe_test with force_raw) ~~~ def _qe(qe_test=False): qe(qe_test=False) def qe(qe_test=False): print(qe_test) _qe(qe_test=True) ~~~ As you can see even if we set it to True, it will always be False.bit_length ~~~ sh-4.4$ python qe_test.py False ~~~ We can fix this by changing: ~~~ > /usr/lib/python3.6/site-packages/ironic/drivers/modules/image_cache.py(318)_fetch() -> images.fetch(context, image_href, path_tmp, force_raw=False) ~~~ To ~~~ > /usr/lib/python3.6/site-packages/ironic/drivers/modules/image_cache.py(318)_fetch() -> images.fetch(context, image_href, path_tmp, force_raw) ~~~ However it introduces another problem because as you can see below path_tmp already contains .part. ~~~ 315 def _fetch(context, image_href, path, force_raw=False): 316 """Fetch image and convert to raw format if needed.""" 317 -> path_tmp="%s.part" % path 318 images.fetch(context, image_href, path_tmp, force_raw=False) ~~~ We feed path_tmp to fetch method below as path: ~~~ 320 def fetch(context, image_href, path, force_raw=False): ... 336 -> if force_raw: 337 image_to_raw(image_href, path, "%s.part" % path) ~~~ And if force_raw is True, which it is, path is appended another .part to it (Pdb) p path_tmp '/var/lib/ironic/master_images/tmp3bd03uc5/e48e1f98-2afe-40b6-bcf9-688c4203a5f1.part.part' (Pdb) We can fix this by changing ~~~ 336 -> if force_raw: 337 image_to_raw(image_href, path, "%s.part" % path) ~~~ To ~~~ 336 -> if force_raw: 337 image_to_raw(image_href, path, "%s" % path) ~~~ Next problem I can see is that we need to run qemu-img info to verify if the image is alreayd raw or has backing file: ~~~ 340 -> def image_to_raw(image_href, path, path_tmp): 341 with fileutils.remove_path_on_error(path_tmp): 342 data = disk_utils.qemu_img_info(path_tmp) 343 fmt = data.file_format 344 if fmt is None: ~~~ The qemu-img info on RHEL 8 returns: ~~~ 383 out, err = utils.execute('env', 'LC_ALL=C', 'LANG=C', 384 'qemu-img', 'info', path, 385 prlimit=QEMU_IMG_LIMITS) 386 -> return imageutils.QemuImgInfo(out) > /usr/lib/python3.6/site-packages/ironic_lib/disk_utils.py(386)qemu_img_info() -> return imageutils.QemuImgInfo(out) (Pdb) p out 'image: /var/lib/ironic/master_images/tmpj2m19d28/e48e1f98-2afe-40b6-bcf9-688c4203a5f1.part\nfile format: qcow2\nvirtual size: 3.84 GiB (4127195136 bytes)\ndisk size: 1e+03 MiB\ncluster_size: 65536\nFormat speci fic information:\n compat: 1.1\n lazy refcounts: false\n refcount bits: 16\n corrupt: false\n' ~~~ Which causes problem because parses doesn't recognize 'e' character. ~~~ disk size: 1e+03 MiB ~~~ In parser we have human read-able output: ~~~ (Pdb) > /usr/lib/python3.6/site-packages/oslo_utils/imageutils.py(62)__init__() -> details = self._parse(cmd_output or '') 61 else: 62 -> details = self._parse(cmd_output or '') 63 self.image = details.get('image') 64 self.backing_file = details.get('backing_file') 65 self.file_format = details.get('file_format') 66 self.virtual_size = details.get('virtual_size') 67 self.cluster_size = details.get('cluster_size') ~~~ We are sending this output to _parse: ~~~ (Pdb) p cmd_output 'image: /var/lib/ironic/master_images/tmphcz_ry4w/e48e1f98-2afe-40b6-bcf9-688c4203a5f1.part\nfile format: qcow2\nvirtual size: 3.84 GiB (4127195136 bytes)\ndisk size: 1e+03 MiB\ncluster_size: 65536\nFormat speci fic information:\n compat: 1.1\n lazy refcounts: false\n refcount bits: 16\n corrupt: false\n' (Pdb) ~~~ And we know the _parse is the one who fails here: ~~~ (Pdb) p self._parse(cmd_output or '') *** ValueError: Invalid string format: 1eB ~~~ Code of _parse: ~~~ 159 -> def _parse(self, cmd_output): 160 # Analysis done of qemu-img.c to figure out what is going on here 161 # Find all points start with some chars and then a ':' then a newline 162 # and then handle the results of those 'top level' items in a separate 163 # function. 164 # (Pdb) l 165 # TODO(harlowja): newer versions might have a json output format 166 # we should switch to that whenever possible. 167 # see: http://bit.ly/XLJXDX 168 contents = {} 169 lines = [x for x in cmd_output.splitlines() if x.strip()] 170 while lines: 171 line = lines.pop(0) 172 top_level = self.TOP_LEVEL_RE.match(line) 173 if top_level: 174 root = self._canonicalize(top_level.group(1)) 175 if not root: 176 continue 177 root_details = top_level.group(2).strip() 178 details = self._extract_details(root, root_details, lines) 179 contents[root] = details 180 return contents ~~~ We are parsing: ~~~ (Pdb) p line 'disk size: 1e+03 MiB' (Pdb) l 167 # see: http://bit.ly/XLJXDX 168 contents = {} 169 lines = [x for x in cmd_output.splitlines() if x.strip()] 170 while lines: 171 line = lines.pop(0) 172 -> top_level = self.TOP_LEVEL_RE.match(line) 173 if top_level: 174 root = self._canonicalize(top_level.group(1)) 175 if not root: 176 continue 177 root_details = top_level.group(2).strip() (Pdb) (Pdb) p self.TOP_LEVEL_RE re.compile('^([\\w\\d\\s\\_\\-]+):(.*)$') (Pdb) p top_level <_sre.SRE_Match object; span=(0, 20), match='disk size: 1e+03 MiB'> (Pdb) p root 'disk_size' (Pdb) p root_details '1e+03 MiB' ~~~ We are getting closer to the problem now: ~~~ [EOF] (Pdb) p self._extract_details(root, root_details, lines) *** ValueError: Invalid string format: 1eB (Pdb) (Pdb) > /usr/lib/python3.6/site-packages/oslo_utils/imageutils.py(117)_extract_details() -> if root_cmd == 'backing_file': (Pdb) p root_cmd 'disk_size' (Pdb) p real_details '1e+03 MiB' 122 -> elif root_cmd in ['virtual_size', 'cluster_size', 'disk_size']: 123 # Replace it with the byte amount (if we can convert it) 124 if root_details in ('None', 'unavailable'): 125 real_details = 0 126 else: 127 real_details = self._extract_bytes(root_details) (Pdb) p self._extract_bytes(root_details) *** ValueError: Invalid string format: 1eB ~~~ Even closer, we know the problem is in _extract_bytes. Code of _extract_bytes ~~~ 97 -> def _extract_bytes(self, details): 98 # Replace it with the byte amount 99 real_size = self.SIZE_RE.search(details) 100 if not real_size: 101 raise ValueError(_('Invalid input value "%s".') % details) 102 magnitude = real_size.group(1) (Pdb) 103 unit_of_measure = real_size.group(2) 104 bytes_info = real_size.group(3) 105 if bytes_info: 106 return int(real_size.group(4)) 107 elif not unit_of_measure: 108 return int(magnitude) 109 # Allow abbreviated unit such as K to mean KB for compatibility. 110 if len(unit_of_measure) == 1 and unit_of_measure != 'B': 111 unit_of_measure += 'B' 112 return strutils.string_to_bytes('%s%s' % (magnitude, unit_of_measure), 113 return_int=True) (Pdb) ~~~ And here is the real problem: ~~~ (Pdb) p self.SIZE_RE.search(details) <_sre.SRE_Match object; span=(0, 2), match='1e'> (Pdb) (Pdb) > /usr/lib/python3.6/site-packages/oslo_utils/imageutils.py(110)_extract_bytes() -> if len(unit_of_measure) == 1 and unit_of_measure != 'B': (Pdb) l 105 if bytes_info: 106 return int(real_size.group(4)) 107 elif not unit_of_measure: 108 return int(magnitude) 109 # Allow abbreviated unit such as K to mean KB for compatibility. 110 -> if len(unit_of_measure) == 1 and unit_of_measure != 'B': 111 unit_of_measure += 'B' 112 return strutils.string_to_bytes('%s%s' % (magnitude, unit_of_measure), 113 return_int=True) 114 115 def _extract_details(self, root_cmd, root_details, lines_after): (Pdb) p unit_of_measure 'e' (Pdb) ~~~ We will add 'B' to the unit_of_measure, hence 'eB' and return it together with magnitude as an int: ~~~ -> return strutils.string_to_bytes('%s%s' % (magnitude, unit_of_measure), (Pdb) l 107 elif not unit_of_measure: 108 return int(magnitude) 109 # Allow abbreviated unit such as K to mean KB for compatibility. 110 if len(unit_of_measure) == 1 and unit_of_measure != 'B': 111 unit_of_measure += 'B' 112 -> return strutils.string_to_bytes('%s%s' % (magnitude, unit_of_measure), 113 return_int=True) 114 115 def _extract_details(self, root_cmd, root_details, lines_after): 116 real_details = root_details 117 if root_cmd == 'backing_file': (Pdb) p magnitude '1' (Pdb) p unit_of_measure 'eB' ~~~ '1eB' And here is where we get the 'ValueError: Invalid string format: 1eB' So the problem seems in the qemu-img returning the disk size in '1e+03 MiB' ~~~ # qemu-img info /var/lib/ironic/master_images/tmptfe753fb/e48e1f98-2afe-40b6-bcf9-688c4203a5f1.part image: /var/lib/ironic/master_images/tmptfe753fb/e48e1f98-2afe-40b6-bcf9-688c4203a5f1.part file format: qcow2 virtual size: 3.84 GiB (4127195136 bytes) -> disk size: 1e+03 MiB cluster_size: 65536 Format specific information: compat: 1.1 lazy refcounts: false refcount bits: 16 corrupt: false ~~~ We don't have the same problem using 'json' ~~~ qemu-img info --output=json /var/lib/ironic/master_images/tmptfe753fb/e48e1f98-2afe-40b6-bcf9-688c4203a5f1.part { "virtual-size": 4127195136, "filename": "/var/lib/ironic/master_images/tmptfe753fb/e48e1f98-2afe-40b6-bcf9-688c4203a5f1.part", "cluster-size": 65536, "format": "qcow2", "actual-size": 1048510464, "format-specific": { "type": "qcow2", "data": { "compat": "1.1", "lazy-refcounts": false, "refcount-bits": 16, "corrupt": false } }, "dirty-flag": false } ~~~ I have forced qemu-img ofmt to be 'json' in /usr/lib/python3.6/site-packages/ironic_lib/disk_utils.py ~~~ 383 out, err = utils.execute('env', 'LC_ALL=C', 'LANG=C', 384 'qemu-img', 'info', path, 385 prlimit=QEMU_IMG_LIMITS) 386 -> return imageutils.QemuImgInfo(out) ~~~ To ~~~ 383 out, err = utils.execute('env', 'LC_ALL=C', 'LANG=C', 384 'qemu-img', 'info', '--output=json', path, 385 prlimit=QEMU_IMG_LIMITS) 386 -> return imageutils.QemuImgInfo(out, format='json') ~~~ DISCLAIMER: All the changes were done in the lab to investigate the issue and were never recommended. This is only to help you figure out the real fix. Environment information: ~~~ # cat /etc/*release NAME="Red Hat Enterprise Linux" VERSION="8.1 (Ootpa)" ID="rhel" ID_LIKE="fedora" VERSION_ID="8.1" PLATFORM_ID="platform:el8" PRETTY_NAME="Red Hat Enterprise Linux 8.1 (Ootpa)" ANSI_COLOR="0;31" CPE_NAME="cpe:/o:redhat:enterprise_linux:8.1:GA" HOME_URL="https://www.redhat.com/" BUG_REPORT_URL="https://bugzilla.redhat.com/" REDHAT_BUGZILLA_PRODUCT="Red Hat Enterprise Linux 8" REDHAT_BUGZILLA_PRODUCT_VERSION=8.1 REDHAT_SUPPORT_PRODUCT="Red Hat Enterprise Linux" REDHAT_SUPPORT_PRODUCT_VERSION="8.1" Red Hat Enterprise Linux release 8.1 (Ootpa) Red Hat OpenStack Platform release 16.0.1 (Train) Red Hat Enterprise Linux release 8.1 (Ootpa) ~~~ Images on Undercloud: ~~~ # podman ps | grep ironic 476ec58cc733 undercloud-0.ctlplane.redhat.local:8787/rhosp-rhel8/openstack-ironic-conductor:16.0-77 kolla_start 6 hours ago Up 8 minutes ago ironic_conductor 493e6a47ea70 undercloud-0.ctlplane.redhat.local:8787/rhosp-rhel8/openstack-nova-compute-ironic:16.0-77 kolla_start 2 days ago Up 6 hours ago nova_compute ~~~
This is being tracked in https://bugzilla.redhat.com/show_bug.cgi?id=1807375 and the fix is available, marking as duplicate. *** This bug has been marked as a duplicate of bug 1807375 ***