Bug 1472836

Summary: Deployment fails when IPA fails to set up iSCSI target
Product: Red Hat OpenStack Reporter: David Juran <djuran>
Component: documentationAssignee: RHOS Documentation Team <rhos-docs>
Status: CLOSED CURRENTRELEASE QA Contact: RHOS Documentation Team <rhos-docs>
Severity: medium Docs Contact:
Priority: medium    
Version: 11.0 (Ocata)CC: amcleod, bfournie, djuran, dmacpher, mburns, slinaber, srevivo
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-12-02 13:37:37 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
ramdisk log none

Description David Juran 2017-07-19 13:44:17 UTC
Created attachment 1301121 [details]
ramdisk log

Description of problem:
Trying to deploy OSP11 on KVM (I know, but it's a home lab...) the deployment fails in an early stage when ironic-python-agent fails to set up the iSCSI target to which the overcloud image will be written. From the ramdisk logs, the following can be seen:

Jul 19 08:00:55 host-192-168-24-12 ironic-python-agent[532]: 2017-07-19 08:00:55.306 532 DEBUG ironic_python_agent.extensions.iscsi [-] Starting ISCSI target with iqn iqn.2008-10.org.openstack:5c22206b-6b1f-424a-aeb0-d1fc785e3957 on device /dev/vda start_iscsi_target /usr/lib/python2.7/site-packages/ironic_python_agent/extensions/iscsi.py:168
Jul 19 08:00:55 host-192-168-24-12 ironic-python-agent[532]: 2017-07-19 08:00:55.312 532 ERROR root [-] Command execution error: Error starting iSCSI target: Error starting iSCSI target: Failed to create a target: Storage object block/iqn.2008-10.org.openstack:5c22206b-6b1f-424a-aeb0-d1fc785e3957 exists
Jul 19 08:00:55 host-192-168-24-12 ironic-python-agent[532]: 2017-07-19 08:00:55.312 532 ERROR root Traceback (most recent call last):
Jul 19 08:00:55 host-192-168-24-12 ironic-python-agent[532]: 2017-07-19 08:00:55.312 532 ERROR root   File "/usr/lib/python2.7/site-packages/ironic_python_agent/extensions/base.py", line 252, in execute_command
Jul 19 08:00:55 host-192-168-24-12 ironic-python-agent[532]: 2017-07-19 08:00:55.312 532 ERROR root     result = ext.execute(command_part, **kwargs)
Jul 19 08:00:55 host-192-168-24-12 ironic-python-agent[532]: 2017-07-19 08:00:55.312 532 ERROR root   File "/usr/lib/python2.7/site-packages/ironic_python_agent/extensions/base.py", line 205, in execute
Jul 19 08:00:55 host-192-168-24-12 ironic-python-agent[532]: 2017-07-19 08:00:55.312 532 ERROR root     return cmd(**kwargs)
Jul 19 08:00:55 host-192-168-24-12 ironic-python-agent[532]: 2017-07-19 08:00:55.312 532 ERROR root   File "/usr/lib/python2.7/site-packages/ironic_python_agent/extensions/base.py", line 321, in wrapper
Jul 19 08:00:55 host-192-168-24-12 ironic-python-agent[532]: 2017-07-19 08:00:55.312 532 ERROR root     result = func(self, **command_params)
Jul 19 08:00:55 host-192-168-24-12 ironic-python-agent[532]: 2017-07-19 08:00:55.312 532 ERROR root   File "/usr/lib/python2.7/site-packages/ironic_python_agent/extensions/iscsi.py", line 183, in start_iscsi_target
Jul 19 08:00:55 host-192-168-24-12 ironic-python-agent[532]: 2017-07-19 08:00:55.312 532 ERROR root     _start_lio(iqn, portal_port, device)
Jul 19 08:00:55 host-192-168-24-12 ironic-python-agent[532]: 2017-07-19 08:00:55.312 532 ERROR root   File "/usr/lib/python2.7/site-packages/ironic_python_agent/extensions/iscsi.py", line 97, in _start_lio
Jul 19 08:00:55 host-192-168-24-12 ironic-python-agent[532]: 2017-07-19 08:00:55.312 532 ERROR root     raise errors.ISCSIError(msg)
Jul 19 08:00:55 host-192-168-24-12 ironic-python-agent[532]: 2017-07-19 08:00:55.312 532 ERROR root ISCSIError: Error starting iSCSI target: Error starting iSCSI target: Failed to create a target: Storage object block/iqn.2008-10.org.openstack:5c22206b-6b1f-424a-aeb0-d1fc785e3957 exists
Jul 19 08:00:55 host-192-168-24-12 ironic-python-agent[532]: 2017-07-19 08:00:55.312 532 ERROR root
Jul 19 08:00:55 host-192-168-24-12 ironic-python-agent[532]: 2017-07-19 08:00:55.315 532 INFO root [-] Command iscsi.start_iscsi_target completed: Command name: iscsi.start_iscsi_target, params: {u'wipe_disk_metadata': True, u'iqn': u'iqn.2008-10.org.openstack:5c22206b-6b1f-424a-aeb0-d1fc785e3957', u'portal_port': 3260}, status: FAILED, result: None.


Attached is the full ramdisk log


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 1 Bob Fournier 2017-08-04 16:54:00 UTC
David - is this still occurring? If so, it would be useful to also see the logs from the undercloud side - either an sosreport or /var/logs/...   

from the ramdisk log it looks like there is an issue with iscsi connections:
Jul 19 07:56:59 host-192-168-24-12 ironic-python-agent[532]: 2017-07-19 07:56:59.563 532 DEBUG oslo_concurrency.processutils [-] Running cmd (subprocess): iscsistart -f execute /usr/lib/python2.7/site-packages/oslo_concurrency/processutils.py:355
Jul 19 07:56:59 host-192-168-24-12 ironic-python-agent[532]: 2017-07-19 07:56:59.603 532 DEBUG oslo_concurrency.processutils [-] CMD "iscsistart -f" returned: 21 in 0.040s execute /usr/lib/python2.7/site-packages/oslo_concurrency/processutils.py:385
Jul 19 07:56:59 host-192-168-24-12 ironic-python-agent[532]: 2017-07-19 07:56:59.604 532 DEBUG oslo_concurrency.processutils [-] u'iscsistart -f' failed. Not Retrying. execute /usr/lib/python2.7/site-packages/oslo_concurrency/processutils.py:433
Jul 19 07:56:59 host-192-168-24-12 ironic-python-agent[532]: 2017-07-19 07:56:59.605 532 DEBUG root [-] No iscsi connection detected. Skipping iscsi. Error: Unexpected error while running command.
Jul 19 07:56:59 host-192-168-24-12 ironic-python-agent[532]: Command: iscsistart -f
Jul 19 07:56:59 host-192-168-24-12 ironic-python-agent[532]: Exit code: 21
Jul 19 07:56:59 host-192-168-24-12 ironic-python-agent[532]: Stdout: u'Could not get list of targets from firmware.\n'
Jul 19 07:56:59 host-192-168-24-12 ironic-python-agent[532]: Stderr: u'' _check_for_iscsi /usr/lib/python2.7/site-packages/ironic_python_agent/hardware.py:89

Comment 2 David Juran 2017-08-09 09:28:55 UTC
Ahhh, not staring oneself blind at the ramdisk logs was the key! Looking at the ironic-conductor log, the following message could be seen:

2017-08-09 09:52:32.342 16004 ERROR ironic.drivers.modules.deploy_utils [req-44b89c98-2a1c-480d-ba2c-84c9b03a5f9c - - - - -] Command: sudo ironic-rootwrap /etc/ironic/rootwrap.conf parted -a optimal -s /dev/disk/by-path/ip-10.0.0.26:3260-iscsi-iqn.2008-10.org.openstack:64ce5a82-21ec-48b7-bf91-efa63e8ed448-lun-1 -- unit MiB mklabel msdos mkpart primary  1 2 mkpart primary  2 40962 set 2 boot on
2017-08-09 09:52:32.344 16004 ERROR ironic.drivers.modules.deploy_utils [req-44b89c98-2a1c-480d-ba2c-84c9b03a5f9c - - - - -] StdOut: u''
2017-08-09 09:52:32.354 16004 ERROR ironic.drivers.modules.deploy_utils [req-44b89c98-2a1c-480d-ba2c-84c9b03a5f9c - - - - -] StdErr: u'Error: The location 40962 is outside of the device /dev/sda.\n'

And 40962 MB _is_ (slightly) more then the 40 GiB I had allocated for the disk. Increasing disk size to  50 GB solved the problem.

So now the question is: Is this a docs-bug where we say that 40 GB is enough for the root disk or is I-p-a setting up a larger partition then it should?

Comment 3 Bob Fournier 2017-08-11 14:03:58 UTC
Thanks for finding this David! I don't believe we doc anywhere that 40 GB is large enough, it should be dependent on the overcloud image size.  I'm not sure what else we can do in this case.  I think we should probably close this.

Comment 4 David Juran 2017-08-11 15:27:18 UTC
Actually we do document[1] that 40 GB is a minimum requirement. On the other hand, the overcloud image is only 5 GB in size[2].
So why is the i-p-a trying to create a partition of just over 40 GB? Or am I misreading something?

[1]
https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/11/html/director_installation_and_usage/chap-requirements#sect-Compute_Node_Requirements

[2]
[stack@OSPd11 images(undercloud_admin)]$ qemu-img info overcloud-full.qcow2
image: overcloud-full.qcow2
file format: qcow2
virtual size: 5.0G (5396234240 bytes)
disk size: 1.2G
cluster_size: 65536
Format specific information:
    compat: 1.1
    lazy refcounts: false
    refcount bits: 16
    corrupt: false

Comment 5 Dmitry Tantsur 2017-08-15 14:08:17 UTC
Hi!

IPA tries to create the partition with the size from the flavor [1], which is exacly 40 GiB. Then, due to an unclear long-standing parted issue, it actually occupies slightly more. Finally, we also need a few (50?) MiB for a config drive. For UEFI, also a boot partition. This results in more than 40 GiB required. I think the documentation should be fixed to mention 50 GiB minimum.

[1] https://github.com/openstack/instack-undercloud/blob/master/instack_undercloud/undercloud.py#L1373

Comment 6 Bob Fournier 2017-08-15 14:16:47 UTC
Thanks Dmitry.  I will change this to a doc bug.

Comment 7 Bob Fournier 2017-08-15 18:05:40 UTC
Doc team - as indicated in Comment 4, the docs should be changed to set the Disk Space to a minimum of 50 Gb (not 40).

Comment 9 Dan Macpherson 2020-11-24 14:20:37 UTC
port to OSP16.1, OSP16.0, OSP13 too