Description of problem: overcloud BM instance on unique tenant network fails to boot with the console reporting Could not open san device, Connection timed out Version-Release number of selected component (if applicable): RHOS-16.1-RHEL-8-20201021.n.0 iscsi-initiator-utils.x86_64 6.2.0.878-4.gitd791ce0.el8 @rhosp-rhel-8.2-baseos iscsi-initiator-utils-iscsiuio.x86_64 6.2.0.878-4.gitd791ce0.el8 @rhosp-rhel-8.2-baseos libiscsi.x86_64 1.18.0-8.module+el8.2.0+4793+b09dd2fb @rhosp-rhel-8.2-av libvirt-daemon-driver-storage-iscsi.x86_64 6.0.0-25.4.module+el8.2.1+8060+c0c58169 @rhosp-rhel-8.2-av libvirt-daemon-driver-storage-iscsi-direct.x86_64 6.0.0-25.4.module+el8.2.1+8060+c0c58169 @rhosp-rhel-8.2-av openstack-ironic-python-agent-builder.noarch 2.1.1-1.20200914175356.65d0f80.el8ost @rhelosp-16.1 puppet-cinder.noarch 15.4.1-1.20200831153422.ff571a9.el8ost @rhelosp-16.1 puppet-ironic.noarch 15.4.1-1.20200814153354.39f97cc.el8ost @rhelosp-16.1 puppet-nova.noarch 15.6.1-1.20200814103355.51a6857.el8ost @rhelosp-16.1 python3-cinderclient.noarch 5.0.1-0.20200326130221.8fa0882.el8ost @rhelosp-16.1 python3-ironic-inspector-client.noarch 3.7.1-0.20200522054325.3a41127.el8ost @rhelosp-16.1 python3-ironicclient.noarch 3.1.2-0.20200522053422.1220d76.el8ost @rhelosp-16.1 python3-novaclient.noarch 1:15.1.1-0.20200629073413.79959ab.el8ost @rhelosp-16.1 qemu-kvm-block-iscsi.x86_64 15:4.2.0-29.module+el8.2.1+7990+27f1e480.4 @rhosp-rhel-8.2-av docker://undercloud-0.ctlplane.redhat.local:8787/rh-osbs/rhosp16-openstack-ironic-neutron-agent:16.1_20201020.1 | | docker://undercloud-0.ctlplane.redhat.local:8787/rh-osbs/rhosp16-openstack-nova-scheduler:16.1_20201020.1 | | docker://undercloud-0.ctlplane.redhat.local:8787/rh-osbs/rhosp16-openstack-nova-api:16.1_20201020.1 | | docker://undercloud-0.ctlplane.redhat.local:8787/rh-osbs/rhosp16-openstack-ironic-pxe:16.1_20201020.1 | | docker://undercloud-0.ctlplane.redhat.local:8787/rh-osbs/rhosp16-openstack-nova-conductor:16.1_20201020.1 | | docker://undercloud-0.ctlplane.redhat.local:8787/rh-osbs/rhosp16-openstack-ironic-api:16.1_20201020.1 | | docker://undercloud-0.ctlplane.redhat.local:8787/rh-osbs/rhosp16-openstack-iscsid:16.1_20201020.1 | | docker://undercloud-0.ctlplane.redhat.local:8787/rh-osbs/rhosp16-openstack-nova-compute-ironic:16.1_20201020.1 | | docker://undercloud-0.ctlplane.redhat.local:8787/rh-osbs/rhosp16-openstack-ironic-conductor:16.1_20201020.1 | | docker://undercloud-0.ctlplane.redhat.local:8787/rh-osbs/rhosp16-openstack-ironic-inspector:16.1_20201020.1 | | docker://undercloud-0.ctlplane.redhat.local:8787/rh-osbs/rhosp16-openstack-cinder-api:16.1_20201020.1 | | docker://undercloud-0.ctlplane.redhat.local:8787/rh-osbs/rhosp16-openstack-nova-novncproxy:16.1_20201020.1 | | docker://undercloud-0.ctlplane.redhat.local:8787/rh-osbs/rhosp16-openstack-cinder-scheduler:16.1_20201020.1 | | docker://undercloud-0.ctlplane.redhat.local:8787/rh-osbs/rhosp16-openstack-nova-compute:16.1_20201020.1 | | docker://undercloud-0.ctlplane.redhat.local:8787/rh-osbs/rhosp16-openstack-nova-libvirt:16.1_20201020.1 | | docker://undercloud-0.ctlplane.redhat.local:8787/rh-osbs/rhosp16-openstack-cinder-volume:16.1_20201020.1 How reproducible: Consistently Steps to Reproduce: 1.Deploy 3cont_2comp_2ironic, update rhel8 image to include scsi boot utils, create cinder volume and connect to ironic node, boot fails. 2. 3. Actual results: boot fails with node console reporting "Could not open san device, boot timed out" (this loops perpetually) Expected results: instance should boot successfully Additional info: Jenkins testing
+--------------------------------------+--------------+--------------------------------------+-------------+--------------------+-------------+ | UUID | Name | Instance UUID | Power State | Provisioning State | Maintenance | +--------------------------------------+--------------+--------------------------------------+-------------+--------------------+-------------+ | f947288b-6efc-4e03-8cb5-037162f097f8 | compute-0 | f8c722d7-f230-4561-b987-88821bc4b060 | power on | active | False | | 29a8583c-a9a8-43e1-a19b-890095b173e5 | compute-1 | 556260da-bbe7-425f-9faf-70ff7eac6b9d | power on | active | False | | 44c3233a-4f00-42af-9c61-a264514d6a4c | controller-0 | 133205fc-8dc3-40e4-b27f-97207442bb6e | power on | active | False | | c95e6dbe-f1fe-4bf3-903a-e69a2ec2197a | controller-1 | f30bdd1c-a20f-47aa-b1a0-02c8a7e0b173 | power on | active | False | | f9ca8bd3-c1f6-4d51-9893-e1757e3fcaf5 | controller-2 | 6a946da5-006c-4a6d-a7e2-acb6aa62fcb2 | power on | active | False | +--------------------------------------+--------------+--------------------------------------+-------------+--------------------+-------------+ +--------------------------------------+--------------+--------+------------------------+----------------+------------+ | ID | Name | Status | Networks | Image | Flavor | +--------------------------------------+--------------+--------+------------------------+----------------+------------+ | f30bdd1c-a20f-47aa-b1a0-02c8a7e0b173 | controller-2 | ACTIVE | ctlplane=192.168.24.11 | overcloud-full | controller | | 6a946da5-006c-4a6d-a7e2-acb6aa62fcb2 | controller-1 | ACTIVE | ctlplane=192.168.24.45 | overcloud-full | controller | | 133205fc-8dc3-40e4-b27f-97207442bb6e | controller-0 | ACTIVE | ctlplane=192.168.24.35 | overcloud-full | controller | | 556260da-bbe7-425f-9faf-70ff7eac6b9d | compute-1 | ACTIVE | ctlplane=192.168.24.16 | overcloud-full | compute | | f8c722d7-f230-4561-b987-88821bc4b060 | compute-0 | ACTIVE | ctlplane=192.168.24.48 | overcloud-full | compute | +--------------------------------------+--------------+--------+------------------------+----------------+------------+ +--------------------------------------+----------+--------------------------------------+-------------+--------------------+-------------+ | UUID | Name | Instance UUID | Power State | Provisioning State | Maintenance | +--------------------------------------+----------+--------------------------------------+-------------+--------------------+-------------+ | 47b32327-8005-4805-8f25-c7875d21061b | ironic-0 | 01f8d194-84be-4d9d-ab6a-cfde6779351d | power on | active | False | | 47dfc4e4-e413-4415-9b7c-320b793dd07e | ironic-1 | b5a0bdb4-d54b-4b5f-b581-44472c0750ee | power on | active | False | +--------------------------------------+----------+--------------------------------------+-------------+--------------------+-------------+ +--------------------------------------+---------------+--------+-------------------------+----------------+--------+ | ID | Name | Status | Networks | Image | Flavor | +--------------------------------------+---------------+--------+-------------------------+----------------+--------+ | 01f8d194-84be-4d9d-ab6a-cfde6779351d | bfv-instance1 | ACTIVE | baremetal=192.168.24.80 | | | | b5a0bdb4-d54b-4b5f-b581-44472c0750ee | instance2 | ACTIVE | baremetal=192.168.24.89 | overcloud-full | | bfv-instance1 is the cinder boot node: os_dcf_diskconfig="MANUAL" os_ext_az_availability_zone="nova" os_ext_srv_attr_host="controller-1.redhat.local" os_ext_srv_attr_hostname="bfv-instance1" os_ext_srv_attr_hypervisor_hostname="47b32327-8005-4805-8f25-c7875d21061b" os_ext_srv_attr_instance_name="instance-00000008" os_ext_srv_attr_kernel_id="" os_ext_srv_attr_launch_index="0" os_ext_srv_attr_ramdisk_id="" os_ext_srv_attr_reservation_id="r-qqrmjq97" os_ext_srv_attr_root_device_name="/dev/sda" os_ext_srv_attr_user_data="None" os_ext_sts_power_state="Running" os_ext_sts_task_state="None" os_ext_sts_vm_state="active" os_srv_usg_launched_at="2020-10-30T00:31:25.000000" os_srv_usg_terminated_at="None" accessipv4="" accessipv6="" addresses="baremetal=192.168.24.80" config_drive="True" created="2020-10-30T00:30:20Z" description="None" flavor="disk='20', ephemeral='0', extra_specs.baremetal='true', extra_specs.resources:CUSTOM_BAREMETAL='1', extra_specs.resources:DISK_GB='0', extra_specs.resources:MEMORY_MB='0', extra_specs.resources:VCPU='0', original_name='baremetal', ram='2048', swap='0', vcpus='1'" hostid="c026ce08b0b35d88fe16182861e3b6e2dfed4f8adfb95280cf125cf7" host_status="UP" id="01f8d194-84be-4d9d-ab6a-cfde6779351d" image="" key_name="stack-key" locked="False" locked_reason="None" name="bfv-instance1" progress="0" project_id="8ba6a449dac6476896f9106f2d11a398" properties="" security_groups="name='default'" server_groups="[]" status="ACTIVE" tags="[]" trusted_image_certificates="None" updated="2020-10-30T00:32:27Z" user_id="14e80c9af6bd40dfbb44ceeee3f022b7" volumes_attached="delete_on_termination='False', id='86b1c7df-ad30-4059-811a-b3f0c298141d'" Volume: attachments="[{'id': '86b1c7df-ad30-4059-811a-b3f0c298141d', 'attachment_id': 'fda08c68-cc9b-4750-8887-0d4a91a15847', 'volume_id': '86b1c7df-ad30-4059-811a-b3f0c298141d', 'server_id': '01f8d194-84be-4d9d-ab6a-cfde6779351d', 'host_name': '192.168.24.80', 'device': '/dev/sda', 'attached_at': '2020-10-30T00:30:33.000000'}]" availability_zone="nova" bootable="true" consistencygroup_id="None" created_at="2020-10-30T00:28:54.000000" description="None" encrypted="False" id="86b1c7df-ad30-4059-811a-b3f0c298141d" migration_status="None" multiattach="False" name="rhel-test-volume" os_vol_host_attr_host="hostgroup@tripleo_iscsi#tripleo_iscsi" os_vol_mig_status_attr_migstat="None" os_vol_mig_status_attr_name_id="None" os_vol_tenant_attr_tenant_id="8ba6a449dac6476896f9106f2d11a398" properties="{}" replication_status="None" size="10" snapshot_id="None" source_volid="None" status="in-use" type="tripleo" updated_at="2020-10-30T00:30:33.000000" user_id="14e80c9af6bd40dfbb44ceeee3f022b7" volume_image_metadata="{'signature_verified': 'False', 'image_id': 'd91d7b7e-5fc2-42f8-92f5-b82da1d46fcf', 'image_name': 'rhel-bfv', 'checksum': '98dad0abb0894ddd27c81b983373af33', 'container_format': 'bare', 'disk_format': 'qcow2', 'min_disk': '0', 'min_ram': '0', 'size': '1259601920'}" +--------------------------------------+-----------+--------------------------------------+ | ID | Name | Subnets | +--------------------------------------+-----------+--------------------------------------+ | a51b4765-9227-41c7-8378-d8f5b49e5b25 | baremetal | 5774ae3b-0763-4991-b80b-f95bbca66658 | +--------------------------------------+-----------+--------------------------------------+ (overcloud) [stack@undercloud-0 ~]$ openstack subnet show 5774ae3b-0763-4991-b80b-f95bbca66658 +-------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Field | Value | +-------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | allocation_pools | 192.168.24.71-192.168.24.100 | | cidr | 192.168.24.0/24 | | created_at | 2020-10-29T23:57:21Z | | description | | | dns_nameservers | 10.0.0.1 | | enable_dhcp | True | | gateway_ip | 192.168.24.250 | | host_routes | | | id | 5774ae3b-0763-4991-b80b-f95bbca66658 | | ip_version | 4 | | ipv6_address_mode | None | | ipv6_ra_mode | None | | location | cloud='', project.domain_id=, project.domain_name='Default', project.id='8ba6a449dac6476896f9106f2d11a398', project.name='admin', region_name='regionOne', zone= | | name | baremetal-subnet | | network_id | a51b4765-9227-41c7-8378-d8f5b49e5b25 | | prefix_length | None | | project_id | 8ba6a449dac6476896f9106f2d11a398 | | revision_number | 2 | | segment_id | None | | service_types | | | subnetpool_id | None | | tags | | | updated_at | 2020-10-30T00:29:31Z | +-------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+ (overcloud) [stack@undercloud-0 ~]$ openstack port list +--------------------------------------+------+-------------------+-------------------------------------------------------------------------------+--------+ | ID | Name | MAC Address | Fixed IP Addresses | Status | +--------------------------------------+------+-------------------+-------------------------------------------------------------------------------+--------+ | 056aded0-ea3e-4698-bbc0-72bd0722eaf9 | | 52:54:00:f7:1b:7b | ip_address='192.168.24.80', subnet_id='5774ae3b-0763-4991-b80b-f95bbca66658' | DOWN | | 1abbad93-f1ea-400e-b3ae-37e235b457c0 | | 52:54:00:47:6b:66 | ip_address='192.168.24.89', subnet_id='5774ae3b-0763-4991-b80b-f95bbca66658' | DOWN | | 5edead48-0e2a-4ccb-95f1-7c617e52f97a | | fa:16:3e:58:c3:8e | ip_address='192.168.24.71', subnet_id='5774ae3b-0763-4991-b80b-f95bbca66658' | DOWN | | 76411b09-c512-4b49-8acd-741786e44f92 | | fa:16:3e:d1:d3:5f | ip_address='192.168.24.72', subnet_id='5774ae3b-0763-4991-b80b-f95bbca66658' | ACTIVE | | a88990d7-80e2-4f5d-9169-73c60fbb86f7 | | fa:16:3e:50:bb:0f | ip_address='192.168.24.250', subnet_id='5774ae3b-0763-4991-b80b-f95bbca66658' | ACTIVE | | cdd01aa9-adba-4b68-b70c-fdeff7fdf1d5 | | fa:16:3e:76:a0:7c | ip_address='192.168.24.73', subnet_id='5774ae3b-0763-4991-b80b-f95bbca66658' | ACTIVE | | f53fe0a1-da4e-47ed-a51c-090b75ab8ad2 | | fa:16:3e:b8:93:4d | ip_address='192.168.24.74', subnet_id='5774ae3b-0763-4991-b80b-f95bbca66658' | ACTIVE | +--------------------------------------+------+-------------------+-------------------------------------------------------------------------------+--------+
Which iSCSI is that? Please provide a bit more details about the configuration. Do you know if it works with other drivers (ceph)?
Please ignore my last question, as this is iSCSI only.
Issue is routing -> node boots with 192.168.24.80 as IP, initiator is at 172.17.3.147 controller-0 has 192.168.24.29/32 192.168.24.35/24 controller-1 has 192.168.24.45/24 controller-2 has 192.168.24.11/24 control_virtual_ip is 192.168.24.29 (on controller-0) Router r1 is built with baremetal-subnet using 192.168.24.250 as a gateway, allocation pool 192.168.24.71-> 100 and a route to 172.17.3.0/24 via control_virtual_ip (192.168.24.29 on controller-0) in this state booting node fails with Could not open san device, Connection timed out. changing the route to 172.17.3.0/24 via 192.168.24.45 (controller-1, where the scsi initiator lives) allows the booting node to connect to the initiator and load the image and boot. Controller-0 is not forwarding traffic from the router into vlan30 (172.17.3.0/24).
setting ipv4.net.ipv4.conf.all.rp_filter=2 on controller-0/1/2 resolves the communications issue. Node can now pull the image and boot.
final footnote on the job failure: LIBGUESTFS_BACKEND=direct virt-customize -a /tmp/images/{{ dib_image }} --run-command 'echo "nameserver 10.11.5.19" |tee /etc/resolv.conf && yum localinstall -y http://rhos-release.virt.bos.redhat.com/repos/rhos-release/rhos-release-latest.noarch.rpm && rhos-release -P {{ ospversion.stdout|float }} -p passed_phase1 && yum install -y iscsi-initiator-utils cloud-init openssh && for i in $(awk -F"vmlinuz-" "/linux16/ && ! /rescue/ {print \$2}" /etc/grub2.cfg|awk "{print \$1}") ; do (dracut --force --add "network iscsi" /boot/initramfs-$i.img $i) ; done && sed -i "s/GRUB_CMDLINE_LINUX=\"/GRUB_CMDLINE_LINUX=\"rd.iscsi.firmware=1 /g" /etc/default/grub && /sbin/grub2-mkconfig -o /boot/grub2/grub.cfg && adduser cloud-user && mkdir /home/cloud-user/.ssh && chown -R cloud-user:cloud-user /home/cloud-user && chmod 700 /home/cloud-user/.ssh && touch /home/cloud-user/.ssh/authorized_keys && chmod 600 /home/cloud-user/.ssh/authorized_keys' --root-password password:redhat --ssh-inject cloud-user --selinux-relabel The above command completes without failure but results in an image that is unbootable because of : for i in $(awk -F"vmlinuz-" "/linux16/ && ! /rescue/ {print \$2}" /etc/grub2.cfg|awk "{print \$1}") ; do (dracut --force --add "network iscsi" /boot/initramfs-$i.img $i) ; done There are no longer lines with "linux16" in /etc/grub2.cfg -- switched to ls /lib/modules for a list of installed kernels.
So is this an issue with the network setup?
Or a baremetal provisioning issue? Either way it does not appear to be a storage issue for the Cinder squad. Can we get this reassigned?
I'm going to mark this as a duplicate of bug #1892773, since we're assuming its that same problem, and that bug is also against 16.1 *** This bug has been marked as a duplicate of bug 1892773 ***