Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1750966

Summary: Nodes are stuck in clean_wait status and introspection fails
Product: Red Hat OpenStack Reporter: Eliad Cohen <elicohen>
Component: openstack-ironicAssignee: RHOS Maint <rhos-maint>
Status: CLOSED DUPLICATE QA Contact: mlammon
Severity: urgent Docs Contact:
Priority: unspecified    
Version: 15.0 (Stein)CC: acanan, bfournie, dtantsur, hjensas, mburns, rpittau
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-09-11 17:43:04 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Undercloud folders none

Description Eliad Cohen 2019-09-10 20:50:59 UTC
Created attachment 1613792 [details]
Undercloud folders

Description of problem:


Version-Release number of selected component (if applicable):
RHOS_TRUNK-15.0-RHEL-8-20190830.n.0
python3-ironicclient-2.7.2-0.20190529060404.266a700.el8ost.noarch
puppet-ironic-14.4.1-0.20190423121513.cd9417e.el8ost.noarch
python3-ironic-inspector-client-3.5.0-0.20190313131319.9bb1150.el8ost.noarch

How reproducible:
Erratic

Steps to Reproduce:
1. Deployed a virtual env with 3 controllers and 3 hyperconverged compute-ceph nodes

Actual results:
Introspection doesn't complete within a reasonable amount of time and the following output is given when looking into the list [1]

When attempting to set the state to 'provide' -  ironic node-set-provision-state ee3827cf-f4ae-4594-b842-ccfbf068a77e provide
The command cannot proceed when the node is in clean_wait mode.

Later all nodes move to clean_failed [2]

Expected results:
Introspection should succeed.

Additional info:
[1] http://pastebin.test.redhat.com/796102
[2] http://pastebin.test.redhat.com/796105

Comment 3 Eliad Cohen 2019-09-11 12:30:01 UTC
Recreated. Starting to seem consistent

Comment 6 Harald Jensås 2019-09-11 17:36:33 UTC
From https://rhos-qe-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/view/DFG/view/ceph/view/rhos/job/DFG-ceph-rhos-15_director-rhel-virthost-3cont_3hcicephall-ipv4-geneve-hcicephall-rgw/26/artifact/undercloud-0.tar.gz

2019-09-10 20:05:55.288 60812 ERROR neutron.agent.dhcp.agent + nsenter --net=/run/netns/qdhcp-f46e51ea-7e35-4ec4-9850-a2e42998ab10 --preserve-credentials -m -t 1 podman run --detach --log-driver json-file --log-opt path=/var/log/containers/stdouts/neutron-dnsmasq-qdhcp-f46e51ea-7e35-4ec4-9850-a2e42998ab10.log -v /var/lib/config-data/puppet-generated/neutron/etc/neutron:/etc/neutron:ro -v /run/netns:/run/netns:shared -v /var/lib/neutron:/var/lib/neutron:z,shared -v /dev/log:/dev/log --net host --pid host --privileged -u root --name neutron-dnsmasq-qdhcp-f46e51ea-7e35-4ec4-9850-a2e42998ab10 192.168.24.1:8787/rhosp15/openstack-neutron-dhcp-agent:20190829.2 /usr/sbin/dnsmasq -k --no-hosts --no-resolv --pid-file=/var/lib/neutron/dhcp/f46e51ea-7e35-4ec4-9850-a2e42998ab10/pid --dhcp-hostsfile=/var/lib/neutron/dhcp/f46e51ea-7e35-4ec4-9850-a2e42998ab10/host --addn-hosts=/var/lib/neutron/dhcp/f46e51ea-7e35-4ec4-9850-a2e42998ab10/addn_hosts --dhcp-optsfile=/var/lib/neutron/dhcp/f46e51ea-7e35-4ec4-9850-a2e42998ab10/opts --dhcp-leasefile=/var/lib/neutron/dhcp/f46e51ea-7e35-4ec4-9850-a2e42998ab10/leases --dhcp-match=set:ipxe,175 --dhcp-userclass=set:ipxe6,iPXE --local-service --bind-dynamic --dhcp-range=set:tag0,192.168.24.0,static,255.255.255.0,86400s --dhcp-option-force=option:mtu,1500 --dhcp-lease-max=256 --conf-file= --domain=localdomain
2019-09-10 20:05:55.288 60812 ERROR neutron.agent.dhcp.agent container create failed: container_linux.go:345: starting container process caused "process_linux.go:430: container init caused \"write /proc/self/attr/keycreate: permission denied\""
2019-09-10 20:05:55.288 60812 ERROR neutron.agent.dhcp.agent : internal libpod error


I think we can mark this a dupe of: https://bugzilla.redhat.com/show_bug.cgi?id=1751300

Comment 7 Bob Fournier 2019-09-11 17:43:04 UTC

*** This bug has been marked as a duplicate of bug 1751300 ***