Hide Forgot
Created attachment 1229768 [details] log from one ramdisk Description of problem: When deploying the nodes boot on the ramdisk but are not provisioned because ironic python agent is not able to start the ISCSI daemon and just give up. Version-Release number of selected component (if applicable): rhosp-director-images-ipa-10.0-20161130.1.el7ost.noarch rhosp-director-images-10.0-20161130.1.el7ost.noarch How reproducible: Every time Steps to Reproduce: 1. Run introspection 2. Run deployment 3. Actual results: Deployment failed with a bunch of Message: No valid host was found. There are not enough hosts available., Code: 500" All the nodes are affected. Expected results: Deployment working. Additional info: I've added the log from a ramdisk. The last part is the interesting one: Dec 08 21:15:32 host-172-17-9-207 ironic-python-agent[1373]: 2016-12-08 21:15:32.661 1373 INFO ironic_python_agent.agent [-] Lookup succeeded, node UUID is 8fe14370-2007-4733-9a23-2a55eef3847c Dec 08 21:15:32 host-172-17-9-207 ironic-python-agent[1373]: 2016-12-08 21:15:32.662 1373 INFO ironic_python_agent.agent [-] starting heartbeater Dec 08 21:15:32 host-172-17-9-207 ironic-python-agent[1373]: 2016-12-08 21:15:32.702 1373 INFO ironic_python_agent.agent [-] heartbeat successful Dec 08 21:15:32 host-172-17-9-207 ironic-python-agent[1373]: 2016-12-08 21:15:32.702 1373 INFO ironic_python_agent.agent [-] sleeping before next heartbeat, interval: 128.252775721 Dec 08 21:15:32 host-172-17-9-207 ironic-python-agent[1373]: 2016-12-08 21:15:32.797 1373 INFO ironic_lib.disk_utils [-] Disk metadata on /dev/sda successfully destroyed for node 8fe14370-2007-4733-9a23-2a55eef3847c Dec 08 21:15:32 host-172-17-9-207 ironic-python-agent[1373]: 2016-12-08 21:15:32.802 1373 WARNING ironic_python_agent.extensions.iscsi [-] Linux-IO is not available, falling back to TGT. Error: Cannot mount configfs. Dec 08 21:15:32 host-172-17-9-207 ironic-python-agent[1373]: 2016-12-08 21:15:32.807 1373 ERROR ironic_python_agent.extensions.iscsi [-] Unable to start the ISCSI daemon Dec 08 21:15:32 host-172-17-9-207 ironic-python-agent[1373]: 2016-12-08 21:15:32.808 1373 ERROR root [-] Command execution error: 'exceptions.OSError' object has no attribute 'exit_code' Dec 08 21:15:32 host-172-17-9-207 ironic-python-agent[1373]: 2016-12-08 21:15:32.808 1373 ERROR root Traceback (most recent call last): Dec 08 21:15:32 host-172-17-9-207 ironic-python-agent[1373]: 2016-12-08 21:15:32.808 1373 ERROR root File "/usr/lib/python2.7/site-packages/ironic_python_agent/extensions/base.py", line 251, in execute_command Dec 08 21:15:32 host-172-17-9-207 ironic-python-agent[1373]: 2016-12-08 21:15:32.808 1373 ERROR root result = ext.execute(command_part, **kwargs) Dec 08 21:15:32 host-172-17-9-207 ironic-python-agent[1373]: 2016-12-08 21:15:32.808 1373 ERROR root File "/usr/lib/python2.7/site-packages/ironic_python_agent/extensions/base.py", line 204, in execute Dec 08 21:15:32 host-172-17-9-207 ironic-python-agent[1373]: 2016-12-08 21:15:32.808 1373 ERROR root return cmd(**kwargs) Dec 08 21:15:32 host-172-17-9-207 ironic-python-agent[1373]: 2016-12-08 21:15:32.808 1373 ERROR root File "/usr/lib/python2.7/site-packages/ironic_python_agent/extensions/base.py", line 320, in wrapper Dec 08 21:15:32 host-172-17-9-207 ironic-python-agent[1373]: 2016-12-08 21:15:32.808 1373 ERROR root result = func(self, **command_params) Dec 08 21:15:32 host-172-17-9-207 ironic-python-agent[1373]: 2016-12-08 21:15:32.808 1373 ERROR root File "/usr/lib/python2.7/site-packages/ironic_python_agent/extensions/iscsi.py", line 181, in start_iscsi_target Dec 08 21:15:32 host-172-17-9-207 ironic-python-agent[1373]: 2016-12-08 21:15:32.808 1373 ERROR root _start_tgtd(iqn, portal_port, device) Dec 08 21:15:32 host-172-17-9-207 ironic-python-agent[1373]: 2016-12-08 21:15:32.808 1373 ERROR root File "/usr/lib/python2.7/site-packages/ironic_python_agent/extensions/iscsi.py", line 57, in _start_tgtd Dec 08 21:15:32 host-172-17-9-207 ironic-python-agent[1373]: 2016-12-08 21:15:32.808 1373 ERROR root _execute(['tgtd'], "Unable to start the ISCSI daemon") Dec 08 21:15:32 host-172-17-9-207 ironic-python-agent[1373]: 2016-12-08 21:15:32.808 1373 ERROR root File "/usr/lib/python2.7/site-packages/ironic_python_agent/extensions/iscsi.py", line 42, in _execute Dec 08 21:15:32 host-172-17-9-207 ironic-python-agent[1373]: 2016-12-08 21:15:32.808 1373 ERROR root raise errors.ISCSICommandError(error_msg, e.exit_code, Dec 08 21:15:32 host-172-17-9-207 ironic-python-agent[1373]: 2016-12-08 21:15:32.808 1373 ERROR root AttributeError: 'exceptions.OSError' object has no attribute 'exit_code' Dec 08 21:15:32 host-172-17-9-207 ironic-python-agent[1373]: 2016-12-08 21:15:32.808 1373 ERROR root Dec 08 21:15:32 host-172-17-9-207 ironic-python-agent[1373]: 2016-12-08 21:15:32.808 1373 INFO root [-] Command iscsi.start_iscsi_target completed: Command name: iscsi.start_iscsi_target, params: {u'iqn': u'iqn.2008-10.org.openstack:8fe14370-2007-4733-9a23-2a55eef3847c', u'wipe_disk_metadata': True}, status: FAILED, result: None. Dec 08 21:15:32 host-172-17-9-207 ironic-python-agent[1373]: 172.17.9.254 - - [08/Dec/2016 21:15:32] "POST /v1/commands?wait=true HTTP/1.1" 200 407
Note, this is using a test NetworkManager build. I'm seeing the same problem with my build, I think it the way the rpm's were added to the ramdisk image and the file permissions set up. Prior to the errors above I see these NetworkManager errors. Dec 08 21:31:56 localhost.localdomain nm-dispatcher[428]: find-scripts: Cannot execute '/etc/NetworkManager/dispatcher.d/04-iscsi': not owned by root. Dec 08 21:31:56 localhost.localdomain NetworkManager[370]: <info> [1481250716.3660] Loaded device plugin: NMBondFactory (internal) Dec 08 21:31:56 localhost.localdomain nm-dispatcher[428]: find-scripts: Cannot execute '/etc/NetworkManager/dispatcher.d/20-chrony': not owned by root. Dec 08 21:31:56 localhost.localdomain nm-dispatcher[428]: find-scripts: Cannot execute '/etc/NetworkManager/dispatcher.d/11-dhclient': not owned by root. Dec 08 21:32:07 host-10-8-146-13 nm-dispatcher[2102]: find-scripts: Cannot execute '/etc/NetworkManager/dispatcher.d/00-netreport': not owned by root. Dec 08 21:32:07 host-10-8-146-13 nm-dispatcher[2102]: find-scripts: Cannot execute '/etc/NetworkManager/dispatcher.d/04-iscsi': not owned by root. Dec 08 21:32:07 host-10-8-146-13 nm-dispatcher[2102]: find-scripts: Cannot execute '/etc/NetworkManager/dispatcher.d/20-chrony': not owned by root. Dec 08 21:32:07 host-10-8-146-13 nm-dispatcher[2102]: find-scripts: Cannot execute '/etc/NetworkManager/dispatcher.d/11-dhclient': not owned by root. Dec 08 21:32:07 host-10-8-146-13 nm-dispatcher[2102]: req:1 'connectivity-change': new request (0 scripts) Dec 08 21:32:07 host-10-8-146-13 nm-dispatcher[2102]: req:1 'connectivity-change': completed: no scripts Dec 08 21:32:07 host-10-8-146-13 nm-dispatcher[2102]: find-scripts: Cannot execute '/etc/NetworkManager/dispatcher.d/00-netreport': not owned by root. Dec 08 21:32:07 host-10-8-146-13 nm-dispatcher[2102]: find-scripts: Cannot execute '/etc/NetworkManager/dispatcher.d/04-iscsi': not owned by root.
I think this should be closed as notabug as its just due to the way the NetworkManager packages were added to the image for test.
Closing this as it was an issue with the way the ramdisk image was generated using new NetworkManager rpms.