Bug 1483756
Summary: | Overcloud deploy fails with: IOError: [Errno 26] Text file busy: '/var/lib/docker-puppet/docker-puppet.sh' | ||
---|---|---|---|
Product: | Red Hat OpenStack | Reporter: | Dan Yasny <dyasny> |
Component: | openstack-tripleo | Assignee: | Derek Higgins <derekh> |
Status: | CLOSED DUPLICATE | QA Contact: | Dan Yasny <dyasny> |
Severity: | urgent | Docs Contact: | |
Priority: | urgent | ||
Version: | 12.0 (Pike) | CC: | aschultz, bfournie, derekh, dprince, dyasny, m.andre, mburns, mlammon, ohochman, racedoro, rhel-osp-director-maint, sasha, tvignaud |
Target Milestone: | --- | Keywords: | TestBlocker |
Target Release: | --- | Flags: | tvignaud:
needinfo+
|
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2017-09-26 20:28:25 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 1434060 |
Description
Dan Yasny
2017-08-21 21:35:42 UTC
Looking at the logs on your controller node I came across these errors Aug 21 21:16:39 controller-0 puppet-user[25106]: (/Stage[main]/Ironic::Pxe/Ironic::Pxe::Tftpboot_file[pxelinux.0]/File[/var/lib/ironic/tftpboot/pxelinux.0]) Could not evaluate: Could not retrieve information from environment production source(s) file:/usr/share/syslinux/pxelinux.0 Aug 21 21:16:39 controller-0 puppet-user[25106]: (/Stage[main]/Ironic::Pxe/Ironic::Pxe::Tftpboot_file[chain.c32]/File[/var/lib/ironic/tftpboot/chain.c32]) Could not evaluate: Could not retrieve information from environment production source(s) file:/usr/share/syslinux/chain.c32 I think that the image your using for DockerIronicPxeImage mightn't have the syslinux package installed and as a result these files can't be found by puppet inside the container. I'm told syslinux-tftpboot is blacklisted and doesn't get installed in the image We're missing syslinux which I think isn't getting installed because we relied on it getting pulled in by syslinux-tftpboot but no longer is, so if syslinux-tftpboot is being blacklisted then syslinux needs to be explicitly installed. Thierry, can we check again if syslinux-tftpboot is available in the repos, and install the syslinux dependency in the ironic_pxe image otherwise? Currently, deployment succeeds, but in the logs on the controllers I see the following: [root@controller-0 log]# grep -R 'Ironic::Pxe/Ironic::Pxe::Tftpboot_file' * Binary file journal/fd80cd1d15fa450fa2dcca5ca256cb95/system.journal matches messages:Aug 25 10:21:40 localhost puppet-user[12]: (/Stage[main]/Ironic::Pxe/Ironic::Pxe::Tftpboot_fil [pxelinux.0]/File[/var/lib/ironic/tftpboot/pxelinux.0]) Could not evaluate: Could not retrieve information from environment production source(s) file:/usr/share/syslinux/pxelinux.0 messages:Aug 25 10:21:40 localhost puppet-user[12]: (/Stage[main]/Ironic::Pxe/Ironic::Pxe::Tftpboot_fil [chain.c32]/File[/var/lib/ironic/tftpboot/chain.c32]) Could not evaluate: Could not retrieve information from environment production source(s) file:/usr/share/syslinux/chain.c32 messages:Aug 25 10:21:40 localhost journal: Error: /Stage[main]/Ironic::Pxe/Ironic::Pxe::Tftpboot_file[pxelinux.0]/File[/var/lib/ironic/tftpboot/pxelinux.0]: Could not evaluate: Could not retrieve information from environment production source(s) file:/usr/share/syslinux/pxelinux.0 messages:Aug 25 10:21:40 localhost journal: Error: /Stage[main]/Ironic::Pxe/Ironic::Pxe::Tftpboot_file[chain.c32]/File[/var/lib/ironic/tftpboot/chain.c32]: Could not evaluate: Could not retrieve information from environment production source(s) file:/usr/share/syslinux/chain.c32 I tried to reproduce this today using containers from the 2017-08-18.2 tag: rhosp12/openstack-ironic-api-docker 2017-08-18.2 0b75b70186b8 8 days ago 653.4 MB rhosp12/openstack-ironic-pxe-docker 2017-08-18.2 43bfae3afb8e 8 days ago 657.3 MB ----- What I did was to slice out the relevant elements in /var/lib/docker-puppet/docker-puppet.json into a file that was just Ironic specific containing just this: [ { "config_image": "172.19.0.3:8787/rhosp12/openstack-ironic-api-docker:2017-08-18.2", "step_config": "include ::tripleo::profile::base::ironic::api\n\ninclude ::tripleo::profile::base::database::mysql::client", "config_volume": "ironic_api", "puppet_tags": "ironic_config" }, { "config_image": "172.19.0.3:8787/rhosp12/openstack-ironic-pxe-docker:2017-08-18.2", "step_config": "include ::tripleo::profile::base::ironic::conductor\n\ninclude ::tripleo::profile::base::database::mysql::client", "config_volume": "ironic", "puppet_tags": "ironic_config" } ] --- I called the file ironic.json. The I manually executed a docker puppet run like this: CONFIG=ironic.json NET_HOST=true python docker-puppet.py It ran successfully. Which I think means that the Ironic containers themselves should be fine as far as generating configuration files goes. Derek: I'm curious if this satisfies your syslinux package concerns as well. With regards to the general issue here I'm actually wondering if this is perhaps related to concurrency issues with how we execute docker puppet containers. There is another BZ and an upstream patch that might help in both cases: https://bugzilla.redhat.com/show_bug.cgi?id=1456986 https://review.openstack.org/#/c/498139/ Dan Yasny reminded that this is still blocking all the OSP 12 QA tests for testing Ironic in the Overcloud. What's the recommended next step? I can see that the patch in BZ#1456986 (https://review.openstack.org/#/c/498139/) is merged. Dan, are you suggesting to rerun this with that patch applied? Could the errors described in comment #6 be related to the same issue or should those be tracked separately as a different one? Many thanks. This bug as per the summary should be fixed by https://bugzilla.redhat.com/show_bug.cgi?id=1456986, if so it should be closed as a duplicate. Once able to retest the problem Derek was seeing in Comment #1 may arise, if so, please open a new bug for that. Dan, what's the status of this bug? Can it be closed as per comment 12? Problem has no longer been seen and containerized deployment completed, so closing this out. It seems most likely this was fixed elsewhere so marking as a dup. *** This bug has been marked as a duplicate of bug 1456986 *** The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days |