Bug 1226097
Summary: | rhel-osp-director: The overcloud deployment times out. | ||
---|---|---|---|
Product: | Red Hat OpenStack | Reporter: | Alexander Chuzhoy <sasha> |
Component: | rhosp-director | Assignee: | Lucas Alvares Gomes <lmartins> |
Status: | CLOSED ERRATA | QA Contact: | Alexander Chuzhoy <sasha> |
Severity: | high | Docs Contact: | |
Priority: | high | ||
Version: | unspecified | CC: | bnemec, calfonso, dmacpher, jprovazn, kbasil, mburns, rhel-osp-director-maint, sasha |
Target Milestone: | ga | ||
Target Release: | Director | ||
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | overcloud-full-7.0-18 | Doc Type: | Bug Fix |
Doc Text: |
The grub configuration set the kernel parameters to redirect the console to a serial port that might not be present. As a result, the node failed to boot. This fix disables console redirection to the serial port by default. The node now boots successfully.
|
Story Points: | --- |
Clone Of: | Environment: | ||
Last Closed: | 2015-08-05 13:52:10 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Alexander Chuzhoy
2015-05-29 02:13:12 UTC
I'm hitting same issue when using prebuilt images: http://download.devel.redhat.com/brewroot/work/tasks/5732/9275732/overcloud-full.tar ssh to OC nodes doesn't work because cloud-init failed, here is a suspect part from /var/log/messages from OC controller node: May 29 08:42:22 localhost systemd: Started D-Bus System Message Bus. May 29 08:42:22 localhost systemd: cloud-init-local.service: main process exited, code=exited, status=209/STDOUT May 29 08:42:22 localhost systemd: Failed to start Initial cloud-init job (pre-networking). May 29 08:42:22 localhost systemd: Dependency failed for Cloud-config availability. May 29 08:42:22 localhost systemd: Dependency failed for Execute cloud user/final scripts. May 29 08:42:22 localhost systemd: May 29 08:42:22 localhost systemd: Dependency failed for Apply the settings specified in cloud-config. May 29 08:42:22 localhost systemd: May 29 08:42:22 localhost systemd: May 29 08:42:22 localhost systemd: Unit cloud-init-local.service entered failed state. Related BZ can be found here: https://bugzilla.redhat.com/show_bug.cgi?id=974285 If I compare etc/default/grub on an older VM which works with the new which fails, I can see that: GRUB_CMDLINE_LINUX="crashkernel=auto console=tty0 no_timer_check net.ifnames=0 console=ttyS0,115200n8" is now: GRUB_CMDLINE_LINUX="crashkernel=auto console=tty0 console=ttyS0,115200 rhgb quiet" I'm not saying this causes the problem but https://bugzilla.redhat.com/show_bug.cgi?id=974285#c2 suggests it might be issue related to console setting, so it is worth investigating further this path. This can be worked around by disabling localboot, which results in a kernel cmdline of: [root@ov-xz5nsxuty5-0-wnvnwultiimx-novacompute-t4ezwe664dah ~]# cat /proc/cmdline root=UUID=79711e89-b049-4d78-bd01-2b4f5042af08 ro text nofb nomodeset vga=normal I also confirmed that just manually removing all of the console params from the grub cmdline at boot fixes the problem. This resulted in the following cmdline: [root@ov-bh6esbx7vo-0-7gcluaz4jocg-novacompute-cmi5uggriiux heat-admin]# cat /proc/cmdline BOOT_IMAGE=/boot/vmlinuz-3.10.0-229.4.2.el7.x86_64 root=UUID=79711e89-b049-4d78-bd01-2b4f5042af08 ro crashkernel=auto I don't know what implications that would have for baremetal though. As Ben wrote, updating CMDLINE in /etc/default/grub fixes the issue, in particular removing "console=ttyS0,115200" from the line. It seems that content of /etc/default/grub depends (is generated) on the host where overcloud images are being built. So this file can look like different on different machines. In tripleo a "vm" element is used when building VM images to make sure that working params are set: https://github.com/openstack/diskimage-builder/blob/master/elements/vm/finalise.d/51-bootloader#L146 But I can't confirm these params work for baremetal too. After discussing this with ironic folks I re-assigned the BZ to Lucas because this is ironic-related (localboot option). Can you please retest this now? Also, is this only on virtual machines? If so, I don't think this would be a blocker. Can you confirm? The issue didn't reproduce for me on the last puddle. Also, the command to deploy the overcloud has changed to: openstack overcloud postconfig "[Overcloud IP]" Verified: Environment: instack-undercloud-2.1.2-1.el7ost.noarch Resolving based on comment #9. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2015:1549 |