Bug 1563072 - ipxe-script endpoint returns script with empty URLs
Summary: ipxe-script endpoint returns script with empty URLs
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Beaker
Classification: Retired
Component: openstack
Version: 25
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: 25.1
Assignee: Roman Joost
QA Contact: tools-bugs
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-04-03 06:40 UTC by Roman Joost
Modified: 2018-04-11 05:20 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-04-11 05:20:06 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Beaker Project Gerrit 6054 0 'None' 'MERGED' 'Lookup iPXE compatible URLs to generate script' 2019-11-18 11:30:11 UTC

Description Roman Joost 2018-04-03 06:40:54 UTC
Description of problem:

There is a regression in our IPXE script endpoint introduced by: 

https://gerrit.beaker-project.org/c/5927/30/Server/bkr/server/systems.py#1516

Version-Release number of selected component (if applicable):
25.0

How reproducible:
always

Steps to Reproduce:
1. Provision a machine in OpenStack
2. Check the ipxe-script if it points to boot images as full HTTP URLs
3. Current it looks as if the IPXE script downloads kernel images very, very slowly

Actual results:
OpenStack instances are all aborted.

Expected results:
OpenStack recipes passing

Additional info:

Comment 1 Dan Callaghan 2018-04-04 02:43:45 UTC
I guess it's not that the URLs are "empty", but they are wrong (not absolute URLs pointing at the lab mirror, but just the relative paths like "pxeboot/images/vmlinuz").

Comment 2 Roman Joost 2018-04-04 03:07:23 UTC
I've looked into what actually happens, since the machine just seems to be sitting there "downloading" the kernel by printing dots. But is it really downloading something?

The access_log on beaker-devel shows (host shortened):

Apr  4 02:34:47 beaker-devel httpd[6759]: 10.8.248.114 - - [04/Apr/2018:02:34:47 +0000] "GET /systems/by-uuid/21c85716-4f3e-4e0d-81d8-4eb8c00d7a8b/ipxe-script HTTP/1.1" 200 75 "-" "iPXE/1.0.0+ (6366fa7a)"
Apr  4 02:34:47 beaker-devel httpd[6759]: 10.8.248.114 - - [04/Apr/2018:02:34:47 +0000] "GET /systems/by-uuid/21c85716-4f3e-4e0d-81d8-4eb8c00d7a8b/images/pxeboot/vmlinuz HTTP/1.1" 302 405 "-" "iPXE/1.0.0+ (6366fa7a)"

It downloads the script via HTTP, then due to the bug tries to download the kernel from beaker-devel as well. When I try this with curl and print the response headers I get:

*   Trying 10.16.101.20...
* TCP_NODELAY set
*   Trying 2620:52:0:1065:5054:ff:fe22:b7d9...
* TCP_NODELAY set
* Immediate connect fail for 2620:52:0:1065:5054:ff:fe22:b7d9: Network is unreachable
* Connected to beaker-devel.app.eng.bos.redhat.com (10.16.101.20) port 80 (#0)
> GET /systems/by-uuid/21c85716-4f3e-4e0d-81d8-4eb8c00d7a8b/images/pxeboot/vmlinuz HTTP/1.1
> Host: beaker-devel.app.eng.bos.redhat.com
> User-Agent: curl/7.55.1
> Accept: */*
>
< HTTP/1.1 302 Found
< Date: Wed, 04 Apr 2018 02:50:38 GMT
< Server: Apache/2.2.15 (Red Hat)
< Location: https://beaker-devel.app.eng.bos.redhat.com/systems/by-uuid/21c85716-4f3e-4e0d-81d8-4eb8c00d7a8b/images/pxeboot/vmlinuz
< Content-Length: 405
< Connection: close
< Content-Type: text/html; charset=iso-8859-1
<
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>302 Found</title>
</head><body>
<h1>Found</h1>
<p>The document has moved <a href="https://beaker-devel.app.eng.bos.redhat.com/systems/by-uuid/21c85716-4f3e-4e0d-81d8-4eb8c00d7a8b/images/pxeboot/vmlinuz">here</a>.</p>
<hr>
<address>Apache/2.2.15 (Red Hat) Server at beaker-devel.app.eng.bos.redhat.com Port 80</address>
</body></html>
* Closing connection 0

Comment 3 Roman Joost 2018-04-04 03:23:51 UTC
Looking at: https://kimizhang.wordpress.com/2013/08/26/create-pxe-boot-image-for-openstack/ from Bug 1395512 and into ipxe_image.py it seems the images are not build with our root certificates according to: http://ipxe.org/crypto which means there is no way for ipxe to follow that HTTPS location.

Maybe we should?

Apart from the HTTPS problem, I guess this bug should just fix the regression at hand. Dan and I looked over the code which introduced the regression and agreed to do two things:

a) a test which uses a distro with an NFS URL provisioning OpenStack (I think there is currently no way to force this, so will have to think something up)
b) well.. the fix, which will introduce  distro_tree.url_in_lab back into our code base.

Comment 4 Dan Callaghan 2018-04-04 04:10:24 UTC
There is no point implementing proper SSL certificate checking in the ipxe images. Everything needed for installation is already available over (insecure) plain HTTP for this reason. For example even if ipxe can validate the CA cert, Anaconda will then have the same problem (refuse to accept the internal CA cert for downloading stage2.img and packages). It is easier to just stick to plain HTTP, which is what this is *supposed* to be using.

Comment 6 Roman Joost 2018-04-10 00:58:16 UTC
I've ran another self-test which used a few OpenStack instances. They provisioned fine and the recipes ran through without a problem. Examples:

https://beaker-devel.app.eng.bos.redhat.com/recipes/21723#task153116
https://beaker-devel.app.eng.bos.redhat.com/jobs/14176#set18843
https://beaker-devel.app.eng.bos.redhat.com/jobs/14176#set18841

This one provisioned fine, yet tasks failed (which would be another problem)
https://beaker-devel.app.eng.bos.redhat.com/recipes/21734#task153159

Comment 7 Roman Joost 2018-04-11 05:20:06 UTC
Beaker 25.1 has been released:

https://beaker-project.org/docs/whats-new/release-25.html#beaker-25-1


Note You need to log in before you can comment on or make changes to this bug.