Description of problem: There is a regression in our IPXE script endpoint introduced by: https://gerrit.beaker-project.org/c/5927/30/Server/bkr/server/systems.py#1516 Version-Release number of selected component (if applicable): 25.0 How reproducible: always Steps to Reproduce: 1. Provision a machine in OpenStack 2. Check the ipxe-script if it points to boot images as full HTTP URLs 3. Current it looks as if the IPXE script downloads kernel images very, very slowly Actual results: OpenStack instances are all aborted. Expected results: OpenStack recipes passing Additional info:
I guess it's not that the URLs are "empty", but they are wrong (not absolute URLs pointing at the lab mirror, but just the relative paths like "pxeboot/images/vmlinuz").
I've looked into what actually happens, since the machine just seems to be sitting there "downloading" the kernel by printing dots. But is it really downloading something? The access_log on beaker-devel shows (host shortened): Apr 4 02:34:47 beaker-devel httpd[6759]: 10.8.248.114 - - [04/Apr/2018:02:34:47 +0000] "GET /systems/by-uuid/21c85716-4f3e-4e0d-81d8-4eb8c00d7a8b/ipxe-script HTTP/1.1" 200 75 "-" "iPXE/1.0.0+ (6366fa7a)" Apr 4 02:34:47 beaker-devel httpd[6759]: 10.8.248.114 - - [04/Apr/2018:02:34:47 +0000] "GET /systems/by-uuid/21c85716-4f3e-4e0d-81d8-4eb8c00d7a8b/images/pxeboot/vmlinuz HTTP/1.1" 302 405 "-" "iPXE/1.0.0+ (6366fa7a)" It downloads the script via HTTP, then due to the bug tries to download the kernel from beaker-devel as well. When I try this with curl and print the response headers I get: * Trying 10.16.101.20... * TCP_NODELAY set * Trying 2620:52:0:1065:5054:ff:fe22:b7d9... * TCP_NODELAY set * Immediate connect fail for 2620:52:0:1065:5054:ff:fe22:b7d9: Network is unreachable * Connected to beaker-devel.app.eng.bos.redhat.com (10.16.101.20) port 80 (#0) > GET /systems/by-uuid/21c85716-4f3e-4e0d-81d8-4eb8c00d7a8b/images/pxeboot/vmlinuz HTTP/1.1 > Host: beaker-devel.app.eng.bos.redhat.com > User-Agent: curl/7.55.1 > Accept: */* > < HTTP/1.1 302 Found < Date: Wed, 04 Apr 2018 02:50:38 GMT < Server: Apache/2.2.15 (Red Hat) < Location: https://beaker-devel.app.eng.bos.redhat.com/systems/by-uuid/21c85716-4f3e-4e0d-81d8-4eb8c00d7a8b/images/pxeboot/vmlinuz < Content-Length: 405 < Connection: close < Content-Type: text/html; charset=iso-8859-1 < <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN"> <html><head> <title>302 Found</title> </head><body> <h1>Found</h1> <p>The document has moved <a href="https://beaker-devel.app.eng.bos.redhat.com/systems/by-uuid/21c85716-4f3e-4e0d-81d8-4eb8c00d7a8b/images/pxeboot/vmlinuz">here</a>.</p> <hr> <address>Apache/2.2.15 (Red Hat) Server at beaker-devel.app.eng.bos.redhat.com Port 80</address> </body></html> * Closing connection 0
Looking at: https://kimizhang.wordpress.com/2013/08/26/create-pxe-boot-image-for-openstack/ from Bug 1395512 and into ipxe_image.py it seems the images are not build with our root certificates according to: http://ipxe.org/crypto which means there is no way for ipxe to follow that HTTPS location. Maybe we should? Apart from the HTTPS problem, I guess this bug should just fix the regression at hand. Dan and I looked over the code which introduced the regression and agreed to do two things: a) a test which uses a distro with an NFS URL provisioning OpenStack (I think there is currently no way to force this, so will have to think something up) b) well.. the fix, which will introduce distro_tree.url_in_lab back into our code base.
There is no point implementing proper SSL certificate checking in the ipxe images. Everything needed for installation is already available over (insecure) plain HTTP for this reason. For example even if ipxe can validate the CA cert, Anaconda will then have the same problem (refuse to accept the internal CA cert for downloading stage2.img and packages). It is easier to just stick to plain HTTP, which is what this is *supposed* to be using.
I've ran another self-test which used a few OpenStack instances. They provisioned fine and the recipes ran through without a problem. Examples: https://beaker-devel.app.eng.bos.redhat.com/recipes/21723#task153116 https://beaker-devel.app.eng.bos.redhat.com/jobs/14176#set18843 https://beaker-devel.app.eng.bos.redhat.com/jobs/14176#set18841 This one provisioned fine, yet tasks failed (which would be another problem) https://beaker-devel.app.eng.bos.redhat.com/recipes/21734#task153159
Beaker 25.1 has been released: https://beaker-project.org/docs/whats-new/release-25.html#beaker-25-1