Hide Forgot
This is a rather strange bug that's also a rather big problem for our openQA deployment. For our openQA tests we have some base hard disk images that are produced with virt-install. They're supposed to be re-generated every two weeks to prevent them being too old. However, when the script that produces them tries to rebuild them, it usually fails. If I reboot the openQA server box and run the script right away, though, it works. I've finally got a bit more time to look at this today. The box has been up for a while and the script is failing, so I tried simply running a virt-install interactively...and it seems like what goes wrong is it never even manages to reach anaconda. It dies in dracut, failing to find /dev/root. I'll attach the virt-manager debug output, and the associated system logs. The weird thing is that this doesn't seem to be happening on the virtually-identical openQA *staging* server - it only seems to happen on the *production* server. But I can't figure out what the difference could possibly be. Both boxes are running fairly up to date Fedora 24, with libvirt 1.3.3.2-1.fc24 and virt-install-1.4.0-3.fc24 .
Created attachment 1212979 [details] virt-install command and output (with -d)
Created attachment 1212983 [details] journal output after starting the virt-install command
hum, so I think it may be a network issue: I think the VM doesn't have network access, so it can't go out and download the installer from the repo, so it doesn't get there (we're doing a direct kernel boot with an `inst.repo` parameter to tell it where to get the installer from, here). if I run `ipaddr` from the dracut prompt in the VM, I see: dracut:/# ip addr 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: ens2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000 link/ether 52:54:00:34:43:9a brd ff:ff:ff:ff:ff:ff inet6 fe80::5054:ff:fe34:439a/64 scope link valid_lft forever preferred_lft forever and if I try running dhclient, I get: dracut:/# dhclient ens2 dhcp: PREINIT ens2 up dhcp: FAIL dracut:/# so that seems to be the problem, but I don't see anything immediately obvious wrong with the libvirt networking config. `virsh net-list` shows: [root@openqa01 libvirt][PROD]# virsh net-list Name State Autostart Persistent ---------------------------------------------------------- default active yes yes [root@openqa01 libvirt][PROD]#
Indeed, adding `--network user` to the virt-install command seems to make it fly. But I don't know why it's failing to work with the 'default' libvirt network.
Note to self: rwmjones gave me https://bugzilla.redhat.com/show_bug.cgi?id=1271183#c14 as a reference for debugging this further, whenever I can get to it.
Adam are you still seeing this? Is it f24 specific?
I implemented the workaround I described in #c3 on the openQA boxes, so I can't tell if this is still a problem. I've got quite a lot of other stuff on my plate right now so I'm not sure I want to switch them back and wait for this to happen again...
Okay let's close this then, please reopen if you ever take a stab at reproducing