Bug 2068361
| Summary: | Can't set static IP for all network ports with --mac option when windows guest which has multiple networks | ||||||
|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 9 | Reporter: | mxie <mxie> | ||||
| Component: | virt-v2v | Assignee: | Virtualization Maintenance <virt-maint> | ||||
| Status: | CLOSED MIGRATED | QA Contact: | Virtualization Bugs <virt-bugs> | ||||
| Severity: | medium | Docs Contact: | |||||
| Priority: | medium | ||||||
| Version: | 9.1 | CC: | chhu, hongzliu, juzhou, lersek, marcandre.lureau, mkedzier, rjones, tyan, tzheng, vwu, xiaodwan | ||||
| Target Milestone: | rc | Keywords: | MigratedToJIRA, Triaged | ||||
| Target Release: | --- | Flags: | pm-rhel:
mirror+
|
||||
| Hardware: | x86_64 | ||||||
| OS: | Unspecified | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2023-07-07 21:01:45 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Attachments: |
|
||||||
|
Description
mxie@redhat.com
2022-03-25 04:34:27 UTC
mxie, is this a new test that we've recently added, or is it a bug/regression in existing working functionality that happened with virt-v2v in 9.0 or 9.1? (In reply to Richard W.M. Jones from comment #4) > mxie, is this a new test that we've recently added, or is it a > bug/regression in > existing working functionality that happened with virt-v2v in 9.0 or 9.1? I didn't find same scenario in the existing test cases and bugs, we used to only test --mac option for one network port of windows guest, sorry, the test about --mac was not very comprehensive, so it's a new test case Thanks - I just wanted to exclude the recent changes we made to firstboot scripts. But having said that, the scripts are running in an unexpected order (refer to firstboot-log.png screenshot): 2500-0005-v2v-netcf-ps1.bat # configure the static IP addrs 5000-0001-wait-pnp.bat # run wait-pnp 5000-0002-install-qemu-ga-x86_64-msi-ps1.bat # install qemu-ga Laszlo, shouldn't wait-pnp run before any network configuration since (AIUI) that is responsible for waiting for virtio-net drivers to be installed? Without any testing, my intuition would be something like this:
diff --git a/convert/convert_windows.ml b/convert/convert_windows.ml
index b53312a9ab..4731d75234 100644
--- a/convert/convert_windows.ml
+++ b/convert/convert_windows.ml
@@ -410,7 +410,12 @@ let convert (g : G.guestfs) _ inspect _ static_ips =
(String.replace_char pnp_wait_path '/' '\\')
reg_restore_str in
- Firstboot.add_firstboot_script g inspect.i_root "wait pnp" fb_script;
+ (* Set the priority to something low because we want to wait
+ * for drivers to be installed before doing any other
+ * firstboot scripts.
+ *)
+ Firstboot.add_firstboot_script g inspect.i_root "wait pnp"
+ ~prio:1000 fb_script;
(* add_firstboot_script has created the path already. *)
g#upload tool_path (g#case_sensitive_path pnp_wait_path)
(In reply to Richard W.M. Jones from comment #6) > Thanks - I just wanted to exclude the recent changes we made to firstboot > scripts. > But having said that, the scripts are running in an unexpected order (refer > to firstboot-log.png screenshot): > > 2500-0005-v2v-netcf-ps1.bat # configure the static IP addrs > 5000-0001-wait-pnp.bat # run wait-pnp > 5000-0002-install-qemu-ga-x86_64-msi-ps1.bat # install qemu-ga This order is not unexpected -- please see: - https://bugzilla.redhat.com/show_bug.cgi?id=1788823#c24 (my testing) - https://bugzilla.redhat.com/show_bug.cgi?id=1788823#c33 (mxie's testing) In case virt-v2v installs the virtio-net driver, "v2vnetcf.ps1" includes the following snippet: (* If virtio-net was added to the registry, we must wait for * it to be installed at runtime. *) if net_driver = Virtio_net then ( add "# Wait for the netkvm (virtio-net) driver to become active."; add "$adapters = @()"; add "While (-Not $adapters) {"; add " Start-Sleep -Seconds 5"; add " $adapters = Get-NetAdapter -Physical \ | Where DriverFileName -eq \"netkvm.sys\""; add " Write-Host \"adapters = '$adapters'\""; add "}"; add "" ); This was supposed to wait until the virtio-net device (singular, I guess...) becomes available. Only thereafter is the rest of the script reached, which sets the IP address(es). In the generated OVF, we have: <Item> <rasd:InstanceId>924ebb56-b5d7-4a4a-b42c-7967ccec6c03</rasd:InstanceId> <rasd:Caption>Ethernet adapter on ovirtmgmt</rasd:Caption> <rasd:ResourceType>10</rasd:ResourceType> <rasd:ResourceSubType>3</rasd:ResourceSubType> <Type>interface</Type> <rasd:Connection>ovirtmgmt</rasd:Connection> <rasd:Name>eth0</rasd:Name> <rasd:MACAddress>00:50:56:83:6b:2e</rasd:MACAddress> </Item> <Item> <rasd:InstanceId>bba4928d-6678-466f-96a2-f213793ad633</rasd:InstanceId> <rasd:Caption>Ethernet adapter on ovirtmgmt</rasd:Caption> <rasd:ResourceType>10</rasd:ResourceType> <rasd:ResourceSubType>3</rasd:ResourceSubType> <Type>interface</Type> <rasd:Connection>ovirtmgmt</rasd:Connection> <rasd:Name>eth1</rasd:Name> <rasd:MACAddress>00:50:56:83:43:74</rasd:MACAddress> </Item> <Item> <rasd:InstanceId>effa389c-0631-4420-90f0-7a9b45c7fefa</rasd:InstanceId> <rasd:Caption>Ethernet adapter on ovirtmgmt</rasd:Caption> <rasd:ResourceType>10</rasd:ResourceType> <rasd:ResourceSubType>3</rasd:ResourceSubType> <Type>interface</Type> <rasd:Connection>ovirtmgmt</rasd:Connection> <rasd:Name>eth2</rasd:Name> <rasd:MACAddress>00:50:56:83:94:21</rasd:MACAddress> </Item> with ResourceSubType=3 in all three <Item>s, so that's three virtio-net devices. I guess the "-Not $adapters" condition in the "while" statement of the powershell script fragment above fails as soon as there is just one adapter driven by "netkvm.sys", so it doesn't wait for the other two. To confirm this, we'd need Program Files\Guestfs\Firstboot\log.txt from the destination doman -- mxie, can you please attach that? (Setting needinfo for this.) In particular, we should be looking for the messages "setting IP address of adapter at ...". For an example, refer to the attachment in <https://bugzilla.redhat.com/show_bug.cgi?id=1788823#c25>. > > Laszlo, shouldn't wait-pnp run before any network configuration since (AIUI) > that is responsible for waiting for virtio-net drivers to be installed? We only recently restored wait-pnp (for the same bug 1788823, in the same patch series), so this is not a "regression due to running wait-pnp in the wrong spot" (as we wouldn't run wait-pnp previously at all). Also this test case, with multiple interfaces, is new -- the original report in bug 1788823 describes a single interface only --, so I don't think the symptom is a regression for any reason at all. We don't have a multi-interface baseline to compare against. Now, under the *previous* symptom (that is, when, according to bug 1788823, "v2vnetcf.ps1" would run simply too late, even if there was just one interface), this generally unwanted delay could have masked the issue with the "-Not $adapters" loop condition. Because, by the time one interface had been found, all other interfaces too would have been found. Now that we eliminated the delay for bug 1788823, the insufficiency of "-Not $adapters" (when using multiple adapters) has been laid bare, and we're not compensating for that with wait-pnp either. I have no idea how to fix "-Not $adapters", as we can't check for "all" interfaces (we don't know how many of them exist). We can try moving wait-pnp to the front (per comment 7), but it's just a guess (and in case it does work, then the "-Not $adapters" while loop becomes redundant). I guess I'll try to prepare a scratch build for mxie to test. Hi Laszlo, the attachment logs are for scenario2 of comment0 (In reply to mxie from comment #9) > Hi Laszlo, the attachment logs are for scenario2 of comment0 Thanks, Firstboot/log.txt indeed *only* reports: adapters = 'MSFT_NetAdapter (CreationClassName = "MSFT_NetAdapter", DeviceID = "{5FA53030-26A7-460F-B260-329AD8958322}", SystemCreationClassName = "CIM_NetworkPort", SystemName = "DESKTOP-GSJ1QBH")' setting IP address of adapter at 7 so it does not look beyond the first adapter that's driven by netkvm.sys. I'm unable to fix this bug, and I'm tempted to say that fixing it is impossible. I've prepared two patches: > commit 9da272effab9a4dcb93709e1761753e36c206e75 > Author: Richard W.M. Jones <rjones> > Date: Mon Mar 28 10:52:58 2022 +0200 > > convert_windows: move "wait pnp" guest script to the front > > We want to wait for drivers to be installed before doing any other > firstboot scripts. > > Original patch by Rich; I've replaced priority 1000 with 1250 (by halving > the 2500 that we use with "v2vnetcf.ps1"). > > https://bugzilla.redhat.com/show_bug.cgi?id=2068361 > Signed-off-by: Laszlo Ersek <lersek> > > diff --git a/convert/convert_windows.ml b/convert/convert_windows.ml > index b53312a9aba4..d891776a7a0f 100644 > --- a/convert/convert_windows.ml > +++ b/convert/convert_windows.ml > @@ -410,7 +410,12 @@ let convert (g : G.guestfs) _ inspect _ static_ips = > (String.replace_char pnp_wait_path '/' '\\') > reg_restore_str in > > - Firstboot.add_firstboot_script g inspect.i_root "wait pnp" fb_script; > + (* Set the priority to something low because we want to wait > + * for drivers to be installed before doing any other > + * firstboot scripts. > + *) > + Firstboot.add_firstboot_script g inspect.i_root "wait pnp" > + ~prio:1250 fb_script; > (* add_firstboot_script has created the path already. *) > g#upload tool_path (g#case_sensitive_path pnp_wait_path) > and > commit f5b8cd0cac57e0cd516bebb43631332d8e6b5f6e (HEAD -> wait-for-all-virtio-nics-rhbz-2068361) > Author: Laszlo Ersek <lersek> > Date: Mon Mar 28 10:55:28 2022 +0200 > > convert_windows: wait for all virtio-net NICs in "v2vnetcf.ps1" > > Currently the loop at the top of "v2vnetcf.ps1" waits only until the first > virtio-net NIC shows up. Because on the host side we know the precise set > of virtio-net NICs (by MAC) that we want to assign static IP addresses to, > generate a separate wait loop for each MAC. At the end of the last loop, > all NICs will have shown up. > > While at it, don't needlessly wait 5 seconds before checking each NIC. > > https://bugzilla.redhat.com/show_bug.cgi?id=2068361 > Signed-off-by: Laszlo Ersek <lersek> > > diff --git a/convert/convert_windows.ml b/convert/convert_windows.ml > index d891776a7a0f..d80a0a3a20c1 100644 > --- a/convert/convert_windows.ml > +++ b/convert/convert_windows.ml > @@ -689,15 +689,24 @@ let convert (g : G.guestfs) _ inspect _ static_ips = > * it to be installed at runtime. > *) > if net_driver = Virtio_net then ( > - add "# Wait for the netkvm (virtio-net) driver to become active."; > - add "$adapters = @()"; > - add "While (-Not $adapters) {"; > - add " Start-Sleep -Seconds 5"; > - add " $adapters = Get-NetAdapter -Physical \ > - | Where DriverFileName -eq \"netkvm.sys\""; > - add " Write-Host \"adapters = '$adapters'\""; > - add "}"; > - add "" > + add "# Wait for the netkvm (virtio-net) driver to become active"; > + add "# on each NIC for which we have a static IP address."; > + List.iter ( > + fun { if_mac_addr } -> > + let psmac = (String.replace if_mac_addr ":" "-") in > + let fetch_adapters = > + sprintf > + "$adapters = Get-NetAdapter -Physical \ > + | Where {($_.DriverFileName -eq \"netkvm.sys\") -and \ > + ($_.MacAddress -eq \"%s\")}" psmac in > + add fetch_adapters; > + add "While (-Not $adapters) {"; > + add " Start-Sleep -Seconds 5"; > + add (sprintf " %s" fetch_adapters); > + add "}"; > + add (sprintf "Write-Host \"Found %s\"" if_mac_addr); > + add "" > + ) static_ips; > ); > > List.iter ( My command line is: > $ VIRT_TOOLS_DATA_DIR=/usr/i686-w64-mingw32/sys-root/mingw/bin r-virt-v2v ./run \ > virt-v2v -i libvirt -o libvirt -on converted \ > --mac 52:54:00:32:05:e9:ip:192.168.122.197,192.168.122.1,24 \ > --mac 52:54:00:98:d2:52:ip:192.168.122.198,192.168.122.1,24 \ > --mac 52:54:00:ac:85:e6:ip:192.168.122.199,192.168.122.1,24 \ > win2019 and the generated script ("v2vnetcf.ps1") is: > # Uncomment this line for lots of debug output. > # Set-PSDebug -Trace 1 > > # Wait for the netkvm (virtio-net) driver to become active > # on each NIC for which we have a static IP address. > $adapters = Get-NetAdapter -Physical | Where {($_.DriverFileName -eq "netkvm.sys") -and ($_.MacAddress -eq "52-54-00-32-05-e9")} > While (-Not $adapters) { > Start-Sleep -Seconds 5 > $adapters = Get-NetAdapter -Physical | Where {($_.DriverFileName -eq "netkvm.sys") -and ($_.MacAddress -eq "52-54-00-32-05-e9")} > } > Write-Host "Found 52:54:00:32:05:e9" > > $adapters = Get-NetAdapter -Physical | Where {($_.DriverFileName -eq "netkvm.sys") -and ($_.MacAddress -eq "52-54-00-98-d2-52")} > While (-Not $adapters) { > Start-Sleep -Seconds 5 > $adapters = Get-NetAdapter -Physical | Where {($_.DriverFileName -eq "netkvm.sys") -and ($_.MacAddress -eq "52-54-00-98-d2-52")} > } > Write-Host "Found 52:54:00:98:d2:52" > > $adapters = Get-NetAdapter -Physical | Where {($_.DriverFileName -eq "netkvm.sys") -and ($_.MacAddress -eq "52-54-00-ac-85-e6")} > While (-Not $adapters) { > Start-Sleep -Seconds 5 > $adapters = Get-NetAdapter -Physical | Where {($_.DriverFileName -eq "netkvm.sys") -and ($_.MacAddress -eq "52-54-00-ac-85-e6")} > } > Write-Host "Found 52:54:00:ac:85:e6" > > $mac_address = '52-54-00-32-05-e9' > $ifindex = (Get-NetAdapter -Physical | Where MacAddress -eq $mac_address).ifIndex > if ($ifindex) { > Write-Host "setting IP address of adapter at $ifindex" > New-NetIPAddress -InterfaceIndex $ifindex -IPAddress '192.168.122.197' -DefaultGateway '192.168.122.1' -PrefixLength 24 > } > > $mac_address = '52-54-00-98-d2-52' > $ifindex = (Get-NetAdapter -Physical | Where MacAddress -eq $mac_address).ifIndex > if ($ifindex) { > Write-Host "setting IP address of adapter at $ifindex" > New-NetIPAddress -InterfaceIndex $ifindex -IPAddress '192.168.122.198' -DefaultGateway '192.168.122.1' -PrefixLength 24 > } > > $mac_address = '52-54-00-ac-85-e6' > $ifindex = (Get-NetAdapter -Physical | Where MacAddress -eq $mac_address).ifIndex > if ($ifindex) { > Write-Host "setting IP address of adapter at $ifindex" > New-NetIPAddress -InterfaceIndex $ifindex -IPAddress '192.168.122.199' -DefaultGateway '192.168.122.1' -PrefixLength 24 > } > The script does *not* work at first boot (it has no effect, but more on this later). If I run it manually, at second boot, it works *perfectly*. Here are the various ways in which the script does not work at first boot: (1) The contents of the "/Program Files/Guestfs/Firstboot/scripts-done" directory is as follows (after first boot, i.e. after the IP assignments fail, and I power down the guest, and check its files with guestfish): > -rwxrwxrwx 1 root root 112 Mar 28 13:46 1250-0001-wait-pnp.bat > -rwxrwxrwx 2 root root 123 Mar 28 2022 1250-0001-wait-pnp.log > -rwxrwxrwx 1 root root 112 Mar 28 13:47 2500-0003-v2vnetcf-ps1.bat > -rwxrwxrwx 1 root root 130 Mar 28 13:46 5000-0002-install-qemu-ga-x86_64-msi-ps1.bat (2) The file "1250-0001-wait-pnp.log" has the following contents: > start waiting for PnP to complete @ Mon Mar 28 13:53:52 2022 > done waiting for PnP to complete @ Mon Mar 28 13:53:52 2022 Note that it runs first and completes within a second. (3) The contents of the log file at /Program Files/Guestfs/Firstboot/log.txt is: > starting firstboot service > running "C:\Program Files\Guestfs\Firstboot\scripts\1250-0001-wait-pnp.bat" > 1 file(s) moved. > Wait for PnP to complete > .... exit code 0 > running "C:\Program Files\Guestfs\Firstboot\scripts\2500-0003-v2vnetcf-ps1.bat" > 1 file(s) moved. > .... exit code 1 > running "C:\Program Files\Guestfs\Firstboot\scripts\5000-0002-install-qemu-ga-x86_64-msi-ps1.bat" > 1 file(s) moved. > Removing any previously scheduled qemu-ga installation > ERROR: The network address is invalid. > Scheduling delayed installation of qemu-ga from qemu-ga-x86_64.msi > ERROR: The network address is invalid. > .... exit code 0 > uninstalling firstboot service > OpenSCManager failed (1115) > starting firstboot service > uninstalling firstboot service > Service uninstalled successfully Note that: (3a) "2500-0003-v2vnetcf-ps1.bat" exits with status 1 (that is, it fails), (3b) the two messages > ERROR: The network address is invalid which come presumably from "v2vnetcf.ps1", are *intermixed* with the messages > Removing any previously scheduled qemu-ga installation > Scheduling delayed installation of qemu-ga from qemu-ga-x86_64.msi which come from from the *next* (serially launched!) script called "install-qemu-ga-x86_64.msi.ps1". This suggests that these scripts are somehow executed in parallel, despite the fact that everyting in /Program Files/Guestfs/Firstboot/firstboot.bat is strictly serial. (4) Even if I enable (uncomment) > Set-PSDebug -Trace 1 at the top of "v2vnetcf.ps1", I get *zero* debug output. I have tried everything possible to get *any* debug output from the script, during *first* boot. All the output is produced successfully when I *don't* need it, that is, during second boot; it's impossible to get any debug output during first boot. Here's what I've tried: (4a) Write-Host in "v2vnetcf.ps1" (obviously). No effect, in /Program Files/Guestfs/Firstboot/log.txt (4b) implementing a redirection like ">foobar.txt 2>&1" in the *wrapper* batch file "2500-0003-v2vnetcf-ps1.bat". That is, implement separate redirection in the BAT file that directly calls powershell.exe. No effect, the file is not created. (4c) Wrap the entire guts of "v2vnetcf.ps1" into the following "wildcard" redirection operator: > &{ > ... > } *>> c:\foobar.log (I also attempted just \foobar.log; IOW, removing the C: drive reference.) No effect, the file is not created. What's funniest (in a bad way) is that this "wildcard" redirection operator does not cover the "Set-PSDebug -Trace 1" output even when I run the script in the *dedicated powershell development environment* (PowerShell ISE)! The trace messages are sent to the separate console window in the ISE. How fricking broken is your "power" shell (lol) when *> and *>> are documented to redirect *ALL* streams, including the Debug stream, but then it does not cover the trace messages enabled by "Set-PSDebug"?! And yes, I've seen this question on StackOverflow too. Windows is a dumpster fire. (4d) in "v2vnetcf.ps1", something like (note specifically the relative pathname): > Write-Output | Out-File -FilePath foobar.log -Encoding ASCII -append No effect. The file is not created. (5) I've tried re-adding the (seemingly superfluous) 5 seconds delay to the top of the first wait loop. Does not help. (6) I've repeatedly witnessed the "internal reboot" of the converted Windows 2019 domain that Tomáš Golembiovský described in commit dc66e78fa37d ("windows: delay installation of qemu-ga MSI", 2020-03-10). We worked around this internal reboot for the QGA installation by delaying the QGA installation by two minutes (i.e., by scheduling it for 120 seconds later). I'm fairly convinced that this internal reboot, or the state that *precedes* it, is what breaks "v2vnetcf.ps1"; although I'm unsure how exactly (see again above -- *zero* debug output is possible to capture). Now, funnily enough, I cannot even *attempt* delaying "v2vnetcf.ps1" for two minutes, because in bug 1788823, QE's specific request was that we move "v2vnetcf.ps1" to the *front* of the firstboot scripts. That's exactly what has brought us here, where we now stand. So I can't delay the iface configuration because that's not right from an end-user perspective, but I also can't move it to the front, because then Windows aborts the script due to the internal reboot (or breaks it in some other obscure way). (With a single interface, things likely work now only because the configuration for a single interface completes before Windows reaches the internal reboot point.) And note that "wait pnp" is absolutely useless: it seems to succeed OK, but then "v2vnetcf.ps1" gets killed just the same. And let me repeat -- if, upon second boot, I simply move "2500-0003-v2vnetcf-ps1.bat" from "scripts-done" to "scripts", and re-run "firstboot.bat" manually, then the IP assignments (all three of them) complete flawlessly. So, I'm sorry, but Windows is just a trash fire, and I can't do anything about this problem. (Spent all of today struggling with this.) *** Bug 2114809 has been marked as a duplicate of this bug. *** |