Description of problem:
Error when creating a new Content Host using VMWare Compute Resource. The network information coming from Compute Profile appears to be correct on the webUI but when submitted generate an error (and then we are able to see a diff dvport group as expected)
Version-Release number of selected component (if applicable):
100% on the customer environment
Steps to Reproduce:
1. Create a Compute Resouce
2. Create a Compute Profile
3. Create a Hostgroup
4. Create a new Content Host and use the hostgroup
Fail with error as below:
Unable to save
Failed to create a compute RAILINC-Cary-DEV (VMware) instance vrh7csnellchqd05.railinc.com: ResourceNotAvailable: The resource vim.dvs.DistributedVirtualPort is not available in vim.dvs.DistributedVirtualPortgroup CHQ_Cisco_B_Dswi-DVUplinks-12035.
Create the Content Host without issues.
Chris, could you please take a look? My quick guess is that this happens when user tries to use wrong port and portgroup combination. Could you configure something similar in your vmware instance so we have a reproducer?
Waldirio, could you test if it works without using hostgroup/compute profile when you select everything manually? My guess is that something will override the interface network during the host form update. I've seen something similar with oVirt. Also the reproducer would help greatly if Chris can't build one.
Waldirio, could you please get the following from customer (of course change credentials and url to their satellite)?
1) curl -u admin:changeme -H "Content-Type: application/json" -H "Accept: application/json" -k https://satellite.example.tst/api/v2/hostgroups/11
2) if the host group has parent, please also get the same for that (change 11 to whatever the parent host group id is
3) curl -u admin:changeme -H "Content-Type: application/json" -H "Accept: application/json" -k https://satellite.example.tst/api/v2/compute_profiles/6
4) record the video of the flow
We still can't reproduce and without that, we can't work on the fix for 6.4. The new version will have resource loading limited to only available resources, so the bug will go away, but this is rather a big change, that can't be backported.
Upstream bug assigned to oezr
Good news, we believe Ondrej found the cause. We were able to reproduce when we set the subnet's vlanid to number that is also part of the vmware network name. There's some logic that tries to match similarly named network with subnet vlanid, but it only updated part of the form. The fix is simple for cherry-picking and can be backported. This automatching logic was introduced in 6.4, which explains why it worked on 6.3. The patch has been merged upstream today. I'm sorry it took us a long time, but without a reproducer, it wasn't easy.
I was investigating the problem bit more and the matching introduced in 6.4 is as Marek said for vlanid from selected subnet in the name of vmware network. The matching didn't work properly and that is now fixed. Unfortunately there are two more problems with that. First the matching takes precedence over any other vmware network selection. I have opened a proposal to fix that, so the suggestion based on vlanid would only take effect if the user haven't selected the network in another way (in compute profile or manually). Second that there is no reasonable way to propagate real vlanid from vmware StandardPortGroup to satellite, so the matching is based just on a name of the network, which works just if user follows the naming convention of VLAN<vlanid> on VMware side and can select undesired values if the numers are found in other networks.
Please let us know, if you have some insights, what would be the desired behavior by customer, to cover all the use cases.
It would be also good to cherry-pick the patch from https://projects.theforeman.org/issues/26307 which implements autoselecting only when subnet is changed by user interaction either in modal window or by selecting host group when no compute profile is involved. Selecting compute profile sets the network according to how the profile is defined. This should work well even for customer that do not share the vlanid in name of subnet and network.
Therefore, moving back and linking the second issue.
Build: Satellite 6.5 snap 24
When we have network associated in compute profile, The compute profile network will be selected.
But, when we manually update subnet that has a VLAN ID associated. The VLAN ID , would be selected. If not, it would fallback to use the network from compute profile
Also, now the networks are Human readable Name , from Vmware , rather than network ID's.
Created attachment 1556111 [details]
Readable network name
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory, and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.