Bug 1785014 - We have a class B network (192.168.0.0/16) and it not compatible with ovirt
Summary: We have a class B network (192.168.0.0/16) and it not compatible with ovirt
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: ovirt-hosted-engine-setup
Classification: oVirt
Component: Documentation
Version: 2.4.0
Hardware: x86_64
OS: Linux
high
medium
Target Milestone: ovirt-4.4.2
: ---
Assignee: Steve Goodman
QA Contact: Eli Marcus
URL:
Whiteboard:
Depends On: 1849517
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-12-18 22:02 UTC by administrator
Modified: 2020-09-03 15:14 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-09-03 15:14:50 UTC
oVirt Team: Integration
Embargoed:
sbonazzo: ovirt-4.4?
mtessun: planning_ack+
sbonazzo: devel_ack+
sbonazzo: testing_ack?


Attachments (Terms of Use)

Description administrator 2019-12-18 22:02:30 UTC
Description of problem:

We have in our lab a class B network (192.168.0.0/16)
When we deploy hosted-engine (with IP of 192.168.67.2/16) from the cockpit we get loop of:

[ INFO ] TASK [ovirt.hosted_engine_setup : Define 3rd chunk]
[ INFO ] skipping: [localhost]
[ INFO ] TASK [ovirt.hosted_engine_setup : Set 3rd chunk]
[ INFO ] skipping: [localhost]
[ INFO ] skipping: [localhost]
[ INFO ] TASK [ovirt.hosted_engine_setup : Get ip route]
[ INFO ] skipping: [localhost]
[ INFO ] skipping: [localhost]
[ INFO ] TASK [ovirt.hosted_engine_setup : Fail if can't find an available subnet]
[ INFO ] skipping: [localhost]
[ INFO ] TASK [ovirt.hosted_engine_setup : Set new IPv4 subnet prefix]
[ INFO ] skipping: [localhost]
[ INFO ] TASK [ovirt.hosted_engine_setup : Search again with another prefix]


and the deploy is failed,

Once we change the file:
/usr/share/ansible/roles/ovirt.hosted-engine-setup/defaults/main.yml

From "he_ipv4_subnet_prefix: "192.168.222"" to  "he_ipv4_subnet_prefix: "10.0.90"" the deploy successfully




Version-Release number of selected component (if applicable):
Software Version:4.3.6.7-1.el7




How reproducible:


Steps to Reproduce:
1. 
2.
3.

Actual results:


Expected results:


Additional info:

Comment 1 Sandro Bonazzola 2020-01-08 08:38:27 UTC
Class B addresses should go within range 128.1.0.1 to 191.255.255.254, so you shouldn't use a class C (192.0.1.1 to 223.255.254.254) address as class B.
I would consider closing this as not a bug.

Comment 2 Sandro Bonazzola 2020-01-08 08:42:04 UTC
Marking this as documentation bug, we may want to enhance documentation to clarify ip addressed classes must be respected.

Comment 3 administrator 2020-01-08 09:53:59 UTC
Hi Sandro,

It's is ok that hosted-engine subnet is 10.0.90 and not 192.168.222?
And I agree that the document of this requirement is the right solution (that if you have a network of 192.168.0.0/16 it interrupter on a network of hosted-engine subnet internal )

Comment 4 Steve Goodman 2020-06-18 09:24:43 UTC
I'd like to make sure that we document this correctly, and in the logical place.

Where would you expect to find this info?

Also, from what I understand above, the request is to add the following:

[NOTE]
====
Class B addresses should go within the range 128.1.0.1 to 191.255.255.254. So don't use a class C (i.e. 192.0.1.1 to 223.255.254.254) address in the range reserved for class B.

For example, if you have a network of 192.168.0.0/16 it will prevent a self-hosted engine deployment.
====

Is that correct?

Comment 5 administrator 2020-06-22 11:05:02 UTC
Hi Steve,

Yes, the example is correct.
I would suggest add to network requirements of self-hosted engine, the default engine bridge network is 192.168.222.0/24 
So this network must be not in use on deploy network and if it in use you need to change the self-hosted engine bridge (in file /usr/share/ansible/roles/ovirt.hosted-engine-setup/defaults/main.yml) 
To use different network that not in use.

Another option is to let the user in the deploy process of the self-hosted engine set network of engine bridge

Comment 6 Steve Goodman 2020-06-22 14:44:49 UTC
Marking as high priority because it's a customer bug.

Comment 7 Yedidyah Bar David 2020-06-28 05:55:23 UTC
I don't think the bug/issue has anything to do with Classful Netoworks [1] or anything of the kind.

The issue is specific to 192.168.

The behavior, as far as I can understand from reading the code is:

We use by default 192.168.222.0/24.

If this is in use, we loop over all ranges 192.168.N.0/24, where N is from 1 to 253.
We use the first one we find that is not in use.

"In use", here, means "Has output that includes 'via' for the command 'ip route get 192.168.N.1'".

If we find none, we emit an error message:

          "Cannot find an available subnet for internal Libvirt network"
          "Please set it to an unused subnet by adding the variable 'he_ipv4_subnet_prefix'"
          "to the variable-file ( e.g. he_ipv4_subnet_prefix: '123.123.123' )."

In particular, this indeed should happen if you use 192.168/16.
Did you get this error message? If so, then everything is working as designed.

The error message above does not say _how_ to set the default subnet.
The way you used, by editing the defaults file, is probably the simplest, right now, without more code changes.

We have bug 1849517 and bug 1851677 for allowing passing this on the CLI.

Considering this as a doc bug, I suggest to simply describe the behavior. The suggested solution, for now, is indeed to edit the defaults file, until we fix above bugs.

I also agree that it makes sense to add another question to the user interaction (CLI and cockpit) about the subnet to use, although some people claim we already have too many questions :-)

[1] https://en.wikipedia.org/wiki/Classful_network

Comment 8 Steve Goodman 2020-07-14 13:26:17 UTC
(In reply to Yedidyah Bar David from comment #7)
 
> The error message above does not say _how_ to set the default subnet.
> The way you used, by editing the defaults file, is probably the simplest,
> right now, without more code changes.
> 
> We have bug 1849517 and bug 1851677 for allowing passing this on the CLI.
> 
> Considering this as a doc bug, I suggest to simply describe the behavior.
> The suggested solution, for now, is indeed to edit the defaults file, until
> we fix above bugs.

What is the path/name of the defaults file?
> 
> I also agree that it makes sense to add another question to the user
> interaction (CLI and cockpit) about the subnet to use, although some people
> claim we already have too many questions :-)

This should be a bug for the deployment script.

Comment 9 Steve Goodman 2020-07-14 13:53:02 UTC
administrator, you might have missed this question in comment 7 for you:

...
If we find none, we emit an error message:

          "Cannot find an available subnet for internal Libvirt network"
          "Please set it to an unused subnet by adding the variable 'he_ipv4_subnet_prefix'"
          "to the variable-file ( e.g. he_ipv4_subnet_prefix: '123.123.123' )."

In particular, this indeed should happen if you use 192.168/16.
Did you get this error message? If so, then everything is working as designed.
...

Comment 10 Yedidyah Bar David 2020-07-16 13:07:36 UTC
Now merged the patch for bug 1849517, so I am setting current to depend on it, and suggest to update the documentation, to say something like:

The hosted-engine deploy process temporarily uses a /24 network under 192.168. It defaults to 192.168.222.0/24, and if this one is found to be in use, it tries other /24 ranges under 192.168.0, until it finds a non-used one, or until it exhausts this range and then it fails. If you want the deploy process to use some other /24 range for the temporary network, you can run it as:

    hosted-engine --deploy --ansible-extra-vars=he_ipv4_subnet_prefix=PREFIX

where PREFIX is the prefix you want to use - e.g. '192.168.222' to make it use its default range.

Please note that we didn't yet fix bug 1851677 (for cockpit), so above is only possible using the CLI.

Comment 11 Yedidyah Bar David 2020-07-16 13:20:56 UTC
BTW, didn't try to verify that this works. Steve - perhaps try this yourself, or ask QE. It's enough to pass this argument and search the logs for 'he_ipv4_subnet_prefix' to see that it matches what you passed, and 'local_vm_ip' to see that the engine indeed got an address in the range you specified.

Comment 12 Yedidyah Bar David 2020-07-28 09:07:38 UTC
(In reply to Yedidyah Bar David from comment #10)
> Now merged the patch for bug 1849517, so I am setting current to depend on
> it, and suggest to update the documentation, to say something like:
> 
> The hosted-engine deploy process temporarily uses a /24 network under
> 192.168. It defaults to 192.168.222.0/24, and if this one is found to be in
> use, it tries other /24 ranges under 192.168.0, until it finds a non-used

Sorry, this should be: "under 192.168". I think I may have meant "starting
with 192.168.0" (which is also true).

> one, or until it exhausts this range and then it fails. If you want the
> deploy process to use some other /24 range for the temporary network, you
> can run it as:
> 
>     hosted-engine --deploy --ansible-extra-vars=he_ipv4_subnet_prefix=PREFIX
> 
> where PREFIX is the prefix you want to use - e.g. '192.168.222' to make it
> use its default range.
> 
> Please note that we didn't yet fix bug 1851677 (for cockpit), so above is
> only possible using the CLI.

Comment 13 administrator 2020-07-30 08:27:25 UTC
(In reply to Steve Goodman from comment #9)
> administrator, you might have missed this question in comment
> 7 for you:
> 
> ...
> If we find none, we emit an error message:
> 
>           "Cannot find an available subnet for internal Libvirt network"
>           "Please set it to an unused subnet by adding the variable
> 'he_ipv4_subnet_prefix'"
>           "to the variable-file ( e.g. he_ipv4_subnet_prefix: '123.123.123'
> )."
> 
> In particular, this indeed should happen if you use 192.168/16.
> Did you get this error message? If so, then everything is working as
> designed.
> ...

Hi Steve,
yes I get this error message

Comment 14 Nikolai Sednev 2020-08-13 13:38:49 UTC
In Cisco routers in case of already existing subnet on other interface you'll get appropriate message, that's why arp and rarp exists. IP classless or IP classful are unrelated here, the issue is that we're using this subnet as reserved for libvirt initial NAT during deployment, which will be released once HE-VM is deployed.
I already mentioned in the past that documentation should cover the range as reserved.

Comment 17 Steve Goodman 2020-09-01 15:06:25 UTC
Eli,

Please do peer review.

Comment 18 Eli Marcus 2020-09-03 14:18:06 UTC
(In reply to Steve Goodman from comment #17)
> Eli,
> 
> Please do peer review.

Looks fine to me

Comment 19 Steve Goodman 2020-09-03 15:10:27 UTC
Merged.


Note You need to log in before you can comment on or make changes to this bug.