Bug 1371119 - Null ipv6 in IpConfiguration after 3.6->4.0 upgrade
Summary: Null ipv6 in IpConfiguration after 3.6->4.0 upgrade
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: ovirt-engine
Classification: oVirt
Component: Backend.Core
Version: 4.0.2.7
Hardware: Unspecified
OS: Unspecified
medium
urgent
Target Milestone: ovirt-4.0.4
: 4.0.4.4
Assignee: Martin Mucha
QA Contact: Michael Burman
URL:
Whiteboard:
Depends On:
Blocks: 1373112
TreeView+ depends on / blocked
 
Reported: 2016-08-29 11:46 UTC by nicolas
Modified: 2016-09-26 12:40 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-09-26 12:40:53 UTC
oVirt Team: Network
Embargoed:
rule-engine: ovirt-4.0.z+
rule-engine: exception+
ylavi: planning_ack+
danken: devel_ack+
myakove: testing_ack+


Attachments (Terms of Use)
vdsCaps output, UI log with full trace, ifcfgs (17.95 KB, application/x-gzip)
2016-08-31 12:31 UTC, nicolas
no flags Details
Videocapture of the error (1.85 MB, application/x-gzip)
2016-09-07 17:39 UTC, nicolas
no flags Details
Output of network_attachments (6.38 KB, application/x-gzip)
2016-09-13 11:31 UTC, nicolas
no flags Details


Links
System ID Private Priority Status Summary Last Updated
oVirt gerrit 64069 0 master MERGED core: replace null valued ipv6_boot_protocol with 'NONE' 2020-02-24 10:08:05 UTC
oVirt gerrit 64070 0 master MERGED core: disallow boot protocol to be null 2020-02-24 10:08:05 UTC
oVirt gerrit 64071 0 ovirt-engine-4.0 MERGED core: replace null valued ipv6_boot_protocol with 'NONE' 2020-02-24 10:08:05 UTC
oVirt gerrit 64072 0 ovirt-engine-4.0 MERGED core: disallow boot protocol to be null 2020-02-24 10:08:05 UTC
oVirt gerrit 64073 0 ovirt-engine-4.0.4 MERGED core: replace null valued ipv6_boot_protocol with 'NONE' 2020-02-24 10:08:05 UTC
oVirt gerrit 64074 0 ovirt-engine-4.0.4 ABANDONED core: disallow boot protocol to be null 2020-02-24 10:08:05 UTC

Description nicolas 2016-08-29 11:46:11 UTC
Description of problem:

A Uncaught Exception dialog is shown when trying to edit any network from within the Hosts tab.

Version-Release number of selected component (if applicable):

4.0.2-7

How reproducible:

Always

Steps to Reproduce:
1. Click on the Hosts tab
2. Click on the Network interfaces subtab
3. Click on the Setup Host Networks button
4. Try to edit any of the VLANs that are configured on the host

Actual results:

The uncaught exception is thrown

Expected results:

The network edition dialog should be shown

Additional info:

I de-obfuscated the error and it corresponds to the symbolMap 40DB0D639FCFF1481CE2D629A86BE514 file, token 'Ev':

Ev,java.lang.Throwable::fillInStackTrace()Ljava/lang/Throwable;,java.lang.Throwable,fillInStackTrace,com/google/gwt/emul/java/lang/Throwable.java,114,0

Comment 1 Dan Kenigsberg 2016-08-31 08:16:39 UTC
Nicolas, how did you create the VLANs on the host?

I'd like to see more UI debug information (full stack); I hope Michael can explain how to obtain it.

Comment 2 nicolas 2016-08-31 08:34:21 UTC
I just edited manually the physical interface where the ovirtmgmt network should run, and placed a config like this:

DEVICE="..."
ONBOOT=yes
NETBOOT=yes
UUID="..."
IPV6INIT=yes
BOOTPROTO=static
IPADDR=10...
NETMASK=255...
GATEWAY=10...
HWADDR="XX:XX:XX:XX:XX:XX"
TYPE=Ethernet
NAME="..."

So this one has no bridge and no 'ifcfg-ovirtmgmt' file exists under /etc/sysconfig/network-scripts. The rest of them were created via the oVirt engine: I defined them under the Networks tab (most of them configured as not required) and then I attached them under the 'Hosts' tab, 'Network interfaces' subtab.

If I can provide any further information don't hesitate to ask.

Comment 3 Michael Burman 2016-08-31 08:44:23 UTC
I can't reproduce this report on 4.0.3-0.1.el7ev.
Editing vlan networks on the host via setup networks dialog working as expected. 

http://old.ovirt.org/OVirt_Engine_Debug_Obfuscated_UI

Comment 4 nicolas 2016-08-31 09:09:06 UTC
Just to add some more info, we have 3 oVirt infrastructures, all of them with the very same version.

Two of them are a bit "complex" and have plenty of Networks configured. This issue is reproducible on them.

However, the last one which only has ovirtmgmt and 2 additional networks works good and I'm unable to reproduce the issue.

Seems this is not a global bug on this version but seems to be dependent on the configuration of networks.

Comment 5 Michael Burman 2016-08-31 09:22:31 UTC
Hi nicolas can you please provide us the full stack? it is well explained on the link i attached in comment 3^^

As well the content of the ifcfg-* files and the output of vdsClient -s 0 getVdsCaps

Comment 6 nicolas 2016-08-31 12:31:44 UTC
Created attachment 1196340 [details]
vdsCaps output, UI log with full trace, ifcfgs

Comment 7 nicolas 2016-08-31 12:34:13 UTC
Uploaded all the requested info

Comment 8 Michael Burman 2016-08-31 13:58:52 UTC
Thanks you nicolas, danken can you take a look?

Comment 9 Dan Kenigsberg 2016-08-31 15:25:04 UTC
Martin, can you look at the full UI trace of attachment 1196340 [details]?

Comment 10 Martin Mucha 2016-09-05 10:35:34 UTC
a) unable to reproduce — please add info how one can reproduce this issue.

b) by reading log provided by dan, it seems, that error can be fixed https://gerrit.ovirt.org/#/c/58292/1 This patch was abandoned by decision of alona & yevgeny, while I would still believe, that frontend should not fail with NPE exception under any circumstances, and definitely if totally different layer contains bug.

c) If I'm not mistaken, this is the patch which fixes backend:
https://gerrit.ovirt.org/#/c/52686/
please check if this is contained in your branch.

Comment 11 Dan Kenigsberg 2016-09-05 11:48:15 UTC
a) adding needinfo to reporter

c) Martin, the patch you refer to is well inside engine-4.0.2. (see `git log --grep If38510619c76e4d1ea7f3a5749f4f0811facadf5 ovirt/ovirt-engine-4.0.2`) Could you supply more info about what you think is the bug seen by nicolas?

Comment 12 nicolas 2016-09-05 14:30:38 UTC
I'm not sure what additional details can I provide.

As said, each of the two infrastructures that throw the exception have lots of networks defined, some of them tagged with VLAN IDs - the other infrastructure which works well does not have VLAN IDs defined on their interfaces, that's the only one difference.

The supplied files are for one of the oVirt infrastructure which fails. The exact steps to reproduce are specified in the first comment.

If you need any additional specific details just ask for them.

Comment 13 Yaniv Lavi 2016-09-07 09:29:10 UTC
Can you capture a video of the issue or give us access to the system?

Comment 14 nicolas 2016-09-07 17:39:54 UTC
Created attachment 1198790 [details]
Videocapture of the error

Added a video capture. If you determine that a remote access would be useful for you we can create an administrative account and a VPN access so you can debug further.

Comment 15 Martin Mucha 2016-09-13 11:09:15 UTC
Hi, 

thanks for video. Watching it I'd assume, that record in network_attachments table related to clicked icon is not as UI expects. I'd expect that 'ipv6_boot_protocol' column is null.

can you run
select * from from network_attachments;
and tell us the result?

if I create new network and attach it, this column is not null. So if you have it null, it might be error in upgrade. Did you do upgrade from 3.6 or is this clean 4.0 installation?

———

about abandoned patch (which should fix failing UI):
this really should be merged regardless of other since this:

private IpV6Address iPv6Address;


public FromNetworkAttachment(NetworkAttachment networkAttachment, HostNetworkQos networkQos) {
    …

    this.iPv6Address = networkAttachment.getIpConfiguration() != null
            && networkAttachment.getIpConfiguration().hasIpv6PrimaryAddressSet()
                    ? networkAttachment.getIpConfiguration().getIpv6PrimaryAddress()
                    : null;
    …
}

*assumes* that "this.iPv6Address" *can* be null, while 

public Ipv6BootProtocol getIpv6BootProtocol() {
    return iPv6Address.getBootProtocol();
}

*does not*, and (incorrectly named) field iPv6Address is null, therefore NPE.

Regardless of all possible causes and explanations *this* is wrong and should be fixed. I'm surprised it wasn't discovered by otherwhile overly hyperactive static code analysis tools.

Comment 16 nicolas 2016-09-13 11:31:39 UTC
Created attachment 1200472 [details]
Output of network_attachments

I'm providing the requested output of the psql sentence as an attachment. Indeed, that column seems to be blank for all interfaces.

Both failing infrastructures are a product of upgrading from 3.6.6 directly to 4.0.0, without 3.6.7 in the middle (not sure if this may affect).

Comment 17 Martin Mucha 2016-09-13 14:36:06 UTC
alright, great. 
So it seems that I might been right about cause and culprit.

You can then try to use following as a workaround:
update network_attachments set ipv6_boot_protocol='NONE' where ipv6_boot_protocol is null;

In my engine I have just this db setting — none ipv6_boot_protocol and all other ipv6 columns are null. So that should be fine.

I hope this will solve your issues. Please confirm if you're trying that.
——
note: reason why boot protocol is 'important' is this:
final String v6BootProtocol = rs.getString("ipv6_boot_protocol");
        if (v6BootProtocol != null) {
            final IpV6Address ipV6Address = createIpV6Address(rs, v6BootProtocol);
            ipConfiguration.getIpV6Addresses().add(ipV6Address);
        }

ie. if boot_protocol is null in db, address (which non-nullity is currently assumed on UI) won't be created at all, null will be passed to UI, which will fail because of this betrayal.


Questions to yevgeny(owner)/danken: so what's the design of this? Is it possible for NetworkAttachment IpConfiguration to have ipv6 address without boot protocol(and other properties, which will fail equally) or not? Rest and DB design allows null values, while UI expects non-nullity. If null is allowed, please merge my fix (or fix it any other way), if it's not allowed, please update rest entities and db definition.

Comment 18 Dan Kenigsberg 2016-09-13 15:29:10 UTC
As far as I understand, IPv6 may be "None", but never null. REST should not allow setting such IpConfigurations, and upgrade scripts must make sure this is maintained when 4.0.z is installed.

Yevgeni would know better.

Michael, have you seen something like that after Engine upgrades?

Comment 19 Michael Burman 2016-09-14 09:03:21 UTC
Yes i did saw it after upgrading from 3.6.9 >> 4.0.4.2 and this indeed caused by upgrade. 

It is not possible to edit any network(not only vlans) that is attached to the host. 
Once pressing the pencil to edit a ui exception thrown in the back-end.

engine=# select * from network_attachments;
-[ RECORD 1 ]------+-------------------------------------
id                 | 9442f41f-f754-4991-a55f-6d9e21a4fce7
network_id         | c7b79400-3103-4657-8005-92efe6766c7e
nic_id             | 8f5d0055-7dae-43b8-802c-973b9444e538
boot_protocol      | DHCP
address            | 
netmask            | 
gateway            | 
custom_properties  | 
_create_date       | 2016-09-13 09:56:31.446486+03
_update_date       | 
ipv6_boot_protocol | 
ipv6_address       | 
ipv6_prefix        | 
ipv6_gateway       | 
-[ RECORD 2 ]------+-------------------------------------
id                 | 83dcbdac-26ce-4385-9b90-697cc5278807
network_id         | f32b7f12-cba3-4270-9cfc-c1ee4a85c221
nic_id             | 8d8b7d82-fb23-42fd-a274-92c72c3daa68
boot_protocol      | NONE
address            | 
netmask            | 
gateway            | 
custom_properties  | 
_create_date       | 2016-09-13 10:14:11.342828+03
_update_date       | 
ipv6_boot_protocol | 
ipv6_address       | 
ipv6_prefix        | 
ipv6_gateway       | 
-[ RECORD 3 ]------+-------------------------------------
id                 | ca8b2b35-7164-4794-bef7-26f74eb724d4
network_id         | 18df588d-2719-466d-9cdd-1f2737424730
nic_id             | c1956588-4bbf-465b-af94-100c90cfae95
boot_protocol      | NONE
address            | 
netmask            | 
gateway            | 
custom_properties  | 
_create_date       | 2016-09-13 10:14:11.342828+03
_update_date       | 
ipv6_boot_protocol | 
ipv6_address       | 
ipv6_prefix        | 
ipv6_gateway

Comment 20 Martin Mucha 2016-09-14 10:38:43 UTC
-->Once pressing the pencil to edit a ui exception thrown in the back-end.

you probably meant in the front-end.

Comment 21 nicolas 2016-09-14 11:15:51 UTC
(In reply to Martin Mucha from comment #17)
> 
> You can then try to use following as a workaround:
> update network_attachments set ipv6_boot_protocol='NONE' where
> ipv6_boot_protocol is null;
> 

I confirm the workaround works, after altering all rows I can edit any interface. I can also confirm that newly created interfaces have the "NONE" value correctly set, so editing them doesn't throw the exception. Thanks.

Comment 22 Martin Mucha 2016-09-14 12:55:37 UTC
(In reply to nicolas from comment #21)
> (In reply to Martin Mucha from comment #17)
> > 
> > You can then try to use following as a workaround:
> > update network_attachments set ipv6_boot_protocol='NONE' where
> > ipv6_boot_protocol is null;
> > 
> 
> I confirm the workaround works, after altering all rows I can edit any
> interface. I can also confirm that newly created interfaces have the "NONE"
> value correctly set, so editing them doesn't throw the exception. Thanks.

Good to hear that, thanks for reporting this issue.

Comment 23 Michael Burman 2016-09-21 05:25:56 UTC
Fixed in version?

Comment 24 Michael Burman 2016-09-21 07:58:48 UTC
Verified on - 4.0.4.4-0.1.el7ev
Upgraded from 3.6.9 >> to 4.0.4.4-0.1.el7ev

- It is possible to edit every kind of network via the setup networks dialog as it should be. 
- NO null ipv6 in ipConfiguration. 

engine=# select * from network_attachments;
-[ RECORD 1 ]------+-------------------------------------
id                 | 9442f41f-f754-4991-a55f-6d9e21a4fce7
network_id         | c7b79400-3103-4657-8005-92efe6766c7e
nic_id             | 8f5d0055-7dae-43b8-802c-973b9444e538
boot_protocol      | DHCP
address            | 
netmask            | 
gateway            | 
custom_properties  | 
_create_date       | 2016-09-13 09:56:31.446486+03
_update_date       | 
ipv6_boot_protocol | NONE
ipv6_address       | 
ipv6_prefix        | 
ipv6_gateway       | 
-[ RECORD 2 ]------+-------------------------------------
id                 | 83dcbdac-26ce-4385-9b90-697cc5278807
network_id         | f32b7f12-cba3-4270-9cfc-c1ee4a85c221
nic_id             | 8d8b7d82-fb23-42fd-a274-92c72c3daa68
boot_protocol      | NONE
address            | 
netmask            | 
gateway            | 
custom_properties  | 
_create_date       | 2016-09-13 10:14:11.342828+03
_update_date       | 
ipv6_boot_protocol | NONE
ipv6_address       | 
ipv6_prefix        | 
ipv6_gateway       | 
-[ RECORD 3 ]------+-------------------------------------
id                 | ca8b2b35-7164-4794-bef7-26f74eb724d4
network_id         | 18df588d-2719-466d-9cdd-1f2737424730
nic_id             | c1956588-4bbf-465b-af94-100c90cfae95
boot_protocol      | NONE
address            | 
netmask            | 
gateway            | 
custom_properties  | 
_create_date       | 2016-09-13 10:14:11.342828+03
_update_date       | 
ipv6_boot_protocol | NONE
ipv6_address       | 
ipv6_prefix        | 
ipv6_gateway       | 
-[ RECORD 4 ]------+-------------------------------------
id                 | 2caa7964-38dc-48bd-a3b1-fb7b60e9deab
network_id         | 7ff7ba2e-ac97-41f5-ae57-48966ad0ca89
nic_id             | e2fe771c-2839-47fa-bf5a-a751cfe3390b
boot_protocol      | NONE
address            | 
netmask            | 
gateway            |


Note You need to log in before you can comment on or make changes to this bug.