Bug 1566059 - Scoped link local IPv6 addresses break VM listing (happens when ovirt-guest-agent is not installed but qemu-guest-agent is)
Summary: Scoped link local IPv6 addresses break VM listing (happens when ovirt-guest-a...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: ovirt-engine
Classification: oVirt
Component: BLL.Network
Version: 4.2.2.6
Hardware: Unspecified
OS: Unspecified
unspecified
high vote
Target Milestone: ovirt-4.2.3
: ---
Assignee: eraviv
QA Contact: Petr Matyáš
URL:
Whiteboard:
: 1573830 1626220 (view as bug list)
Depends On:
Blocks: 1551350
TreeView+ depends on / blocked
 
Reported: 2018-04-11 12:45 UTC by Tomáš Golembiovský
Modified: 2019-04-15 16:17 UTC (History)
9 users (show)

Fixed In Version: ovirt-engine-4.2.3.4
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-05-14 15:11:53 UTC
oVirt Team: Network
rule-engine: ovirt-4.2+
ykaul: blocker+


Attachments (Terms of Use)
Engine log (117.18 KB, text/plain)
2018-04-11 12:45 UTC, Tomáš Golembiovský
no flags Details


Links
System ID Priority Status Summary Last Updated
oVirt gerrit 90661 master MERGED core: validate ipv6 address from guest 2018-04-29 11:24:13 UTC
oVirt gerrit 90664 master MERGED core: validate ipv4 address from guest 2018-04-29 11:24:15 UTC
oVirt gerrit 90763 ovirt-engine-4.2 MERGED core: validate ipv6 address from guest 2018-04-30 11:58:22 UTC
oVirt gerrit 90764 ovirt-engine-4.2 MERGED core: validate ipv4 address from guest 2018-04-30 11:59:00 UTC
oVirt gerrit 90767 ovirt-engine-4.2.3.z MERGED core: validate ipv6 address from guest 2018-04-30 12:08:23 UTC
oVirt gerrit 90768 ovirt-engine-4.2.3.z MERGED core: validate ipv4 address from guest 2018-04-30 12:08:26 UTC

Description Tomáš Golembiovský 2018-04-11 12:45:44 UTC
Created attachment 1420285 [details]
Engine log

Description of problem:

IPv6 addresses may contain scope [1]. Such addresses have the from: <address>%<zone>
 
When guest agent provides IPv6 with scope in stats to the engine it breaks VM listing in the UI. engine.log contains the following errors:

2018-04-11 14:35:25,080+02 ERROR [org.ovirt.engine.core.bll.SearchQuery] (default task-10) [794ba721-326f-469f-b251-896172ea1519] Query 'SearchQuery' failed: StatementCallba
ck; SQL [SELECT * FROM ((SELECT  vms.* FROM  vms  )  ORDER BY vm_name ASC ) as T1 OFFSET (1 -1) LIMIT 100]; ERROR: invalid input syntax for type inet: "fe80::343d:ee87:2a50:
daf5%4"

[1] https://tools.ietf.org/html/rfc4007#section-11

Comment 1 Michal Skrivanek 2018-04-13 08:08:25 UTC
I do not mind alternative temporary solution of trimming the report, but one way or the other it should be resolved by 4.2.3.
How/where do you suggest to fix this?

Comment 2 Dan Kenigsberg 2018-04-15 09:19:51 UTC
Michal, is this new in any way? What happens when the guest reports % to the rhv-4.1?

Regardless, Engine must never trust the guest and quote its data.

Comment 3 Tomáš Golembiovský 2018-04-15 18:29:11 UTC
(In reply to Dan Kenigsberg from comment #2)
> Michal, is this new in any way?

It is new in the sense that such IPs are reported from QEMU Guest Agent and QEMU Guest Agent polling is a new feature in 4.2.

> What happens when the guest reports % to the
> rhv-4.1?

I did not try this, but I expect the same problem there. That being said, it's probably not an issue. IPs reported by oVirt Guest Agent don't contain the scope (the source of information is different from QEMU Guest Agent). Unless something changes inside Windows in the future I don't think older RHV versions are in danger.

> 
> Regardless, Engine must never trust the guest and quote its data.

Comment 4 Yaniv Kaul 2018-04-17 19:57:41 UTC
(In reply to Tomáš Golembiovský from comment #3)
> (In reply to Dan Kenigsberg from comment #2)
> > Michal, is this new in any way?
> 
> It is new in the sense that such IPs are reported from QEMU Guest Agent and
> QEMU Guest Agent polling is a new feature in 4.2.

What is the possibility to revert back to ovirt-guest-agent until we can fix it?
Sounds somewhat easier than fix in engine, for the time being?
> 
> > What happens when the guest reports % to the
> > rhv-4.1?
> 
> I did not try this, but I expect the same problem there. That being said,
> it's probably not an issue. IPs reported by oVirt Guest Agent don't contain
> the scope (the source of information is different from QEMU Guest Agent).
> Unless something changes inside Windows in the future I don't think older
> RHV versions are in danger.
> 
> > 
> > Regardless, Engine must never trust the guest and quote its data.

So true!

Comment 5 Tomáš Golembiovský 2018-04-18 06:45:27 UTC
(In reply to Yaniv Kaul from comment #4)
> (In reply to Tomáš Golembiovský from comment #3)
> > (In reply to Dan Kenigsberg from comment #2)
> > > Michal, is this new in any way?
> > 
> > It is new in the sense that such IPs are reported from QEMU Guest Agent and
> > QEMU Guest Agent polling is a new feature in 4.2.
> 
> What is the possibility to revert back to ovirt-guest-agent until we can fix
> it?
> Sounds somewhat easier than fix in engine, for the time being?

This is only problem when ovirt-guest-agent is not installed. If both are installed and running then data from ovirt-guest-agent is used.

> > 
> > > What happens when the guest reports % to the
> > > rhv-4.1?
> > 
> > I did not try this, but I expect the same problem there. That being said,
> > it's probably not an issue. IPs reported by oVirt Guest Agent don't contain
> > the scope (the source of information is different from QEMU Guest Agent).
> > Unless something changes inside Windows in the future I don't think older
> > RHV versions are in danger.
> > 
> > > 
> > > Regardless, Engine must never trust the guest and quote its data.
> 
> So true!

Comment 7 eraviv 2018-04-23 07:58:45 UTC
After looking into the details, here are my observations:

Engine:
-------
Initially engine saves all reported ips (both ipv4 and ipv6) to vm_dynamic.vm_ip as a single string in a 'text' type (e.g. "192.168.122.1 fe80::1").

When the 'vms' view is called it executes the stored procedure "fn_get_comparable_ip_list" which breaks the string into individual ips and tries to build an array of type 'inet' with each ip in a separate item of the array. 

'inet' is a built-in postgres type which stores only valid ip addresses and supports sorting thereof, but which does not support zone ids for ipv6 addresses. Engine has been using the inet type in the db for its sorting functionality.

The ipv6 address with the '%' is rejected when the inet array is being populated by the stored procedure, and an exception is thrown back to engine. 


IPv6:
-----
- According to [1] non-global-scope ipv6 addresses may have a zone suffix in one of the formats:
        <address>%<zone_id>
        <address>%<zone_id>/<prefix>  

- According to [2] IPv6 requires a link-local address on every network interface on which the IPv6 protocol is enabled, even when routable addresses are also assigned.The link-local address is required for [...] IPv6-based protocols, such as DHCPv6.

- According to [3] link-Local addresses are designed to be used for addressing on a single link for purposes such as automatic address configuration, neighbour discovery, or when no routers are present.

- According to the bug, qemu-guest-agent already reports non-global addresses with their zone_id appended to the address, and vdsm forwards them as-is.

- According to [4] and to postgres 'inet' type documentation there are no plans to support the zone_id in this type.

Conclusion:
-----------
Addresses with a zone_id suffix cannot be saved to the db because the 'inet' type does not support them, but they cannot be totally ignored because a reporting interface may only have a link-local address. 

So the suggested solution is to strip the ipv6 addresses of the zone_id on entering the engine, and saving them in the db without it.

It is assumed that as far as we currently know, addresses are not reported with their prefix attached so no loss of current info is expected.

When the addresses are requested from the engine via the REST API, they are reported by engine under their respective interfaces, so the zone_id is redundant in that use case. 
In the web-admin, only the first ipv4 and ipv6 addresses of the guests are actually visible (without hovering on the field) so it is assumed that a duplicate ipv6 will not appear unless hovered over.

----------------------------------------------------
[1] https://tools.ietf.org/html/rfc4007#section-6
[2] https://en.wikipedia.org/wiki/Link-local_address#IPv6
[3] https://tools.ietf.org/html/rfc4291#section-2.5.6
[4] http://www.postgresql-archive.org/IPv6-link-local-addresses-and-init-data-type-td5905510.html

Comment 8 Tomáš Golembiovský 2018-04-23 10:18:54 UTC
(In reply to eraviv from comment #7)

> So the suggested solution is to strip the ipv6 addresses of the zone_id on
> entering the engine, and saving them in the db without it.

Seems good enough for me until we have a valid use-case for preserving it.

Comment 9 Martin Perina 2018-05-02 10:56:47 UTC
*** Bug 1573830 has been marked as a duplicate of this bug. ***

Comment 10 msheena 2018-05-07 08:23:27 UTC
Tomáš,
What are the setup required to recreate and steps?
Would appreciate assistance on how to check what zone_id qemu agent is reporting?

Comment 11 Tomáš Golembiovský 2018-05-07 10:18:55 UTC
You need a Windows VM with recent QEMU-GA (idealy use RHV-toolsSetup ISO from 4.2). Also you need to disable oVirt guest agent service to recreate the issue.

Comment 12 Petr Matyáš 2018-05-11 10:46:37 UTC
Verified on ovirt-engine-4.2.3.5-0.1.el7.noarch

Comment 13 Sandro Bonazzola 2018-05-14 15:11:53 UTC
This bugzilla is included in oVirt 4.2.3 release, published on May 4th 2018.

Since the problem described in this bug report should be
resolved in oVirt 4.2.3 release, it has been closed with a resolution of CURRENT RELEASE.

If the solution does not work for you, please open a new bug report.

Comment 14 Dominik Holler 2019-04-15 16:15:54 UTC
*** Bug 1626220 has been marked as a duplicate of this bug. ***


Note You need to log in before you can comment on or make changes to this bug.