Bug 1571154 - vdsm reports to engine the local host network address IPv4 and IPv6 during the VM launch
Summary: vdsm reports to engine the local host network address IPv4 and IPv6 during t...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: ovirt-engine
Classification: oVirt
Component: Frontend.WebAdmin
Version: 4.2.3.2
Hardware: x86_64
OS: Linux
unspecified
low
Target Milestone: ovirt-4.2.4
: ---
Assignee: Arik
QA Contact: Michael Burman
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-04-24 08:14 UTC by Michael Burman
Modified: 2018-06-26 08:46 UTC (History)
9 users (show)

Fixed In Version: ovirt-engine-4.2.4
Clone Of:
Environment:
Last Closed: 2018-06-26 08:46:08 UTC
oVirt Team: Network
Embargoed:
rule-engine: ovirt-4.2+
ylavi: blocker-
ylavi: exception+


Attachments (Terms of Use)
record (2.38 MB, application/x-gzip)
2018-04-24 08:14 UTC, Michael Burman
no flags Details


Links
System ID Private Priority Status Summary Last Updated
oVirt gerrit 90756 0 master MERGED core: improve monitoring of vm guest agent nics 2018-05-10 12:07:40 UTC
oVirt gerrit 90758 0 master MERGED core: filter loopback nics 2018-05-10 13:16:48 UTC
oVirt gerrit 90759 0 master MERGED core: filter invalid ip addresses in vm-analyzer 2018-05-10 13:16:51 UTC
oVirt gerrit 90760 0 master MERGED core: filter blacklisted nic names 2018-05-10 13:16:46 UTC
oVirt gerrit 90761 0 master MERGED core: simplify construction of vm ips 2018-05-10 13:16:54 UTC
oVirt gerrit 91139 0 ovirt-engine-4.2 MERGED core: improve monitoring of vm guest agent nics 2018-05-14 08:28:10 UTC
oVirt gerrit 91140 0 ovirt-engine-4.2 MERGED core: filter blacklisted nic names 2018-05-14 12:09:17 UTC
oVirt gerrit 91141 0 ovirt-engine-4.2 MERGED core: filter loopback nics 2018-05-14 12:09:25 UTC
oVirt gerrit 91142 0 ovirt-engine-4.2 MERGED core: filter invalid ip addresses in vm-analyzer 2018-05-14 12:09:32 UTC
oVirt gerrit 91143 0 ovirt-engine-4.2 MERGED core: simplify construction of vm ips 2018-05-14 12:09:41 UTC
oVirt gerrit 91144 0 master MERGED core: adding loopback nics in windows to the blacklist 2018-05-14 06:32:49 UTC
oVirt gerrit 91206 0 ovirt-engine-4.2 MERGED core: adding loopback nics in windows to the blacklist 2018-05-14 12:13:10 UTC

Description Michael Burman 2018-04-24 08:14:52 UTC
Created attachment 1425862 [details]
record

Description of problem:
[UI] - IP Address column - engine reports the local net and wrong IPv4 address during the VM launch.

For few seconds, in the IP address column, wrong IPv4 reported as well the local net IP 127.0.0.1, which shouldn't be.


- The same happens if the exclamation mark of - guest agent is not a latest version, and in this case the wrong IPs shown for ever.

Version-Release number of selected component (if applicable):
4.2.3.2-0.1.el7

How reproducible:
100%

Steps to Reproduce:
1. Start VM or run VM with excalmation mark (guest agent not in lasted version)

Actual results:
local net IP is shown and wron IPv4 displayed , example:
127.0.0.1
15.0.0.2::1

Expected results:
Correct IPv4 and no local net address displayed

This is a new behaviour on latest version

Comment 1 Dan Kenigsberg 2018-04-24 08:35:20 UTC
Tomas, isn't this a result of your integration with qemu guest agent? Can you please filter-out the local addresses (they are uninteresting to the end user)?
Any idea why 15.0.0.2 is being reported?

Comment 2 Red Hat Bugzilla Rules Engine 2018-04-24 08:36:24 UTC
This bug report has Keywords: Regression or TestBlocker.
Since no regressions or test blockers are allowed between releases, it is also being identified as a blocker for this release. Please resolve ASAP.

Comment 3 Michael Burman 2018-04-24 08:42:11 UTC
(In reply to Dan Kenigsberg from comment #1)
> Tomas, isn't this a result of your integration with qemu guest agent? Can
> you please filter-out the local addresses (they are uninteresting to the end
> user)?
> Any idea why 15.0.0.2 is being reported?

Dan, 15.0.0.2 is my ovn IP and it's OK, but it has ::1 in the end which is wrong!
The same will be 10.35.128.0::1

Comment 4 Tomáš Golembiovský 2018-04-24 11:48:12 UTC
Yes, this is probably the result of QEMU-GA integration. But lets think of it as a feature. I am not entirely convinced this should be filtered on VDSM level, nor am I convinced this is always uninteresting to the end user. The fact that we don't have a suitable UI for such information today does not mean it will always stay that way. If engine does not want to show localhost/link local addresses etc. in IP column why don't filter it on engine?

Comment 5 Michal Skrivanek 2018-04-25 04:42:28 UTC
Do you run ovirt-guest-agent then?

Comment 6 Michael Burman 2018-04-25 05:02:20 UTC
(In reply to Michal Skrivanek from comment #5)
> Do you run ovirt-guest-agent then?

Of course i do, if i wasn't running, i couldn't see this IP's on my engine UI  - 
ovirt-guest-agent-common-1.0.14-3.el7ev.noarch

Comment 7 Dan Kenigsberg 2018-04-25 06:16:29 UTC
(In reply to Michael Burman from comment #6)
> (In reply to Michal Skrivanek from comment #5)
> > Do you run ovirt-guest-agent then?
> 
> Of course i do, if i wasn't running, i couldn't see this IP's on my engine
> UI  - 
> ovirt-guest-agent-common-1.0.14-3.el7ev.noarch

Actually, as of the recent changes by Tomas, if ovirt-guest-agent is off, Vdsm collects the addresses from qemu-guest-agent. So you are expected to see IPs even if ovirt-guest-agent is off.

The bug here is that qemu-guest-agent reports boring local addresses, as well as the odd-looking 15.0.0.2::1.

Tomas, what is this 15.0.0.2::1 address, and why is it reported by qemu-guest-agent?

Comment 8 Michal Skrivanek 2018-04-25 07:19:58 UTC
(In reply to Dan Kenigsberg from comment #7)
> Tomas, what is this 15.0.0.2::1 address, and why is it reported by
> qemu-guest-agent?

Looking at the video there's no "15.0.0.2::1" address. There is "15.0.0.2" and "::1" which is probably all correct. You see both IPv4 and IPv6 addresses, and AFAIK there is no localhost filtering in qemu-ga like we have in ovirt-ga.

Michael, please attach output of "ip" and ovirt-guest-agent logs

Comment 9 Michal Skrivanek 2018-04-25 07:24:11 UTC
(In reply to Michal Skrivanek from comment #8)
> (In reply to Dan Kenigsberg from comment #7)
> > Tomas, what is this 15.0.0.2::1 address, and why is it reported by
> > qemu-guest-agent?
> 
> Looking at the video there's no "15.0.0.2::1" address. There is "15.0.0.2"
> and "::1" which is probably all correct. You see both IPv4 and IPv6
> addresses, and AFAIK there is no localhost filtering in qemu-ga like we have
> in ovirt-ga.
> 
> Michael, please attach output of "ip" and ovirt-guest-agent logs

or rather vdsm. ovirt-ga data have precedence over qemu-ga, but it is polled differently so it may happen you see qemu-ga data temporarily.

Comment 10 Michael Burman 2018-04-25 07:32:04 UTC
(In reply to Michal Skrivanek from comment #8)
> (In reply to Dan Kenigsberg from comment #7)
> > Tomas, what is this 15.0.0.2::1 address, and why is it reported by
> > qemu-guest-agent?
> 
> Looking at the video there's no "15.0.0.2::1" address. There is "15.0.0.2"
> and "::1" which is probably all correct. You see both IPv4 and IPv6
> addresses, and AFAIK there is no localhost filtering in qemu-ga like we have
> in ovirt-ga.
> 
> Michael, please attach output of "ip" and ovirt-guest-agent logs

In the video there is 15:0:0:2::1 indeed and not "15.0.0.2" and "::1
this is not correct, ::1 added to all IPv4 addresses. This is sown in the video as well.

Comment 11 Raz Tamir 2018-04-25 08:32:49 UTC
Also affecting our automation

Comment 12 Michal Skrivanek 2018-04-25 08:37:14 UTC
(In reply to Michael Burman from comment #10)

> In the video there is 15:0:0:2::1 indeed and not "15.0.0.2" and "::1
> this is not correct, ::1 added to all IPv4 addresses. This is sown in the
> video as well.

no, nothing is "added" to anything. There's just "::1" as a localhost IPv6 address. Are you looking at a different video perhaps?:)

Comment 13 Michal Skrivanek 2018-04-25 08:39:41 UTC
(In reply to Raz Tamir from comment #11)
> Also affecting our automation

hm, how exactly? IIUC the report (and the video) the ovirt-guest-agent report takes over once it properly starts. So if you're checking certain strings to be there you still get them, and in fact at the same time as before. It's just that sooner than that you can see qemu-ga report which includes all interfaces potentially different from what ovirt-ga reports.
If you have tests checking for "IP address not being reported when ovirt-guest-agent is not running" then they need to be fixed instead.

Comment 14 Yaniv Kaul 2018-04-25 08:47:25 UTC
(In reply to Michal Skrivanek from comment #13)
> (In reply to Raz Tamir from comment #11)
> > Also affecting our automation
> 
> hm, how exactly? IIUC the report (and the video) the ovirt-guest-agent
> report takes over once it properly starts. So if you're checking certain
> strings to be there you still get them, and in fact at the same time as
> before. It's just that sooner than that you can see qemu-ga report which
> includes all interfaces potentially different from what ovirt-ga reports.
> If you have tests checking for "IP address not being reported when
> ovirt-guest-agent is not running" then they need to be fixed instead.

I think we should remove the report of IPs from qemu-guest-agent (in VDSM?) until this issue is fully and successfully resolved.

Comment 15 Michal Skrivanek 2018-04-25 08:49:51 UTC
(In reply to Yaniv Kaul from comment #14)

> I think we should remove the report of IPs from qemu-guest-agent (in VDSM?)
> until this issue is fully and successfully resolved.

why? what's wrong with reporting all addresses?

Comment 16 Yaniv Kaul 2018-04-25 08:52:12 UTC
(In reply to Michal Skrivanek from comment #15)
> (In reply to Yaniv Kaul from comment #14)
> 
> > I think we should remove the report of IPs from qemu-guest-agent (in VDSM?)
> > until this issue is fully and successfully resolved.
> 
> why? what's wrong with reporting all addresses?

It's buggy. There's no need to report 127.0.0.1, for example.
It's buggy - Engine can't handle the % thing (Windows only).

The whole feature of moving from o-g-a to q-g-a was not properly done - it was not in the scope initially for 4.2, the design was not there, other teams were not involved, etc.
Let's avoid such features next time.
I believe it also landed somewhat late.

Comment 17 Michael Burman 2018-04-25 08:55:51 UTC
(In reply to Michal Skrivanek from comment #12)
> (In reply to Michael Burman from comment #10)
> 
> > In the video there is 15:0:0:2::1 indeed and not "15.0.0.2" and "::1
> > this is not correct, ::1 added to all IPv4 addresses. This is sown in the
> > video as well.
> 
> no, nothing is "added" to anything. There's just "::1" as a localhost IPv6
> address. Are you looking at a different video perhaps?:)

I understand it's the IPv6 local net, but in the UI it's look part of the IPv4 address that's all.

Comment 18 Michal Skrivanek 2018-04-25 08:59:41 UTC
(In reply to Yaniv Kaul from comment #16)
> (In reply to Michal Skrivanek from comment #15)
> > (In reply to Yaniv Kaul from comment #14)
> > 
> > > I think we should remove the report of IPs from qemu-guest-agent (in VDSM?)
> > > until this issue is fully and successfully resolved.
> > 
> > why? what's wrong with reporting all addresses?
> 
> It's buggy. There's no need to report 127.0.0.1, for example.

but that's not incorrect. That is a valid interface with a a valid address

> It's buggy - Engine can't handle the % thing (Windows only).

it's not windows only, it's a valid address and indeed the bug is on engine side that the database field designed for IP addresses can not hold a valid IPv6 address.


> The whole feature of moving from o-g-a to q-g-a was not properly done - it
> was not in the scope initially for 4.2, the design was not there, other
> teams were not involved, etc.
> Let's avoid such features next time.
> I believe it also landed somewhat late.

I disagree on all above

Comment 19 Raz Tamir 2018-04-25 09:00:08 UTC
(In reply to Michal Skrivanek from comment #13)
> (In reply to Raz Tamir from comment #11)
> > Also affecting our automation
> 
> hm, how exactly? IIUC the report (and the video) the ovirt-guest-agent
> report takes over once it properly starts. So if you're checking certain
> strings to be there you still get them, and in fact at the same time as
> before. It's just that sooner than that you can see qemu-ga report which
> includes all interfaces potentially different from what ovirt-ga reports.
> If you have tests checking for "IP address not being reported when
> ovirt-guest-agent is not running" then they need to be fixed instead.

When we execute vdsm-client VM.getStats we get 

'netIfaces': [{'name': 'lo', 'hw': '00:00:00:00:00:00', 'inet': ['127.0.0.1'], 'inet6': ['::1']}, {'name': 'eth0', 'hw': '00:1a:4a:16:88:1d', 'inet': ['10.46.17.29'], 'inet6': ['2620:52:0:2e10:21a:4aff:fe16:881d', 'fe80::21a:4aff:fe16:881d']}]

It is a quick fix in our code to exclude the 127.* but just want to make sure you are aware that this issue is not a UI only as stated in the summary

Comment 20 Michal Skrivanek 2018-04-25 09:08:21 UTC
either way, disabling is not difficult. It can be done either globally (in vdsm.conf for the whole qemu-guest-agent reporting) or only the network piece can be disabled in code

Comment 21 Michal Skrivanek 2018-04-25 09:16:06 UTC
(In reply to Raz Tamir from comment #19)
> (In reply to Michal Skrivanek from comment #13)
> > (In reply to Raz Tamir from comment #11)
> > > Also affecting our automation
> > 
> > hm, how exactly? IIUC the report (and the video) the ovirt-guest-agent
> > report takes over once it properly starts. So if you're checking certain
> > strings to be there you still get them, and in fact at the same time as
> > before. It's just that sooner than that you can see qemu-ga report which
> > includes all interfaces potentially different from what ovirt-ga reports.
> > If you have tests checking for "IP address not being reported when
> > ovirt-guest-agent is not running" then they need to be fixed instead.
> 
> When we execute vdsm-client VM.getStats we get 
> 
> 'netIfaces': [{'name': 'lo', 'hw': '00:00:00:00:00:00', 'inet':
> ['127.0.0.1'], 'inet6': ['::1']}, {'name': 'eth0', 'hw':
> '00:1a:4a:16:88:1d', 'inet': ['10.46.17.29'], 'inet6':
> ['2620:52:0:2e10:21a:4aff:fe16:881d', 'fe80::21a:4aff:fe16:881d']}]

right. So that's a complete list which looks alright to me. But once ovirt-guest-agent sends its report you should get the same one
Looking at one of the VMs from the video I see:
"netIfaces": [{"name": "eth0","inet6": ["fe80::200:ff:fe00:20"],"inet": ["10.35.130.57"],"hw": "00:00:00:00:00:20"}]


> It is a quick fix in our code to exclude the 127.* but just want to make
> sure you are aware that this issue is not a UI only as stated in the summary

sure, it's not UI-only. It also somewhat breaks the feature to allow blacklists (bug 1437145) which was implemented as a quick fix just in ovirt-guest-agent, So that feature only works depending on which agent is used

Comment 22 Michal Skrivanek 2018-04-25 09:32:06 UTC
(In reply to Michal Skrivanek from comment #20)
I also believe a proper fix is not that hard either. Chopping off "%" in DAO sounds like a trivial short term fix until we fix it properly in database and report/show complete address.
Since we always filtered loopback in ovirt-guest-agent filtering out ::1 and 127.0.0.1 in engine sounds good enough to me as well, if that's the price I have to pay for showing guest IP information without additional agent

Comment 23 Yaniv Kaul 2018-04-25 09:34:31 UTC
From my perspective:
{'name': 'lo', 'hw': '00:00:00:00:00:00', 'inet': ['127.0.0.1'], 'inet6': ['::1']}
- There is no need to collect or report the loopback device. No one cares about it, at all. It's a qemu-guest-agent and/or ovirt-guest-agent bug.

Comment 24 Michal Skrivanek 2018-04-25 09:39:34 UTC
(In reply to Michal Skrivanek from comment #18)

> > I believe it also landed somewhat late.
> 
> I disagree on all above

no, let me take that back. The NIC address reporting itself landed very late indeed. That was done under the assumption that ovirt-guest-agent report supersedes it anyway. Which it does.

Comment 25 Michal Skrivanek 2018-04-25 09:45:23 UTC
(In reply to Yaniv Kaul from comment #23)
> From my perspective:
> {'name': 'lo', 'hw': '00:00:00:00:00:00', 'inet': ['127.0.0.1'], 'inet6':
> ['::1']}
> - There is no need to collect or report the loopback device. No one cares
> about it, at all. It's a qemu-guest-agent and/or ovirt-guest-agent bug.

I think so too. Same as we should have the same interface blacklisting facility as implemented in ovirt-guest-agent for e.g. docker0 interface. I believe that should happen inside the guest to be able to control that based on specific OS

Comment 26 Yaniv Kaul 2018-04-25 09:47:03 UTC
(In reply to Michal Skrivanek from comment #25)
> (In reply to Yaniv Kaul from comment #23)
> > From my perspective:
> > {'name': 'lo', 'hw': '00:00:00:00:00:00', 'inet': ['127.0.0.1'], 'inet6':
> > ['::1']}
> > - There is no need to collect or report the loopback device. No one cares
> > about it, at all. It's a qemu-guest-agent and/or ovirt-guest-agent bug.
> 
> I think so too. Same as we should have the same interface blacklisting
> facility as implemented in ovirt-guest-agent for e.g. docker0 interface. I
> believe that should happen inside the guest to be able to control that based
> on specific OS

Probably need to add something like:
if (ifa->ifa_flags & (IFF_LOOPBACK))
        continue;

To https://github.com/qemu/qemu/blob/master/qga/commands-posix.c#L1722

Comment 27 Tomáš Golembiovský 2018-04-25 09:58:22 UTC
(In reply to Yaniv Kaul from comment #26)
> (In reply to Michal Skrivanek from comment #25)
> > (In reply to Yaniv Kaul from comment #23)
> > > From my perspective:
> > > {'name': 'lo', 'hw': '00:00:00:00:00:00', 'inet': ['127.0.0.1'], 'inet6':
> > > ['::1']}
> > > - There is no need to collect or report the loopback device. No one cares
> > > about it, at all. It's a qemu-guest-agent and/or ovirt-guest-agent bug.
> > 
> > I think so too. Same as we should have the same interface blacklisting
> > facility as implemented in ovirt-guest-agent for e.g. docker0 interface. I
> > believe that should happen inside the guest to be able to control that based
> > on specific OS
> 
> Probably need to add something like:
> if (ifa->ifa_flags & (IFF_LOOPBACK))
>         continue;
> 
> To https://github.com/qemu/qemu/blob/master/qga/commands-posix.c#L1722

Uhm, hold on guys. I think we're assuming too much here. The fact that such information is of no use to oVirt engine does not mean it is uninteresting to everyone else. We're not talking about *oVirt* guest agent anymore.

If we don't care about the info then let's filter it out in engine.

Comment 28 Yaniv Kaul 2018-04-25 10:03:46 UTC
(In reply to Tomáš Golembiovský from comment #27)
> (In reply to Yaniv Kaul from comment #26)
> > (In reply to Michal Skrivanek from comment #25)
> > > (In reply to Yaniv Kaul from comment #23)
> > > > From my perspective:
> > > > {'name': 'lo', 'hw': '00:00:00:00:00:00', 'inet': ['127.0.0.1'], 'inet6':
> > > > ['::1']}
> > > > - There is no need to collect or report the loopback device. No one cares
> > > > about it, at all. It's a qemu-guest-agent and/or ovirt-guest-agent bug.
> > > 
> > > I think so too. Same as we should have the same interface blacklisting
> > > facility as implemented in ovirt-guest-agent for e.g. docker0 interface. I
> > > believe that should happen inside the guest to be able to control that based
> > > on specific OS
> > 
> > Probably need to add something like:
> > if (ifa->ifa_flags & (IFF_LOOPBACK))
> >         continue;
> > 
> > To https://github.com/qemu/qemu/blob/master/qga/commands-posix.c#L1722
> 
> Uhm, hold on guys. I think we're assuming too much here. The fact that such
> information is of no use to oVirt engine does not mean it is uninteresting
> to everyone else. We're not talking about *oVirt* guest agent anymore.

Ask in qemu upstream if anyone cares about the loopback address.

> 
> If we don't care about the info then let's filter it out in engine.

Or in VDSM? Why send non-needed data to Engine? Anything we can filter in VDSM is blessed.

Comment 29 Michal Skrivanek 2018-04-25 10:43:52 UTC
(In reply to Yaniv Kaul from comment #28)

> Or in VDSM? Why send non-needed data to Engine? Anything we can filter in
> VDSM is blessed.

but it's also more difficult to control. With a simple vdc_option we can have custom blacklist filtering easily

Comment 30 Michael Burman 2018-05-27 13:56:56 UTC
Verified on - 4.2.4-0.1.el7

Comment 31 Sandro Bonazzola 2018-06-26 08:46:08 UTC
This bugzilla is included in oVirt 4.2.4 release, published on June 26th 2018.

Since the problem described in this bug report should be
resolved in oVirt 4.2.4 release, it has been closed with a resolution of CURRENT RELEASE.

If the solution does not work for you, please open a new bug report.


Note You need to log in before you can comment on or make changes to this bug.