Bug 1286132 - [Admin Portal] Cannot edit DC properties after upgrade from 3.5 - MAC Address Pool complains about invalid MAC address
[Admin Portal] Cannot edit DC properties after upgrade from 3.5 - MAC Address...
Status: CLOSED CURRENTRELEASE
Product: ovirt-engine
Classification: oVirt
Component: Frontend.WebAdmin (Show other bugs)
3.6.1
Unspecified Unspecified
high Severity medium (vote)
: ovirt-3.6.2
: 3.6.2.5
Assigned To: Martin Mucha
Meni Yakove
: Regression
Depends On:
Blocks: RHEV3.6Upgrade
  Show dependency treegraph
 
Reported: 2015-11-27 06:43 EST by Jiri Belka
Modified: 2016-02-18 06:08 EST (History)
10 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cause: error in MAC address generation in file ./packaging/setup/plugins/ovirt-engine-setup/config/macrange.py caused possible generation of MAC address of illegal form, for example '00:1a:4a:1:f:ff'. Consequence: engine silently used different mac address range than required. If generated range was 00:1a:4a:1:f:00-00:1a:4a:1:f:ff, engine actually used range was: 00:00:1a:4a:1f:00-00:00:1a:4a:1f:ff. Fix: Badly generated MAC addresses stored in vdc_option table (which were then moved to mac_pool_ranges table) were fixed in same manner as engine misinterpreted them for several years, to avoid sudden change. So after this change, your setup will use still the same range, but it will be reported and stored in DB correctly (see 'Consequence' part for demonstration, how is bad range fixed). Result: DB now contains MAC address in correct form, and every user is highly recommended to verify his/her setup, if it actually uses MAC address range he/she expects to be used. If there's something odd it can be easily identified; if MAC address has more than expected (by one or two) leading zeroes, then your setup contained an error, which was fixed. Please update your setup if reported MAC address range is not the one you want to use.
Story Points: ---
Clone Of:
Environment:
Last Closed: 2016-02-18 06:08:03 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: Network
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
rule-engine: ovirt‑3.6.z+
rule-engine: blocker+
ylavi: planning_ack+
rule-engine: devel_ack+
rule-engine: testing_ack+


Attachments (Terms of Use)
mac pool dialog (128.72 KB, image/png)
2015-11-27 06:45 EST, Jiri Belka
no flags Details


External Trackers
Tracker ID Priority Status Summary Last Updated
oVirt gerrit 49711 master MERGED core: fix badly generated macs in db 2015-12-17 07:46 EST
oVirt gerrit 50659 ovirt-engine-3.6 ABANDONED core: fix badly generated macs in db 2015-12-29 09:12 EST
oVirt gerrit 51037 master MERGED core: fix badly generated macs in db 2016-01-04 05:43 EST
oVirt gerrit 51289 ovirt-engine-3.6.2 MERGED core: fix badly generated macs in db 2016-01-04 08:58 EST
oVirt gerrit 51290 ovirt-engine-3.6 MERGED core: fix badly generated macs in db 2016-01-04 08:57 EST
oVirt gerrit 51693 ovirt-engine-3.6 MERGED network:fix invalid regexp syntax 2016-01-12 05:18 EST
oVirt gerrit 51694 ovirt-engine-3.6.2 MERGED network:fix invalid regexp syntax 2016-01-12 05:19 EST

  None (edit)
Description Jiri Belka 2015-11-27 06:43:20 EST
Description of problem:

Cannot edit DC properties after upgrade from 3.5 - MAC Address Pool complains about invalid MAC address. Discovered while I wanted to bump up DC level to 3.6 after upgrade from 3.5.6 to 3.6.0-22 (build).

In Configure link (top-right) -> Mac Pool Address - there's Default and it has mac pool range defined and it _IS_ the same as one in DC properties dialog which should have been invalid. Odd.

Version-Release number of selected component (if applicable):
rhevm-3.6.1-0.2.el6.noarch

How reproducible:
100%

Steps to Reproduce:
1. upgrade from 3.5.6 env (I had engine, 2 hosts, some VMs running with nfs 
   data domain)
2. upgrade engine to 3.6 and then hosts
3. after bumping up cluster level, try to bump up DC level to 3.6

Actual results:
unable to edit DC, mac pool range is highlighted as invalid (even the same mac pool range is on Configure -> Mac Address Pool defined)

Expected results:
should work

Additional info:
Comment 1 Jiri Belka 2015-11-27 06:45 EST
Created attachment 1099663 [details]
mac pool dialog
Comment 3 Jiri Belka 2015-11-27 06:48:21 EST
engine=# select * from mac_pool_ranges ;
             mac_pool_id              |     from_mac     |      to_mac      
--------------------------------------+------------------+------------------
 0000001c-001c-001c-001c-0000000000e7 | 00:1a:4a:1:4f:00 | 00:1a:4a:1:4f:ff
(1 row)
Comment 4 Jiri Belka 2015-11-27 06:53:11 EST
Probably 03_06_0090_add_mac_pool_ranges_to_storage_pool.sql is the issue.
Comment 5 Red Hat Bugzilla Rules Engine 2015-11-29 06:53:18 EST
This bug report has Keywords: Regression or TestBlocker.
Since no regressions or test blockers are allowed between releases, it is also being identified as a blocker for this release. Please resolve ASAP.
Comment 6 Martin Mucha 2015-11-30 07:29:24 EST
I 'tested' upgrade script using:

select mac_range[1], mac_range[2] from
(SELECT string_to_array(unnest(string_to_array('00:1a:4a:1a:25:00-00:1a:4a:1a:25:ff', ',')),'-') as mac_range ) as mac_ranges;

which works as expected. So it seemsw, that if there was valid range in vdc_options table, it will be correctly moved to new data structure. We need to retest this making sure, whether there's valid range in MacPoolRanges vdc option before upgrading. Jiri already has it in his queue of tasks.
Comment 7 Dan Kenigsberg 2015-11-30 09:23:43 EST
If 00:1a:4a:1a:25:0 was considered as a valid address by former releases, the upgrade script should be protective and add leading zeros when needed.
Comment 8 Yaniv Kaul 2015-11-30 09:40:56 EST
(In reply to Dan Kenigsberg from comment #7)
> If 00:1a:4a:1a:25:0 was considered as a valid address by former releases,
> the upgrade script should be protective and add leading zeros when needed.

Why? Do we have customers with such issue? Looks like such a far fetched corner case, that the fix might be more complex and risky.
Comment 9 Jiri Belka 2015-11-30 09:43:41 EST
I did not touch anything in that 3.5.x engine. Anyway, I'll be re-trying the flow again soon (some hour).
Comment 10 Jiri Belka 2015-11-30 11:07:03 EST
It could be that 3.5.6's /usr/share/ovirt-engine/setup/plugins/ovirt-engine-setup/ovirt-engine/config/macrange.py could put single hexadecimal digit... But I'm not pythonista.
Comment 11 Jiri Belka 2015-11-30 11:26:04 EST
Only suspicious cause must be some internal code, as engine-config prevents putting single hexa digit in mac address group:

# engine-config -s MacPoolRanges=00:1a:4a:1:25:00-00:1a:4a:1:25:ff                                        
Cannot set value 00:1a:4a:1:25:00-00:1a:4a:1:25:ff to key MacPoolRanges. The range start/end is an invalid MAC address. 00:1a:4a:1:25:00 should be in a format of AA:AA:AA:AA:AA:AA


I could not reproduce with valid mac range (ie. valid mac range in 3.5.6 -> 3.6.x). Unfortunately I didn't save "setup/ovirt-engine-setup-*.log" from original VM :/

This is a part from non-problematic VM:

~~~
2015-11-20 14:11:06 DEBUG otopi.context context._executeMethod:138 Stage misc METHOD otopi.plugins.ovirt_engine_setup.ovirt_engine.config.macrange.Plugin._misc
2015-11-20 14:11:06 DEBUG otopi.ovirt_engine_setup.engine_common.database database.execute:164 Database: 'None', Statement: '
                    select count(*) as count
                    from vdc_options
                    where
                        option_name=%(name)s and
                        version=%(version)s
                ', args: {'version': 'general', 'name': 'MacPoolRanges'}
2015-11-20 14:11:06 DEBUG otopi.ovirt_engine_setup.engine_common.database database.execute:214 Result: [{'count': 1L}]
2015-11-20 14:11:06 DEBUG otopi.ovirt_engine_setup.engine_common.database database.execute:164 Database: 'None', Statement: '
                        update vdc_options
                        set
                            option_value=%(value)s
                        where
                            option_name=%(name)s and
                            version=%(version)s
                    ', args: {'version': 'general', 'name': 'MacPoolRanges', 'value': '00:1a:4a:1a:25:00-00:1a:4a:1a:25:ff'}
2015-11-20 14:11:06 DEBUG otopi.ovirt_engine_setup.engine_common.database database.execute:214 Result: []

~~~

Please inspect setup tool if it was possible to put such bad mac range. If so, it could cause problem for users...
Comment 12 Martin Mucha 2015-12-03 04:11:18 EST
what's the required action? Given python script was deleted in 3.6, so it can issue problem only when updating from version when this script existed. I cannot estimate the chance of this script is generating wrong macs. We can either ignore it as a not probable situation (someone should probably estimate that probability then) or we can write db upgrade script, which fixes this  wrong macs.
Comment 13 Yaniv Kaul 2015-12-03 11:13:02 EST
I suggest close-wontfix.
Comment 14 Jiri Belka 2015-12-03 11:19:25 EST
(In reply to Yaniv Kaul from comment #13)
> I suggest close-wontfix.

Nobody confirmed that the code could generate such mac pool (#10). If this could appear in wild (it appeared on my env by itself), we should at least know if the code was/is wrong in 3.5 engine-setup and at least document some solution for this.
Comment 15 Martin Mucha 2015-12-04 06:57:27 EST
I don't know the circumstances in which the python script is called. I got confirmed, that bug is there. Provided, that there are 2 random generated fields which gets badly aligned, it means 40% probability of bad MAC? Depending on pseudo-random number generator used in python. So if this script was likely to be executed, it's quite probable, that there's user with defective settings.

In 3.5, prior to changes to mac pool the code behaves like this: insufficient checks of input validity are done, and code flow changes original MAC "00:1a:4a:1:4f:00" to "00:01:a4:a1:4f:00" as a coincidence. There may be scenarios, when the code might fail, I did not check. So if python generated bad macs and code did not fail, wrong range was used. After upgrade the code behaves in similar way — what's once stored in db is considered as has-to-be-valid value.

possible update script posted to gerrit.
Comment 22 Red Hat Bugzilla Rules Engine 2016-01-12 04:54:33 EST
Target release should be placed once a package build is known to fix a issue. Since this bug is not modified, the target version has been reset. Please use target milestone to plan a fix for a oVirt release.
Comment 23 Sandro Bonazzola 2016-01-14 03:26:31 EST
Sorry for the noise with assignee, something went wrong while setting target release.
Comment 24 Michael Burman 2016-01-25 02:51:54 EST
Hi, 

Our team(network) couldn't reproduce this report, not even one time. 
I performed upgrade flows multiple times 3.5>3.6 and 3.4>3.5>3.6 and we can't get to situation that the MAC address contains an invalid MAC address range like described above^^ 

On 3.4 and 3.5 it's not possible to configure an invalid MAC address range like described above^^ via the engine-config.
Comment 25 Jiri Belka 2016-01-25 05:34:23 EST
In comment #11 I stated I could not reproduce, keep in mind that mac range was somehow generated (python code).

In comment #20, you can see that bad mac ranges exist in wild, so I did not fabricate the issue.
Comment 26 Michael Burman 2016-01-27 07:34:11 EST
Finished a full upgrade cycle 3.4(latest av) > 3.5.7 > 3.6.2.6 with success, this report is not reproducible. DC properties are editable after upgrade.
Suggesting to move to VERIFIED on 3.6.2.6

Yaniv, pleas ack
Comment 27 Yaniv Lavi 2016-01-27 12:49:26 EST
Please consulate with assignee on how to mod the database to get the state we saw in the field and then test.
Comment 28 Martin Mucha 2016-01-28 02:58:33 EST
"… state we saw in the field and then test." — I'm not sure if I understand that. But if you want to get DB to state which led to inability to edit DC properties, then simple update of records in mac_pool_ranges, which makes value in column 'from_mac' or 'to_mac' defective (single char on secord or third group from the right) should bring you to defective state. Fix only treated DB data, and did not enforce check constraint on aforementioned columns, so you can easily do this.
Comment 29 Michael Burman 2016-01-28 09:00:05 EST
With Martin's help we had an invalid MAC address range before the upgrade --> 

engine-config -g MacPoolRanges
MacPoolRanges: 00:1a:4a:b:7:00-00:1a:4a:b:7:ff version: general

Run upgrade from 3.5.7 > 3.6.3 -->

engine=# select * from mac_pool_ranges;
             mac_pool_id              |     from_mac      |      to_mac       
--------------------------------------+-------------------+-------------------
 0000001c-001c-001c-001c-00000000030f | 00:00:1a:4a:b7:00 | 00:00:1a:4a:b7:ff

Verified on - 3.6.3-0.1.el6

Note You need to log in before you can comment on or make changes to this bug.