Bug 1732808

Summary: db:migrate fails: 34000000000001 is out of range for ActiveModel::Type::Integer with limit 4 bytes
Product: Red Hat CloudForms Management Engine Reporter: Jaroslav Henner <jhenner>
Component: ApplianceAssignee: Joe Rafaniello <jrafanie>
Status: CLOSED ERRATA QA Contact: Jaroslav Henner <jhenner>
Severity: high Docs Contact: Red Hat CloudForms Documentation <cloudforms-docs>
Priority: high    
Version: 5.10.7CC: abellott, duhlmann, jrafanie, obarenbo, simaishi, yrudman
Target Milestone: GAKeywords: Regression
Target Release: 5.11.0   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: 5.11.0.19 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-12-12 13:36:50 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: Bug
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: CFME Core Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1703278    
Attachments:
Description Flags
db:migrate
none
rake db:migreate none

Description Jaroslav Henner 2019-07-24 12:05:50 UTC
Created attachment 1593135 [details]
db:migrate

Description of problem:
The migration of customer DB to 5.11 fails due to number out the range.

Version-Release number of selected component (if applicable):
cfme-5.11.0.15-1.el8cf.x86_64


How reproducible:
2/2

Steps to Reproduce:
1. Prepare a VM with >15G in /var/lib/psql
2. pg_restore -U root -j 4 -d vmdb_production /net/$NFS/srv/export/customer_db_dump
2. fix_auth --databaseyml
3. fix_auth --v2 --invalid bogus
4. vmdb; rake db:migrate


Actual results:

== 20181016140921 MigrateOrchStacksToHaveOwnershipConcept: migrating ==========
-- Migrating existing orchestration stacks to have direct owners, groups, tenant
rake aborted!
StandardError: An error has occurred, this and all later migrations canceled:

34000000000001 is out of range for ActiveModel::Type::Integer with limit 4 bytes
/opt/rh/cfme-gemset/gems/activemodel-5.1.7/lib/active_model/type/integer.rb:51:in `ensure_in_range'
/opt/rh/cfme-gemset/gems/activemodel-5.1.7/lib/active_model/type/integer.rb:27:in `serialize'
/opt/rh/cfme-gemset/gems/activerecord-5.1.7/lib/active_record/attribute.rb:51:in `value_for_database'
/opt/rh/cfme-gemset/gems/activerecord-5.1.7/lib/active_record/attribute.rb:63:in `forgetting_assignment'
/opt/rh/cfme-gemset/gems/activerecord-5.1.7/lib/active_record/attribute_set/builder.rb:21:in `transform_values'


Expected results:
migration complete

Additional info:

Comment 3 Jaroslav Henner 2019-07-24 12:27:58 UTC
Created attachment 1593140 [details]
rake db:migreate

Comment 4 Jaroslav Henner 2019-07-24 12:34:10 UTC
Indeed, This tells we need at least 6 bytes to encode a value this big:
In [3]: math.log(34000000000001)/math.log(2)/8
Out[3]: 5.618824997487342

Comment 5 Jaroslav Henner 2019-07-24 13:12:25 UTC
I think I can mark this as Regression. When I migrated the same db to 5.10, it worked.

Comment 6 Brandon Dunne 2019-07-24 15:10:47 UTC
Hi Jaroslav,

In the description is says "How reproducible: 2/2", do you have an a reproducer environment that I can look at?  I'm not able to reproduce this with my database.

In Comment 5 it says "When I migrated the same db to 5.10, it worked.", but these migrations aren't in v5.10, so I don't think this is relevant.


As you can see in the relevant section of the attached log, all three of the columns that we are attempting to update have been created as :bigint which is limit: 8, so I'm interested to see how to reproduce this.

== 20181010134649 AddEvmOwnerToOrchestrationStacks: migrating =================
-- add_reference(:orchestration_stacks, :evm_owner, {:type=>:bigint, :index=>true})
   -> 0.0077s
-- add_reference(:orchestration_stacks, :miq_group, {:type=>:bigint, :index=>true})
   -> 0.0076s
-- add_reference(:orchestration_stacks, :tenant, {:type=>:bigint, :index=>true})
   -> 0.0063s
== 20181010134649 AddEvmOwnerToOrchestrationStacks: migrated (0.0219s) ========

== 20181016140921 MigrateOrchStacksToHaveOwnershipConcept: migrating ==========
-- Migrating existing orchestration stacks to have direct owners, groups, tenant
rake aborted!
StandardError: An error has occurred, this and all later migrations canceled:

34000000000001 is out of range for ActiveModel::Type::Integer with limit 4 bytes

Comment 8 Joe Rafaniello 2019-07-24 21:32:28 UTC
Testing with the database, we were able to binary search it down to this:
rake db:migrate VERSION=20180409120422; rake db:migrate

Which migrates back to just after that version and tries to migrate forward.  It looks like the following migrations must be run where the first somehow pollutes the memory spaces of the last(the reported failing migration)

20180409120422_create_physical_network_ports.rb  (where we start)

The following migrations are then run:
20180424141617_azure_backslash_to_forward_slash.rb  (Seems to be the culprit)
20180425123859_add_index_on_all_type_columns.rb
20180426163655_create_physical_storage.rb
20180507134810_azure_normalize_image_name.rb
20180525111220_add_connected_port_to_physical_network_port.rb
20180525171150_add_deleted_on_to_service_template.rb
20180530160321_add_report_base_model_to_chargeback_rate.rb
20180605135438_add_resource_to_miq_schedule.rb
20180605210436_add_title_cves_to_openscap_rule_results.rb
20180606083431_convert_quadicon_settings_keys.rb
20180606155924_move_ansible_container_secrets_into_database.rb
20180606201908_rename_towhat_to_resource_type.rb
20180607084710_nuage_subclass_l3_cloud_subnet.rb
20180607134817_add_network_router_id_to_floating_ip.rb
20180611131314_add_status_to_storages.rb
20180613200937_add_firmware_type_to_hardware.rb
20180618083035_add_visible_to_zone.rb
20180618084054_init_zones_visibility.rb
20180618084757_add_zone_before_pause_id_to_ext_management_system.rb
20180618212608_create_cloud_volume_types.rb
20180620170052_add_aws_region_to_file_depot.rb
20180625120055_add_ems_ref_to_lans.rb
20180705190447_add_chassis_and_switch_to_event_stream.rb
20180706115011_add_loc_led_name_asset_details.rb
20180712122000_remove_host_provisioning.rb
20180713194201_create_canister.rb
20180713194229_add_canister_id_to_hardwares.rb
20180713201539_add_physical_chassis_id_to_physical_storage.rb
20180718132840_remove_transformation_product_setting.rb
20180719162710_add_owner_and_group_to_auth.rb
20180719163110_add_internal_to_service_template.rb
20180726142030_create_physical_disk.rb
20180807152553_drop_vim_performance_tag_values.rb
20180807153714_add_conversion_host_table.rb
20180810144738_update_default_internal_attribute.rb
20180813141056_add_cancelation_status_to_miq_request.rb
20180817152200_add_network_router_id_to_security_group.rb
20180817152201_add_cloud_subnet_id_to_security_group.rb
20180817212259_add_api_version_and_domain_id_and_security_protocol_and_openstack_region_to_file_depot.rb
20180821112856_create_service_catalog_tables.rb
20180823111741_move_location_led_state_to_asset_details_table.rb
20180827083140_create_service_instances_table.rb
20180827145819_add_link_to_notification_types.rb
20180828092111_add_parent_physical_chassis_id_to_physical_chassis.rb
20180830121026_add_port_status_to_physical_network_ports.rb
20180905144610_remove_report_base_model_from_chargeback_rate.rb
20180914214707_drop_duplicate_indexes.rb
20180917151300_add_limits_to_conversion_host.rb
20180920085721_add_maintenance_zone_id_to_region.rb
20180924144957_rename_physical_disk_type.rb
20180926152238_fix_default_tenant_group.rb
20181001131632_add_conversion_host_id_to_miq_request_tasks.rb
20181002123523_add_total_space_to_physical_storages.rb
20181002192054_fix_conversion_host_resource_type.rb
20181003122633_add_canister_id_and_ems_ref_to_physical_disks.rb
20181010134649_add_evm_owner_to_orchestration_stacks.rb
20181012160010_remove_special_characters_from_ansible_rabbitmq_password.rb
20181012161000_remove_special_characters_from_ansible_rabbitmq_password_two.rb
20181016140921_migrate_orch_stacks_to_have_ownership_concept.rb  (the failing migration)

Will need to look into this some more.

Comment 11 drew uhlmann 2019-07-30 20:13:58 UTC
Yup, sorry, please ignore comment 9, it's not the way to go.

Comment 12 CFME Bot 2019-08-07 16:01:59 UTC
New commit detected on ManageIQ/manageiq-schema/master:

https://github.com/ManageIQ/manageiq-schema/commit/cdfce63d2bc668d3e619519db5449ecb431e3ed8
commit cdfce63d2bc668d3e619519db5449ecb431e3ed8
Author:     Joe Rafaniello <jrafanie>
AuthorDate: Thu Jul 25 14:24:34 2019 -0400
Commit:     Joe Rafaniello <jrafanie>
CommitDate: Thu Jul 25 14:24:34 2019 -0400

    Migrate with cleared schema cache

    Wrap each migration with a clearing of the schema cache to ensure the
    schema/column information is not cached.  Since migrations very often
    change schema, these caches are likely to get busted.

    Migration authors don't realize they may need to reset the column
    information whenever they change the schema so it's often overlooked.
    It often doesn't cause a problem, until it does:

    20180424141617_azure_backslash_to_forward_slash.rb
     - loads stub: class OrchestrationStack < ActiveRecord::Base
     - orchestration_stacks table column information is cached by table name

    20181010134649_add_evm_owner_to_orchestration_stacks.rb
     - adds evm_owner, miq_group, tenant references (_id columns)

    20181016140921_migrate_orch_stacks_to_have_ownership_concept.rb
     - loads stub: class OrchestrationStack < ActiveRecord::Base
     - uses cached table column information
     - Blows up trying to store bigint into what rails thinks is an integer column:
       stack.update_attributes(:evm_owner_id => user.id, :tenant_id =>
       user.current_tenant.id, :miq_group_id => user.current_group.id)
       '34000000000001 is out of range for ActiveModel::Type::Integer with
       limit 4 bytes'

    This commit eliminates the need to reset column information for any of
    these migrations as we clear the schema cache for all tables before each
    migration.

    Fixes https://bugzilla.redhat.com/show_bug.cgi?id=1732808

 lib/manageiq/schema/engine.rb | 5 +
 lib/manageiq/schema/migrate_with_cleared_schema_cache.rb | 22 +
 2 files changed, 27 insertions(+)

Comment 13 CFME Bot 2019-08-07 22:32:55 UTC
New commit detected on ManageIQ/manageiq-schema/ivanchuk:

https://github.com/ManageIQ/manageiq-schema/commit/ed95ecf9b91de89c4de30702d2efc81ba0a5438d
commit ed95ecf9b91de89c4de30702d2efc81ba0a5438d
Author:     Brandon Dunne <bdunne>
AuthorDate: Wed Aug  7 12:01:45 2019 -0400
Commit:     Brandon Dunne <bdunne>
CommitDate: Wed Aug  7 12:01:45 2019 -0400

    Merge pull request #401 from jrafanie/migrate_with_cleared_schema_cache

    Migrate with cleared schema cache

    (cherry picked from commit 44bbd51ddb5a8566ebe5c75da4c4175d726e0774)

    https://bugzilla.redhat.com/show_bug.cgi?id=1732808

 lib/manageiq/schema/engine.rb | 5 +
 lib/manageiq/schema/migrate_with_cleared_schema_cache.rb | 22 +
 2 files changed, 27 insertions(+)

Comment 14 Jaroslav Henner 2019-08-21 14:10:57 UTC
5.11.0.19 worked. cfme-5.11.0.18-1.el8cf.x86_64 with the same steps gives the error in the bug description.

This looks to be fixed.

Comment 16 errata-xmlrpc 2019-12-12 13:36:50 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:4199