Created attachment 1612252 [details] screenshot Description of problem: After restoring the DB on 5.11 I don't see any zone, nor a server there although I restored environment with two evmserverds running connected to same DB. Version-Release number of selected component (if applicable): Version 5.11.0.22.20190827200559_e618ece How reproducible: I created the 2 machines environment and then * dumped with pg_dumpall (from https://bugzilla.redhat.com/show_bug.cgi?id=1726467#c7) * I also tried it the old way: pg_dump --format custom --file pg_dump-5.10 vmdb_production Steps to Reproduce: 1. Create the environment with two evmserverds running connected to the same DB. You will need to use the appliance console option (Join Region in External Database) 2. Create two machines with CFME 5.11 3. Follow the steps in https://bugzilla.redhat.com/show_bug.cgi?id=1726467 with small changes: Copy the UUIDs of both appliances and make sure both database.yml has valid key, but also make sure the evm-only appliance has correct ip in database.yml > producition > host field. 4. start the evmserverds Actual results: Both appliances seems to work quite fine, but we cannot access the configuration of servers under as there is nothing in the Configuration > Zones [----] I, [2019-09-06T04:47:11.391130 #47579:2af7ba7f45a4] INFO -- : MIQ(AssignedServerRole#activate) Activating Role <user_interface> on Server <EVM> [----] I, [2019-09-06T04:47:11.396195 #47579:2af7ba7f45a4] INFO -- : MIQ(AssignedServerRole#activate) Activating Role <web_services> on Server <EVM> [----] I, [2019-09-06T04:47:11.402323 #47579:2af7ba7f45a4] INFO -- : MIQ(AssignedServerRole#activate) Activating Role <remote_console> on Server <EVM> [----] E, [2019-09-06T04:47:11.412251 #47579:2af7ba7f45a4] ERROR -- : MIQ(MiqServer#monitor) comparison of Array with Array failed [----] E, [2019-09-06T04:47:11.412732 #47579:2af7ba7f45a4] ERROR -- : [ArgumentError]: comparison of Array with Array failed Method:[block (2 levels) in <class:LogProxy>] [----] E, [2019-09-06T04:47:11.412897 #47579:2af7ba7f45a4] ERROR -- : /var/www/miq/vmdb/app/models/miq_server.rb:277:in `sort_by' /var/www/miq/vmdb/app/models/miq_server.rb:277:in `log_active_servers' /var/www/miq/vmdb/app/models/miq_server.rb:345:in `block in monitor' /opt/rh/cfme-gemset/bundler/gems/cfme-gems-pending-ca1c762f8036/lib/gems/pending/util/extensions/miq-benchmark.rb:11:in `realtime_store' /opt/rh/cfme-gemset/bundler/gems/cfme-gems-pending-ca1c762f8036/lib/gems/pending/util/extensions/miq-benchmark.rb:28:in `realtime_block' /var/www/miq/vmdb/app/models/miq_server.rb:345:in `monitor' /var/www/miq/vmdb/app/models/miq_server.rb:387:in `block (2 levels) in monitor_loop' /opt/rh/cfme-gemset/bundler/gems/cfme-gems-pending-ca1c762f8036/lib/gems/pending/util/extensions/miq-benchmark.rb:11:in `realtime_store' /opt/rh/cfme-gemset/bundler/gems/cfme-gems-pending-ca1c762f8036/lib/gems/pending/util/extensions/miq-benchmark.rb:35:in `realtime_block' /var/www/miq/vmdb/app/models/miq_server.rb:387:in `block in monitor_loop' /var/www/miq/vmdb/app/models/miq_server.rb:386:in `loop' /var/www/miq/vmdb/app/models/miq_server.rb:386:in `monitor_loop' /var/www/miq/vmdb/app/models/miq_server.rb:248:in `start' /var/www/miq/vmdb/lib/workers/evm_server.rb:27:in `start' /var/www/miq/vmdb/lib/workers/evm_server.rb:48:in `start' /var/www/miq/vmdb/lib/workers/bin/evm_server.rb:4:in `<main>' [----] I, [2019-09-06T04:47:11.413021 #47579:2af7ba7f45a4] INFO -- : MIQ(MiqServer#monitor) Reconnecting to database after error... [----] I, [2019-09-06T04:47:11.424482 #47579:2af7ba7f45a4] INFO -- : MIQ(MiqServer#monitor) Reconnecting to database after error...Successful [----] I, [2019-09-06T04:47:17.620734 #48145:2af7ba7f45a4] INFO -- : MIQ(MiqScheduleWorker::Runner#do_work) Number of scheduled items to be processed: 1. Expected results: One zone, two servers in it no error in log Additional info:
Created attachment 1612257 [details] on-5.11-created I tried to create the two-evmserverd-with-one-db environment on the 5.11 (no dump+restore -- just create) and it doesn't suffer the problem.
I also tried to restore 5.10 DB on 5.11 without preserving the GUID. I started only one of the two evmserverd. The zones being empty still holds. No Zone, no server. But I also didn't see any ERROR in log.
Hi Jaroslav, Thanks for testing this. My first thought is Nick didn't mention to configure the database with the same region. Can you use psql to see what select id, name, guid from miq_servers and select id, region from miq_regions? If that's not it, please provide ip address of the machines so I can take a look. Thanks! Note, Nick also didn't include the step of "configure internal db", copy GUID, then dropdb vmdb_production, createdb vmdb_production, THEN psql (restore).. Can you confirm that we also need to do the dropdb and createdb vmdb_production?
Note, we often neglect stuff that we do without thinking so it's great for you to point out missing details. So, thanks!!!
I tried restoring 5.9 single appliance deployment on 5.10 and 5.11 and found no issue
(In reply to Joe Rafaniello from comment #4) > Hi Jaroslav, > > Thanks for testing this. My first thought is Nick didn't mention to > configure the database with the same region. I am configuring them with same region. # Can you use psql to see what > select id, name, guid from miq_servers and select id, region from > miq_regions? > > If that's not it, please provide ip address of the machines so I can take a > look. Well I have already deleted my machines but as I apparently need to check is this a regression, I am preparing new ones. > > Thanks! > > Note, Nick also didn't include the step of "configure internal db", copy > GUID, then dropdb vmdb_production, createdb vmdb_production. I do that stuff. Except create vmdb_production which doesn't seem to be required. > THEN psql > (restore).. > > Can you confirm that we also need to do the dropdb and createdb > vmdb_production?
In my tests I am testing what is the effect of preserving GUID and with new GUID. So far when I didn't preserve GUIDs i get either double of the servers or just comple nothing in the Zones, depending on whether the bug emerges. It doesn't seem to me the GUID is important thing. I think I should be getting either new server in the Zones or the previous one per appliance, not complete nothing.
>> If that's not it, please provide ip address of the machines so I can take a >> look. >Well I have already deleted my machines but as I apparently need to check is this a regression, I am preparing new ones. Can you keep machines for reported issues at least 1 week? Logs (vmdb/log/*for this issue are also helpful as I would like to seeing the boot seeding and our logging of the id range for the various objects. We're missing information to help solve this issue. >So far when I didn't preserve GUIDs i get either double of the servers or just comple nothing in the Zones, depending on whether the bug emerges. It doesn't seem to me the GUID is important thing. I think I should be getting either new server in the Zones or the previous one per appliance, not complete nothing. Yes, if you copy over the GUID, you should get 2 servers (miq_servers table) and whatever zones you had originally (1 default zone is the default, zones table). If you don't preserve the GUID, you should get 4 servers (2 duplicates of the same name server each with a different GUID) and the same zones as previously. If you are seeing no zones after restore, I need to see the logs and tables for the miq_servers and zones as there's clearly a bug we need to fix. Thanks!
With Jaroslav's environment, I found the Zone is not marked as visible. I believe this column was added going from 5.10 to 5.11 and we didn't change any existing zones to be visible => true in a migration. Looking at solutions. I was able to "fix" it by marking the default Zone as visible. Loading production environment (Rails 5.1.7) irb(main):001:0> Zone.first.visible PostgreSQLAdapter#log_after_checkout, connection_pool: size: 5, connections: 1, in use: 1, waiting_in_queue: 0 => nil irb(main):002:0> Zone.first.update(:visible => true) => true
We have a migration to set the existing zone as visible => true, added here: https://github.com/ManageIQ/manageiq-schema/pull/275 Looking into why this didn't work.
Note, the migration 20180618084054_init_zones_visibility.rb was run in the logs: [----] I, [2019-09-06T11:35:27.959207 #2521:2b1688a585c4] INFO -- : Migrating to AddVisibleToZone (20180618083035) [----] I, [2019-09-06T11:35:28.174954 #2521:2b1688a585c4] INFO -- : Migrating to InitZonesVisibility (20180618084054) Somehow, the default zone's visible column was still nil. Note2, this column and data migration was added in hammer/5.10, NOT ivanchuk/5.11 so the reported database came from 5.9. If we migrated from 5.9 directly to 5.11, I don't think that's supported, see https://bugzilla.redhat.com/show_bug.cgi?id=1726467#c8 Will need to look further at this. Even if 5.9 -> 5.11 direct upgrade isn't supported, I would think the database migration should work so I'm still looking at this.
(In reply to Joe Rafaniello from comment #12) > Note, the migration 20180618084054_init_zones_visibility.rb was run in the > logs: > > [----] I, [2019-09-06T11:35:27.959207 #2521:2b1688a585c4] INFO -- : > Migrating to AddVisibleToZone (20180618083035) > [----] I, [2019-09-06T11:35:28.174954 #2521:2b1688a585c4] INFO -- : > Migrating to InitZonesVisibility (20180618084054) > > Somehow, the default zone's visible column was still nil. > > Note2, this column and data migration was added in hammer/5.10, NOT > ivanchuk/5.11 so the reported database came from 5.9. If we migrated from > 5.9 directly to 5.11, I don't think that's supported, see > https://bugzilla.redhat.com/show_bug.cgi?id=1726467#c8 I think you see the migration in the log as, I assume, all the migrations are done in the time db is being created with the appliance console. I am pretty sure the VM you got was restored from 5.10 DB. > > Will need to look further at this. Even if 5.9 -> 5.11 direct upgrade isn't > supported, I would think the database migration should work so I'm still > looking at this.
Looking at the database dump that was imported: -- Data for Name: zones; Type: TABLE DATA; Schema: public; Owner: root -- COPY public.zones (id, name, description, created_on, updated_on, settings, log_file_depot_id, visible) FROM stdin; 1 default Default Zone 2019-09-06 15:45:03.290445 2019-09-06 15:45:03.290445 \N \N \N \. * It came from at least 5.10 since the visible flag is there * The visible value was exported as NULL * This value should have been migrated from nil to true in the migration
5.9 -> 5.10 with two evmserverds and one db dump+restore works without issues
Ah, I got it, the existing database that dumped this export is bad. 1) the database dump has a nil value for the visible flag for the default zone in the zones table (see comment #10) 2) the migration to fix this nil has already been run (according to the database dump), meaning it's a 5.10 database that incorrectly has a nil visible flag vmdb_production=# select * from schema_migrations where version = '20180618084054'; version ---------------- 20180618084054 (1 row) [root@dhcp-8-198-97 vmdb]# grep 20180618084054 ~/single_region_two_apps/510z/vmdb-pg_dumpall.pg ... 678 20180618084054 2019-09-06 15:44:16.062897 Therefore, this database dump was already a problem. Note, we'll only run the migration 20180618084054_init_zones_visibility.rb, if schema_migrations table doesn't already have it. In this case, it's already in that table, so we don't run it. Note2, I was confused because I saw it in the logs in comment #12: [----] I, [2019-09-06T11:35:27.959207 #2521:2b1688a585c4] INFO -- : Migrating to AddVisibleToZone (20180618083035) [----] I, [2019-09-06T11:35:28.174954 #2521:2b1688a585c4] INFO -- : Migrating to InitZonesVisibility (20180618084054) This was done when configuring the internal database, before it's blown away by the restore. It's not run after this. Note, migrations that occurred from 5.10 to 5.11 occurred nearly 40 minutes later and DO NOT include the migration that will fix the nil visible default zone(20180618084054) : [----] I, [2019-09-06T12:14:53.836650 #4564:2aae417045bc] INFO -- : Migrating to AddTotalSpaceToPhysicalStorages (20181002123523) [----] I, [2019-09-06T12:14:54.059273 #4564:2aae417045bc] INFO -- : Migrating to AddCanisterIdAndEmsRefToPhysicalDisks (20181003122633) [----] I, [2019-09-06T12:14:54.244986 #4564:2aae417045bc] INFO -- : Migrating to AddEvmOwnerToOrchestrationStacks (20181010134649) [----] I, [2019-09-06T12:14:54.464618 #4564:2aae417045bc] INFO -- : Migrating to MigrateOrchStacksToHaveOwnershipConcept (20181016140921) [----] I, [2019-09-06T12:14:54.684739 #4564:2aae417045bc] INFO -- : Migrating to AddCommentsToVmsTable (20181025230931) [----] I, [2019-09-06T12:14:54.897453 #4564:2aae417045bc] INFO -- : Migrating to SetVmConnectionState (20181119154009) [----] I, [2019-09-06T12:14:55.136144 #4564:2aae417045bc] INFO -- : Migrating to ClassificationParentNull (20181130203334) [----] I, [2019-09-06T12:14:55.340988 #4564:2aae417045bc] INFO -- : Migrating to RemoveRenamedAnsibleTowerConfigurationManagerRefreshWorkerRows (20181203224640) [----] I, [2019-09-06T12:14:55.535358 #4564:2aae417045bc] INFO -- : Migrating to RenameWebsocketToRemoteConsole (20190108135546) [----] I, [2019-09-06T12:14:55.750219 #4564:2aae417045bc] INFO -- : Migrating to RemoveCinderManagerEventWorkerRows (20190108163812) [----] I, [2019-09-06T12:14:55.973939 #4564:2aae417045bc] INFO -- : Migrating to AddMultiAttachmentToCloudVolume (20190110201414) [----] I, [2019-09-06T12:14:56.177379 #4564:2aae417045bc] INFO -- : Migrating to UseViewsForMetrics (20190122213042) [----] I, [2019-09-06T12:14:56.479288 #4564:2aae417045bc] INFO -- : Migrating to AddAccessibleToHostStorages (20190123210452) [----] I, [2019-09-06T12:14:56.725488 #4564:2aae417045bc] INFO -- : Migrating to AddHostIdToSwitch (20190201173247) [----] I, [2019-09-06T12:14:56.969282 #4564:2aae417045bc] INFO -- : Migrating to AddMissingEmsIdToSwitch (20190201173316) [----] I, [2019-09-06T12:14:57.224848 #4564:2aae417045bc] INFO -- : Migrating to AddCommentsToConversionHostsTable (20190213184307) [----] I, [2019-09-06T12:14:57.460033 #4564:2aae417045bc] INFO -- : Migrating to AddCommentsToMiqTasksTable (20190214125845) [----] I, [2019-09-06T12:14:57.664961 #4564:2aae417045bc] INFO -- : Migrating to AddLastInventoryDateToEms (20190225142729) [----] I, [2019-09-06T12:14:57.884324 #4564:2aae417045bc] INFO -- : Migrating to CheckGuidUniqueness (20190226184255) [----] I, [2019-09-06T12:14:58.137090 #4564:2aae417045bc] INFO -- : Migrating to ServiceTemplatesShouldHaveNames (20190228210310) [----] I, [2019-09-06T12:14:58.362463 #4564:2aae417045bc] INFO -- : Migrating to RemoveRssFeeds (20190301174502) [----] I, [2019-09-06T12:14:58.593290 #4564:2aae417045bc] INFO -- : Migrating to AddProviderGuids (20190304192641) [----] I, [2019-09-06T12:14:58.837911 #4564:2aae417045bc] INFO -- : Migrating to AddProviderServicesSupportedToAvailabilityZone (20190305181255) [----] I, [2019-09-06T12:14:59.063163 #4564:2aae417045bc] INFO -- : Migrating to DropContainerDeploymentsAndNodes (20190306235417) [----] I, [2019-09-06T12:14:59.281422 #4564:2aae417045bc] INFO -- : Migrating to RemovingAuthenticationForContainerDeployments (20190307131832) [----] I, [2019-09-06T12:14:59.528678 #4564:2aae417045bc] INFO -- : Migrating to CreateJoinTableServiceTemplateTenant (20190314110421) [----] I, [2019-09-06T12:14:59.788503 #4564:2aae417045bc] INFO -- : Migrating to AddZoneToServiceTemplates (20190318190517) [----] I, [2019-09-06T12:15:00.001317 #4564:2aae417045bc] INFO -- : Migrating to AddStateToServices (20190325160127) [----] I, [2019-09-06T12:15:00.209922 #4564:2aae417045bc] INFO -- : Migrating to DialogFieldLoadValuesOnInit (20190327132620) [----] I, [2019-09-06T12:15:00.490020 #4564:2aae417045bc] INFO -- : Migrating to AddPriceToServiceTemplates (20190513155251) [----] I, [2019-09-06T12:15:00.709424 #4564:2aae417045bc] INFO -- : Migrating to CreateFirmwareRegistries (20190514115219) [----] I, [2019-09-06T12:15:00.933837 #4564:2aae417045bc] INFO -- : Migrating to CreateFirmwareBinaries (20190514124322) [----] I, [2019-09-06T12:15:01.203998 #4564:2aae417045bc] INFO -- : Migrating to CreateFirmwareTargets (20190517093412) [----] I, [2019-09-06T12:15:01.459263 #4564:2aae417045bc] INFO -- : Migrating to CreateFirmwareBinaryFirmwareTargets (20190517093516) [----] I, [2019-09-06T12:15:01.732305 #4564:2aae417045bc] INFO -- : Migrating to AddPriceToServices (20190520174739) [----] I, [2019-09-06T12:15:01.965035 #4564:2aae417045bc] INFO -- : Migrating to RenameServiceState (20190521172822) [----] I, [2019-09-06T12:15:02.203652 #4564:2aae417045bc] INFO -- : Migrating to AddEmsLicenses (20190528160313) [----] I, [2019-09-06T12:15:02.448740 #4564:2aae417045bc] INFO -- : Migrating to AddEmsExtensions (20190528161746) [----] I, [2019-09-06T12:15:02.671423 #4564:2aae417045bc] INFO -- : Migrating to AddGuidToMiqDatabases (20190531161732) [----] I, [2019-09-06T12:15:02.916436 #4564:2aae417045bc] INFO -- : Migrating to CreateExternalUrls (20190604090631) [----] I, [2019-09-06T12:15:03.168178 #4564:2aae417045bc] INFO -- : Migrating to AddAnsibleColumnsToAuthentications (20190613214747) [----] I, [2019-09-06T12:15:03.489919 #4564:2aae417045bc] INFO -- : Migrating to MoveAwxCredentialsToAuthentications (20190617191109) [----] I, [2019-09-06T12:15:03.757443 #4564:2aae417045bc] INFO -- : Migrating to FixUnserializableNotificationOptions (20190708192323) [----] I, [2019-09-06T12:15:03.983170 #4564:2aae417045bc] INFO -- : Migrating to SetProvisionedStateToServices (20190712135032) [----] I, [2019-09-06T12:15:04.202999 #4564:2aae417045bc] INFO -- : Migrating to AddGitRepositoryToConfigurationScriptSource (20190716210326) [----] I, [2019-09-06T12:15:04.443629 #4564:2aae417045bc] INFO -- : Migrating to AddAuthenticationIdToGitRepository (20190723023214) [----] I, [2019-09-06T12:15:04.676650 #4564:2aae417045bc] INFO -- : Migrating to RemoveLocalDefaultEmbeddedAnsibleRepos (20190726204302) [----] I, [2019-09-06T12:15:04.913009 #4564:2aae417045bc] INFO -- : Migrating to MoveEmbeddedAnsibleProxySettingToGitRepositoryProxySettings (20190809193031)
Hi Jaroslav, To summarize my last comment: 1) We won't fix the zone's visibility flag because the migration was already been run 2) The existing database export is wrong so the bug/problem occurred before you ran pg_dumpall I couldn't find the vmdb_production database on the .132 appliance that you said this database came from. Can you verify the zones table (it shouldn't have a nil before you export the 5.10 database and try the upgrade again? Note, we added the visible column and migrated it in the same pull request, so there's no chance for you to add the column and not migrate it: https://github.com/ManageIQ/manageiq-schema/commit/32838a47f68769ceee62ff088205aff14e821bcd
Before and after migration from 5.10 to 5.11 and automate:reset I see: [root@dhcp-8-198-97 vmdb]# psql vmdb_production -c 'select * from zones;' id | name | description | created_on | updated_on | settings | log_fi le_depot_id | visible ----+---------+--------------+----------------------------+----------------------------+----------+------- ------------+--------- 1 | default | Default Zone | 2019-09-09 11:23:17.077841 | 2019-09-09 11:23:17.077841 | | | (1 row) after starting the evmserverd I see: [root@dhcp-8-198-97 vmdb]# psql vmdb_production -c 'select * from zones;' id | name | description | created_on | updated_on | settings | log_file_depot_id | visible ----+-----------------------------------------------------+------------------+---------------------------- +----------------------------+----------+-------------------+--------- 1 | default | Default Zone | 2019-09-09 11:23:17.077841 | 2019-09-09 11:23:17.077841 | | | 2 | __maintenance__a19f5523-1bc3-49ea-985d-90427138da9a | Maintenance Zone | 2019-09-09 12:35:51.582649 | 2019-09-09 12:35:51.582649 | | | f (2 rows) sometimes the WebUI was available only after stopping and then starting the evmservd again. When WebUI is available, I see no sone in Zones
Looking for details regarding how the exported (with the bad visible value) DB was created. Was it the result of a 5.9 to 5.10 upgrade, was this a fresh 5.10 deployment, was it used for provider pause/resume testing, etc.
(In reply to dmetzger from comment #19) > Looking for details regarding how the exported (with the bad visible value) > DB was created. Was it the result of a 5.9 to 5.10 upgrade, was this a fresh > 5.10 deployment, was it used for provider pause/resume testing, etc. I took 2 fresh unconfigured CFME 5.10. Configured one -- creating the db with region 0 using appliance console and then for the other I picked "Join Region in External Database" menu option in appliance_console and pointed it to the first appliance. I made sure they are using same v2_key. Fresh 5.10 environment
https://github.com/ManageIQ/manageiq-schema/pull/413
New commit detected on ManageIQ/manageiq-schema/master: https://github.com/ManageIQ/manageiq-schema/commit/87f08813032ba391754ba50a68acd9517915644a commit 87f08813032ba391754ba50a68acd9517915644a Author: Joe Rafaniello <jrafanie> AuthorDate: Mon Sep 9 16:43:31 2019 -0400 Commit: Joe Rafaniello <jrafanie> CommitDate: Mon Sep 9 16:43:31 2019 -0400 Set zone's visible column default true and update existing rows Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1749694 With hammer (rails 5.0), Zone.seed was not creating the default zone with visible => true, causing it to remain nil. The hammer UI code wasn't checking this when displaying zones in the UI. When we upgraded to Ivanchuk/rails 5.1, the UI now checks for visible and these visible => nil zones created in hammer would not be visible. Note, Ivanchuk (rails 5.1), Zone.seed does create the default zone with visible => true. To be safe, this migration not only relies on postgresql's column default to set Zone visible but also updates existing records from visible => nil to visible => true. db/migrate/20190909195908_set_zone_visible_column_default_true.rb | 9 + spec/migrations/20190909195908_set_zone_visible_column_default_true_spec.rb | 30 + 2 files changed, 39 insertions(+)
New commit detected on ManageIQ/manageiq-schema/ivanchuk: https://github.com/ManageIQ/manageiq-schema/commit/3860ccd17e44391eb7e365931265f88a717e3117 commit 3860ccd17e44391eb7e365931265f88a717e3117 Author: Brandon Dunne <bdunne> AuthorDate: Mon Sep 9 18:10:32 2019 -0400 Commit: Brandon Dunne <bdunne> CommitDate: Mon Sep 9 18:10:32 2019 -0400 Merge pull request #413 from jrafanie/set_zone_visible_column_default_true Set zone's visible column default true and update existing rows (cherry picked from commit 859479cadb40cae2dfe58b4d1cf6a6c2e437017a) https://bugzilla.redhat.com/show_bug.cgi?id=1749694 db/migrate/20190909195908_set_zone_visible_column_default_true.rb | 9 + spec/migrations/20190909195908_set_zone_visible_column_default_true_spec.rb | 30 + 2 files changed, 39 insertions(+)
I can see the zone as well as the EVMs there