Description of problem: while importing a database to look into an issue where objects couldn't be deleted, I ran into a "stack too deep" trace running fix_auth. Version-Release number of selected component (if applicable): 5.9.7 How reproducible: all the time Steps to Reproduce: 1. create appliance based on 5.9.7 2. create database using appliance_console and /dev/vdb 3. destroy and recreate the database to ready importing the dump 4. import database 5. fix authentication in database.yml 6. fix authentication in database Actual results: fixing authentications.password, auth_key fixing miq_databases.registration_http_proxy_server, session_secret_token, csrf_secret_token fixing miq_ae_values.value fixing miq_ae_fields.default_value fixing miq_requests.options fixing miq_request_tasks.options bundler: failed to load command: tools/fix_auth.rb (tools/fix_auth.rb) SystemStackError: stack level too deep /var/www/miq/vmdb/lib/vmdb/settings_walker.rb:19:in `block in walk' /var/www/miq/vmdb/lib/vmdb/settings_walker.rb:19:in `each' /var/www/miq/vmdb/lib/vmdb/settings_walker.rb:19:in `walk' /var/www/miq/vmdb/lib/vmdb/settings_walker.rb:26:in `block in walk' /var/www/miq/vmdb/lib/vmdb/settings_walker.rb:19:in `each' /var/www/miq/vmdb/lib/vmdb/settings_walker.rb:19:in `walk' /var/www/miq/vmdb/lib/vmdb/settings_walker.rb:26:in `block in walk' /var/www/miq/vmdb/lib/vmdb/settings_walker.rb:19:in `each' /var/www/miq/vmdb/lib/vmdb/settings_walker.rb:19:in `walk' /var/www/miq/vmdb/lib/vmdb/settings_walker.rb:26:in `block in walk' /var/www/miq/vmdb/lib/vmdb/settings_walker.rb:19:in `each' /var/www/miq/vmdb/lib/vmdb/settings_walker.rb:19:in `walk' /var/www/miq/vmdb/lib/vmdb/settings_walker.rb:26:in `block in walk' /var/www/miq/vmdb/lib/vmdb/settings_walker.rb:19:in `each' /var/www/miq/vmdb/lib/vmdb/settings_walker.rb:19:in `walk' /var/www/miq/vmdb/lib/vmdb/settings_walker.rb:26:in `block in walk' /var/www/miq/vmdb/lib/vmdb/settings_walker.rb:19:in `each' : [...] /var/www/miq/vmdb/lib/vmdb/settings_walker.rb:26:in `block in walk' /var/www/miq/vmdb/lib/vmdb/settings_walker.rb:19:in `each' /var/www/miq/vmdb/lib/vmdb/settings_walker.rb:19:in `walk' /var/www/miq/vmdb/lib/vmdb/settings_walker.rb:26:in `block in walk' /var/www/miq/vmdb/lib/vmdb/settings_walker.rb:19:in `each' /var/www/miq/vmdb/lib/vmdb/settings_walker.rb:19:in `walk' /var/www/miq/vmdb/lib/vmdb/settings_walker.rb:29:in `block (2 levels) in walk' /var/www/miq/vmdb/lib/vmdb/settings_walker.rb:28:in `each' /var/www/miq/vmdb/lib/vmdb/settings_walker.rb:28:in `each_with_index' /var/www/miq/vmdb/lib/vmdb/settings_walker.rb:28:in `block in walk' /var/www/miq/vmdb/lib/vmdb/settings_walker.rb:19:in `each' /var/www/miq/vmdb/lib/vmdb/settings_walker.rb:19:in `walk' /var/www/miq/vmdb/tools/fix_auth/auth_config_model.rb:38:in `recrypt' /var/www/miq/vmdb/tools/fix_auth/auth_model.rb:46:in `block in fix_passwords' /var/www/miq/vmdb/tools/fix_auth/auth_model.rb:44:in `each' /var/www/miq/vmdb/tools/fix_auth/auth_model.rb:44:in `fix_passwords' /var/www/miq/vmdb/tools/fix_auth/auth_model.rb:85:in `block in run' /opt/rh/cfme-gemset/gems/activerecord-5.0.7.1/lib/active_record/relation/delegation.rb:38:in `each' /opt/rh/cfme-gemset/gems/activerecord-5.0.7.1/lib/active_record/relation/delegation.rb:38:in `each' /var/www/miq/vmdb/tools/fix_auth/auth_model.rb:84:in `run' /var/www/miq/vmdb/tools/fix_auth/fix_auth.rb:63:in `block in fix_database_passwords' /var/www/miq/vmdb/tools/fix_auth/fix_auth.rb:62:in `each' /var/www/miq/vmdb/tools/fix_auth/fix_auth.rb:62:in `fix_database_passwords' /var/www/miq/vmdb/tools/fix_auth/fix_auth.rb:86:in `run' /var/www/miq/vmdb/tools/fix_auth/cli.rb:37:in `run' /var/www/miq/vmdb/tools/fix_auth/cli.rb:41:in `run' tools/fix_auth.rb:26:in `<top (required)>' Expected results: authentication is fixed as expected Additional info: I used the sbr-cfme lab to create the appliance every time. rake db:migrate doesn't seem to think the db need any migration trying again with 5.9.9 does not change the behaviour observed trying with a new private key doesn't change the behaviour either exact commands in private notes. original customer issue is that they cannot delete a container provider , after the first attempt it cannot be edited either.
Created attachment 1551834 [details] fix_auth output output of `bundle exec tools/fix_auth.rb --v2 --invalid bogus`
note : the region is number 34, usually it gets fixed after fix_auth but I running fix_auth with the correct region doesn't change anything.
https://github.com/ManageIQ/manageiq/pull/18631
This was caused by a bad miq_request_tasks record The customer inserted amazon credentials with a self reference (you can do this in yaml) that resulted in an infinite recursion. I have added code to detect one or two cases with this recursion. I have also patched fix_auth on this appliance so Felix can continue on with the problem he was originally investigating. Just waiting for a merge and backport.
New commit detected on ManageIQ/manageiq/master: https://github.com/ManageIQ/manageiq/commit/2eabc44f1a1874c3707964279db8aa7c3793bf1e commit 2eabc44f1a1874c3707964279db8aa7c3793bf1e Author: Keenan Brock <keenan> AuthorDate: Thu Apr 4 15:21:20 2019 -0400 Commit: Keenan Brock <keenan> CommitDate: Thu Apr 4 15:21:20 2019 -0400 fix_auth now handles recursive settings situation: 1. For one customer, miq_request_tasks has an options hash with recursive values. 2. fix_auth recurses all the options looking for passwords to convert before: it recurses forever after: it now detects the recursion and does not go forever NOTE: this only detects very simple recursive cases. https://bugzilla.redhat.com/show_bug.cgi?id=1696237 lib/vmdb/settings_walker.rb | 3 +- spec/lib/vmdb/settings_spec.rb | 77 +- 2 files changed, 53 insertions(+), 27 deletions(-)
merged. Of Note: to "fix" a region of 34, just delete the REGION file and everything will work great.
The DB dump I have seen in the Red Hat Customer Portal is 1G. It will certainly take much of resources for verification as well as for automating the test for this. Can I get some reproduce steps that would not involve getting a DB as big as that?
Ok, I tried to restore the db on some larger VM I made on 5.11 and failed like this: [root@dhcp-8-198-123 vmdb]# pg_restore --no-password --dbname vmdb_production --verbose --exit-on-error /net/$DB_DUMPS_IP/srv/export/customer_db_dump_migrated pg_restore: connecting to database for restore pg_restore: creating SCHEMA "public" pg_restore: creating COMMENT "SCHEMA public" pg_restore: creating SCHEMA "repmgr_miq_region_34_cluster" pg_restore: creating EXTENSION "plpgsql" pg_restore: creating COMMENT "EXTENSION plpgsql" pg_restore: creating FUNCTION "public.metric_rollups_inheritance_after()" pg_restore: creating FUNCTION "public.metric_rollups_inheritance_before()" pg_restore: creating FUNCTION "public.metrics_inheritance_after()" pg_restore: creating FUNCTION "public.metrics_inheritance_before()" pg_restore: creating FUNCTION "repmgr_miq_region_34_cluster.repmgr_get_last_standby_location()" pg_restore: [archiver (db)] Error while PROCESSING TOC: pg_restore: [archiver (db)] Error from TOC entry 825; 1255 29576 FUNCTION repmgr_get_last_standby_location() root pg_restore: [archiver (db)] could not execute query: ERROR: could not access file "$libdir/repmgr_funcs": No such file or directory Command was: CREATE FUNCTION repmgr_miq_region_34_cluster.repmgr_get_last_standby_location() RETURNS text LANGUAGE c STRICT AS '$libdir/repmgr_funcs', 'repmgr_get_last_standby_location';
I restored the DB without --exit-on-error. There were 10 errors. THen I tried the fix_auth. It didn't reproduce on 5.11.0.15 but It also didn't reproduce on cfme-5.10.3.3. I don't know what I am doing wrong # pg_restore -U root -j 4 -d vmdb_production /net/$NFS_SHARE/srv/export/customer_db_dump # fix_auth --databaseyml # fix_auth --v2 --invalid bogus
I couldn't reproduce. The fix_auth tool does work, so hopefully it is OK to make this VERIFIED.
Setting the qe_test_coverage- as I was unable to reproduce