Bug 1696237
| Summary: | unable to run fix_auth on database with "stack too deep" | ||||||
|---|---|---|---|---|---|---|---|
| Product: | Red Hat CloudForms Management Engine | Reporter: | Felix Dewaleyne <fdewaley> | ||||
| Component: | Appliance | Assignee: | Keenan Brock <kbrock> | ||||
| Status: | CLOSED CURRENTRELEASE | QA Contact: | Jaroslav Henner <jhenner> | ||||
| Severity: | high | Docs Contact: | Red Hat CloudForms Documentation <cloudforms-docs> | ||||
| Priority: | high | ||||||
| Version: | 5.9.7 | CC: | abellott, dmetzger, fdewaley, kbrock, mshriver, ncarboni, obarenbo | ||||
| Target Milestone: | GA | Keywords: | TestOnly, ZStream | ||||
| Target Release: | 5.11.0 | ||||||
| Hardware: | All | ||||||
| OS: | All | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | 5.11.0.1 | Doc Type: | If docs needed, set a value | ||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | |||||||
| : | 1702072 (view as bug list) | Environment: | |||||
| Last Closed: | 2019-12-13 14:54:30 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | Bug | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | CFME Core | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Bug Depends On: | |||||||
| Bug Blocks: | 1702072 | ||||||
| Attachments: |
|
||||||
|
Description
Felix Dewaleyne
2019-04-04 11:51:37 UTC
Created attachment 1551834 [details]
fix_auth output
output of `bundle exec tools/fix_auth.rb --v2 --invalid bogus`
note : the region is number 34, usually it gets fixed after fix_auth but I running fix_auth with the correct region doesn't change anything. This was caused by a bad miq_request_tasks record The customer inserted amazon credentials with a self reference (you can do this in yaml) that resulted in an infinite recursion. I have added code to detect one or two cases with this recursion. I have also patched fix_auth on this appliance so Felix can continue on with the problem he was originally investigating. Just waiting for a merge and backport. New commit detected on ManageIQ/manageiq/master: https://github.com/ManageIQ/manageiq/commit/2eabc44f1a1874c3707964279db8aa7c3793bf1e commit 2eabc44f1a1874c3707964279db8aa7c3793bf1e Author: Keenan Brock <keenan> AuthorDate: Thu Apr 4 15:21:20 2019 -0400 Commit: Keenan Brock <keenan> CommitDate: Thu Apr 4 15:21:20 2019 -0400 fix_auth now handles recursive settings situation: 1. For one customer, miq_request_tasks has an options hash with recursive values. 2. fix_auth recurses all the options looking for passwords to convert before: it recurses forever after: it now detects the recursion and does not go forever NOTE: this only detects very simple recursive cases. https://bugzilla.redhat.com/show_bug.cgi?id=1696237 lib/vmdb/settings_walker.rb | 3 +- spec/lib/vmdb/settings_spec.rb | 77 +- 2 files changed, 53 insertions(+), 27 deletions(-) merged. Of Note: to "fix" a region of 34, just delete the REGION file and everything will work great. The DB dump I have seen in the Red Hat Customer Portal is 1G. It will certainly take much of resources for verification as well as for automating the test for this. Can I get some reproduce steps that would not involve getting a DB as big as that? Ok, I tried to restore the db on some larger VM I made on 5.11 and failed like this:
[root@dhcp-8-198-123 vmdb]# pg_restore --no-password --dbname vmdb_production --verbose --exit-on-error /net/$DB_DUMPS_IP/srv/export/customer_db_dump_migrated
pg_restore: connecting to database for restore
pg_restore: creating SCHEMA "public"
pg_restore: creating COMMENT "SCHEMA public"
pg_restore: creating SCHEMA "repmgr_miq_region_34_cluster"
pg_restore: creating EXTENSION "plpgsql"
pg_restore: creating COMMENT "EXTENSION plpgsql"
pg_restore: creating FUNCTION "public.metric_rollups_inheritance_after()"
pg_restore: creating FUNCTION "public.metric_rollups_inheritance_before()"
pg_restore: creating FUNCTION "public.metrics_inheritance_after()"
pg_restore: creating FUNCTION "public.metrics_inheritance_before()"
pg_restore: creating FUNCTION "repmgr_miq_region_34_cluster.repmgr_get_last_standby_location()"
pg_restore: [archiver (db)] Error while PROCESSING TOC:
pg_restore: [archiver (db)] Error from TOC entry 825; 1255 29576 FUNCTION repmgr_get_last_standby_location() root
pg_restore: [archiver (db)] could not execute query: ERROR: could not access file "$libdir/repmgr_funcs": No such file or directory
Command was: CREATE FUNCTION repmgr_miq_region_34_cluster.repmgr_get_last_standby_location() RETURNS text
LANGUAGE c STRICT
AS '$libdir/repmgr_funcs', 'repmgr_get_last_standby_location';
I restored the DB without --exit-on-error. There were 10 errors. THen I tried the fix_auth. It didn't reproduce on 5.11.0.15 but It also didn't reproduce on cfme-5.10.3.3. I don't know what I am doing wrong # pg_restore -U root -j 4 -d vmdb_production /net/$NFS_SHARE/srv/export/customer_db_dump # fix_auth --databaseyml # fix_auth --v2 --invalid bogus I couldn't reproduce. The fix_auth tool does work, so hopefully it is OK to make this VERIFIED. Setting the qe_test_coverage- as I was unable to reproduce |