Bug 1805451 - After upgrading from CloudForms 4.5 to 5.11 provisioning is no longer working correctly
Summary: After upgrading from CloudForms 4.5 to 5.11 provisioning is no longer working...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat CloudForms Management Engine
Classification: Red Hat
Component: Automate
Version: 5.11.0
Hardware: All
OS: All
unspecified
high
Target Milestone: GA
: 5.11.5
Assignee: Tina Fitzgerald
QA Contact: Jaroslav Henner
Red Hat CloudForms Documentation
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-02-20 19:59 UTC by Ryan Spagnola
Modified: 2023-09-07 21:58 UTC (History)
11 users (show)

Fixed In Version: 5.11.5.0
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-05-05 13:43:11 UTC
Category: Bug
Cloudforms Team: CFME Core
Target Upstream Version:
Embargoed:
pm-rhel: cfme-5.11.z+
rspagnol: needinfo-


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2020:2020 0 None None None 2020-05-05 13:43:21 UTC

Description Ryan Spagnola 2020-02-20 19:59:04 UTC
Description of problem:
Automate code is not running successfully after upgrading to 5.11 from 4.5.
Automation.log does not present any errors only evm.log

Version-Release number of selected component (if applicable):
5.11.1.2

How reproducible:
Always

Steps to Reproduce:
1. In 5.11 run a provision that calls custom code that works in 4.5
2.
3.

Actual results:
fails with stacktrace in evm.log but no errors in automation.log

Expected results:
successful provision

Additional info:
Request - r30000000470082_miq_provision_30000000470058

Stacktrace:

[----] E, [2020-02-19T14:12:50.673141 #7480:2ab2c71985bc] ERROR -- : Q-task_id([r30000000470082_miq_provision_30000000470058]) MIQ(MiqQueue#deliver) Message id: [30000279822143], Error: [comparison of String with Symbol failed]
[----] E, [2020-02-19T14:12:50.673261 #7480:2ab2c71985bc] ERROR -- : Q-task_id([r30000000470082_miq_provision_30000000470058]) [ArgumentError]: comparison of String with Symbol failed  Method:[block (2 levels) in <class:LogProxy>]
[----] E, [2020-02-19T14:12:50.673464 #7480:2ab2c71985bc] ERROR -- : Q-task_id([r30000000470082_miq_provision_30000000470058]) /opt/rh/cfme-gemset/bundler/gems/cfme-automation_engine-4bcf183fc9c8/lib/miq_automation_engine/engine/miq_ae_engine/miq_ae_object.rb:227:in `sort'
/opt/rh/cfme-gemset/bundler/gems/cfme-automation_engine-4bcf183fc9c8/lib/miq_automation_engine/engine/miq_ae_engine/miq_ae_object.rb:227:in `block in to_xml'
/opt/rh/cfme-gemset/gems/builder-3.2.3/lib/builder/xmlbase.rb:175:in `_nested_structures'
/opt/rh/cfme-gemset/gems/builder-3.2.3/lib/builder/xmlbase.rb:68:in `tag!'
/opt/rh/cfme-gemset/gems/builder-3.2.3/lib/builder/xmlbase.rb:93:in `method_missing'
/opt/rh/cfme-gemset/bundler/gems/cfme-automation_engine-4bcf183fc9c8/lib/miq_automation_engine/engine/miq_ae_engine/miq_ae_object.rb:226:in `to_xml'
/opt/rh/cfme-gemset/bundler/gems/cfme-automation_engine-4bcf183fc9c8/lib/miq_automation_engine/engine/miq_ae_engine/miq_ae_workspace_runtime.rb:197:in `block (2 levels) in to_expanded_xml'
/opt/rh/cfme-gemset/bundler/gems/cfme-automation_engine-4bcf183fc9c8/lib/miq_automation_engine/engine/miq_ae_engine/miq_ae_workspace_runtime.rb:197:in `each'
/opt/rh/cfme-gemset/bundler/gems/cfme-automation_engine-4bcf183fc9c8/lib/miq_automation_engine/engine/miq_ae_engine/miq_ae_workspace_runtime.rb:197:in `block in to_expanded_xml'
/opt/rh/cfme-gemset/gems/builder-3.2.3/lib/builder/xmlbase.rb:175:in `_nested_structures'
/opt/rh/cfme-gemset/gems/builder-3.2.3/lib/builder/xmlbase.rb:68:in `tag!'
/opt/rh/cfme-gemset/gems/builder-3.2.3/lib/builder/xmlbase.rb:93:in `method_missing'
/opt/rh/cfme-gemset/bundler/gems/cfme-automation_engine-4bcf183fc9c8/lib/miq_automation_engine/engine/miq_ae_engine/miq_ae_workspace_runtime.rb:196:in `to_expanded_xml'
/opt/rh/cfme-gemset/bundler/gems/cfme-automation_engine-4bcf183fc9c8/lib/miq_automation_engine/engine/miq_ae_engine.rb:299:in `block (2 levels) in resolve_automation_object'
/usr/share/ruby/logger.rb:465:in `add'
/opt/rh/cfme-gemset/gems/activerecord-session_store-1.1.3/lib/active_record/session_store/extension/logger_silencer.rb:38:in `add_with_threadsafety'
/usr/share/ruby/logger.rb:493:in `debug'
/opt/rh/cfme-gemset/bundler/gems/cfme-automation_engine-4bcf183fc9c8/lib/miq_automation_engine/engine/miq_ae_engine.rb:299:in `block in resolve_automation_object'
/opt/rh/cfme-gemset/bundler/gems/cfme-automation_engine-4bcf183fc9c8/lib/miq_automation_engine/engine/miq_ae_engine.rb:298:in `tap'
/opt/rh/cfme-gemset/bundler/gems/cfme-automation_engine-4bcf183fc9c8/lib/miq_automation_engine/engine/miq_ae_engine.rb:298:in `resolve_automation_object'
/opt/rh/cfme-gemset/bundler/gems/cfme-automation_engine-4bcf183fc9c8/lib/miq_automation_engine/engine/miq_ae_engine.rb:92:in `deliver'
/var/www/miq/vmdb/app/models/miq_queue.rb:479:in `block in dispatch_method'
/usr/share/ruby/timeout.rb:93:in `block in timeout'
/usr/share/ruby/timeout.rb:33:in `block in catch'
/usr/share/ruby/timeout.rb:33:in `catch'
/usr/share/ruby/timeout.rb:33:in `catch'
/usr/share/ruby/timeout.rb:108:in `timeout'
/var/www/miq/vmdb/app/models/miq_queue.rb:477:in `dispatch_method'
/var/www/miq/vmdb/app/models/miq_queue.rb:454:in `block in deliver'
/var/www/miq/vmdb/app/models/user.rb:290:in `with_user_group'
/var/www/miq/vmdb/app/models/miq_queue.rb:454:in `deliver'
/var/www/miq/vmdb/app/models/miq_queue_worker_base/runner.rb:104:in `deliver_queue_message'
/var/www/miq/vmdb/app/models/miq_queue_worker_base/runner.rb:137:in `deliver_message'
/var/www/miq/vmdb/app/models/miq_queue_worker_base/runner.rb:155:in `block in do_work'
/var/www/miq/vmdb/app/models/miq_queue_worker_base/runner.rb:149:in `loop'
/var/www/miq/vmdb/app/models/miq_queue_worker_base/runner.rb:149:in `do_work'
/var/www/miq/vmdb/app/models/miq_worker/runner.rb:329:in `block in do_work_loop'
/var/www/miq/vmdb/app/models/miq_worker/runner.rb:326:in `loop'
/var/www/miq/vmdb/app/models/miq_worker/runner.rb:326:in `do_work_loop'
/var/www/miq/vmdb/app/models/miq_worker/runner.rb:153:in `run'
/var/www/miq/vmdb/app/models/miq_worker/runner.rb:127:in `start'
/var/www/miq/vmdb/app/models/miq_worker/runner.rb:22:in `start_worker'
/var/www/miq/vmdb/app/models/miq_worker.rb:398:in `block in start_runner_via_fork'
/opt/rh/cfme-gemset/gems/nakayoshi_fork-0.0.4/lib/nakayoshi_fork.rb:23:in `fork'
/opt/rh/cfme-gemset/gems/nakayoshi_fork-0.0.4/lib/nakayoshi_fork.rb:23:in `fork'
/var/www/miq/vmdb/app/models/miq_worker.rb:396:in `start_runner_via_fork'
/var/www/miq/vmdb/app/models/miq_worker.rb:386:in `start_runner'
/var/www/miq/vmdb/app/models/miq_worker.rb:437:in `start'
/var/www/miq/vmdb/app/models/miq_worker.rb:269:in `start_worker'
/var/www/miq/vmdb/app/models/miq_worker.rb:147:in `block in sync_workers'
/var/www/miq/vmdb/app/models/miq_worker.rb:147:in `times'
/var/www/miq/vmdb/app/models/miq_worker.rb:147:in `sync_workers'
/var/www/miq/vmdb/app/models/miq_server/worker_management/monitor.rb:54:in `block in sync_workers'
/var/www/miq/vmdb/app/models/miq_server/worker_management/monitor.rb:50:in `each'
/var/www/miq/vmdb/app/models/miq_server/worker_management/monitor.rb:50:in `sync_workers'
/var/www/miq/vmdb/app/models/miq_server.rb:156:in `start'
/var/www/miq/vmdb/app/models/miq_server.rb:247:in `start'
/var/www/miq/vmdb/lib/workers/evm_server.rb:27:in `start'
/var/www/miq/vmdb/lib/workers/evm_server.rb:48:in `start'
/var/www/miq/vmdb/lib/workers/bin/evm_server.rb:4:in `<main>'

Comment 20 Joe Rafaniello 2020-02-24 21:05:06 UTC
We noticed a delete_service being called shortly before the can't modify frozen hash in the task_finished part of the service provisioning.  It's unclear if this is a "safe" time to delete the service as anything that happens after the service is deleted could either have stale in-memory data or we attempt to update a frozen record.

Comment 31 CFME Bot 2020-03-20 19:11:02 UTC
New commit detected on ManageIQ/manageiq-automation_engine/ivanchuk:

https://github.com/ManageIQ/manageiq-automation_engine/commit/fb2fb61f4a9b1cdf6d3afc65abe9ac0276a0e272
commit fb2fb61f4a9b1cdf6d3afc65abe9ac0276a0e272
Author:     tina <tfitzger>
AuthorDate: Wed Mar  4 19:48:15 2020 +0000
Commit:     tina <tfitzger>
CommitDate: Wed Mar  4 19:48:15 2020 +0000

    Merge pull request #429 from d-m-u/removing_to_xml

    Remove the to_xml call from miq_ae_engine

    (cherry picked from commit 1a3ba3a3cde0706890050e405f56b4c808213c1e)

    https://bugzilla.redhat.com/show_bug.cgi?id=1805451

 lib/miq_automation_engine/engine/miq_ae_engine.rb | 4 +-
 1 file changed, 1 insertion(+), 3 deletions(-)

Comment 32 CFME Bot 2020-03-20 20:25:57 UTC
New commit detected on ManageIQ/manageiq-automation_engine/jansa:

https://github.com/ManageIQ/manageiq-automation_engine/commit/36885d034e765652b16216e7e143fe63e94279ed
commit 36885d034e765652b16216e7e143fe63e94279ed
Author:     tina <tfitzger>
AuthorDate: Wed Mar  4 19:48:15 2020 +0000
Commit:     tina <tfitzger>
CommitDate: Wed Mar  4 19:48:15 2020 +0000

    Merge pull request #429 from d-m-u/removing_to_xml

    Remove the to_xml call from miq_ae_engine

    (cherry picked from commit 1a3ba3a3cde0706890050e405f56b4c808213c1e)

    https://bugzilla.redhat.com/show_bug.cgi?id=1805451

 lib/miq_automation_engine/engine/miq_ae_engine.rb | 4 +-
 1 file changed, 1 insertion(+), 3 deletions(-)

Comment 34 Jaroslav Henner 2020-04-16 13:19:17 UTC
@Tina Is there some way I could reproduce this? If not, I think I will need the customer DB as well.

I understood that serialisation to xml in some debug calls failed, but I also am not much sure what was the root cause.

Comment 35 Jaroslav Henner 2020-04-21 09:30:01 UTC
@Ryan, could you help me to reproduce this?

Comment 36 Tina Fitzgerald 2020-04-23 15:15:32 UTC
Hi Jaroslav,

Since we're removing the to_xml code, we just need to validate that removing the to_xml call doesn't cause problems. 

Can you test provisioning with/without DEBUG logging on for Automate as follows:
1. Turn on DEBUG for Automate in Advanced settings.
2. Perform a VM lifecycle provision and check logs for xml dump.
3. Check for errors.

Let me know if you have any questions.

Thanks,
Tina

Comment 37 Jaroslav Henner 2020-04-27 12:33:11 UTC
I was not able to reproduce the issue with the info I got. I think to reproduce I need some funky string stored in database that makes problems when serializing. So only thing I did check is that some debug log lines were removed. Indeed, when I provision and retire a VM, on CFME version prior the fixed-in-version I got:
[root@dhcp-8-198-205 ~]# grep DEBUG /var/www/miq/vmdb/log/*
grep: /var/www/miq/vmdb/log/apache: Is a directory
/var/www/miq/vmdb/log/automation.log:[----] D, [2020-04-27T07:49:27.955806 #7324:2b0de6b85dd4] DEBUG -- : <MiqAeWorkspace>
/var/www/miq/vmdb/log/automation.log:[----] D, [2020-04-27T07:49:28.777145 #7324:2b0de6b85dd4] DEBUG -- : <MiqAeWorkspace>
...

while on the later version I got no debug statements in the automation log even though I've set this advanced config on both:

:log:
  :level_automation: debug

Comment 40 errata-xmlrpc 2020-05-05 13:43:11 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:2020


Note You need to log in before you can comment on or make changes to this bug.