Bug 1409795 - katello:upgrade_check aborts on systems without an UUID
Summary: katello:upgrade_check aborts on systems without an UUID
Alias: None
Product: Red Hat Satellite 6
Classification: Red Hat
Component: Upgrades
Version: 6.1.11
Hardware: Unspecified
OS: Unspecified
medium vote
Target Milestone: Unspecified
Assignee: satellite6-bugs
QA Contact: Katello QA List
Depends On:
Blocks: Sat6_Upgrades
TreeView+ depends on / blocked
Reported: 2017-01-03 12:33 UTC by Evgeni Golov
Modified: 2018-11-03 05:45 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Last Closed: 2018-09-04 18:01:00 UTC

Attachments (Terms of Use)

System ID Priority Status Summary Last Updated
Foreman Issue Tracker 17905 None None None 2017-01-03 13:04:14 UTC
Red Hat Knowledge Base (Solution) 3677041 None None None 2018-11-03 05:45:17 UTC

Description Evgeni Golov 2017-01-03 12:33:45 UTC
Satellite 6.1.11 (not .10 as in the BZ version, seems there is no .11?)

Description of problem:
While running "foreman-rake katello:upgrade_check" on a big customer database, rake would abort while scanning the systems:

$ foreman-rake katello:preupgrade_content_host_check --trace
** Invoke katello:preupgrade_content_host_check (first_time)
** Invoke environment (first_time)
** Execute environment
API controllers newer than Apipie cache! Run apipie:cache rake task to regenerate cache.
** Execute katello:preupgrade_content_host_check
Calculating Host changes on upgrade.  This may take a few minutes.
rake aborted!
Expect initializer to return hash if a group of attributes is defined by lazy_accessor
/opt/rh/ruby193/root/usr/share/gems/gems/katello- `run_initializer'
/opt/rh/ruby193/root/usr/share/gems/gems/katello- `lazy_attribute_get'
/opt/rh/ruby193/root/usr/share/gems/gems/katello- `block (2 levels) in lazy_accessor'
/opt/rh/ruby193/root/usr/share/gems/gems/katello- `block in get_systems_with_facts'
/opt/rh/ruby193/root/usr/share/gems/gems/katello- `each'
/opt/rh/ruby193/root/usr/share/gems/gems/katello- `get_systems_with_facts'
/opt/rh/ruby193/root/usr/share/gems/gems/katello- `ensure_one_system_per_hostname'
/opt/rh/ruby193/root/usr/share/gems/gems/katello- `block (2 levels) in <top (required)>'
/opt/rh/ruby193/root/usr/share/ruby/rake/task.rb:205:in `call'
/opt/rh/ruby193/root/usr/share/ruby/rake/task.rb:205:in `block in execute'
/opt/rh/ruby193/root/usr/share/ruby/rake/task.rb:200:in `each'
/opt/rh/ruby193/root/usr/share/ruby/rake/task.rb:200:in `execute'
/opt/rh/ruby193/root/usr/share/ruby/rake/task.rb:158:in `block in invoke_with_call_chain'
/opt/rh/ruby193/root/usr/share/ruby/monitor.rb:211:in `mon_synchronize'
/opt/rh/ruby193/root/usr/share/ruby/rake/task.rb:151:in `invoke_with_call_chain'
/opt/rh/ruby193/root/usr/share/ruby/rake/task.rb:144:in `invoke'
/opt/rh/ruby193/root/usr/share/ruby/rake/application.rb:116:in `invoke_task'
/opt/rh/ruby193/root/usr/share/ruby/rake/application.rb:94:in `block (2 levels) in top_level'
/opt/rh/ruby193/root/usr/share/ruby/rake/application.rb:94:in `each'
/opt/rh/ruby193/root/usr/share/ruby/rake/application.rb:94:in `block in top_level'
/opt/rh/ruby193/root/usr/share/ruby/rake/application.rb:133:in `standard_exception_handling'
/opt/rh/ruby193/root/usr/share/ruby/rake/application.rb:88:in `top_level'
/opt/rh/ruby193/root/usr/share/ruby/rake/application.rb:66:in `block in run'
/opt/rh/ruby193/root/usr/share/ruby/rake/application.rb:133:in `standard_exception_handling'
/opt/rh/ruby193/root/usr/share/ruby/rake/application.rb:63:in `run'
/opt/rh/ruby193/root/usr/bin/rake:32:in `<main>'
Tasks: TOP => katello:preupgrade_content_host_check

The code in question seems to be:
      systems.each do |system|
          facts = system.facts
          unless facts
        rescue RestClient::Exception

Line 48 is "facts = system.facts".

After adding a tactical "puts system.inspect" just before the system.facts line, we could identify the bad system:

#<Katello::System id: 6891, uuid: nil, name: "hostname", description: "Initial Registration Params", location: "None", environment_id: 4, created_at: "2016-11-15 12:58:01", updated_at: "2016-11-15 12:58:01", type: "Katello::System", content_view_id: 15, host_id: nil>
rake aborted!

Looking into PostgreSQL revealed that we actually had two systems with that symptom:

foreman=# select * from katello_systems where uuid is null;

  id  | uuid |         name          |         description         | location | environment_id |         created_at         |         updated_at         |      type       | content_view_id | host_id


6891 |      | hostname              | Initial Registration Params | None     |              4 | 2016-11-15 12:58:01.246012 | 2016-11-15 12:58:01.246012 | Katello::System |              15 |

6262 |      | hostname2             | Initial Registration Params | None     |              4 | 2016-09-26 09:06:38.945969 | 2016-09-26 09:06:38.945969 | Katello::System |              16 |

(2 rows)

PostgreSQL would also tell us that there were another two systems with those hostnames, but now with proper UUIDs.
Seems the initial registration of those wen't badly and they were re-registered.

After erasing the two broken systems from the DB the upgrade_check would run fine.

I think the upgrade_check.rake needs a bit more of error handling, as I would expect it to catch this bad systems and tell me about them, not choke on them.

Version-Release number of selected component (if applicable):
Satellite 6.1.11

How reproducible:
Always, but no idea how the initial problematic host was created

Steps to Reproduce:
1. create a katello::system without a uuid
2. run foreman-rake katello:upgrade_check

Actual results:
rake aborted

Expected results:
system is said to be faulty

Comment 1 Ivan Necas 2017-01-03 12:59:48 UTC
The reason for this I can imagine could be a failed orchestration of the host, so that was never finished (perhaps the task was force-unlocked).

I agree the upgrade check should handle such a situatino

Comment 2 Ivan Necas 2017-01-03 13:04:11 UTC
Created redmine issue http://projects.theforeman.org/issues/17905 from this bug

Comment 4 Ivan Necas 2017-01-03 13:10:25 UTC
It seems it's similar to https://bugzilla.redhat.com/show_bug.cgi?id=1329561

Comment 5 Evgeni Golov 2017-01-03 13:31:50 UTC
FWIW, those hosts are content-only, so there was no corresponding Foreman-Host at all (and that part is fine).

Comment 6 Bryan Kearney 2017-03-27 17:03:58 UTC
I do not forsee fixing this for 6.1. I am aligning this to 6.2.z only.

Comment 7 Evgeni Golov 2017-03-27 19:27:54 UTC
Hi Brian,

any reason you flagged this to 6.3 and not to 6.2 then?

Comment 8 Bryan Kearney 2017-05-10 12:44:32 UTC
It is linked to both 6.3 and 6.2.z so that if we fix it in one, we fix it in both.

Comment 9 Bryan Kearney 2017-05-23 15:12:48 UTC
This did not make it in time for 6.3, moving it to backlog.

Comment 11 Bryan Kearney 2018-09-04 18:01:00 UTC
Thank you for your interest in Satellite 6. We have evaluated this request, and we do not expect this to be implemented in the product in the foreseeable future. We are therefore closing this out as WONTFIX. If you have any concerns about this, please feel free to contact Rich Jerrido or Bryan Kearney. Thank you.

Note You need to log in before you can comment on or make changes to this bug.