Bug 1409795

Summary: katello:upgrade_check aborts on systems without an UUID
Product: Red Hat Satellite Reporter: Evgeni Golov <egolov>
Component: UpgradesAssignee: satellite6-bugs <satellite6-bugs>
Status: CLOSED WONTFIX QA Contact: Katello QA List <katello-qa-list>
Severity: medium Docs Contact:
Priority: medium    
Version: 6.1.11CC: bkearney, inecas, jcallaha, mbacovsk
Target Milestone: UnspecifiedKeywords: Triaged
Target Release: Unused   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-09-04 18:01:00 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1410795    

Description Evgeni Golov 2017-01-03 12:33:45 UTC
Satellite 6.1.11 (not .10 as in the BZ version, seems there is no .11?)

Description of problem:
While running "foreman-rake katello:upgrade_check" on a big customer database, rake would abort while scanning the systems:

$ foreman-rake katello:preupgrade_content_host_check --trace
** Invoke katello:preupgrade_content_host_check (first_time)
** Invoke environment (first_time)
** Execute environment
API controllers newer than Apipie cache! Run apipie:cache rake task to regenerate cache.
** Execute katello:preupgrade_content_host_check
Calculating Host changes on upgrade.  This may take a few minutes.
rake aborted!
Expect initializer to return hash if a group of attributes is defined by lazy_accessor
/opt/rh/ruby193/root/usr/share/gems/gems/katello-2.2.0.93/app/lib/katello/lazy_accessor.rb:177:in `run_initializer'
/opt/rh/ruby193/root/usr/share/gems/gems/katello-2.2.0.93/app/lib/katello/lazy_accessor.rb:154:in `lazy_attribute_get'
/opt/rh/ruby193/root/usr/share/gems/gems/katello-2.2.0.93/app/lib/katello/lazy_accessor.rb:74:in `block (2 levels) in lazy_accessor'
/opt/rh/ruby193/root/usr/share/gems/gems/katello-2.2.0.93/lib/katello/tasks/preupgrade_content_host_check.rake:48:in `block in get_systems_with_facts'
/opt/rh/ruby193/root/usr/share/gems/gems/katello-2.2.0.93/lib/katello/tasks/preupgrade_content_host_check.rake:46:in `each'
/opt/rh/ruby193/root/usr/share/gems/gems/katello-2.2.0.93/lib/katello/tasks/preupgrade_content_host_check.rake:46:in `get_systems_with_facts'
/opt/rh/ruby193/root/usr/share/gems/gems/katello-2.2.0.93/lib/katello/tasks/preupgrade_content_host_check.rake:18:in `ensure_one_system_per_hostname'
/opt/rh/ruby193/root/usr/share/gems/gems/katello-2.2.0.93/lib/katello/tasks/preupgrade_content_host_check.rake:103:in `block (2 levels) in <top (required)>'
/opt/rh/ruby193/root/usr/share/ruby/rake/task.rb:205:in `call'
/opt/rh/ruby193/root/usr/share/ruby/rake/task.rb:205:in `block in execute'
/opt/rh/ruby193/root/usr/share/ruby/rake/task.rb:200:in `each'
/opt/rh/ruby193/root/usr/share/ruby/rake/task.rb:200:in `execute'
/opt/rh/ruby193/root/usr/share/ruby/rake/task.rb:158:in `block in invoke_with_call_chain'
/opt/rh/ruby193/root/usr/share/ruby/monitor.rb:211:in `mon_synchronize'
/opt/rh/ruby193/root/usr/share/ruby/rake/task.rb:151:in `invoke_with_call_chain'
/opt/rh/ruby193/root/usr/share/ruby/rake/task.rb:144:in `invoke'
/opt/rh/ruby193/root/usr/share/ruby/rake/application.rb:116:in `invoke_task'
/opt/rh/ruby193/root/usr/share/ruby/rake/application.rb:94:in `block (2 levels) in top_level'
/opt/rh/ruby193/root/usr/share/ruby/rake/application.rb:94:in `each'
/opt/rh/ruby193/root/usr/share/ruby/rake/application.rb:94:in `block in top_level'
/opt/rh/ruby193/root/usr/share/ruby/rake/application.rb:133:in `standard_exception_handling'
/opt/rh/ruby193/root/usr/share/ruby/rake/application.rb:88:in `top_level'
/opt/rh/ruby193/root/usr/share/ruby/rake/application.rb:66:in `block in run'
/opt/rh/ruby193/root/usr/share/ruby/rake/application.rb:133:in `standard_exception_handling'
/opt/rh/ruby193/root/usr/share/ruby/rake/application.rb:63:in `run'
/opt/rh/ruby193/root/usr/bin/rake:32:in `<main>'
Tasks: TOP => katello:preupgrade_content_host_check

The code in question seems to be:
      systems.each do |system|
        begin
          facts = system.facts
          unless facts
            systems_to_remove.push(system)
          end
        rescue RestClient::Exception
          systems_to_remove.push(system)
        end
      end

Line 48 is "facts = system.facts".

After adding a tactical "puts system.inspect" just before the system.facts line, we could identify the bad system:

#<Katello::System id: 6891, uuid: nil, name: "hostname", description: "Initial Registration Params", location: "None", environment_id: 4, created_at: "2016-11-15 12:58:01", updated_at: "2016-11-15 12:58:01", type: "Katello::System", content_view_id: 15, host_id: nil>
rake aborted!

Looking into PostgreSQL revealed that we actually had two systems with that symptom:

foreman=# select * from katello_systems where uuid is null;

  id  | uuid |         name          |         description         | location | environment_id |         created_at         |         updated_at         |      type       | content_view_id | host_id

------+------+-----------------------+-----------------------------+----------+----------------+----------------------------+----------------------------+-----------------+-----------------+---------

6891 |      | hostname              | Initial Registration Params | None     |              4 | 2016-11-15 12:58:01.246012 | 2016-11-15 12:58:01.246012 | Katello::System |              15 |

6262 |      | hostname2             | Initial Registration Params | None     |              4 | 2016-09-26 09:06:38.945969 | 2016-09-26 09:06:38.945969 | Katello::System |              16 |

(2 rows)


PostgreSQL would also tell us that there were another two systems with those hostnames, but now with proper UUIDs.
Seems the initial registration of those wen't badly and they were re-registered.

After erasing the two broken systems from the DB the upgrade_check would run fine.

I think the upgrade_check.rake needs a bit more of error handling, as I would expect it to catch this bad systems and tell me about them, not choke on them.

Version-Release number of selected component (if applicable):
Satellite 6.1.11

How reproducible:
Always, but no idea how the initial problematic host was created

Steps to Reproduce:
1. create a katello::system without a uuid
2. run foreman-rake katello:upgrade_check

Actual results:
rake aborted

Expected results:
system is said to be faulty

Comment 1 Ivan Necas 2017-01-03 12:59:48 UTC
The reason for this I can imagine could be a failed orchestration of the host, so that was never finished (perhaps the task was force-unlocked).

I agree the upgrade check should handle such a situatino

Comment 2 Ivan Necas 2017-01-03 13:04:11 UTC
Created redmine issue http://projects.theforeman.org/issues/17905 from this bug

Comment 4 Ivan Necas 2017-01-03 13:10:25 UTC
It seems it's similar to https://bugzilla.redhat.com/show_bug.cgi?id=1329561

Comment 5 Evgeni Golov 2017-01-03 13:31:50 UTC
FWIW, those hosts are content-only, so there was no corresponding Foreman-Host at all (and that part is fine).

Comment 6 Bryan Kearney 2017-03-27 17:03:58 UTC
I do not forsee fixing this for 6.1. I am aligning this to 6.2.z only.

Comment 7 Evgeni Golov 2017-03-27 19:27:54 UTC
Hi Brian,

any reason you flagged this to 6.3 and not to 6.2 then?

Comment 8 Bryan Kearney 2017-05-10 12:44:32 UTC
It is linked to both 6.3 and 6.2.z so that if we fix it in one, we fix it in both.

Comment 9 Bryan Kearney 2017-05-23 15:12:48 UTC
This did not make it in time for 6.3, moving it to backlog.

Comment 11 Bryan Kearney 2018-09-04 18:01:00 UTC
Thank you for your interest in Satellite 6. We have evaluated this request, and we do not expect this to be implemented in the product in the foreseeable future. We are therefore closing this out as WONTFIX. If you have any concerns about this, please feel free to contact Rich Jerrido or Bryan Kearney. Thank you.