Note: This bug is displayed in read-only format because
the product is no longer active in Red Hat Bugzilla.
Red Hat Satellite engineering is moving the tracking of its product development work on Satellite to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "Satellite project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs will be migrated starting at the end of May. If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "Satellite project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/SAT-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Description of problem:
Upgrading Satellite (e.g. from 6.3 to 6.4) can fail at random rake script with backtrace like:
RestClient::Exceptions::OpenTimeout: Timed out connecting to server
/opt/theforeman/tfm/root/usr/share/gems/gems/rest-client-2.0.1/lib/restclient/request.rb:731:in `rescue in transmit'
/opt/theforeman/tfm/root/usr/share/gems/gems/rest-client-2.0.1/lib/restclient/request.rb:642:in `transmit'
/opt/theforeman/tfm/root/usr/share/gems/gems/rest-client-2.0.1/lib/restclient/request.rb:145:in `execute'
/opt/theforeman/tfm/root/usr/share/gems/gems/rest-client-2.0.1/lib/restclient/request.rb:52:in `execute'
/opt/theforeman/tfm/root/usr/share/gems/gems/rest-client-2.0.1/lib/restclient/resource.rb:67:in `post'
/opt/theforeman/tfm/root/usr/share/gems/gems/runcible-2.8.1/lib/runcible/base.rb:92:in `get_response'
/opt/theforeman/tfm/root/usr/share/gems/gems/runcible-2.8.1/lib/runcible/base.rb:72:in `call'
/opt/theforeman/tfm/root/usr/share/gems/gems/runcible-2.8.1/lib/runcible/resources/repository.rb:147:in `unit_search'
/opt/theforeman/tfm/root/usr/share/gems/gems/runcible-2.8.1/lib/runcible/extensions/repository.rb:199:in `distributions'
/opt/theforeman/tfm/root/usr/share/gems/gems/katello-3.7.0.46/app/models/katello/repository.rb:620:in `import_distribution_data'
/opt/theforeman/tfm/root/usr/share/gems/gems/katello-3.7.0.46/app/models/katello/repository.rb:615:in `block in import_distributions'
/opt/theforeman/tfm-ror51/root/usr/share/gems/gems/activerecord-5.1.6/lib/active_record/relation/delegation.rb:39:in `each'
/opt/theforeman/tfm-ror51/root/usr/share/gems/gems/activerecord-5.1.6/lib/active_record/relation/delegation.rb:39:in `each'
(now it depends per use case / rake script failed, e.g. few times:
/opt/theforeman/tfm/root/usr/share/gems/gems/katello-3.7.0.46/app/models/katello/repository.rb:614:in `import_distributions'
/opt/theforeman/tfm/root/usr/share/gems/gems/katello-3.7.0.46/lib/katello/tasks/upgrades/3.7/make_all_ks_repos_bootable.rake:7:in `block (4 levels) in <top (required)>'
/opt/rh/rh-ruby24/root/usr/share/gems/gems/rake-12.0.0/exe/rake:27:in `<top (required)>'
)
Usually it happens in a rake script that calls REST requests in a tight loop. Very random occurrences, more often on 6.5.
inecas++ ++ found out a fix:
diff --git a/app/models/setting.rb b/app/models/setting.rb
index 72a094024aa135fa41e0f6454d07c1d79fa3ac0a..e19ad55b24afb74c83f9e3033db2756541830fab 100644
--- a/app/models/setting.rb
+++ b/app/models/setting.rb
@@ -93,12 +93,8 @@ class Setting < ApplicationRecord
def self.[](name)
name = name.to_s
cache_value = Setting.cache.read(name)
- if cache_value.nil?
- value = find_by(:name => name).try(:value)
- Setting.cache.write(name, value)
- return value
- else
- return cache_value
+ cache.fetch(name) do
+ find_by(:name => name).try(:value)
end
end
that really works.
This BZ is rather a tracker to ensure we fix it in downstream / to decide where all to fix it.
Version-Release number of selected component (if applicable):
Sat 6.4.2
How reproducible:
very sporadicaly during upgrade. Bit artificial reproducer found by Ivan is provided below.
Steps to Reproduce:
1. Decrease katello rest timeout to see the timeouts much sooner than after 1 hour:
/etc/foreman/plugins/katello.yaml :
:rest_client_timeout: 30
2. Run either (or all) scripts that fires REST requests in a tight loop:
a) requests to 127.0.0.1/pub :
mkdir -p ~/repro
cat <<EOF | foreman-rake console | tee ~/repro/pub-$(date -Isec).log
conf.return_format = ""
log = Logger.new(STDOUT)
log.info "=== process pid #{\$\$}"
url = URI.parse('http://127.0.0.1/pub/')
threads = 1.times.map do |i|
Thread.new do
log.info "Start thread no. #{i+1}"
iteration = 0
cycle = 0
while true
begin
cycle += 1
if cycle == 1000
iteration += 1
cycle = 0
log.info "=== Thread #{i+1}: iteration #{iteration}k"
end
Setting['http_cache']
Net::HTTP.get_response(url)
rescue Exception => e
log.info "Thread #{i+1}: error at iteration #{iteration * 1000 + cycle}"
log.info "#{e.class} #{e.message}"
log.info e.backtrace.join("\n")
sleep(1)
end
end
end
end
threads.each(&:join)
EOF
b) requests to pulp repo details:
cat <<EOF | foreman-rake console | tee ~/repro/pulp-$(date -Isec).log
conf.return_format = ""
log = Logger.new(STDOUT)
pulp_id = Katello::Repository.first.pulp_id
Katello.pulp_server.extensions.repository
threads = 4.times.map do |i|
Thread.new do
log.info "Start thread no. #{i+1}"
iteration = 0
cycle = 0
while true
begin
cycle += 1
if cycle == 1000
iteration += 1
cycle = 0
log.info "=== iteration #{iteration}k"
end
Katello.pulp_server.extensions.repository.distributions(pulp_id)
rescue Exception => e
log.info "Error at iteration #{iteration * 1000 + cycle}"
log.info "#{e.class} #{e.message}"
log.info e.backtrace.join("\n")
end
end
end
end
threads.each(&:join)
EOF
c) optionally, run requests against candlepin like (in rake console):
4.times do
Thread.new do
begin
org = Organization.first
label = org.label
iteration = 0
cycle = 0
while true
cycle += 1
if cycle == 1000
iteration += 1
cycle = 0
puts "=== iteration #{iteration}k"
end
Katello::Resources::Candlepin::Owner.get_ueber_cert(label)
end
rescue Exception => e
puts e.message
puts e.backtrace.join("\n")
raise e
end
end
end
3. Wait few hours and count number of timeouts
Actual results:
3. VERY depending on system, few to many timeouts happen
Expected results:
3. no timeouts
Additional info:
VERIFIED
Version tested:
Satellite 6.5 snap 21
Able to reproduce with 6.4 but could not see timeout on 6.5 snap 21 with above reproducer steps.
I, [2019-03-29T04:18:54.060583 #5203] INFO -- : === Thread 1: iteration 163k
I, [2019-03-29T04:18:56.548624 #5203] INFO -- : === Thread 1: iteration 164k
I, [2019-03-29T04:18:59.073265 #5203] INFO -- : === Thread 1: iteration 165k
I, [2019-03-29T04:19:01.414366 #5203] INFO -- : === Thread 1: iteration 166k
I, [2019-03-29T04:19:03.754766 #5203] INFO -- : === Thread 1: iteration 167k
I, [2019-03-29T04:35:39.720092 #12961] INFO -- : === iteration 19k
I, [2019-03-29T04:35:40.208662 #12961] INFO -- : === iteration 19k
I, [2019-03-29T04:35:41.687506 #12961] INFO -- : === iteration 19k
I, [2019-03-29T04:36:22.321864 #12961] INFO -- : === iteration 20k
I, [2019-03-29T04:36:22.376873 #12961] INFO -- : === iteration 20k
I, [2019-03-29T04:36:23.768408 #12961] INFO -- : === iteration 20k
I, [2019-03-29T04:36:25.250421 #12961] INFO -- : === iteration 20k
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory, and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.
https://access.redhat.com/errata/RHSA-2019:1222
Description of problem: Upgrading Satellite (e.g. from 6.3 to 6.4) can fail at random rake script with backtrace like: RestClient::Exceptions::OpenTimeout: Timed out connecting to server /opt/theforeman/tfm/root/usr/share/gems/gems/rest-client-2.0.1/lib/restclient/request.rb:731:in `rescue in transmit' /opt/theforeman/tfm/root/usr/share/gems/gems/rest-client-2.0.1/lib/restclient/request.rb:642:in `transmit' /opt/theforeman/tfm/root/usr/share/gems/gems/rest-client-2.0.1/lib/restclient/request.rb:145:in `execute' /opt/theforeman/tfm/root/usr/share/gems/gems/rest-client-2.0.1/lib/restclient/request.rb:52:in `execute' /opt/theforeman/tfm/root/usr/share/gems/gems/rest-client-2.0.1/lib/restclient/resource.rb:67:in `post' /opt/theforeman/tfm/root/usr/share/gems/gems/runcible-2.8.1/lib/runcible/base.rb:92:in `get_response' /opt/theforeman/tfm/root/usr/share/gems/gems/runcible-2.8.1/lib/runcible/base.rb:72:in `call' /opt/theforeman/tfm/root/usr/share/gems/gems/runcible-2.8.1/lib/runcible/resources/repository.rb:147:in `unit_search' /opt/theforeman/tfm/root/usr/share/gems/gems/runcible-2.8.1/lib/runcible/extensions/repository.rb:199:in `distributions' /opt/theforeman/tfm/root/usr/share/gems/gems/katello-3.7.0.46/app/models/katello/repository.rb:620:in `import_distribution_data' /opt/theforeman/tfm/root/usr/share/gems/gems/katello-3.7.0.46/app/models/katello/repository.rb:615:in `block in import_distributions' /opt/theforeman/tfm-ror51/root/usr/share/gems/gems/activerecord-5.1.6/lib/active_record/relation/delegation.rb:39:in `each' /opt/theforeman/tfm-ror51/root/usr/share/gems/gems/activerecord-5.1.6/lib/active_record/relation/delegation.rb:39:in `each' (now it depends per use case / rake script failed, e.g. few times: /opt/theforeman/tfm/root/usr/share/gems/gems/katello-3.7.0.46/app/models/katello/repository.rb:614:in `import_distributions' /opt/theforeman/tfm/root/usr/share/gems/gems/katello-3.7.0.46/lib/katello/tasks/upgrades/3.7/make_all_ks_repos_bootable.rake:7:in `block (4 levels) in <top (required)>' /opt/rh/rh-ruby24/root/usr/share/gems/gems/rake-12.0.0/exe/rake:27:in `<top (required)>' ) Usually it happens in a rake script that calls REST requests in a tight loop. Very random occurrences, more often on 6.5. inecas++ ++ found out a fix: diff --git a/app/models/setting.rb b/app/models/setting.rb index 72a094024aa135fa41e0f6454d07c1d79fa3ac0a..e19ad55b24afb74c83f9e3033db2756541830fab 100644 --- a/app/models/setting.rb +++ b/app/models/setting.rb @@ -93,12 +93,8 @@ class Setting < ApplicationRecord def self.[](name) name = name.to_s cache_value = Setting.cache.read(name) - if cache_value.nil? - value = find_by(:name => name).try(:value) - Setting.cache.write(name, value) - return value - else - return cache_value + cache.fetch(name) do + find_by(:name => name).try(:value) end end that really works. This BZ is rather a tracker to ensure we fix it in downstream / to decide where all to fix it. Version-Release number of selected component (if applicable): Sat 6.4.2 How reproducible: very sporadicaly during upgrade. Bit artificial reproducer found by Ivan is provided below. Steps to Reproduce: 1. Decrease katello rest timeout to see the timeouts much sooner than after 1 hour: /etc/foreman/plugins/katello.yaml : :rest_client_timeout: 30 2. Run either (or all) scripts that fires REST requests in a tight loop: a) requests to 127.0.0.1/pub : mkdir -p ~/repro cat <<EOF | foreman-rake console | tee ~/repro/pub-$(date -Isec).log conf.return_format = "" log = Logger.new(STDOUT) log.info "=== process pid #{\$\$}" url = URI.parse('http://127.0.0.1/pub/') threads = 1.times.map do |i| Thread.new do log.info "Start thread no. #{i+1}" iteration = 0 cycle = 0 while true begin cycle += 1 if cycle == 1000 iteration += 1 cycle = 0 log.info "=== Thread #{i+1}: iteration #{iteration}k" end Setting['http_cache'] Net::HTTP.get_response(url) rescue Exception => e log.info "Thread #{i+1}: error at iteration #{iteration * 1000 + cycle}" log.info "#{e.class} #{e.message}" log.info e.backtrace.join("\n") sleep(1) end end end end threads.each(&:join) EOF b) requests to pulp repo details: cat <<EOF | foreman-rake console | tee ~/repro/pulp-$(date -Isec).log conf.return_format = "" log = Logger.new(STDOUT) pulp_id = Katello::Repository.first.pulp_id Katello.pulp_server.extensions.repository threads = 4.times.map do |i| Thread.new do log.info "Start thread no. #{i+1}" iteration = 0 cycle = 0 while true begin cycle += 1 if cycle == 1000 iteration += 1 cycle = 0 log.info "=== iteration #{iteration}k" end Katello.pulp_server.extensions.repository.distributions(pulp_id) rescue Exception => e log.info "Error at iteration #{iteration * 1000 + cycle}" log.info "#{e.class} #{e.message}" log.info e.backtrace.join("\n") end end end end threads.each(&:join) EOF c) optionally, run requests against candlepin like (in rake console): 4.times do Thread.new do begin org = Organization.first label = org.label iteration = 0 cycle = 0 while true cycle += 1 if cycle == 1000 iteration += 1 cycle = 0 puts "=== iteration #{iteration}k" end Katello::Resources::Candlepin::Owner.get_ueber_cert(label) end rescue Exception => e puts e.message puts e.backtrace.join("\n") raise e end end end 3. Wait few hours and count number of timeouts Actual results: 3. VERY depending on system, few to many timeouts happen Expected results: 3. no timeouts Additional info: