Note: This bug is displayed in read-only format because
the product is no longer active in Red Hat Bugzilla.
Red Hat Satellite engineering is moving the tracking of its product development work on Satellite to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "Satellite project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs will be migrated starting at the end of May. If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "Satellite project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/SAT-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Descriptionsthirugn@redhat.com
2019-03-13 14:28:21 UTC
+++ This bug was initially created as a clone of Bug #1687190 +++
Description of problem:
Upgrading Satellite (e.g. from 6.3 to 6.4) can fail at random rake script with backtrace like:
RestClient::Exceptions::OpenTimeout: Timed out connecting to server
/opt/theforeman/tfm/root/usr/share/gems/gems/rest-client-2.0.1/lib/restclient/request.rb:731:in `rescue in transmit'
/opt/theforeman/tfm/root/usr/share/gems/gems/rest-client-2.0.1/lib/restclient/request.rb:642:in `transmit'
/opt/theforeman/tfm/root/usr/share/gems/gems/rest-client-2.0.1/lib/restclient/request.rb:145:in `execute'
/opt/theforeman/tfm/root/usr/share/gems/gems/rest-client-2.0.1/lib/restclient/request.rb:52:in `execute'
/opt/theforeman/tfm/root/usr/share/gems/gems/rest-client-2.0.1/lib/restclient/resource.rb:67:in `post'
/opt/theforeman/tfm/root/usr/share/gems/gems/runcible-2.8.1/lib/runcible/base.rb:92:in `get_response'
/opt/theforeman/tfm/root/usr/share/gems/gems/runcible-2.8.1/lib/runcible/base.rb:72:in `call'
/opt/theforeman/tfm/root/usr/share/gems/gems/runcible-2.8.1/lib/runcible/resources/repository.rb:147:in `unit_search'
/opt/theforeman/tfm/root/usr/share/gems/gems/runcible-2.8.1/lib/runcible/extensions/repository.rb:199:in `distributions'
/opt/theforeman/tfm/root/usr/share/gems/gems/katello-3.7.0.46/app/models/katello/repository.rb:620:in `import_distribution_data'
/opt/theforeman/tfm/root/usr/share/gems/gems/katello-3.7.0.46/app/models/katello/repository.rb:615:in `block in import_distributions'
/opt/theforeman/tfm-ror51/root/usr/share/gems/gems/activerecord-5.1.6/lib/active_record/relation/delegation.rb:39:in `each'
/opt/theforeman/tfm-ror51/root/usr/share/gems/gems/activerecord-5.1.6/lib/active_record/relation/delegation.rb:39:in `each'
(now it depends per use case / rake script failed, e.g. few times:
/opt/theforeman/tfm/root/usr/share/gems/gems/katello-3.7.0.46/app/models/katello/repository.rb:614:in `import_distributions'
/opt/theforeman/tfm/root/usr/share/gems/gems/katello-3.7.0.46/lib/katello/tasks/upgrades/3.7/make_all_ks_repos_bootable.rake:7:in `block (4 levels) in <top (required)>'
/opt/rh/rh-ruby24/root/usr/share/gems/gems/rake-12.0.0/exe/rake:27:in `<top (required)>'
)
Usually it happens in a rake script that calls REST requests in a tight loop. Very random occurrences, more often on 6.5.
inecas++ ++ found out a fix:
diff --git a/app/models/setting.rb b/app/models/setting.rb
index 72a094024aa135fa41e0f6454d07c1d79fa3ac0a..e19ad55b24afb74c83f9e3033db2756541830fab 100644
--- a/app/models/setting.rb
+++ b/app/models/setting.rb
@@ -93,12 +93,8 @@ class Setting < ApplicationRecord
def self.[](name)
name = name.to_s
cache_value = Setting.cache.read(name)
- if cache_value.nil?
- value = find_by(:name => name).try(:value)
- Setting.cache.write(name, value)
- return value
- else
- return cache_value
+ cache.fetch(name) do
+ find_by(:name => name).try(:value)
end
end
that really works.
This BZ is rather a tracker to ensure we fix it in downstream / to decide where all to fix it.
Version-Release number of selected component (if applicable):
Sat 6.4.2
How reproducible:
very sporadicaly during upgrade. Bit artificial reproducer found by Ivan is provided below.
Steps to Reproduce:
1. Decrease katello rest timeout to see the timeouts much sooner than after 1 hour:
/etc/foreman/plugins/katello.yaml :
:rest_client_timeout: 30
2. Run either (or all) scripts that fires REST requests in a tight loop:
a) requests to 127.0.0.1/pub :
mkdir -p ~/repro
cat <<EOF | foreman-rake console | tee ~/repro/pub-$(date -Isec).log
conf.return_format = ""
log = Logger.new(STDOUT)
log.info "=== process pid #{\$\$}"
url = URI.parse('http://127.0.0.1/pub/')
threads = 1.times.map do |i|
Thread.new do
log.info "Start thread no. #{i+1}"
iteration = 0
cycle = 0
while true
begin
cycle += 1
if cycle == 1000
iteration += 1
cycle = 0
log.info "=== Thread #{i+1}: iteration #{iteration}k"
end
Setting['http_cache']
Net::HTTP.get_response(url)
rescue Exception => e
log.info "Thread #{i+1}: error at iteration #{iteration * 1000 + cycle}"
log.info "#{e.class} #{e.message}"
log.info e.backtrace.join("\n")
sleep(1)
end
end
end
end
threads.each(&:join)
EOF
b) requests to pulp repo details:
cat <<EOF | foreman-rake console | tee ~/repro/pulp-$(date -Isec).log
conf.return_format = ""
log = Logger.new(STDOUT)
pulp_id = Katello::Repository.first.pulp_id
Katello.pulp_server.extensions.repository
threads = 4.times.map do |i|
Thread.new do
log.info "Start thread no. #{i+1}"
iteration = 0
cycle = 0
while true
begin
cycle += 1
if cycle == 1000
iteration += 1
cycle = 0
log.info "=== iteration #{iteration}k"
end
Katello.pulp_server.extensions.repository.distributions(pulp_id)
rescue Exception => e
log.info "Error at iteration #{iteration * 1000 + cycle}"
log.info "#{e.class} #{e.message}"
log.info e.backtrace.join("\n")
end
end
end
end
threads.each(&:join)
EOF
c) optionally, run requests against candlepin like (in rake console):
4.times do
Thread.new do
begin
org = Organization.first
label = org.label
iteration = 0
cycle = 0
while true
cycle += 1
if cycle == 1000
iteration += 1
cycle = 0
puts "=== iteration #{iteration}k"
end
Katello::Resources::Candlepin::Owner.get_ueber_cert(label)
end
rescue Exception => e
puts e.message
puts e.backtrace.join("\n")
raise e
end
end
end
3. Wait few hours and count number of timeouts
Actual results:
3. VERY depending on system, few to many timeouts happen
Expected results:
3. no timeouts
Additional info:
--- Additional comment from RHEL Product and Program Management on 2019-03-10 19:18:08 UTC ---
Since this bug report was entered in Red Hat Bugzilla, the 'sat-backlog' flag has been set to ? to ensure that it is properly evaluated for release.
--- Additional comment from RHEL Product and Program Management on 2019-03-10 19:18:08 UTC ---
Since this issue was entered in Red Hat Bugzilla, the pm_ack has been set to + automatically for the next planned release.
VERIFIED
Version tested:
# rpm -q satellite
satellite-6.4.3-1.el7sat.noarch
No timeout seen with provided reproducer
I, [2019-04-11T05:06:38.925736 #19214] INFO -- : === Thread 1: iteration 100k
I, [2019-04-11T05:06:41.913713 #19214] INFO -- : === Thread 1: iteration 101k
I, [2019-04-11T05:06:45.022606 #19214] INFO -- : === Thread 1: iteration 102k
I, [2019-04-11T05:06:48.103804 #19214] INFO -- : === Thread 1: iteration 103k
I, [2019-04-11T05:06:51.326653 #19214] INFO -- : === Thread 1: iteration 104k
I, [2019-04-11T05:19:39.623569 #1336] INFO -- : === iteration 16k
I, [2019-04-11T05:19:40.514961 #1336] INFO -- : === iteration 16k
I, [2019-04-11T05:20:22.603960 #1336] INFO -- : === iteration 17k
I, [2019-04-11T05:20:24.864712 #1336] INFO -- : === iteration 17k
I, [2019-04-11T05:20:25.063272 #1336] INFO -- : === iteration 17k
I, [2019-04-11T05:20:26.603654 #1336] INFO -- : === iteration 17k
I, [2019-04-11T05:21:08.424344 #1336] INFO -- : === iteration 18k
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory, and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.
https://access.redhat.com/errata/RHBA-2019:0903