Red Hat Satellite engineering is moving the tracking of its product development work on Satellite to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "Satellite project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs will be migrated starting at the end of May. If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "Satellite project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/SAT-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1295725 - [RFE] foreman-task to have warning goferd failed due to disk full
Summary: [RFE] foreman-task to have warning goferd failed due to disk full
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Satellite
Classification: Red Hat
Component: katello-agent
Version: 6.1.5
Hardware: All
OS: Linux
medium
medium
Target Milestone: Unspecified
Assignee: Katello Bug Bin
QA Contact: Katello QA List
URL:
Whiteboard:
Depends On:
Blocks: 1353215
TreeView+ depends on / blocked
 
Reported: 2016-01-05 10:58 UTC by Pavel Moravec
Modified: 2024-06-13 20:39 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: Enhancement
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-11-30 14:51:14 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Foreman Issue Tracker 16703 0 None None None 2016-09-27 11:55:34 UTC
Red Hat Knowledge Base (Solution) 2112391 0 None None None 2016-01-05 11:14:37 UTC

Description Pavel Moravec 2016-01-05 10:58:43 UTC
Description of problem:
Assume a content host (running goferd without a problem, connected to Sat/Caps port 5647) runs out of free disk space - a user error but one that can easily happen.

In this situation, any new foreman-task propagated to goferd on this machine gets stalled forever (or until the timeout for package install / capsule sync, so matter of hours? havent waited so long). Usually with a symptom of goferd having no established TCP connection for a long time - despite heartbeats are enabled.

It would be nice to catch this situation (sooner) and provide meaningful error message into the foreman-task.


Version-Release number of selected component (if applicable):
(content host)
python-gofer-2.6.8-1.el7sat.noarch
python-gofer-proton-2.6.8-1.el7sat.noarch
gofer-2.6.8-1.el7sat.noarch

(Satellite)
ruby193-rubygem-foreman-tasks-0.6.15.7-1.el7sat.noarch


How reproducible:
100%

Steps to Reproduce:
1. Have a content host registered to Satellite6
2. Fill its disk (at least /var partition)
3. hammer -u admin -p password content-host package remove --content-host-id UUID --organization-id 1 --packages sos
4. monitor TCP connections established from goferd to Satellite/Capsule port 5647
5. (after few minutes, free the disk on content host)



Actual results:
3. timeouts / never finishes
4. no connection for a longer time


Expected results:
3. to finish (sooner) with self-explanatory error
4. almost everytime there needs to be an established TCP connection


Additional info:
I *think* the problem is in goferd that fails to write a json file with pending work to /var/lib/gofer/messaging/pending/katelloplugin. So it has nothing to pick up later on. This failure to write should be reported as task failed.

In parallel, goferd looses TCP connection to qdrouterd (for some time, in my reproducer it got established after some (tens of?) minutes - no idea what triggers this.

Comment 2 Pavel Moravec 2016-01-05 11:13:14 UTC
Removing "Improvement" due to another symptom detected:

goferd process consumes 100% CPU after a while. That sounds rather a bug than improvement.

Comment 5 Pavel Moravec 2016-02-20 08:56:11 UTC
(In reply to Bryan Kearney from comment #4)
> so this is not fixed by 1295957?

Yes, thanks for spotting it. goferd high CPU usage not reproducible further since qpid-proton 0.9-12 used.

Changing back to "[Improvement] foreman-task to have warning goferd failed due to disk full" since goferd should be robust enough to report back to katello disk full / failure in creating json file in katelloagent dir.

Comment 6 Bryan Kearney 2016-02-20 21:24:06 UTC
Du to change... moving this out.

Comment 8 Bryan Kearney 2018-09-04 18:57:50 UTC
Thank you for your interest in Satellite 6. We have evaluated this request, and we do not expect this to be implemented in the product in the foreseeable future. We are therefore closing this out as WONTFIX. If you have any concerns about this, please feel free to contact Rich Jerrido or Bryan Kearney. Thank you.

Comment 9 Bryan Kearney 2018-09-04 19:09:02 UTC
Thank you for your interest in Satellite 6. We have evaluated this request, and we do not expect this to be implemented in the product in the foreseeable future. We are therefore closing this out as WONTFIX. If you have any concerns about this, please feel free to contact Rich Jerrido or Bryan Kearney. Thank you.

Comment 11 Bryan Kearney 2018-11-01 14:45:06 UTC
The Satellite Team is attempting to provide an accurate backlog of bugzilla requests which we feel will be resolved in the next few releases. We do not believe this bugzilla will meet that criteria, and have plans to close it out in 1 month. This is  not a reflection on the validity of the request, but a reflection of the many priorities for the product. If you have any concerns about this, feel free to contact Rich Jerrido or Bryan Kearney or your account team. If we do not hear from you, we will close this bug out. Thank you.

Comment 12 Bryan Kearney 2018-11-30 14:51:14 UTC
Thank you for your interest in Satellite 6. We have evaluated this request, and while we recognize that it is a valid request, we do not expect this to be implemented in the product in the foreseeable future. This is due to other priorities for the product, and not a reflection on the request itself. We are therefore closing this out as WONTFIX. If you have any concerns about this, please do not reopen. Instead, feel free to contact Rich Jerrido or Bryan Kearney. Thank you.


Note You need to log in before you can comment on or make changes to this bug.