Bug 2217397

Summary: REX job finished with exit code 0 but the script failed on client side due to no space.
Product: Red Hat Satellite Reporter: Hao Chang Yu <hyu>
Component: Remote ExecutionAssignee: Adam Ruzicka <aruzicka>
Status: CLOSED ERRATA QA Contact: Peter Ondrejka <pondrejk>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 6.11.5CC: ahumbe, aruzicka, balu.shanmugam, ccordoui, iballou, rlavi, vdeshpan
Target Milestone: 6.15.0Keywords: Triaged
Target Release: Unused   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: rubygem-smart_proxy_remote_execution_ssh-0.10.2 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 2246551 2250342 (view as bug list) Environment:
Last Closed: 2024-04-23 17:11:32 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
RHEL 7 Hotfix RPM for Satellite 6.11.5
none
RHEL 8 Hotfix RPM for Satellite 6.11.5
none
RHEL 8 Hotfix RPM for Satellite 6.12.5 none

Description Hao Chang Yu 2023-06-26 07:33:29 UTC
Description of problem:
Satellite shows the REX job has finished successfully but the script was actually failed with no space available on the client.

Based on the Dynflow task output below, the wrapper script failed to write the exit code to a file due to no space which caused the terminal to exit with 0.
----------------------
proxy_output:
  result:
    <snip>
  - output_type: stdout
      <snip>
      Error Summary
      -------------
      Disk Requirements:
        At least XXXX more space needed on the / filesystem. <=========================== Yum failed due to insufficient space

      Uploading Enabled Repositories Report
      Loaded plugins: product-id, subscription-manager
    timestamp: xxxxxxx
  - output_type: stdout
    output: |
      Package action failed, exiting...
      sh: line 0: echo: write error: No space left on device  <============================ Wrapper script failed to write the exit code to the file
    timestamp: xxxxxxx
  runner_id: xxxxxx
  exit_status: 0   <==================== wrong exit code.
----------------------




Additional info:

Based on "sh: line 0: echo: write error: No space left on device" error above, I think the shell script which wrapped the command failed to redirect the Yum exit code to the "@exit_code_path" file due to completely ran out of space in "/" directory. Since the @exit_code_path" file is empty, the terminal exited with 0 status code.

--------------------------------
      <<-SCRIPT.gsub(/^\s+\| /, '')
      | sh -c "(#{@user_method.cli_command_prefix}#{su_method ? "'#{@remote_script} < /dev/null '" : "#{@remote_script} < /dev/null"}; echo \\$?>#{@exit_code_path}) | /usr/bin/tee #{@output_path}   <====================== redirect the YUM exit code to a file
      | exit \\$(cat #{@exit_code_path})"   <=============== exit the script with the exit code in the file
      SCRIPT
--------------------------------

I think we should be able to prevent this issue by checking the exit status of the wrapping script itself in the case that the wrapping script itself fail to write the exit code of the Yum command.

Comment 1 balu.shanmugam 2023-07-20 05:14:35 UTC
Redhat confirms it is a bug fixed on version 6.136.
We are in version 6.11. We need a hot fix for this version as well as 6.12.
This bug is heavily impacting our patching, and we cannot wait for a migration to solve this.

Comment 2 Vedashree Deshpande 2023-08-08 06:33:42 UTC
Hello Balu, 

Thank you for the information, I tried understanding it but could not see any updates from our Engineering team about the fix being available or bug being resolved in Satellite 6.13, can you help us with the source or steps you have received so we can test too?

Regards, 
Vedashree Deshpande.

Comment 3 Adam Ruzicka 2023-08-08 15:05:50 UTC
This is definitely not fixed on 6.13.

Comment 4 Adam Ruzicka 2023-08-08 15:06:49 UTC
Created redmine issue https://projects.theforeman.org/issues/36655 from this bug

Comment 5 balu.shanmugam 2023-08-09 04:46:34 UTC
@Vedashree Deshpande, I am not sure what source and steps you are talking about?
Please refer #Case 03542621 for more details.

Comment 6 Bryan Kearney 2023-10-05 16:02:38 UTC
Moving this bug to POST for triage into Satellite since the upstream issue https://projects.theforeman.org/issues/36655 has been resolved.

Comment 8 Vedashree Deshpande 2023-10-10 11:59:52 UTC
(In reply to balu.shanmugam from comment #5)
> @Vedashree Deshpande, I am not sure what source and steps you are talking
> about?
> Please refer #Case 03542621 for more details.

Sure, I got the answer of how you reproduced the issue. Thank you.

Comment 10 Ian Ballou 2023-10-20 16:07:25 UTC
Created attachment 1994864 [details]
RHEL 7 Hotfix RPM for Satellite 6.11.5

A Hotfix RPM is now available for Satellite 6.11.5 on RHEL 7.

Installation instructions:

1. Take a backup or snapshot of the Satellite server.

2. Download the RHEL 7 hotfix RPM from the attachment.

3. # yum localinstall ./tfm-rubygem-smart_proxy_remote_execution_ssh-0.5.3-2.HOTFIXRHBZ2217397.el7sat.noarch.rpm --disableplugin=foreman-protector

4. # satellite-maintain service restart

Comment 11 Ian Ballou 2023-10-20 16:09:29 UTC
Created attachment 1994865 [details]
RHEL 8 Hotfix RPM for Satellite 6.11.5

Installation instructions (RHEL 8):

1. Take a backup or snapshot of the Satellite server.

2. Download the RHEL 8 hotfix RPM from the attachment.

3. # dnf install ./rubygem-smart_proxy_remote_execution_ssh-0.5.3-2.HOTFIXRHBZ2217397.el8sat.noarch.rpm --disableplugin=foreman-protector

4. # satellite-maintain service restart

Comment 12 Ian Ballou 2023-10-20 16:13:28 UTC
Created attachment 1994866 [details]
RHEL 8 Hotfix RPM for Satellite 6.12.5

A Hotfix RPM is now available for Satellite 6.12.5 on RHEL 8.

Installation instructions:

1. Take a backup or snapshot of the Satellite server.

2. Download the hotfix RPM from the attachment

3. # dnf install ./rubygem-smart_proxy_remote_execution_ssh-0.7.3-2.HOTFIXRHBZ2217397.el8sat.noarch.rpm

4. # satellite-maintain service restart

Comment 13 Brad Buckingham 2023-10-30 11:29:29 UTC
Bulk setting Target Milestone = 6.15.0 where sat-6.15.0+ is set.

Comment 14 Peter Ondrejka 2023-11-27 15:25:51 UTC
Verified in stream snap 36

Comment 17 errata-xmlrpc 2024-04-23 17:11:32 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: Satellite 6.15.0 release), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2024:2010