Hide Forgot
Description of problem: Remote execution for RHEL7.x clients fails with following error on satellite webui. 1: Failed to refresh the connector 2: Net::SSH::Disconnect connection closed by remote host 3: Exit status: EXCEPTION But the same task works for RHEL6 clients successfully. Version-Release number of selected component (if applicable): tfm-rubygem-hammer_cli_foreman_remote_execution-0.0.5.3-1.el7sat.noarch rubygem-smart_proxy_remote_execution_ssh-0.1.2-2.el7sat.noarch tfm-rubygem-smart_proxy_remote_execution_ssh_core-0.1.2-1.el7sat.noarch tfm-rubygem-foreman_remote_execution-0.3.0.12-1.el7sat.noarch How reproducible: Always Steps to Reproduce: 1. Login to satellite webui --> Hosts --> Templates --> Job Templates --> New Job Template --> Use following details to create template: ****************************** Name: Yum Update - SSH Default Job category: Power Description format: %{action} host Provider type: SSHExecutionProvider Under Template input: Name: action Required: check-mark Input type: User input Options: yum-update & reboot yum-update Description: action to perform on the server Overridable: check-mark ****************************** Then submit and template should show: ****************************** logger "<%= input('action') %>" <%= case input('action') when 'yum-update & reboot' 'yum -y -d 1 -e 0 update && reboot' else 'yum -y -d 1 -e 0 update' end %> ****************************** 2. Using the above template 'Run Job' on RHEL6 and RHEL7 client system. Actual results: - RHEL6 Client gets updated, rebooted and the satellite shows task executed successfully. - RHEL7 client gets updated, rebooted but the satellite shows error ****************************** Failed to refresh the connector Net::SSH::Disconnect connection closed by remote host Exit status: EXCEPTION ****************************** Expected results: - Remote executing with reboot command should result successful status on satellite webui. Additional info: 1. networkmanager running. 2. updated the pam module for sshd; rebooted. # cat /etc/pam.d/sshd #%PAM-1.0 auth required pam_sepermit.so auth substack password-auth auth include postlogin # Used with polkit to reauthorize users in remote sessions -auth optional pam_reauthorize.so prepare account required pam_nologin.so account include password-auth password include password-auth # pam_selinux.so close should be the first session rule session required pam_selinux.so close session required pam_loginuid.so # pam_selinux.so open should only be followed by sessions to be executed in the user context session required pam_selinux.so open env_params session required pam_namespace.so session optional pam_keyinit.so force revoke session include password-auth session include postlogin # Used with polkit to reauthorize users in remote sessions -session optional pam_reauthorize.so prepare -session optional pam_systemd.so
The problem is we can't detect, if the error caused by the restart was expected or not. The proper way to restart the host and see action as success is using "Power Action - SSH Default" from within the template like this: ``` <%= render_template("Power Action - SSH Default", :action => "restart") %> ```
Turning the bug into documentation to add the following info the the host config guide: When performing power actions as part of job template, one should use the "Power Action - SSH Default" template from withing their template, so that Satellite doesn't interpret the disconnect exception as error, that happens due to ongoing reboot, and marks the job as success.
Assigning to Lucie for review. Lucie - looks like we need to make an addition to the Host Configuration Guide to provide the above clarification regarding remote execution.
Information on how to set up an advanced job template for remote power actions has been added to the Host Configuration Guide. The changes are now live, see the updated procedure here (step 3): https://access.redhat.com/documentation/en/red-hat-satellite/6.2/single/host-configuration-guide/#proc-Host_Configuration_Guide-Creating_a_Job_Template And a newly added example here: https://access.redhat.com/documentation/en/red-hat-satellite/6.2/single/host-configuration-guide/#exam-Host_Configuration_Guide-Including_Power_Actions_in_Templates
The solution listed here has nothing to do with the problem. Leveraging a template simply results in the template's text being included. From what I can tell, there's no special "magic" about the "Power actions - SSH Default" template, and it simply inserts shutdown -h or reboot command as appropriate. This bug has occurred for me when running jobs on several servers at once, yet does not happen when only run on a single server. It is not dependent on RHEL6 vs RHEL7. I'm baffled as to how the suggested fix listed above does anything to mitigate the bug described here. All it really does, as far as I can tell, is make things more complicated for no good reason. I think this is more likely to be related to load on the satellite server than templating issues. The end result of a compiled template is a set of commands, so whether I include a template named "Power Actions - SSH Default" in my template, or include a "reboot" command, I'm hard pressed to see how this could possibly result in different behavior on the client side. Client doesn't compile templates, server does. Please explain how this bug was marked as resolved with the resolution specified.
Also, the KB article created from this bug report is very strange. I'm told to add "-session optional pam_systemd.so" to /etc/pam.d/sshd, yet it already contains "session include password-auth", and /etc/pam.d/password-auth contains, you guessed it, "-session optional pam_systemd.so". No explanation as to why this would be necessary (it's not). I'm surprised to see the KB article is "Verified". Not sure what exactly was verified, since clearly it wasn't the content of the article. The notes on this bug report and the KB article are really strange to me, I typically find Redhat actually provides accurate and detailed information yet this seems to be completely missing from both of these.
I dug through the foreman code and can't find anything that indicates there's special treatment of templates based on their name. I can unlock, edit, even delete the "Power Action - SSH Default" template. Can't find any indication that there's something special about it that makes Satellite "know" that the reboot command is okay, vs when I use an identical template named, say, "Mouse poop" which contains the same commands. Not to mention, well, the behavior of using "reboot" in my own template, vs including it by way of including the "Power Action - SSH Default" template, does not change behavior. This still happens randomly on RHEL6/7 hosts when a larger number of hosts are all updating and rebooting at the same time, and does not happen when only a handful are.