Bug 470097

Summary: deploying config files to multiple machines generates ISE if any client cannot receive files
Product: Red Hat Satellite 5 Reporter: Michael George <mgeorge>
Component: Configuration ManagementAssignee: Devan Goodwin <dgoodwin>
Status: CLOSED CURRENTRELEASE QA Contact: Preethi Thomas <pthomas>
Severity: high Docs Contact:
Priority: medium    
Version: 511CC: akarlsso, bperkins, cperry, jason.dobies, jplans, tao, xdmoon
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: sat530 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 490779 502954 (view as bug list) Environment:
Last Closed: 2009-09-10 20:30:02 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 456985, 478863, 490779, 502954    

Description Michael George 2008-11-05 18:32:34 UTC
User-Agent:       Mozilla/5.0 (X11; U; Linux i686 (x86_64); en-US; rv:1.9b5) Gecko/2008042803 Red Hat/3.0b5-0.beta5.6.el5 Firefox/3.0b5

When trying to deploy a config file to multiple machines an ISE is generated if any of the clients cannot receive the files. This can be due to the profile in the sat referring to a machine which no longer exists or if the client does not have the correct packages installed (ie. osad) for instance.

Reproducible: Always

Steps to Reproduce:
1.Select to manage multiple machines
2.Select to deploy a config file
3.ISE occurs everytime (if any machine has a problem receiving update)
Actual Results:  
ISE and no machine seems to receive the updated config file

Expected Results:  
All machines receive the updated config file.

Comment 1 Michael George 2008-11-05 20:13:10 UTC
In an attempt to work around the problem we instead issued a "remote command" to the same set of ~350 machines to execute the following command

rhncfg-client get

This cause the load average of the satellite to top 250 for around 45 minutes hence it was unusable in that time.

Of the 350 machines

~150 had successfully executed the command
~100 had failed due to timeout
~150 were marked as being in progress after 45 minutes.

The in progress machines seem to be progressing slowly now after several hours.

Attempts to reschedule the command on the failed systems generates an ISE as above.

Comment 10 Devan Goodwin 2009-01-22 14:48:28 UTC
Reproduced with:

- Register two systems.
- Enable deploy on one, leave the other along.
- Subscribe both to a configuration channel with a single file in their individual SDC Configuration tabs.
- View system list, select both, press "Update List". (button has been renamed)
- View SSM, goto Configuration tab, schedule deploy for the single file.
- Confirmation screen shows only the system that has deploy enabled.
- Confirm -> ISE as per original report.

Working on a solution now.

Comment 12 Devan Goodwin 2009-01-22 18:50:15 UTC
Fixed in spacewalk commit: 8eef26ed0e7d0aa0f4b6b98926a822c169a16c39

To fix I corrected the discrepancy between the confirmation screen and the list the application actually tried to deploy to. (which was as expected, everything in the set)

This exception triggering will now be a highly unlikely (and exceptional) scenario and thus I think it's safe and most logical to leave the ISE in place. Suggest we do this and if the problem ever surfaces again then focus on addressing handling it.

Comment 17 Brandon Perkins 2009-09-10 20:30:02 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHEA-2009-1434.html