Bug 373131

Summary: Auto Apply Errata not working correctly
Product: [Retired] Red Hat Network Reporter: daryl herzmann <akrherz>
Component: RHN/BackendAssignee: Mike Orazi <morazi>
Status: CLOSED CURRENTRELEASE QA Contact: Amy Owens <aowens>
Severity: high Docs Contact:
Priority: low    
Version: rhn500CC: akrherz, duffy, edsall, newbery, pcfe, pmutha, rhn-bugs, xdmoon
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
URL: https://www.redhat.com/archives/rhn-users/2007-March/msg00059.html
Whiteboard: US=16955
Fixed In Version: 5.0.5 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2008-04-07 20:00:19 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 327071, 428842    
Attachments:
Description Flags
package list
none
Daily RHN Status emails
none
RHN daily status report 10 Jan 2008
none
RHN Status Email for 18 Jan 2008 none

Description daryl herzmann 2007-11-09 16:47:35 UTC
+++ This bug was initially created as a clone of Bug #232902 +++

Description of problem:
Taken from a rhn-list posting and my own review. It seems that Auto Apply Errata
is not working as expected within RHN 5.0.0 since its release into production on
March 11th. 

I do not have full details yet, but sounds like it kicked in for this account
and scheduled the application of errata to *all* systems on the account, even
those where the errata was not appliciable, and also ignoring the flag for 'Auto
Apply Errata'. 

Creating this *public* bug for tracking. 

Cliff. 

Version-Release number of selected component (if applicable):


How reproducible:
Not sure yet
Report came from:
https://www.redhat.com/archives/rhn-users/2007-March/msg00059.html

Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:
I'm not sure whether this is a hosted or proxy issue and I'm not sure
if it is operator error or something else that caused it. We have a
hosted environment running a current release proxy server with custom
channels. A product enhancement errata was released for some of those
custom channels a couple of days ago and today it was scheduled for
every system subscribed to the pertinent channels (or so it seems at a
glance).

Question 1: Should this errata be scheduled for machines which don't
have any relevant package installed? It was but this seems silly. They
picked it up and failed to install it harmlessly.

Question 2: Should it be scheduled for systems with auto errata update
set to no? It was. I can't tell how the scheduling is done, it is
listed as (none).

Question 3: Should systems with this errata scheduled but which have
auto errata updates set to no pick up the package and install it when
doing the next rhn_check? They did. This caused major grief as it
broke openAFS in this case and it seems to me that this should not
have happened.

Can anyone shed any light on any of this for me? It has been an
immensely long week and I'm exhausted so maybe I'm not thinking too
clearly at the moment.

Thanks,
John

-- Additional comment from cperry on 2007-03-19 12:31 EST --
Having reviewed, it seems that there is a bug in the background daemon that
schedules to application of 'Apply Auto Errata'. I have placed this bug onto the
must fix list for the next scheduled minor maintence release of the RHN Hosted
web site to get fixed. 

Cliff. 

-- Additional comment from ksmith on 2007-03-20 15:42 EST --
* Fixed overly "generous" logic in the ErrataQueueWorker which was causing
auto-errata updates to be applied to too many machines.

-- Additional comment from ksmith on 2007-03-22 10:09 EST --
Suggested Test Plan:

1) Find an account which has errata auto-updates disabled.

2) Create and publish a new errata.

3) Wait for the errata to get applied.

You can monitor the progress of errata processing by:
 3a) Edit /usr/share/rhn/classes/log4j.properties on scripts.back-webqa.redhat.com
 3b) Add the following line:
     log4j.logger.com.redhat.rhn.task.ErrataCacheTask=DEBUG
 3c) Bounce taskomatic: /sbin/service taskomatic restart
 3d) Monitor the /var/log/rhn/rhn_taskomatic_daemon.log file

4) Once errata processing is complete, verify that the server used in step #1
does not have the errata scheduled.

-- Additional comment from pthomas on 2007-03-22 14:18 EST --
fails qa.
looks like its not fixed.
ksmith is already looking in to it.


-- Additional comment from ksmith on 2007-03-22 16:40 EST --
Ok. Let's try this again with a different test path. We'll need to simulate the
effects of prod-ops' errata scripts so some manual database munging is involved.

1) Use the /svn/trunk/eng/backend/server/test/test_errata_import_webqa.py script
to create a new errata. Change the advisory name before running the script. I've
used the house number part of my address to try and insure uniqueness.

2) After the script has run, execute the following SQL query against the webqa
database:

select id from rhnErrata where created > sysdate - 1 and created < sysdate and
advisory like '%ADVISORY_NAME%'

Note: Replace the term ADVISORY_NAME with the unique value you selected in step
1. Make sure to leave the percent signs in place otherwise the query will not
work correctly.

3) Jot down the id returned from the query.

4) Create a work entry for taskomatic so it can process the new errata. This is
normally done by prod-ops' scripts, but we'll have to simulate it here.

insert into rhnErrataQueue values (ERRATA_ID, sysdate, sysdate, sysdate);
commit;

Note: Replace the term ERRATA_ID with the id from step 3.

5) Wait for taskomatic to process this record. I'd suggest waiting 15-20 minutes.

6) Login to an account which has servers with auto errata updates disabled.

7) Verify that these servers _do not_ have a pending errata action for the newly
created errata.

-- Additional comment from pthomas on 2007-03-23 08:50 EST --
verified.

Thanks kevin for the detailed test plan.

-- Additional comment from bretm on 2007-03-26 09:41 EST --
I've checked this in stage using my own account... 

System 1007229597 has a bunch of errata scheduled (correct)
System 1007229598 does *not* have a bunch of errata scheduled (correct)



-- Additional comment from bretm on 2007-03-26 09:49 EST --
Moving to RELEASE_PENDING

-- Additional comment from akrherz on 2007-11-09 11:39 EST --
Greetings,

This problem appears to be back as I have most of my auto-apply systems yet to
apply RHEL5.1 .. Some SSIDs for ya:

1007413619
1008254853

(I have lots more if necessary)

daryl

Comment 4 Máirín Duffy 2007-11-14 15:22:50 UTC
What we have seen in the one customer case we know of:

- ~40 errata out of the ~136 RHEL 5.1 GA errata were successfully scheduled on
auto-errata update enabled systems.

- 4 asynchronous errata (openldap 6255, tetex 6254, pcre 6256, kdegraphics 6257)
released post 5.1 GA were correctly scheduled and updated.

- If rhn_check is run on an affected system, the 5.1 GA errata are not applied.
(and they are not queued up to be applied as 'pending actions' in the webui for
that system) 

- it seems that if other actions are scheduled, and rhn_check is ran, those
actions are scheduled and take place without issue. 

Comment 5 Máirín Duffy 2007-11-14 15:23:19 UTC
Created attachment 258171 [details]
package list

attaching a customer-provided list of errata that did not get scheduled for the
affected systems.

Comment 7 daryl herzmann 2007-12-10 22:13:00 UTC
Hi,

Currently, for errata released today

samba     RHSA-2007:1114  113 systems in "None" status for me
python    RHSA-2007:1077  0 systems currently scheduled for it
sysreport RHBA-2007:1071  1 system in "Pending"

We are still waiting for RHEL5 and 4.6 to be fully scheduled for our account.

daryl

Comment 8 daryl herzmann 2007-12-12 03:48:45 UTC
Hi,

A data later and another data point.

samba     RHSA-2007:1114  112 systems in "None" status for me
python    RHSA-2007:1077  is OK now
sysreport RHBA-2007:1071  1 system in "Pending"

The 1 less for samba, is because I manually applied it on one machine as a test.

daryl

Comment 9 daryl herzmann 2007-12-13 19:47:30 UTC
Good day,

No meaningful change with the samba or sysreport errata.  

Autofs and lvm2 errata's on 12 Dec seemed to get scheduled just fine.

Any updates from your side?

thanks,
  daryl

Comment 11 daryl herzmann 2007-12-27 17:26:50 UTC
Greetings,

Still no meaningful change from last updates.  Still waiting for samba,
sysreport, RHEL 5.1 and RHEL 4.6 to be scheduled.

daryl

Comment 12 daryl herzmann 2008-01-08 14:41:05 UTC
Good day,

Some script/action was run against our account last night scheduling all sorts
of errata against our "Auto-Errata" machines.  Any comments on what was done?

thanks!
  daryl

Comment 13 Dave Edsall 2008-01-09 15:02:05 UTC
It would appear that a script or action was run against Iowa State machines
again last night. Two days straight now, RHN has attempted to apply updates that
have already been applied for machines set to Auto Errata update. 

Here's a line of the output of /var/log/up2date for the machine with system ID
1008565488:


[Tue Jan  8 19:57:31 2008] up2date RPM package conflict error.  The message was:
Test install failed because of package conflicts:
package libpng-1.2.7-3.el4_5.1 is already installed
package libpng-devel-1.2.7-3.el4_5.1 is already installed


More alarming is an attempt to install a kernel (which I assume would be
prevented by the skiplist):

[Tue Jan  8 19:57:24 2008] up2date RPM package conflict error.  The message was:
Test install failed because of package conflicts:
package kernel-smp-devel-2.6.9-67.0.1.EL (which is newer than kernel-smp-devel-2
.6.9-55.0.9.EL) is already installed
package kernel-hugemem-devel-2.6.9-67.0.1.EL (which is newer than kernel-hugemem
-devel-2.6.9-55.0.9.EL) is already installed





Comment 14 daryl herzmann 2008-01-09 15:04:11 UTC
Created attachment 291159 [details]
Daily RHN Status emails

Daily status emails from 8 Jan and 9 Jan showing the many old errata getting
scheduled.  All of these actions were scheduled around 7 PM on the 7th and 8th.

Comment 15 daryl herzmann 2008-01-09 16:08:37 UTC
task-o-matic is running again this morning :)  Lots of errata from 2007 getting
scheduled.  

Comment 17 daryl herzmann 2008-01-10 15:05:24 UTC
Created attachment 291287 [details]
RHN daily status report 10 Jan 2008

Today's RHN status email showing all sorts of errata getting scheduled.
Including:

 RHSA-2007:0848 	   207

But I only have 114 systems :)

Comment 18 daryl herzmann 2008-01-11 21:36:08 UTC
Here is probably the best example of this problem:

https://rhn.redhat.com/rhn/errata/details/SystemsAffected.do?eid=6085

Showing 9 of my systems in "None" status for redhat-release Errata included with
RHEL5.1

SSID: 1008254853  remains in its untouched state for you folks to diagnose if
you wish.

Comment 19 daryl herzmann 2008-01-18 15:00:31 UTC
Created attachment 292154 [details]
RHN Status Email for 18 Jan 2008

Got this status email this morning showing a couple hundred errata from 2007
getting queued.  The only problem is that I can find no evidence of this on RHN
web UI under scheduled or completed actions.

Comment 21 daryl herzmann 2008-01-31 03:54:53 UTC
Wow,

We are still waiting for RHEL 4.6, RHEL 5.1, the samba errata and any sort of
explaination for what happened on 7 January.  Is anybody at redhat looking at
this bugzilla?  The only update we get from GSS is that there is no update to
report.

Comment 27 daryl herzmann 2008-02-22 23:25:03 UTC
Red Hat,

I am very concerned that this bug isn't being addressed.  This bug is not the
RHEL4 client + rhn satellite bug that you are feverishly working to fix.  

If this bug is being worked on, what bugzilla entry contains the progress?

thanks,
  daryl

Comment 28 Máirín Duffy 2008-02-22 23:43:19 UTC
Daryl, this bug is clearly and correctly filed under RHN (hosted), not Satellite. 

Comment 29 daryl herzmann 2008-02-22 23:53:13 UTC
Thanks Máirín,

But where is the activity on this bug taking place?  GSS told me there are a
number of private bz entries with lots of activity regarding this issue.  I
highly doubt they are against RHN hosted.  Those are sat customers.

Does red hat believe this issue and the rhel4 client on rhn satellite to be related?

daryl

Comment 30 Máirín Duffy 2008-02-23 00:42:57 UTC
Hi Daryl, I'm not sure where you are seeing a connection to a satellite bug in
this bug, but I don't see it; the private comments are very much hosted
specific. I do not know if they are thought to be related issues.

Comment 31 daryl herzmann 2008-02-23 04:23:07 UTC
Hi Máirín,

Okay, thanks as always for the help.  I sure hope this is fixed before RHEL 5.2
release.

daryl

Comment 37 daryl herzmann 2008-04-07 19:03:37 UTC
Hi,

Was this actually fixed in rhn 5.0.5 ?  Or is it pending some other release?

thanks!
  daryl

Comment 38 Amy Owens 2008-04-07 20:15:08 UTC
Hi-
The issue about 4.6/5.1 errata was fixed.  I am going to open another BZ
regarding seamonkey not getting scheduled (BZ 441389).  Dev is currently looking
at the process to push errata as it appears that some are not getting scheduled
correctly.