Bug 248535

Summary:	sealert hogs cpu at login
Product:	[Fedora] Fedora	Reporter:	cje
Component:	setroubleshoot	Assignee:	John Dennis <jdennis>
Status:	CLOSED CURRENTRELEASE	QA Contact:	Fedora Extras Quality Assurance <extras-qa>
Severity:	medium	Docs Contact:
Priority:	low
Version:	7
Target Milestone:	---
Target Release:	---
Hardware:	All
OS:	Linux
Whiteboard:
Fixed In Version:	setroubleshoot-2.0.1-1	Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2008-01-09 17:00:50 UTC	Type:	---
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description cje 2007-07-17 10:29:51 UTC

Description of problem:
every time i log in sealert uses up all cpu for a few minutes at least.  i
suspect it's because i had an selinux problem with munin that put a few alerts
in the log files every five minutes for many days.  now there's thousands of
entries in the log files and i guess sealert is scanning through them all every
time.

Version-Release number of selected component (if applicable):
setroubleshoot-1.9.4-2.fc7

not sure what to do about it.  somehow setroubleshoot/sealert seems to know
what's 'new' and what isn't.  perhaps whatever method is used for that could be
extended so that it doesn't take so long to scan each time.

sorry, i realise this is a bit vague.  just not sure what details to add or how
to diagnose/debug further.  let me know if there's anything i can do to assist.

Comment 1 John Dennis 2007-07-17 12:45:21 UTC

How many alerts do you have? Look at the status bar on the bottom, there is a
field with two numbers seperated by a slash (e.g. 123/123). That is the total
count and visible count. 

If you have a large mumber and you think that's the culprit then you should try
deleting them. Select as many as you want, choose "Mark Delete" from the edit
menu and then "Remove Mark Deleted" to actually remove them. Does this solve the
problem?

Comment 2 cje 2007-07-17 12:59:20 UTC

right.  in the setroubleshoot browser it says 3433/3433.  the first 200 or so
are crossed out (strikethrough) in the list.  looks like i tried ctrl-A, Delete
before but didn't know there's a 'Really delete' option.

did ctrl-A, Delete again.  still waiting for the window to respond.  there's no
significant cpu or disk activity but i imagine it's busy applying strikethrough
to 3200 list entries.  :-)

when/if it comes back i'll do a ctrl-E and then logout and login and we'll see.

thanks for the info.

Comment 3 cje 2007-07-17 13:01:24 UTC

sorry - thought bugzilla was going to ask me if i'd provided the requested info.
 must be the gnome bugzilla that does that.

Comment 4 cje 2007-07-17 15:27:36 UTC

all better now thanks!  it's taken all this time to clear out that list - if i
selected more than a few hundred at a time it froze after the ctrl-e.  killing
it and restarting took a long time and then i had to undelete some of the marked
items before i could successfully ctrl-e again!

so, thanks again for the tip.

Comment 5 John Dennis 2007-07-17 16:05:11 UTC

O.K. glad that worked. However, that is only a workaround, we really shouldn't
get into that situation to begin with. 3.5K alerts is way more than the system
was designed for.

In theory the system was designed to recognize a AVC denial is just another
instance of a previously seen alert and it just increments the count on the
alert rather than adding a new alert. What I think should have happened is that
you would have had a handful of alerts whose count was 3.5K and not 3.5K
individual alerts.

You've since deleted the alert database, but if you continue to get these alerts
I need to know what they are so I can track down why you were getting this
behavior. So if you see the setroubleshoot browser filling up with what seems
like the same problem alert then please select it, from the edit menu select
"Copy Alert" and then paste that into this bugzilla. Thanks!

Comment 6 cje 2007-07-17 23:38:34 UTC

fair enough.  i've recorded the denials in bug 248304.  easy enough to reproduce
if you need more details.

i'm guessing it doesn't realise they're the same because it's denying read/write
on a socket - perhaps the id of the socket is different with each event.

on the same note, the suggestions need to be changed for socket errors - it says
i can restorecon socket:[237623] (or words to that effect), which doesn't sound
right.  (certainly doesn't work anyway).

more weirdness, for completeness:

deleted all alerts from another host.  sealert then hung.  running it again just
hangs.  restarted setroubleshootd a few hours later (with a bunch more messages
in the system log) - a minute of 100%cpu (and this is on a faster host) ..
sealaert still hangs with the spinner going back and forth over "Load audit" in
the status bar.

Comment 7 John Dennis 2008-01-09 17:00:50 UTC

these issues should be resolved in the 2.0 setroubleshoot series. We now cap the
maximum number of alerts at 30 and we strip instance information from socket
names so they no longer appear unique.