Bug 436032

Summary: SEtroubleshoot browser hangs 'mark delete' or 'remove marked deleted'
Product: [Fedora] Fedora Reporter: Andrew Farris <lordmorgul>
Component: setroubleshootAssignee: Thomas Liu <tliu>
Status: CLOSED WONTFIX QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: low Docs Contact:
Priority: low    
Version: 9   
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2009-07-14 18:24:02 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
setroubleshoot-hang-large-markdelete.txt
none
audit_listener_database.xml (3Mb)
none
audit_listener_database.xml with more than 20 audits, 99k file none

Description Andrew Farris 2008-03-04 22:11:42 UTC
Description of problem:
SEtroubleshoot browser hangs when attempting to 'mark delete' or 'remove marked
deleted' for a large number of denials.  If I select more than 20 or so.. it
hangs.  The process just goes to sleep, no cpu spin or errors shown.

Version-Release number of selected component (if applicable):
setroubleshoot-2.0.6-1.fc9.noarch

How reproducible:
Every time.  I can delete entries if I do them about 10 at a time, but it takes
a long time to complete that.

Steps to Reproduce:
1. select 20 denials
2. mark delete
3. remove marked deleted

I'll work on more info.  It does not appear to just be taking a long time to
handle the job.

Comment 1 John Dennis 2008-03-04 22:22:08 UTC
Did you restart the sealert program after installing?

In 2.0 you can't have more than 30 alerts total unless you overrode the defaults
so something sounds fishy.

Please attach /var/lib/setroubleshoot/audit_listner_database.xml.


Comment 2 John Dennis 2008-03-04 22:23:08 UTC
Also please attach /var/log/setroubleshoot/setroubleshootd.log

Comment 3 Andrew Farris 2008-03-04 22:37:31 UTC
Created attachment 296822 [details]
setroubleshoot-hang-large-markdelete.txt

Here is a backtrace of its state when already hung (attached after).  First t a
a bt, then continued for 10mins (nothing happening I just walked away), then t
a a bt again.  The app is just fully stopped and doing nothing.

Comment 4 Andrew Farris 2008-03-04 22:42:10 UTC
Created attachment 296824 [details]
audit_listener_database.xml (3Mb)

I see this in the log.
2008-03-04 13:53:04,611 [rpc.ERROR] could not send data on socket
({unix}/var/run/setroubleshoot/setroubleshoot_server socket=0xb7af8aac):
Connection reset by peer
2008-03-04 14:03:53,619 [rpc.ERROR] could not send data on socket
({unix}/var/run/setroubleshoot/setroubleshoot_server socket=0xb7af8aac):
Connection reset by peer
2008-03-04 14:34:21,707 [rpc.ERROR] could not send data on socket
({unix}/var/run/setroubleshoot/setroubleshoot_server socket=0xbb34e2c):
Connection reset by peer

Thats all that is there.

Comment 5 Andrew Farris 2008-03-04 22:44:05 UTC
This is probably an audit database thats been sitting around awhile then. :)
I've rebooted since installing yes, but the database has entries I had left
there from weeks back.

I'll move it and see what happens when I get new audits showing up.

Comment 6 John Dennis 2008-03-04 22:47:06 UTC
Please restart both components:

as root:

% service setroubleshoot restart

in your session:

% sealert -q
% sealert -s

Does the problem persist?

Comment 7 John Dennis 2008-03-04 22:49:12 UTC
re comment #5, the new 2.0 daemon is supposed to wipe a 1.x database clean, in
theory the 2.0 daemon shouldn't be operating on an old format database.

Comment 8 Andrew Farris 2008-03-04 22:56:35 UTC
I'll get back to you on this later (class time), but I'll try with and without that old database and see if its 
properly wiping it.

Comment 9 Andrew Farris 2008-03-05 05:07:14 UTC
Ok John
Restarting service setroubleshoot does not remove the old 3Mb database. 
Starting the browser takes a long time to read it, and the hang issue is there
when trying to remove entries from that huge file.

If I move the database and start a new emtpy one by restarting setroubleshoot
again, then use sealert -s, things work smoothly, but there are too few entries
now to know what will happen later.  I have 3 alerts in my new file, and I can
mark and unmark them, but I can do that with the old file if I go a few at a
time as well.

Nothing in the larger old file seems significantly different (as far as database
type, xml hierarchy, etc).  Is the file the correct database type for the 2.0
daemon?  If so, why did it get so large if its supposed to limit to 20 now?

I'll keep the new empty database and see what happens as I get more audits over
the next few days.

Comment 10 John Dennis 2008-03-05 14:32:11 UTC
Thanks for the information Andrew. Here is what is supposed to happen. The top
level node in the xml has a version attribute. When the daemon starts up the
first thing it does in read the database, but if the versions do not match it
ignores what it just read, thus the file is not deleted or truncated. The first
time the daemon needs to write the database (because a new alert arrived) it
writes it in the new format. If you have matching versions of the daemon and the
gui component this should work. If for some reason you have an old version of
setroubleshoot-sever I would expect the behavior you're seeing.

The 1.0 version of the database is similar, but not identical to the 2.0
version, cursory examination probably won't reveal the differences.

The version information should be the 2nd line in the xml file, e.g.

<sigs version="3.0">

could you please tell me the database version of the old large file giving you
problems and the rpm version of setroubleshoot-server?

As to how the old database got so large, the like answer is in the old 1.0
version we failed to recognize some classes of alerts as representing the same
issue and kept adding them to the database rather than updating a single alert.
In 2.0 we've hopefully fixed many of those. The reason we now limit the number
of alerts to 30 (by default) is to prevent explosive growth if the system gets
hammered by a lot of AVC's or we fail to merge a new alert into a previous one.

Comment 11 Andrew Farris 2008-03-05 19:09:20 UTC
<?xml version="1.0" encoding="utf-8"?>
<sigs version="3.0">

I have setroubleshoot-server-2.0.6-1.fc9.noarch.  The 3Mb database was not being
truncated with new denials being written to it, the newest denials were the day
I posted this and the oldest were kept from 2/17 I believe (hundreds of them in
the file).

I'll poke more later with some badly labeled files and force test what happens
some more now that I know what to look for.

Comment 12 Andrew Farris 2008-03-13 07:59:12 UTC
Created attachment 297902 [details]
audit_listener_database.xml with more than 20 audits, 99k file

This audit database is from another of my machines, and it also has older
audits, back into mid Feb, but its being currently loaded with more than 20
audit entries showing.	I also watched it grow by several entries today.

setroubleshoot-2.0.6-1.fc9.noarch

I have not had a chance to really recreate this with a fresh audit database and
go from zero to above 20.

Comment 13 Andrew Farris 2008-03-13 08:01:06 UTC
Oops, sorry I forgot its supposed to cut-off at 30.  That last attachment can be
ignored for the moment, its just 24 audits.

Comment 14 Andrew Farris 2008-03-21 21:35:14 UTC
John, I haven't had a good way of testing this and been short of time so I
didn't try anything too clever yet.  I did do a quick edit of my current
database to insert a copy of a current <siginfo> block several extra times until
the database has 33 total events shown in the browser.  When the service is
restarted this does not get truncated to 30, and when the browser opens it shows 33.

When I forced a new audit to be generated it added the 34th entry to the
database, and did not truncate it to 30.  Do I misunderstand you or is that
never supposed to exceed 30 in the database now?  Is it possible my quick hack
is an inadequate method of testing (the copied entries are not counted toward
the 30? the browser shows 34/34 as the number of them and all show as separate
entries).

Comment 15 Andrew Farris 2008-03-21 21:36:16 UTC
By 'current database' above I meant a new one originally written by the current
setroubleshoot, not the 3Mb old database.

Comment 16 Andrew Farris 2008-03-21 21:42:08 UTC
/etc/setroubleshoot/setroubleshoot.cfg had a setting max_alerts that was set to
50 on my system.  Changing this to 20 and then repeating my test of causing a
new alert type got the same results, a new one is added making the total above
the max.

Comment 17 Andrew Farris 2008-04-08 02:38:41 UTC
I now have a database of alerts that has grown above the limit of 50 set in the
config, and this database was not manually edited.  I have 51 alerts shown.  It
looks like the truncation that is supposed to occur is not happening.

Comment 18 Bug Zapper 2008-05-14 05:46:46 UTC
Changing version to '9' as part of upcoming Fedora 9 GA.
More information and reason for this action is here:
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 19 Bug Zapper 2009-06-09 23:40:58 UTC
This message is a reminder that Fedora 9 is nearing its end of life.
Approximately 30 (thirty) days from now Fedora will stop maintaining
and issuing updates for Fedora 9.  It is Fedora's policy to close all
bug reports from releases that are no longer maintained.  At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '9'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 9's end of life.

Bug Reporter: Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 9 is end of life.  If you 
would still like to see this bug fixed and are able to reproduce it 
against a later version of Fedora please change the 'version' of this 
bug to the applicable version.  If you are unable to change the version, 
please add a comment here and someone will do it for you.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events.  Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

The process we are following is described here: 
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 20 Bug Zapper 2009-07-14 18:24:02 UTC
Fedora 9 changed to end-of-life (EOL) status on 2009-07-10. Fedora 9 is 
no longer maintained, which means that it will not receive any further 
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of 
Fedora please feel free to reopen this bug against that version.

Thank you for reporting this bug and we are sorry it could not be fixed.