Bug 896246 - sensitive info false positive on libsystemd-login.so.0
Summary: sensitive info false positive on libsystemd-login.so.0
Keywords:
Status: CLOSED DUPLICATE of bug 1009730
Alias: None
Product: Fedora
Classification: Fedora
Component: abrt
Version: 18
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Jiri Moskovcak
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
: 1008914 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-01-16 22:02 UTC by Karel Volný
Modified: 2015-02-01 22:55 UTC (History)
12 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2013-09-27 11:26:29 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
abrt-gui screenshot (105.58 KB, image/png)
2013-01-22 12:59 UTC, Karel Volný
no flags Details
abrt gui with highlighted search arrorws (102.83 KB, image/png)
2013-01-22 13:23 UTC, Jiri Moskovcak
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 998999 0 unspecified CLOSED It's very hard to find out how to see all your private data in the report 2021-02-22 00:41:40 UTC

Internal Links: 998999

Description Karel Volný 2013-01-16 22:02:44 UTC
Description of problem:
I F18, things crash every here and there. Trying to report via abrt, I'm getting false positives aout sensitive informations in backtraces and memory maps - the Artificial non-Inteligence matches the string "login" in the name of library "libsystemd-login.so.0"

Version-Release number of selected component (if applicable):
abrt-2.0.20-1.fc18.x86_64

How reproducible:
always (if libsystemd-login.so.0 is involved in the crash)

Steps to Reproduce:
1. install Fedora 18
2. try to use it for five minutes
3. run abrt-gui to report the crashes that happened meanwhile
4. choose some crash and try to report it

Actual results:
the wizard tells you there is sensitive info in backtrace
examining the textarea, you see that the string "login" in "libsystemd-login.so.0" is highlighted in red

Expected results:
"login" is not highlighted if it is a part of packaged file name

Additional info:
btw, it would be nice if the whole line with the offending keyword would be highlighted (with the keyword being in bold)so that you can spot the problems easier, even some buttons allowing to scroll to next/previous problem would be handy ... but that's a RFE material for a new bug, and now I'm lazy to search if somebody hasn't entered it already, bugzilla responses are in tenths of seconds now :-(

Comment 1 Jiri Moskovcak 2013-01-21 12:44:33 UTC
It has to be that "not-intelligent" because in some backtraces there can be variables like "mylogin", "mypassword", "tmppassword", .etc... So ignoring it just because it's part of the name of some library would be dangerous because the UI logic doesn't know what is the content it's highlighting...

As for the additional info part: There are arrows to jump thru all the highlighted words, so highlighting the whole line is not necessary.

Comment 2 Karel Volný 2013-01-22 12:58:29 UTC
(In reply to comment #1)
> It has to be that "not-intelligent" because in some backtraces there can be
> variables like "mylogin", "mypassword", "tmppassword", .etc... So ignoring
> it just because it's part of the name of some library would be dangerous
> because the UI logic doesn't know what is the content it's highlighting...

please read carefully

I'm NOT proposing to ignore the string "login"

> As for the additional info part: There are arrows to jump thru all the
> highlighted words, so highlighting the whole line is not necessary.

pardon my blindness, but I really do not see such arrows - could you highlight them on the attached screenshot please?

Comment 3 Karel Volný 2013-01-22 12:59:53 UTC
Created attachment 685121 [details]
abrt-gui screenshot

Comment 4 Jiri Moskovcak 2013-01-22 13:23:05 UTC
(In reply to comment #2)
> (In reply to comment #1)
> > It has to be that "not-intelligent" because in some backtraces there can be
> > variables like "mylogin", "mypassword", "tmppassword", .etc... So ignoring
> > it just because it's part of the name of some library would be dangerous
> > because the UI logic doesn't know what is the content it's highlighting...
> 
> please read carefully
> 
> I'm NOT proposing to ignore the string "login"

- ok, so you're just proposing to not highlight it, but I'm saying it's not possible to determine if the word "login" is mentioned in some security sensitive context or not without teaching ABRT to understand the context which is not worth it

> 
> > As for the additional info part: There are arrows to jump thru all the
> > highlighted words, so highlighting the whole line is not necessary.
> 
> pardon my blindness, but I really do not see such arrows - could you
> highlight them on the attached screenshot please?

- of course, please see attached picture

Comment 5 Jiri Moskovcak 2013-01-22 13:23:52 UTC
Created attachment 685135 [details]
abrt gui with highlighted search arrorws

Comment 6 Michal Toman 2013-08-09 12:30:51 UTC
There is no generic way to say whether string is OK or is not. It is better to show all suspicious strings than skip one important.

Comment 7 Karel Volný 2013-08-19 15:50:09 UTC
(In reply to Jiri Moskovcak from comment #4)
> - ok, so you're just proposing to not highlight it, but I'm saying it's not
> possible to determine if the word "login" is mentioned in some security
> sensitive context or not without teaching ABRT to understand the context
> which is not worth it

would that be possible to elaborate a bit what does that mean "not worth it"?

like, for example:

we think of two possible solutions

a) to keep false positive list
- easy to implement, estimated developer time 2 hours
- hard to maintain, estimated half a day monthly

b) create AI that would check the context of the suspicious match if that's a real filename
- harder to implement, estimated developer time one full day
- no maintenance cost
- little slowdown in processing, unimportant to the user as generating the backtraces and other tasks urge the user to go take a nap anyways

this compares with the fact that
- we are getting half a million number of abrt reports monthly and the number is expected to grow in the future
- our estimate (based on what?) is that only 0,1% suffer from such false positives, that makes 500 affected reports monthly
- examining such a false positive takes 5 minutes to inexperienced user and half a minute to experienced users, which makes 1 minute average in the mix
- that sums to one workday wasted monthly by examining false positives

as we value user time 100 times less than developer time, we may discard a)

and b) would mean that it'd take 100 months at current reporting rate = over 8 years to outweight the precious developer time, and we do not plan to maintain this code that long


... of course the numbers are completely made up

> > > As for the additional info part: There are arrows to jump thru all the
> > > highlighted words, so highlighting the whole line is not necessary.
> > 
> > pardon my blindness, but I really do not see such arrows - could you
> > highlight them on the attached screenshot please?
> 
> - of course, please see attached picture

thankyou ... not quite obvious to me, but once learned, it makes my life easier

(In reply to Michal Toman from comment #6)
> There is no generic way to say whether string is OK or is not.

I'm not talking about "generic way", I'm talking about one concrete case of false positive => you completely change point of this report, reopening

> It is better to show all suspicious strings than skip one important.

do you have any proof to this bold statement?

- usually, the reality proves otherwise ... for example, recently I've read some analysis that too much traffic signs leads to drivers overlooking and ignoring them rather than thinking about each one carefully, and I believe this principle to be generic

Comment 8 Kamil Páral 2013-08-21 08:33:03 UTC
I'm also often affected by this. libsecret or similar string is highlighted in error.

You claim that it's hard to distinguish user-related and system-related info in a plaintext file. I agree. But some of the whitelists could be really simple. 

For example, can some user data even end up in "maps" tab? From what I see, those are just memory addresses and file names. Shouldn't that be excluded from private info search completely?

As for "backtrace" tab, can't we whitelist something like this?
(/usr)?/lib(64)?/\S*libsecret-\S*\.so\.\S*
(/usr)?/lib(64)?/\S*libpasswd-\S*\.so\.\S*

Or maybe anything in /lib?
(/usr)?/lib(64)?/\S*

Or, I agree with Karel, it would be really simple to just test whether the matched text sequence is an existing file or not.

I don't think it's that hard to implement and it would avoid so many false positives. Are there some other reasons why not to do this?

Comment 9 Jakub Filak 2013-09-17 13:19:30 UTC
*** Bug 1008914 has been marked as a duplicate of this bug. ***

Comment 10 Jiri Moskovcak 2013-09-19 21:33:39 UTC
(In reply to Kamil Páral from comment #8)
> I'm also often affected by this. libsecret or similar string is highlighted
> in error.
> 
> You claim that it's hard to distinguish user-related and system-related info
> in a plaintext file. I agree. But some of the whitelists could be really
> simple. 
> 
> For example, can some user data even end up in "maps" tab? From what I see,
> those are just memory addresses and file names. Shouldn't that be excluded
> from private info search completely?
> 
> As for "backtrace" tab, can't we whitelist something like this?
> (/usr)?/lib(64)?/\S*libsecret-\S*\.so\.\S*
> (/usr)?/lib(64)?/\S*libpasswd-\S*\.so\.\S*
> 
> Or maybe anything in /lib?
> (/usr)?/lib(64)?/\S*

- yes, except the current implementation doesn't support regexps

> 
> Or, I agree with Karel, it would be really simple to just test whether the
> matched text sequence is an existing file or not.

- not without hardcoding some info (checking all the files from the list of loaded libraries might not be a good idea)

> 
> I don't think it's that hard to implement and it would avoid so many false
> positives. Are there some other reasons why not to do this?

- yes. it's actually quite complicated, because libreport doesn't know what data it processes and it simply searches all the text files

- so without adding some additional metadata the only solution is a simple whitelist which currently contains only:

login.so
libsecret.so

- if you have any other candidates, please add them into commit to rhbz#1009730

Comment 11 Nicolas Mailhot 2013-09-20 08:16:06 UTC
Also:
SHELL=/sbin/nologin/

Comment 12 Jiri Moskovcak 2013-09-27 11:26:29 UTC

*** This bug has been marked as a duplicate of bug 1009730 ***


Note You need to log in before you can comment on or make changes to this bug.