447862 – upgrades of selinux policy cause kernel oops

Bug 447862 - upgrades of selinux policy cause kernel oops

Summary: upgrades of selinux policy cause kernel oops

Keywords:
Status:	CLOSED WORKSFORME
Alias:	None
Product:	Fedora
Classification:	Fedora
Component:	kernel
Sub Component:
Version:	9
Hardware:	x86_64
OS:	Linux
Priority:	low
Severity:	medium
Target Milestone:	---
Assignee:	Eric Paris
QA Contact:	Fedora Extras Quality Assurance
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2008-05-22 07:06 UTC by Marcel Kyas
Modified:	2008-06-17 15:45 UTC (History)
CC List:	5 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2008-06-17 15:45:36 UTC
Type:	---
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
Kernel error message observed while updating policy (5.67 KB, text/plain) 2008-05-22 07:06 UTC, Marcel Kyas	no flags	Details
Booting F9 with kernel 2.6.24.7-92.fc8 (127.18 KB, text/plain) 2008-05-26 07:25 UTC, Marcel Kyas	no flags	Details
Log messages of upgrade to selinux-policy-3.3.1-55.fc9 (77.33 KB, text/plain) 2008-05-31 11:53 UTC, Marcel Kyas	no flags	Details
View All

Description Marcel Kyas 2008-05-22 07:06:46 UTC

Description of problem:

Whenever yum upgrades selinux-policy-targeted, semodule blocks and I have the 
attached error in dmesg.  After the error occurred, access to the file system
succeeds at random.

Version-Release number of selected component (if applicable):

2.6.25.3-18.fc9.x86_64


How reproducible:

Always

Steps to Reproduce:
1. yum update selinux-policy-targeted
2.
3.
  
Actual results:

Kernel crashes as shown above.

Expected results:

The update finishes without error.

Additional info:

I have observed the same error with previous kernels but suspected that it was
caused by nvidia's kernel module.

Comment 1 Marcel Kyas 2008-05-22 07:06:46 UTC

Created attachment 306341 [details]
Kernel error message observed while updating policy

Comment 2 Stephen Smalley 2008-05-22 17:43:26 UTC

I wasn't able to reproduce the oops; I did see the invalidating context message
that appeared in the attachment (along with others) but it proceeded.  Tried on
x86_64 and x86 with the same kernel and updating from the f9-shipped policy to
the latest update.

If I read the oops correctly, we're setting up for the initial hashtab_search
call in mls_convert_context and we end up with a negative array index, i.e.
c->range.level[l].sens was zero, which is illegal and suggests something went
wrong earlier.

Comment 3 Stephen Smalley 2008-05-22 17:46:37 UTC

What does /usr/sbin/semanage user -l show?
I noticed you have 8 users in the attached output; there are only 6in the stock
policy.

Comment 4 Marcel Kyas 2008-05-22 18:57:35 UTC

/usr/sbin/semanage user -l shows

SELinux User    SELinux Roles

guest_u         guest_r
root            system_r staff_r unconfined_r sysadm_r
staff_u         system_r staff_r sysadm_r
sysadm_u        sysadm_r
system_u        system_r
unconfined_u    system_r unconfined_r
user_u          user_r
xguest_u        xguest_r

Comment 5 Stephen Smalley 2008-05-22 19:10:56 UTC

Oh, I didn't have xguest installed, so that at least accounts for the xguest_u.
 Not sure about the guest_u difference.
The more troubling aspect is that the output above indicates that semanage
thinks you have MLS/MCS disabled.
/usr/sbin/sestatus shows what?
cat /selinux/mls shows what?

Comment 6 Marcel Kyas 2008-05-22 20:07:04 UTC

Yes.  It is troublesome.  /selinux should be a virtual filesystem similar to
proc and sysfs, shouldn't it?  At least this is the information that might be
relevant to you.

# /usr/sbin/sestatus
SELinux status:                 disabled

# cat /selinux/mls
cat: /selinux/mls: No such file or directory

# ls /selinux/
#

# cat /etc/selinux/config:

# This file controls the state of SELinux on the system.
# SELINUX= can take one of these three values:
#       enforcing - SELinux security policy is enforced.
#       permissive - SELinux prints warnings instead of enforcing.
#       disabled - SELinux is fully disabled.
SELINUX=enforcing
# SELINUXTYPE= type of policy in use. Possible values are:
#       targeted - Only targeted network daemons are protected.
#       strict - Full SELinux protection.
SELINUXTYPE=targeted

# SETLOCALDEFS= Check local definition changes
SETLOCALDEFS=0 

# mount
/dev/sdb1 on / type ext3 (rw,noatime,nodiratime)
proc on /proc type proc (rw)
sysfs on /sys type sysfs (rw)
devpts on /dev/pts type devpts (rw,gid=5,mode=620)
/dev/mapper/Priss-var on /var type ext3 (rw,noatime,nodiratime)
/dev/mapper/Priss-usr on /usr type ext3 (rw,noatime,nodiratime)
/dev/mapper/Priss-tmp on /tmp type ext3 (rw,noatime,nodiratime)
/dev/mapper/Priss-home on /home type ext3 (rw,noatime,nodiratime)
/dev/sda1 on /boot type ext3 (rw,noatime,nodiratime)
tmpfs on /dev/shm type tmpfs (rw)
none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)
fusectl on /sys/fs/fuse/connections type fusectl (rw)

Comment 7 Stephen Smalley 2008-05-22 20:12:35 UTC

Yes, selinuxfs is a pseudo filesystem that is only present if SELinux is enabled
in the kernel.  As it seems to be disabled on your box, it seems you've disabled
it?  By specifying selinux=0 at boot I'd guess since your /etc/selinux/config
has enforcing.  Did you perhaps disable it because of the earlier kernel oopses?

Comment 8 Marcel Kyas 2008-05-22 20:44:52 UTC

No, I did not disable it on the command line.  I have rebooted the machine 10
minutes ago (I needed to use kernel-2.6.24.7-92.fc8.x86_64 because of another
bug).  This time, it ran a relabel.

With kernel-2.6.25.3-18.fc9.x86_64 it shows

SELinux status:                 enabled
SELinuxfs mount:                /selinux
Current mode:                   enforcing
Mode from config file:          enforcing
Policy version:                 22
Policy from config file:        targeted

[root@localhost ~]# cat /selinux/mls 
1

Comment 9 James Morris 2008-05-22 22:51:29 UTC

(In reply to comment #8)
> No, I did not disable it on the command line.  I have rebooted the machine 10
> minutes ago (I needed to use kernel-2.6.24.7-92.fc8.x86_64 because of another
> bug).  This time, it ran a relabel.

May I ask what the other bug relates to?

Comment 10 James Morris 2008-05-22 22:54:02 UTC

FWIW, I have xguest installed and have been doing yum updates on x86_64 with
apparently no problems.

Comment 11 Chuck Ebbert 2008-05-23 05:20:55 UTC

Looking at the stack trace, we have sidtab_map_remove_on_error() in the trace,
so maybe some kind of error happened when loading the policy.

I can confirm oops did happen here in mls_convert_context():
                levdatum = hashtab_search(newp->p_levels.table,
                        oldp->p_sens_val_to_name[c->range.level[l].sens - 1]);

c->range.level[l].sens was 0

Comment 12 Marcel Kyas 2008-05-23 05:58:11 UTC

(In reply to comment #9)
> (In reply to comment #8)
> > No, I did not disable it on the command line.  I have rebooted the machine 10
> > minutes ago (I needed to use kernel-2.6.24.7-92.fc8.x86_64 because of another
> > bug).  This time, it ran a relabel.
> 
> May I ask what the other bug relates to?

The other bug is https://bugzilla.redhat.com/show_bug.cgi?id=447872

Comment 13 Stephen Smalley 2008-05-23 11:53:15 UTC

(In reply to comment #11)
> Looking at the stack trace, we have sidtab_map_remove_on_error() in the trace,
> so maybe some kind of error happened when loading the policy.
> 
> I can confirm oops did happen here in mls_convert_context():
>                 levdatum = hashtab_search(newp->p_levels.table,
>                         oldp->p_sens_val_to_name[c->range.level[l].sens - 1]);
> 
> c->range.level[l].sens was 0

To explain:  The new policy invalidated a previously valid security context in
the old policy (as noted in the kernel output), thereby requiring its removal
from the SID table.  That isn't an error per se, and the system is supposed to
handle it gracefully.  I do see those warnings upon the upgrade, but don't end
up with an Oops.  Has anyone else replicated the actual Oops?

With regard to the trace, yes, that was my understanding as well, but sens == 0
is an illegal value and indicates some earlier error (e.g. memory corruption).

BTW, we'll see somewhat different behavior in F10, as the handling of
invalidated contexts has changed there for the deferred mapping of contexts
support (for setting down unknown file contexts from rpm for buildsys).  No more
removal of entries from the SID table there.  But the convert code remains, and
I don't know of an actual bug in that code.

Comment 14 Stephen Smalley 2008-05-23 11:57:37 UTC

(In reply to comment #8)
> No, I did not disable it on the command line.  I have rebooted the machine 10
> minutes ago (I needed to use kernel-2.6.24.7-92.fc8.x86_64 because of another
> bug).  This time, it ran a relabel.

You still must have disabled SELinux when you booted the old kernel, as your
system had SELinux disabled (based on the output you posted) at that time, and
the relabel upon boot happens when booting a kernel with SELinux enabled after
previously running with it disabled.  If you didn't do that intentionally via
selnux=0 on the kernel command line or in your grub.conf for that kernel, I'd
like to know why it happened.  /var/log/messages output from that kernel's boot
would tell the story.

> With kernel-2.6.25.3-18.fc9.x86_64 it shows
> 
> SELinux status:                 enabled
> SELinuxfs mount:                /selinux
> Current mode:                   enforcing
> Mode from config file:          enforcing
> Policy version:                 22
> Policy from config file:        targeted
> 
> [root@localhost ~]# cat /selinux/mls 
> 1
 
Much better.  And semanage user -l shows what now?

Comment 15 Marcel Kyas 2008-05-26 07:25:39 UTC

Created attachment 306653 [details]
Booting F9 with kernel 2.6.24.7-92.fc8

I have booted with that kernel twice in the row.  I have NVidia's binary driver
installed for this kernel.  This was the reason I did not file a bug I first
saw the Oops.

Comment 16 Marcel Kyas 2008-05-26 07:27:32 UTC

(In reply to comment #14)
> You still must have disabled SELinux when you booted the old kernel, as your
> system had SELinux disabled (based on the output you posted) at that time, and
> the relabel upon boot happens when booting a kernel with SELinux enabled after
> previously running with it disabled.  If you didn't do that intentionally via
> selnux=0 on the kernel command line or in your grub.conf for that kernel, I'd
> like to know why it happened.  /var/log/messages output from that kernel's boot
> would tell the story.

I did not do it intentionally.  See above's comment for the attachement.

> > With kernel-2.6.25.3-18.fc9.x86_64 it shows
> > 
> > SELinux status:                 enabled
> > SELinuxfs mount:                /selinux
> > Current mode:                   enforcing
> > Mode from config file:          enforcing
> > Policy version:                 22
> > Policy from config file:        targeted
> > 
> > [root@localhost ~]# cat /selinux/mls 
> > 1
>  
> Much better.  And semanage user -l shows what now?
> 
                Kennzeichnung MLS/       MLS/                          
SELinux-User    Präfix    MCS-Stufe  MCS-Bereich                    SELinux-Rollen

guest_u         guest      s0         s0                             guest_r
root            user       s0         SystemLow-SystemHigh           system_r
staff_r unconfined_r sysadm_r
staff_u         user       s0         SystemLow-SystemHigh           system_r
staff_r sysadm_r
sysadm_u        user       s0         SystemLow-SystemHigh           sysadm_r
system_u        user       s0         SystemLow-SystemHigh           system_r
unconfined_u    unconfined s0         SystemLow-SystemHigh           system_r
unconfined_r
user_u          user       s0         s0                             user_r
xguest_u        xguest     s0         s0                             xguest_r

Comment 17 Stephen Smalley 2008-05-27 15:05:58 UTC

Ok, that makes sense - booting your old F8 kernel+initrd won't ever load policy
because initial policy load moved from /sbin/init to the initrd in F9.
Still don't know why you are hitting that Oops though; it means we have an
invalid context structure in the SID table already when we reach the policy reload.
mls_level_isvalid() would have rejected such a sensitivity of zero.

Comment 18 Marcel Kyas 2008-05-31 11:47:54 UTC

(In reply to comment #17)
> Ok, that makes sense - booting your old F8 kernel+initrd won't ever load policy
> because initial policy load moved from /sbin/init to the initrd in F9.
> Still don't know why you are hitting that Oops though; it means we have an
> invalid context structure in the SID table already when we reach the policy
reload.
> mls_level_isvalid() would have rejected such a sensitivity of zero.

What do I have to look for?

Maybe the following helps in narrowing the problem down: Yesterday I have booted
the kernel with noirqdebug (as advised in bug #447872).  PackageKit managed to
update selinux-policy-targeted to version 3.3.1-42.fc9.

This time I did not get an Oops, but instead I am seeing lot's of 

May 30 23:01:58 localhost kernel: inode_doinit_with_dentry:  context_to_sid(unco
nfined_u:object_r:unconfined_mozilla_home_t:s0) returned 22 for dev=dm-3 ino=743
8345

messages in my log file.

Comment 19 Marcel Kyas 2008-05-31 11:53:52 UTC

Created attachment 307272 [details]
Log messages of upgrade to selinux-policy-3.3.1-55.fc9

The attachement shows yesterdays messages.  This time, no oops while updating
the policy has been observed.  Instead, many messages concerning
inode_doinit_with_dentry occured.

Comment 20 Stephen Smalley 2008-06-03 13:50:59 UTC

This means that a security context that is on the disk is no longer valid under
policy.  To fix, run restorecon -RF $HOME

Comment 21 Eric Paris 2008-06-17 15:45:36 UTC

I guess at this point I'm going to call the oops random memory corruption and
close this as 'WORKSFORME.' I think all of the other issues were
explained/corrected by sds.  If anyone can come up with a way to do this again
feel free to reopen...

Note You need to log in before you can comment on or make changes to this bug.