Bug 469079 - IBM Power5 systems require selinux=0 to boot after install
IBM Power5 systems require selinux=0 to boot after install
Product: Fedora
Classification: Fedora
Component: kernel (Show other bugs)
All Linux
medium Severity medium
: ---
: ---
Assigned To: Eric Paris
Fedora Extras Quality Assurance
Depends On:
Blocks: F10Blocker/F10FinalBlocker
  Show dependency treegraph
Reported: 2008-10-29 15:20 EDT by James Laska
Modified: 2013-09-02 02:28 EDT (History)
13 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2008-11-06 11:58:01 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)
objdump -d of selinux-bprm-post-apply-creds (18.42 KB, text/plain)
2008-10-30 14:11 EDT, Eric Paris
no flags Details

  None (edit)
Description James Laska 2008-10-29 15:20:18 EDT
Description of problem:

Unable to boot F10 on an IBM JS21 system.  The system stops during boot and displays no activity.  There is no VGA on this system, and sysrq-t output can be observed at: http://fpaste.org/paste/8257

Version-Release number of selected component (if applicable):

How reproducible:
Everytime on ibm-js21-03.test.redhat.com

Steps to Reproduce:
1. Install F10 (no encrypted devices)
2. Prior to reboot, remove rhgb and quiet from yaboot.conf
3. Reboot into installed system
Actual results:

System starts to boot, but stops after probing disks

Expected results:

Boots into expected runlevel
Comment 1 James Laska 2008-10-29 15:57:27 EDT
Recreated on ibm-505-lp1 a power5 virtualized guest.  This output includes booting with "plymouth:debug" enabled.

Comment 2 James Laska 2008-10-29 19:29:04 EDT
At the suggestion of Ray Strode, I booted with "plymouth:nolog plymouth:debug" this now shows the system in a kernel panic (full boot log available at http://fpaste.org/paste/8280):

Unable to handle kernel paging request for data at address 0xfffb70b7
Faulting instruction address: 0xc0000000001fee54
Oops: Kernel access of bad area, sig: 11 [#1]
SMP NR_CPUS=128 NUMA pSeries
Modules linked in: ipr
NIP: c0000000001fee54 LR: c0000000001fee30 CTR: c0000000001fed7c
REGS: c0000000f4053600 TRAP: 0300   Not tainted  (
MSR: 8000000000009032 <EE,ME,IR,DR>  CR: 24002488  XER: 20000001
DAR: 00000000fffb70b7, DSISR: 0000000040000000
TASK = c0000000f404c000[1] 'init' THREAD: c0000000f4050000 CPU: 0
GPR00: c0000000001fee30 c0000000f4053880 c0000000008f6b00 c0000000f404c000 
GPR04: 0000000000000058 0000000000000006 0000000000000000 c0000000f40538c0 
GPR08: 0000000000000000 00000000fffb70a7 0000000000d43000 0000000000000000 
GPR12: 0000000024002482 c00000000092f400 c0000000ed427b90 0000000000000001 
GPR16: c0000000edc383b4 c0000000ed467b00 c0000000ed467c00 00000000f7fd6cb4 
GPR20: 00000000f7fbe000 c0000000ed467800 0000000000000000 c0000000f4054000 
GPR24: c0000000f4803160 00000000f7fbe000 0000000000000000 00000000f7ffe648 
GPR28: 000000000003f4b0 0000000000000001 c000000000897918 c0000000ee760800 
NIP [c0000000001fee54] .selinux_bprm_post_apply_creds+0xd8/0x554
LR [c0000000001fee30] .selinux_bprm_post_apply_creds+0xb4/0x554
Call Trace:
[c0000000f4053880] [c0000000001fee30] .selinux_bprm_post_apply_creds+0xb4/0x554 (unreliable)
[c0000000f40539d0] [c0000000001f0948] .security_bprm_post_apply_creds+0x38/0x50
[c0000000f4053a50] [c000000000142e54] .compute_creds+0xf8/0x114
[c0000000f4053ae0] [c00000000018f74c] .load_elf_binary+0xf10/0x1690
[c0000000f4053c20] [c000000000142b28] .search_binary_handler+0x124/0x358
[c0000000f4053ce0] [c000000000181a0c] .compat_do_execve+0x180/0x24c
[c0000000f4053d90] [c000000000015668] .compat_sys_execve+0x74/0xb0
[c0000000f4053e30] [c000000000008770] syscall_exit+0x0/0x40
Instruction dump:
4182006c e87e8278 4836257d 60000000 e93f01e0 2fa90000 419e0028 e86d01b0 
e9290018 38a00006 38c00000 3ba00001 <e8890010> 4bffa689 2fa30000 409e0008 
---[ end trace f0a5452ca0e0233e ]---
Comment 3 James Laska 2008-10-29 19:38:47 EDT
As the previous kernel suggests, there must be something floating around in selinux land ... booting with "selinux=0" resolves the issue.
Comment 4 Stephen Smalley 2008-10-30 12:56:49 EDT
Can you disassemble the instruction dump?
Comment 5 Eric Paris 2008-10-30 14:03:30 EDT
Quick first pokes:

[root@ibm-505-lp1]# addr2line --exe=vmlinux --inline 0xc0000000001fee54

inside flush_unauthorized_files()

2125  if (tty) {
2126      file_list_lock();
2127      file = list_entry(tty->tty_files.next, typeof(*file), f_u.fu_list);
2128      if (file) {
2129              /* Revalidate access to controlling tty.
2130                 Use inode_has_perm on the tty inode directly rather
2131                 than using file_has_perm, as this particular open
2132                 file may belong to another process and we are only
2133                 interested in the inode-based check here. */
2134              struct inode *inode = file->f_path.dentry->d_inode;
2135              if (inode_has_perm(current, inode,
2136                                 FILE__READ | FILE__WRITE, NULL)) {
2137                      drop_tty = 1;
2138              }
2139      }
2140      file_list_unlock();
2141  }
Comment 6 Eric Paris 2008-10-30 14:11:18 EDT
Created attachment 321970 [details]
objdump -d of selinux-bprm-post-apply-creds
Comment 7 Eric Paris 2008-10-30 14:58:12 EDT
Obviously I really need to do some looking, but can tty->tty_files be empty?  can list_entry really return a NULL value?  I thought, list_entry basically just pointed backwards at memory from .next by some offset...

Seems to be what we really meant was

if (!list_empty(tty->tty_files))
   file = list_first_entry(tty->tty_files, struct file, f_u.fu_list)

have we just always had non-empty tty_files list and on this platform we have an empty one?  Or has 'that place in tty where file points' after the whole list_entry thing, just been 0 on every platform that has mattered so far?

I could be way off, but first look, this doesn't seem right...
Comment 8 Stephen Smalley 2008-10-30 16:48:59 EDT
Looks like you're right.
It has been that way since the tty revalidation was merged in 2004.
Comment 9 Eric Paris 2008-10-30 17:24:24 EDT
I managed to boot a kernel with the patch I describe above, but I didn't see the printk I expected in the list_empty() case.  I'll keep looking to make sure this was the real problem....
Comment 10 Eric Paris 2008-10-30 18:05:24 EDT

type=1404 audit(1225402059.983:2): enforcing=1 old_enforcing=0 auid=4294967295 ses=4294967295
type=1403 audit(1225402060.230:3): policy loaded auid=4294967295 ses=4294967295
[main.c]                                    on_newroot:new root mounted at "/sysroot", switching to it
[./plugin.c]                                on_boot_output:writing 'Switching to new root and running init.
' to all windows (41 bytes)
inside flush_unauthorized_files with tty->tty_files empty
		Welcome to Fedora 
		Press 'I' to enter interactive startup.

No idea why, maybe we should figure that out?   but tty->tty_files is empty and I was able to boot without a problem....
Comment 11 Stephen Smalley 2008-10-31 09:49:32 EDT
It seems like a legal case to me for tty_files to be empty, and a (longstanding) bug in SELinux that we didn't handle it correctly in the first place.
Can you trigger it by closing all references to a given tty and then exec'ing a domain-changing program?  Although I suppose the caller might hold a reference and thus it is difficult to force it to occur with an actual revoke-style operation.
Comment 12 Eric Paris 2008-11-03 09:53:17 EST
Checked a fix in to the devel branch.  Upstream: 37dd0bd04a3240d2922786d501e2f12cec858fbf
Comment 13 Tom "spot" Callaway 2008-11-06 11:58:01 EST
Fixed in

Note You need to log in before you can comment on or make changes to this bug.