Bug 219393 - LSPP: 'netlabelctl cipsov4 add' throws Kernel OOPs
LSPP: 'netlabelctl cipsov4 add' throws Kernel OOPs
Status: CLOSED CURRENTRELEASE
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel (Show other bugs)
5.0
s390x Linux
medium Severity high
: ---
: ---
Assigned To: Eric Paris
Brian Brock
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2006-12-12 16:54 EST by George C. Wilson
Modified: 2007-11-30 17:07 EST (History)
7 users (show)

See Also:
Fixed In Version: RC
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2007-02-07 20:42:15 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Kernel patch for 2.6.18-1.2840.2.1.el5.lspp.57 (3.69 KB, patch)
2006-12-13 13:47 EST, Paul Moore
no flags Details | Diff
Kernel patch #2 for 2.6.18-1.2840.2.1.el5.lspp.57 (2.02 KB, patch)
2006-12-14 16:02 EST, Paul Moore
no flags Details | Diff

  None (edit)
Description George C. Wilson 2006-12-12 16:54:29 EST
Description of problem:

This bug corresponts to IBM LTC Bugzilla Bug 30037 - RIT108985 - 'netlabelctl
cipsov4 add' throws Kernel OOPs

"netlabelctl cipsov4 add" command throwing kernel oops.
 
Contact Information = nasastry@in.ibm.com

Version-Release number of selected component (if applicable):

uname -a:
Linux india5.pdl.pok.ibm.com 2.6.18-1.2747.el5 #1 SMP Thu Nov 9 18:57:27 EST
2006 s390x s390x s390x GNU/Linux

Userspace rpm: netlabel_tools-0.17-5.el5

How reproducible:



Steps to Reproduce:
1. Install package netlabel_tools-0.17-5.el5.s390x.rpm
2. Issue the following command 
# "netlabelctl cipsov4 add std doi:8
tags:00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000001
levels:0=0000000000000000000000000000000000000000000000000000000000,1=100000000000000000000000000000000000000000000000000000000000000000
categories:0=1,1=0"; echo $?
3. Check the z/VM console for OOPs message.
 
---Kernel - Network Drivers Component Data--- 
Stack trace output:
  no
 
Oops output:
  Unable to handle kernel pointer dereference at virtual kernel address
0000000478dc0000                          
Oops: 003b [#1]                                                                 
CPU:    0    Not tainted                                                        
Process netlabelctl (pid: 1533, task: 000000007e3cd868, ksp: 0000000076ccfc40)  
Krnl PSW : 0704200180000000 00000000002256dc (netlbl_cipsov4_add+0x38c/0x7f0)   
Krnl GPRS: 0000000000000008 00000003fffffffc 0000000078dc0e60 000000007e128454  
           00000000ffffffff 0000000000000008 0000000000000001 0000000000000000  
           0000000076ccf940 0000000079c53700 0000000000000014 000000007e12844c  
           000000007e128448 0000000000250d60 000000000022568c 0000000076ccf830  
Krnl Code: 50 61 20 00 e3 10 c0 00 00 91 a7 1a 00 03 a5 17 ff fc b9 18          
Call Trace:                                                                     
([<000000000022564a>] netlbl_cipsov4_add+0x2fa/0x7f0)                           
 [<00000000001c7210>] genl_rcv_msg+0x1ec/0x22c                                  
 [<00000000001c4314>] netlink_run_queue+0x8c/0x190                              
 [<00000000001c6b5a>] genl_rcv+0x4e/0x88                                        
 [<00000000001c499c>] netlink_data_ready+0x34/0x9c                              
 [<00000000001c30a4>] netlink_sendskb+0x40/0x88                                 
 [<00000000001c4952>] netlink_sendmsg+0x382/0x398                               
 [<000000000019a1ba>] sock_sendmsg+0xf6/0x120                                   
 [<000000000019aed6>] sys_sendmsg+0x1f6/0x26c                                   
 [<000000000019cbf4>] sys_socketcall+0x24c/0x274                                
 [<000000000001f150>] sysc_tracego+0xe/0x14                                  
 [<000000410a3d8fb6>] 0x410a3d8fb6                                           
 <0>Kernel panic - not syncing: Fatal exception: panic_on_oops               
HCPGIR450W CP entered; disabled wait PSW 00020001 80000000 00000000 00015E24
  
Actual results:
Kernel oops.

Expected results:
No kernel oops.

Additional info:
Comment 1 George C. Wilson 2006-12-12 17:00:53 EST
This may need to be assigned to the kerel component. But thought I'd start
w/netlabel_tools.
Comment 4 Paul Moore 2006-12-12 17:19:19 EST
I'm not able to reproduce this using kernel 2.6.18-1.2840.2.1.el5.lspp.57 on
x86; I do not have access to a s390 platform.  Could you try a newer kernel and
see if it fixes the problem?  If it continues to fail please let me know.  Also,
it would be a great help if you could decode the offset into
netlbl_cipsov4_add() and provide a line number; s390 assembly code is Greek to
me, and I don't speak Greek.

Thanks.
Comment 5 George C. Wilson 2006-12-12 17:49:05 EST
Until the IT gets linked, this is from IBM LTC bug 30037:

------ Additional Comment #28 From George C. Wilson  2006-12-12 17:45 EDT  [reply] 

Nagaswara, can you try a later kernel? The LSPP pre-RHEL5 kernel patches are
contained in the lspp.57 kernel Paul refers to. It is available here:
http://people.redhat.com/sgrubb/files/lspp/kernel-2.6.18-1.2840.2.1.el5.lspp.57.s390x.rpm
It would be a good kernel on which to attempt recreating this bug. If it cannot
be recreated, it should be fixed in a later REHL5 kernel.
Comment 6 Nageswara R Sastry 2006-12-13 00:21:28 EST
I can able to reproduce the Kernel OOPs on the new kernel
(2.6.18-1.2840.2.1.el5.lspp.57) too.

With the following command. P.S. This is the simplified command to reproduce
Kernel OOPs.

# netlabelctl cipsov4 add std doi:8 tags:1 levels:0=0,1=10000000000000000000
categories:0=1,1=0

Thanks!!
Comment 7 Paul Moore 2006-12-13 10:21:54 EST
Using kernel 2.6.18-1.2840.2.1.el5.lspp.57 I'm able to lock-up an ia64 box using
the command above.  However, it's a pretty hard lockup (I don't get an
Ooops/panic string) so if someone with a s390 box could decode the line number
in the panic string using gdb and let me know I would appreciate it.

I am going to continue debugging the problem on the ia64 platform and will post
updates when I know more.
Comment 8 George C. Wilson 2006-12-13 10:35:47 EST
Some comments haven't gotten mirrored yet. I asked the 390 folks for a decode.
And Lou reports command did not work but got a 0 rc and no oops on ppc64.
Comment 9 Paul Moore 2006-12-13 10:44:34 EST
Sorry, didn't realize there was such a delay with the mirror setup.

I have a hunch about what the problem might be, and the ppc64 behavior seems
consistent with my hunch.  I'm compiling a test kernel right now, if it fixes
the problem I'll post a quick patch here for the s390 folks to test.
Comment 10 Stephanie Glass 2006-12-13 11:08:27 EST
The problem of mirroring is on the Red Hat side.  Comments made it into the IT
but didn't go from the IT to here.  Can someone check on why...
Comment 11 Paul Moore 2006-12-13 11:30:43 EST
My hunch didn't quite pan out as I would have liked, I'm going to have to dig
out the printk()'s on this one.  I'll update once I have a fix.
Comment 17 Paul Moore 2006-12-13 13:47:54 EST
Created attachment 143542 [details]
Kernel patch for 2.6.18-1.2840.2.1.el5.lspp.57

I have a fix which seems to fix the problem on ia64 (see attachment).  Could
someone with access to a s390 please test this fix to see if it corrects the
problem?

Once I have verification that this solves the problem I'll send this out for
inclusion in the upstream kernels.
Comment 18 IBM Bug Proxy 2006-12-13 14:43:46 EST
manually copying data over from IBM bugzilla so Paul can see it

------- Additional Comment #32 From Nageswara R. Sastry 2006-12-13 00:19 EDT
[reply] -------    Internal Only 

(In reply to comment #28)
> Nagaswara, can you try a later kernel
http://people.redhat.com/sgrubb/files/lspp/kernel-2.6.18-1.2840.2.1.el5.lspp.57.s390x.rpm

I can able to reproduce the Kernel OOPs on the new kernel
(2.6.18-1.2840.2.1.el5.lspp.57) too.

Thanks!!
Comment 19 IBM Bug Proxy 2006-12-13 15:09:53 EST
adding Brock Organ to cc.  Brock can help with zseries problems on Redhat side
Comment 20 Issue Tracker 2006-12-14 10:16:57 EST
----- Additional Comments From skodati@in.ibm.com  2006-12-14 01:44 EDT
-------
I have disassembled the code, it appears that the oops is generated from
netlbl_cipsov4_add_std() from
linux-2.6.18.s390x/net/netlabel/netlabel_cipso_v4.c:233

   226         nla_for_each_nested(nla_a,
    227                            
info->attrs[NLBL_CIPSOV4_A_MLSLVLLST],
    228                             nla_a_rem)
    229                 if (nla_a->nla_type == NLBL_CIPSOV4_A_MLSLVL) {
    230                         struct nlattr *lvl_loc;
    231                         struct nlattr *lvl_rem;
    232
    233                         if (nla_validate_nested(nla_a,
    234                                               NLBL_CIPSOV4_A_MAX,
    235                                              
netlbl_cipsov4_genl_policy) != 0)
    236                                 goto add_std_failure;


Partof the Assembly code is

     620:       e3 10 90 08 00 04       lg      %r1,8(%r9)
     626:       d5 07 d0 08 10 00       clc     8(8,%r13),0(%r1)
     62c:       a7 84 01 fc             je      a24
<netlbl_cipsov4_add+0x6b0>
     630:       e3 10 80 20 00 04       lg      %r1,32(%r8)
     636:       e3 10 10 40 00 04       lg      %r1,64(%r1)
     63c:       e3 a0 10 00 00 91       llgh    %r10,0(%r1)
     642:       41 c0 10 04             la      %r12,4(%r1)
     646:       a7 aa ff fc             ahi     %r10,-4
     64a:       a7 f4 00 67             j       718
<netlbl_cipsov4_add+0x3a4>
     64e:       a7 3a ff fc             ahi     %r3,-4
     652:       a7 49 00 0c             lghi    %r4,12
     656:       41 b0 c0 04             la      %r11,4(%r12)
     65a:       c0 50 00 00 00 00       larl    %r5,65a
<netlbl_cipsov4_add+0x2e6>
     660:       b9 14 00 33             lgfr    %r3,%r3
     664:       b9 04 00 2b             lgr     %r2,%r11
     668:       c0 e5 00 00 00 00       brasl   %r14,668
<netlbl_cipsov4_add+0x2f4>
     66e:       12 22                   ltr     %r2,%r2  <-- oops here 


This event sent from IssueTracker by araghavan 
 issue 108985
Comment 21 Issue Tracker 2006-12-14 10:17:12 EST
----- Additional Comments From nasastry@in.ibm.com  2006-12-14 06:03 EDT
-------
Compiling Kernel with the patch suggested by Paul from HP, will update
the
result tomorrow.

Thanks!! 


This event sent from IssueTracker by araghavan 
 issue 108985
Comment 22 Eric Paris 2006-12-14 10:28:12 EST
I just got a zVM to work on here inside Red Hat.  Will be testing the patch in
the next 30 minutes.
Comment 23 Eric Paris 2006-12-14 11:01:37 EST
[root@jake s390x]# uname -a
Linux jake.z900.redhat.com 2.6.18-1.2876.2.1.el5.lspp.58 #1 SMP Wed Dec 13
15:11:50 EST 2006 s390x s390x s390x GNU/Linux
[root@jake s390x]# netlabelctl cipsov4 add std doi:8 tags:1
levels:0=0,1=10000000000000000000 categories:0=1,1=0
netlabelctl: error, invalid argument or parameter


Is this what you were hoping for paul?
Comment 24 Paul Moore 2006-12-14 11:09:40 EST
That looks good to me, however, just to clarify - there are no
oopses/panics/lockups/heebie-jebbies on the system?
Comment 25 Eric Paris 2006-12-14 11:45:17 EST
While admitting this is the first time I've ever even logged into an s390 I'm
not seeing any 'heebie-jeebies.'  dmesg looks clean, the system keeps running,
it fails the same way over and over with both examples

[root@jake s390x]# netlabelctl cipsov4 add std doi:8
tags:00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000001
levels:0=0000000000000000000000000000000000000000000000000000000000,1=100000000000000000000000000000000000000000000000000000000000000000
categories:0=1,1=0; echo $?
netlabelctl: error, invalid argument or parameter
1
[root@jake s390x]# netlabelctl cipsov4 add std doi:8 tags:1
levels:0=0,1=10000000000000000000 categories:0=1,1=0; echo $? netlabelctl:
error, invalid argument or parameter 
1


I guess we should wait for IBM's testing, but i'll go ahead and move toward
pushing this new kernel with this patch out to the LSPP people (but it will
still be a bit since there is a new audit issue I want to get into the next LSPP
kernel)
Comment 26 Paul Moore 2006-12-14 12:25:36 EST
That sounds good to me, as soon as we hear from the IBM folk that it is okay
I'll push this fix to netdev.
Comment 27 Paul Moore 2006-12-14 16:02:40 EST
Created attachment 143701 [details]
Kernel patch #2 for 2.6.18-1.2840.2.1.el5.lspp.57

During testing of the original patch I found an additional problem where the
level and category mappings are not being initialized correctly in the kernel. 
This patch fixes that problem and should be applied along with the first patch.
Comment 28 Issue Tracker 2006-12-15 09:12:38 EST
----- Additional Comments From nasastry@in.ibm.com  2006-12-15 00:06 EDT
-------
Please find the test results of the patch (Redhat Bugzilla bug #219393
attachment (id=143542)),

[root@india3 ~]# netlabelctl cipsov4 add std doi:8 tags:1
levels:0=0,1=10000000000000000000 categories:0=1,1=0
netlabelctl: error, invalid argument or parameter
[root@india3 ~]# echo $?
1
[root@india3 ~]# netlabelctl cipsov4 add std doi:8
tags:00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000001
levels:0=0000000000000000000000000000000000000000000000000000000000,1=100000000000000000000000000000000000000000000000000000000000000000
categories:0=1,1=0
netlabelctl: error, invalid argument or parameter
[root@india3 ~]# echo $?
1

It's working fine.

Thanks!! 


This event sent from IssueTracker by araghavan 
 issue 108985
Comment 29 Issue Tracker 2006-12-15 09:12:53 EST
----- Additional Comments From nasastry@in.ibm.com  2006-12-15 04:58 EDT
-------
Please find the test results of the patch (Redhat Bugzilla bug #219393
attachment (id=143542) and (id=143701)),

[root@india3 ~]# netlabelctl cipsov4 add std doi:8 tags:1
levels:0=0,1=10000000000000000000 categories:0=1,1=0
netlabelctl: error, invalid argument or parameter
[root@india3 ~]# echo $?
1
[root@india3 ~]# netlabelctl cipsov4 add std doi:8
tags:00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000001
levels:0=0000000000000000000000000000000000000000000000000000000000,1=100000000000000000000000000000000000000000000000000000000000000000
categories:0=1,1=0
netlabelctl: error, invalid argument or parameter
[root@india3 ~]# echo $?
1

It's working fine.

Thanks!! 


This event sent from IssueTracker by araghavan 
 issue 108985
Comment 30 Paul Moore 2006-12-15 11:14:22 EST
Can anyone at IBM confirm that the first patch (see comment #17) fixes the problem?
Comment 31 Eric Paris 2006-12-15 11:16:04 EST
making 2 IBM comments public.  looks like we are good here!
Comment 32 Paul Moore 2006-12-15 11:23:51 EST
Okay, thanks.  I'll push these patches upstream as soon as I can.
Comment 33 Paul Moore 2006-12-15 16:54:25 EST
The two patches attached to this BZ entry were just pushed upstream to the
SELinux and netdev mailing lists.
Comment 34 Jay Turner 2007-01-02 10:08:12 EST
QE ack for RHEL5.  Related to the LSPP release criteria.
Comment 35 Don Zickus 2007-01-03 18:26:38 EST
in 2.6.18-1.2961.el5
Comment 38 RHEL Product and Program Management 2007-02-07 20:42:15 EST
A package has been built which should help the problem described in 
this bug report. This report is therefore being closed with a resolution 
of CURRENTRELEASE. You may reopen this bug report if the solution does 
not work for you.

Note You need to log in before you can comment on or make changes to this bug.