Bug 864350

Summary: Fedora18:Alpha:runltp execution triggered SyS_get_robust_lis is unreliable & resulted call trace
Product: [Fedora] Fedora Reporter: IBM Bug Proxy <bugproxy>
Component: kernelAssignee: Kernel Maintainer List <kernel-maint>
Status: CLOSED NOTABUG QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: high Docs Contact:
Priority: unspecified    
Version: 18CC: gansalmon, itamar, jkachuck, jonathan, kernel-maint, madhu.chinakonda, wgomerin
Target Milestone: ---   
Target Release: ---   
Hardware: ppc64   
OS: All   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-10-09 16:09:29 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
var-log-messages.txt none

Description IBM Bug Proxy 2012-10-09 08:30:47 UTC
Problem Description
-------------------------------
I noticed following call trace in /var/log/messages ,While running ltp-runall tests  on Fedora 18 alpha ppc64 installed P7 LPAR system.


--- Call Trace noticed ---
.
Sep 28 22:28:38 localhost kernel: [32911.440287] ------------[ cut here ]------------
Sep 28 22:28:38 localhost kernel: [32911.440289] WARNING: at kernel/futex.c:2455
Sep 28 22:28:38 localhost kernel: [32911.440291] Modules linked in: nfsd auth_rpcgss nfs_acl binfmt_misc nls_utf8 lockd sunrpc bnep bluetooth rfkill ipt_MASQUERADE nf_conntrack_netbios_ns nf_conntrack_broadcast ip6table_mangle ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter ip6_tables iptable_nat nf_nat iptable_mangle nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack uinput ibmveth ibmvscsic scsi_transport_srp scsi_tgt
Sep 28 22:28:38 localhost kernel: [32911.440323] NIP: c000000000106724 LR: c000000000106720 CTR: c0000000003f14f0
Sep 28 22:28:38 localhost kernel: [32911.440326] REGS: c000000026d83b10 TRAP: 0700   Not tainted  (3.6.0-0.rc5.git0.1.fc18.ppc64p7)
Sep 28 22:28:38 localhost kernel: [32911.440328] MSR: 8000000000029032 <SF,EE,ME,IR,DR,RI>  CR: 28000084  XER: 20000000
Sep 28 22:28:38 localhost kernel: [32911.440335] SOFTE: 1
Sep 28 22:28:38 localhost kernel: [32911.440337] CFAR: c0000000007d306c
Sep 28 22:28:38 localhost kernel: [32911.440339] TASK = c000000007040000[14618] 'get_robust_list' THREAD: c000000026d80000 CPU: 3
Sep 28 22:28:38 localhost kernel: [32911.440339] GPR00: c000000000106720 c000000026d83d90 c0000000012b38c0 0000000000000034
Sep 28 22:28:38 localhost kernel: [32911.440339] GPR04: 0000000000000000 0000000000000000 696e20323031332e 73745f6c69737420
Sep 28 22:28:38 localhost kernel: [32911.440339] GPR08: 0000000000000001 0000000000000000 0000000000000000 0000000000000000
Sep 28 22:28:38 localhost kernel: [32911.440339] GPR12: 0000000028000082 c00000000eda0a80 0000000000000000 0000000000000000
Sep 28 22:28:38 localhost kernel: [32911.440339] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
Sep 28 22:28:38 localhost kernel: [32911.440339] GPR20: 0000000000000000 0000000010004528 00000000100044e0 0000000010004488
Sep 28 22:28:38 localhost kernel: [32911.440339] GPR24: 0000000010004570 0000000000000000 00000000100044b8 c0000000011b38ef
Sep 28 22:28:38 localhost kernel: [32911.440339] GPR28: 00000fffec7001c0 0000000000000000 c00000000122a710 0000000000000000
Sep 28 22:28:38 localhost kernel: [32911.440372] NIP [c000000000106724] .SyS_get_robust_list+0x194/0x1b0
Sep 28 22:28:38 localhost kernel: [32911.440375] LR [c000000000106720] .SyS_get_robust_list+0x190/0x1b0
Sep 28 22:28:38 localhost kernel: [32911.440376] Call Trace:
Sep 28 22:28:38 localhost kernel: [32911.440379] [c000000026d83d90] [c000000000106720] .SyS_get_robust_list+0x190/0x1b0 (unreliable)
Sep 28 22:28:38 localhost kernel: [32911.440384] [c000000026d83e30] [c000000000009854] syscall_exit+0x0/0x94
Sep 28 22:28:38 localhost kernel: [32911.440386] Instruction dump:
Sep 28 22:28:38 localhost kernel: [32911.440388] eb81ffe0 eba1ffe8 7c0803a6 ebc1fff0 ebe1fff8 4e800020 60420000 3920ffda
Sep 28 22:28:38 localhost kernel: [32911.440393] 4bffff00 e87e8038 486cc8bd 60000000 <0fe00000> 39200001 993b0001 4bfffebc
Sep 28 22:28:38 localhost kernel: [32911.440400] ---[ end trace cfd53d8dbdfdc75c ]---


--- Steps to reproduce ---
1) Download last LTP tarball from http://ltp.sourceforge.net/ & extract it.
2) cd /root/ltp-full-*
3) export CC='gcc'
4) ./configure
5) make
6) make install
7) export LTPROOT=/opt/ltp
8) export PATH=$PATH:$LTPROOT/testcases/bin
9) Now execute runltp 
   cd /opt/ltp/
   nohup ./runltp -c 2 -i 2 -m 2,4,10240,1 -D 2,10,10240,1 -p -q  -l ltprunall.log -o ltprunall.out -C ltprunall-log.fail -d /opt/ltp &

--- Other Details ---

[root@localhost ~]# uname -a
Linux localhost 3.6.0-0.rc5.git0.1.fc18.ppc64p7 #1 SMP Tue Sep 11 10:31:05 MST 2012 ppc64 ppc64 ppc64 GNU/Linux

[root@localhostl]# free -m
             total       used       free     shared    buffers     cached
Mem:          1990        965       1024          0         23         38
-/+ buffers/cache:        903       1086
Swap:         3999         98       3901

[root@localhostl]# lscpu 
Architecture:          ppc64
Byte Order:            Big Endian
CPU(s):                48
On-line CPU(s) list:   0-7
Off-line CPU(s) list:  8-47
Thread(s) per core:    4
Core(s) per socket:    1
Socket(s):             2
NUMA node(s):          2
Model:                 IBM,8246-L2C
L1d cache:             32K
L1i cache:             32K
NUMA node0 CPU(s):     0-7
NUMA node1 CPU(s):     

Attached var-log-messages.txt for reference.

== Comment: #1 - Vaishnavi Bhat <vaish123.com> -  ==
I see the following ERROR messages in the log, 

Sep 28 17:19:29 localhost tcsd[464]: Error: could not load module /lib/modules/3.6.0-0.rc5.git0.1.fc18.ppc64p7/kernel/drivers/char/tpm/tpm_*: No such file or directory
Sep 28 17:19:29 localhost tcsd[464]: Starting tcsd: [FAILED]
Sep 28 17:19:29 localhost TCSD TDDL[483]: TrouSerS ERROR: Could not find a device to open!

dbus-daemon[527]: ERROR: Unable to open policy /sys/fs/selinux/policy.

Sep 29 01:01:46 localhost systemd[1]: Reloading.
Sep 29 01:01:57 localhost chrony-helper[2076]: tr: write error: Broken pipe
Sep 29 01:01:57 localhost chrony-helper[2076]: tr: write error


Sep 28 22:51:23 localhost NetworkManager[20828]: <error> [1348887083.356258] [nm-device-ethernet.c:456] real_update_permanent_hw_address(): (eth0): unable to read permanent MAC address (error 0)


Can you paste the output of 
- systemctl status tcsd.service 
- rpm -qa | grep setools* 

Thanks

== Comment: #2 - MANAS K. NAYAK <maknayak.com> -  ==
(In reply to comment #1)
> I see the following ERROR messages in the log, 
> 
> Sep 28 17:19:29 localhost tcsd[464]: Error: could not load module
> /lib/modules/3.6.0-0.rc5.git0.1.fc18.ppc64p7/kernel/drivers/char/tpm/tpm_*:
> No such file or directory
> Sep 28 17:19:29 localhost tcsd[464]: Starting tcsd: [FAILED]
> Sep 28 17:19:29 localhost TCSD TDDL[483]: TrouSerS ERROR: Could not find a
> device to open!
> 
> dbus-daemon[527]: ERROR: Unable to open policy /sys/fs/selinux/policy.
> 
> Sep 29 01:01:46 localhost systemd[1]: Reloading.
> Sep 29 01:01:57 localhost chrony-helper[2076]: tr: write error: Broken pipe
> Sep 29 01:01:57 localhost chrony-helper[2076]: tr: write error
> 
> 
> Sep 28 22:51:23 localhost NetworkManager[20828]: <error> [1348887083.356258]
> [nm-device-ethernet.c:456] real_update_permanent_hw_address(): (eth0):
> unable to read permanent MAC address (error 0)
> 
> 
> Can you paste the output of 
> - systemctl status tcsd.service 
> - rpm -qa | grep setools* 
> 
> Thanks

Hi Vaishanavi,
Here are the details you asked for.

[root@miz04 ~]# systemctl status tcsd.service 
tcsd.service - LSB: Init script for TCSD
          Loaded: loaded (/etc/rc.d/init.d/tcsd)
          Active: failed (Result: exit-code) since Fri, 05 Oct 2012 11:57:33 -0400; 5min ago
         Process: 438 ExecStart=/etc/rc.d/init.d/tcsd start (code=exited, status=137)
          CGroup: name=systemd:/system/tcsd.service

Oct 05 11:57:31 miz04.austin.ibm.com tcsd[438]: Error: could not load module...y
Oct 05 11:57:31 miz04.austin.ibm.com tcsd[438]: Starting tcsd: [FAILED]

[root@miz04 ~]# rpm -qa | grep setools* 
setools-libs-3.3.7-26.fc18.ppc64
setools-libs-python-3.3.7-26.fc18.ppc64

Thanks...
Manas

Comment 1 IBM Bug Proxy 2012-10-09 08:31:08 UTC
Created attachment 623976 [details]
var-log-messages.txt

Comment 2 IBM Bug Proxy 2012-10-09 09:51:46 UTC
------- Comment From vaish123.com 2012-10-09 09:47 EDT-------
> [root@miz04 ~]# systemctl status tcsd.service
> tcsd.service - LSB: Init script for TCSD
>           Loaded: loaded (/etc/rc.d/init.d/tcsd)
>           Active: failed (Result: exit-code) since Fri, 05 Oct 2012 11:57:33
> -0400; 5min ago
>          Process: 438 ExecStart=/etc/rc.d/init.d/tcsd start (code=exited,
> status=137)
>           CGroup: name=systemd:/system/tcsd.service
> Oct 05 11:57:31 miz04.austin.ibm.com tcsd[438]: Error: could not load
> module...y
> Oct 05 11:57:31 miz04.austin.ibm.com tcsd[438]: Starting tcsd: [FAILED]
The tcsd.service issue has been addressed in RH bugzilla https://bugzilla.redhat.com/show_bug.cgi?format=multiple&id=834915
( Fedora 17, arch= X86-64)

Thanks.

Comment 3 Dave Jones 2012-10-09 16:09:29 UTC
if you read the warning you'll see this..

WARN_ONCE(1, "deprecated: get_robust_list will be deleted in 2013.\n");