Bug 98318

Summary: [x86_64] Reproducible oops caused by useradd.
Product: [Retired] Red Hat Raw Hide Reporter: Aleksey Nogin <aleksey>
Component: kernelAssignee: Arjan van de Ven <arjanv>
Status: CLOSED NOTABUG QA Contact: Brian Brock <bbrock>
Severity: high Docs Contact:
Priority: medium    
Version: 1.0CC: jyh
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2004-03-11 07:59:05 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Aleksey Nogin 2003-06-30 23:56:55 UTC
Reproducible: 100%

Steps to reproduce:

Attempt to install the Canna package from rawhide, or just run the

/usr/sbin/useradd -c "Canna Service User" -r -s /sbin/nologin -u 39 -d
/var/lib/canna canna

command directly.

Expected: user is added.

Actual:
After several seconds (strace shows a lot of YP traffic), I get an Oops:
-------
Unable to handle kernel paging request at virtual address ffffffffffffffff
 printing rip:
ffffffff80163ac6
PML4 103027 PGD 2067 PMD 0
Oops: 0002
CPU 0
Pid: 890, comm: useradd Not tainted
RIP: 0010:[<ffffffff80163ac6>]{__pollwait+166}
RSP: 0018:00000103fabd7e38  EFLAGS: 00010206
RAX: 000000000000002f RBX: 00000103dffff000 RCX: ffffffffffffffff
RDX: 00000103dffff000 RSI: 00000000003defff RDI: 0000010000019168
RBP: 00000103fd7177b8 R08: 0000010000019198 R09: 00000100164ffff0
R10: 0000000000000001 R11: 0000000000000000 R12: 00000103fabd7f38
R13: 00000103e12ddc08 R14: 0000000000000001 R15: 00000103e0000000
FS:  000000000050e280(0000) GS:ffffffff804cf600(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: ffffffffffffffff CR3: 0000000000101000 CR4: 00000000000006e0

Call Trace: [<ffffffff80163a6a>]{__pollwait+74}
[<ffffffff802313c8>]{datagram_poll+24}
       [<ffffffff8022b36d>]{sock_poll+29} [<ffffffff80164457>]{do_pollfd+119}
       [<ffffffff8016452f>]{do_poll+127} [<ffffffff80164772>]{sys_poll+466}
       [<ffffffff8010f0d3>]{system_call+119}
Process useradd (pid: 890, stackpage=103fabd7000)
Stack: 00000103fabd7e38 0000000000000018 ffffffff80163a6a 00000103e12ddbd0
       00000103ff02f100 00000103fd7177b8 00000103e0000000 0000000000000000
       ffffffff802313c8 0000000000000145 ffffffff8022b36d 0000010000019440
       ffffffff80164457 00000000000001f0 00000103fabd7eec 00000103fabd7ee0
       0000000000000000 0000000000000000 00000000000001f5 00000103f21c6ed8
       0000000000000001 00000103fabd7f38 ffffffff8016452f 00000000000001f5
       00000103fabd7f38 0000000000525aec 0000000000000000 0000007fbfffc710
       0000000000000001 0000000000000000 00000103f21c6ed8 00000000000001f5
       ffffffff80164772 00000103fabd7f38 00000001fffffff2 0000000000000000
       00000103dffff000 0000000000000002 0000000000001388 0000000000000001
Call Trace: [<ffffffff80163a6a>]{__pollwait+74}
[<ffffffff802313c8>]{datagram_poll+24}
       [<ffffffff8022b36d>]{sock_poll+29} [<ffffffff80164457>]{do_pollfd+119}
       [<ffffffff8016452f>]{do_poll+127} [<ffffffff80164772>]{sys_poll+466}
       [<ffffffff8010f0d3>]{system_call+119}

Code: 48 89 29 4c 89 69 28 48 8d 71 08 65 48 8b 04 25 18 00 00 00
-------

Another instance of the same thing:
-------
Jun 30 13:48:45 matrix41 kernel: Unable to handle kernel paging request at
virtual address ffffffffffffffff
Jun 30 13:48:45 matrix41 kernel:  printing rip:
Jun 30 13:48:45 matrix41 kernel: ffffffff80163ac6
Jun 30 13:48:45 matrix41 kernel: PML4 103027 PGD 2067 PMD 0
Jun 30 13:48:45 matrix41 kernel: Oops: 0002
Jun 30 13:48:45 matrix41 kernel: CPU 1
Jun 30 13:48:45 matrix41 kernel: Pid: 766, comm: useradd Not tainted
Jun 30 13:48:45 matrix41 kernel: RIP: 0010:[<ffffffff80163ac6>]{__pollwait+166}
Jun 30 13:48:45 matrix41 kernel: RSP: 0018:00000103f9031e38  EFLAGS: 00010206
Jun 30 13:48:45 matrix41 kernel: RAX: 000000000000002f RBX: 00000103dffff000
RCX: ffffffffffffffff
Jun 30 13:48:45 matrix41 kernel: RDX: 00000103dffff000 RSI: 00000000003defff
RDI: 0000010000019168
Jun 30 13:48:45 matrix41 kernel: RBP: 00000103f8fa7870 R08: 0000010000019198
R09: 00000100164ffff0
Jun 30 13:48:45 matrix41 kernel: R10: 0000000000000001 R11: 0000000000000000
R12: 00000103f9031f38
Jun 30 13:48:45 matrix41 kernel: R13: 00000103fad93308 R14: 0000000000000001
R15: 00000103e0000000
Jun 30 13:48:45 matrix41 kernel: FS:  000000000050e280(0000)
GS:ffffffff804cf680(0000) knlGS:0000000000000000
Jun 30 13:48:45 matrix41 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
Jun 30 13:48:45 matrix41 kernel: CR2: ffffffffffffffff CR3: 0000000037ffa000
CR4: 00000000000006e0
Jun 30 13:48:45 matrix41 kernel:
Jun 30 13:48:45 matrix41 kernel: Call Trace: [<ffffffff80163a6a>]{__pollwait+74}
[<ffffffff802313c8>]{datagram
_poll+24}
Jun 30 13:48:45 matrix41 kernel:        [<ffffffff8022b36d>]{sock_poll+29}
[<ffffffff80164457>]{do_pollfd+119}

Jun 30 13:48:45 matrix41 kernel:        [<ffffffff8016452f>]{do_poll+127}
[<ffffffff80164772>]{sys_poll+466}
Jun 30 13:48:45 matrix41 kernel:        [<ffffffff8010f0d3>]{system_call+119}
Jun 30 13:48:45 matrix41 kernel: Process useradd (pid: 766, stackpage=103f9031000)
Jun 30 13:48:45 matrix41 kernel: Stack: 00000103f9031e38 0000000000000018
ffffffff80163a6a 00000103fad932d0
Jun 30 13:48:45 matrix41 kernel:        00000103ff521140 00000103f8fa7870
00000103e0000000 0000000000000000
Jun 30 13:48:45 matrix41 kernel:        ffffffff802313c8 0000000000000145
ffffffff8022b36d 0000010000019440
Jun 30 13:48:45 matrix41 kernel:        ffffffff80164457 00000000000001f0
00000103f9031eec 00000103f9031ee0
Jun 30 13:48:45 matrix41 kernel:        0000000000000000 0000000000000000
00000000000001f5 00000103fae90c98
Jun 30 13:48:45 matrix41 kernel:        0000000000000001 00000103f9031f38
ffffffff8016452f 00000000000001f5
Jun 30 13:48:45 matrix41 kernel:        00000103f9031f38 0000000000525a4c
0000000000000000 0000007fbfffba60
Jun 30 13:48:45 matrix41 kernel:        0000000000000001 0000000000000000
00000103fae90c98 00000000000001f5
Jun 30 13:48:45 matrix41 kernel:        ffffffff80164772 00000103f9031f38
00000001fffffff2 0000000000000000
Jun 30 13:48:45 matrix41 kernel:        00000103dffff000 0000000000000002
0000000000001388 0000000000000001
Jun 30 13:48:45 matrix41 kernel: Call Trace: [<ffffffff80163a6a>]{__pollwait+74}
[<ffffffff802313c8>]{datagram
_poll+24}
Jun 30 13:48:45 matrix41 kernel:        [<ffffffff8022b36d>]{sock_poll+29}
[<ffffffff80164457>]{do_pollfd+119}

Jun 30 13:48:45 matrix41 kernel:        [<ffffffff8016452f>]{do_poll+127}
[<ffffffff80164772>]{sys_poll+466}
Jun 30 13:48:45 matrix41 kernel:        [<ffffffff8010f0d3>]{system_call+119}
Jun 30 13:48:45 matrix41 kernel:
Jun 30 13:48:45 matrix41 kernel: Code: 48 89 29 4c 89 69 28 48 8d 71 08 65 48 8b
04 25 18 00 00 00
-----

(If I turn ypbind off before running useradd, nothing bad happens).

I am running a current version of Rawhide (manually installed packages into a
fresh partituion).

"cat /proc/version" is

Linux version 2.4.20-9.2smp (bhcompile.redhat.com) (gcc version 3.2.2
20030222 (Red Hat Linux 3.2.2-5)) #1 SMP Thu Apr 10 11:54:33 EDT 2003

Comment 1 Aleksey Nogin 2003-07-01 00:24:18 UTC
Just saw syslogd trip over the same thing:

Unable to handle kernel paging request at virtual address ffffffffffffffff
 printing rip:
ffffffff80163ac6
PML4 103027 PGD 2067 PMD 0
Oops: 0002
CPU 0
Pid: 408, comm: syslogd Not tainted
RIP: 0010:[<ffffffff80163ac6>]{__pollwait+166}
RSP: 0018:00000103fe30de18  EFLAGS: 00010206
RAX: 000000000000002f RBX: 00000103dfff4000 RCX: ffffffffffffffff
RDX: 00000103dfff4000 RSI: 00000000003deff4 RDI: 0000010000019168
RBP: 00000100dff709e0 R08: 0000010000019198 R09: 00000100164ffc28
R10: 0000000000000001 R11: 0000000000000000 R12: 00000103fe30de98
R13: 00000103fdec0288 R14: 0000000000000001 R15: 0000000000000145
FS:  000000000050e280(0000) GS:ffffffff804cf600(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: ffffffffffffffff CR3: 0000000000101000 CR4: 00000000000006e0

Call Trace: [<ffffffff80163a6a>]{__pollwait+74}
[<ffffffff802313c8>]{datagram_poll+24}
       [<ffffffff8022b36d>]{sock_poll+29} [<ffffffff80163d3f>]{do_select+351}
       [<ffffffff8016426b>]{sys_select+971} [<ffffffff8010f0d3>]{system_call+119}

Process syslogd (pid: 408, stackpage=103fe30d000)
Stack: 00000103fe30de18 0000000000000018 ffffffff80163a6a 00000103fdec66a8
       00000103fdfbc880 00000100dff709e0 0000000000000000 00000103fe30df18
       ffffffff802313c8 0000000000000000 ffffffff8022b36d 0000000000000000
       ffffffff80163d3f 0000000000000000 7fffffffffffffff 0000000000000002
       00000103fe30de98 00000103fe30df10 0000000100000246 0000000000000000
       00000103dfff4000 00000000004055d3 0000000000000000 0000000000000008
       0000000000000000 0000007fbffff200 00000103fdec66a8 0000000000000001
       ffffffff8016426b 00000000004055e3 0000000000000001 00000008fe30de78
       0000000000000000 0000000000000000 7fffffffffffffff 00000103fdec66a8
       00000103fdec66b0 00000103fdec66b8 00000103fdec66c0 00000103fdec66c8
Call Trace: [<ffffffff80163a6a>]{__pollwait+74}
[<ffffffff802313c8>]{datagram_poll+24}
       [<ffffffff8022b36d>]{sock_poll+29} [<ffffffff80163d3f>]{do_select+351}
       [<ffffffff8016426b>]{sys_select+971} [<ffffffff8010f0d3>]{system_call+119}


Code: 48 89 29 4c 89 69 28 48 8d 71 08 65 48 8b 04 25 18 00 00 00


Comment 2 Aleksey Nogin 2004-03-11 07:59:05 UTC
Turned out that this machine had a buggy BIOS and problems have
disappeared after a BIOS upgrade.