Bug 1623127 - "cannot allocate any more memory" on 4.19-rc1
Summary: "cannot allocate any more memory" on 4.19-rc1
Keywords:
Status: CLOSED RAWHIDE
Alias: None
Product: Fedora
Classification: Fedora
Component: crash
Version: 28
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Dave Anderson
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks: 1628967
TreeView+ depends on / blocked
 
Reported: 2018-08-28 14:17 UTC by Dominique Martinet
Modified: 2018-09-24 19:07 UTC (History)
2 users (show)

Fixed In Version: crash-7.2.4-1.fc30
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1628967 (view as bug list)
Environment:
Last Closed: 2018-09-24 19:07:38 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
tentative ugly patch (3.15 KB, patch)
2018-08-28 14:17 UTC, Dominique Martinet
no flags Details | Diff
task_struct.pid_links patch (1.77 KB, text/plain)
2018-08-28 19:26 UTC, Dave Anderson
no flags Details

Description Dominique Martinet 2018-08-28 14:17:51 UTC
Created attachment 1479278 [details]
tentative ugly patch

Description of problem:
crash does not start with a 4.19-rc1 kernel

Version-Release number of selected component (if applicable):
today's master branch on github: e541c5ca276 ("Introduction of a new "kmem -r" option.")

How reproducible:
always.

Steps to Reproduce:
1. compile 4.19-rc1 kernel
2. reboot on it
3. run crash

Actual results:
crash fails to start with this error:

   buf_1K_used: 478                                                    
   buf_2K_used: 1
   buf_4K_used: 1
   buf_8K_used: 0
  buf_32K_used: 1
    buf_1K_ovf: 0
    buf_2K_ovf: 0
    buf_4K_ovf: 0
    buf_8K_ovf: 0
   buf_32K_ovf: 0
 buf_1K_maxuse:  2 of 10
 buf_2K_maxuse:  1 of 10
 buf_4K_maxuse:  1 of 5
 buf_8K_maxuse:  0 of 5
buf_32K_maxuse:  1 of 1
  buf_inuse[5]: [3][0][0][0][0]
      smallest: 8
       largest: 7069516368156
      embedded: 3
  max_embedded: 3
       mallocs: 3
         frees: 3
    reqs/total: 485/7069517334472
  average size: 14576324401

crash: cannot allocate any more memory!


Expected results:
crash starts.

Additional info:
I recompiled with -O0 and hooked up gdb to dump_shared_bufs, here's the backtrace at the time of the crash:
#0  dump_shared_bufs () at tools.c:5525
#1  0x00000000004795e7 in getbuf (reqsize=7069516368156) at tools.c:5716
#2  0x0000000000529a28 in store_module_symbols_v2 (total=17429, mods_installed=47) at symbols.c:1777
#3  0x00000000004e19c3 in module_init () at kernel.c:3639
#4  0x000000000046397e in main_loop () at main.c:772
#5  0x00000000006af463 in captured_command_loop (data=data@entry=0x0) at main.c:258
#6  0x00000000006ae0fa in catch_errors (func=func@entry=0x6af450 <captured_command_loop>, func_args=func_args@entry=0x0, 
    errstring=errstring@entry=0x902bb9 "", mask=mask@entry=6) at exceptions.c:557
#7  0x00000000006b0456 in captured_main (data=data@entry=0x7fffffffe4c0) at main.c:1064
#8  0x00000000006ae0fa in catch_errors (func=func@entry=0x6af730 <captured_main>, func_args=func_args@entry=0x7fffffffe4c0, 
    errstring=errstring@entry=0x902bb9 "", mask=mask@entry=6) at exceptions.c:557
#9  0x00000000006b0767 in gdb_main (args=0x7fffffffe4c0) at main.c:1079
#10 gdb_main_entry (argc=<optimized out>, argv=argv@entry=0x7fffffffe628) at main.c:1099
#11 0x00000000004f68a4 in gdb_main_loop (argc=<optimized out>, argc@entry=1, argv=argv@entry=0x7fffffffe628) at gdb_interface.c:76
#12 0x00000000004620b7 in main (argc=1, argv=0x7fffffffe628) at main.c:707


with (at store_module_symbols_v2 level):
(gdb) p/x 7069516368156 (strbuflen)
$11 = 0x66e0003091c
(gdb) p/x last
$8 = 0xfbcdfffdc1f8
(gdb) p/x first
$9 = 0xf55ffffabeb8
(gdb) p/x BUFSIZE
$10 = 0x5dc
(gdb) p/x first + strbuflen
$12 = 0xfbcdfffdc7d4
(gdb) p/x lm->mod_base + lm->mod_size
$13 = 0xffffffffc0d19000

looking on the linux side this seems to be due to 7290d58095712a89f845e1bca05334796dd49ed2 ("module: use relative references for __ksymtab entries") ; I've monkey-reverted this and a few other commits that depend on it and got crash to. . . fail on something else because 2c4704756cab7cfa031ada4dab361562f0e357c0 ("pids: Move the pgrp and session pid pointers from task_struct to signal_struct")

. . . Well, anyway, it probably took me much longer than it should have, but I did work out the first one, I've attached a patch that appears to work for this part unconditionally - I'm not sure how to do this dynamically though so I'll have to defer to you on that.

I'll have a look at the task_struct_pids eventually, that might be easier to make conditional as we can just check if the struct member exists, but it looks like a slightly bigger change and I don't think I'll have time to finish tonight so posting what I have for now.


Thanks for crash!
-- 
Dominique Martinet

Comment 1 Dave Anderson 2018-08-28 14:38:51 UTC
> . . . Well, anyway, it probably took me much longer than it should have, but I > did work out the first one, I've attached a patch that appears to work for this > part unconditionally - I'm not sure how to do this dynamically though so I'll
> have to defer to you on that.

You mean being able to maintain backwards-compatibility, correct?

Comment 2 Dominique Martinet 2018-08-28 14:41:50 UTC
> You mean being able to maintain backwards-compatibility, correct?

Yeah, I mean detect at runtime what the kernel crash is looking at uses.
Having let this rest for a few minutes I realize we could just check if the members of the kernel_symbol struct are name or name_offset, if that is available?

-- 
Dominique

Comment 3 Dave Anderson 2018-08-28 14:48:13 UTC
Ah, it appears to conditional based upon CONFIG_HAVE_ARCH_PREL32_RELOCATIONS
declaring two different kernel_symbol structure types.  So maybe your new
modsym_name() and modsym_value() functions can be made conditional based
upon the how the kernel_symbol structure is declared?

Comment 4 Dominique Martinet 2018-08-28 15:28:23 UTC
(Thanks for the invitation to the crash-utility mailing list, I've subscribed on a "real" mail address for that -- sorry for being picky! :))

Yeah we could definitely check that, it's getting late here (JST) but I can do that tomorrow certainly; it looks straightforward enough. If you have time to look at the other incompatible change with task_struct pids field that would be appreciated!

I'll send the next version of the patch to the list as I see that's how it works in the archive.

-- 
Dominique

Comment 5 Dave Anderson 2018-08-28 15:52:00 UTC
OK thanks, I appreciate it.  I'm currently trying to provision a system
with 4.19-rc1. 

I'm not clear on the task_struct pids changes yet.  Since signal_structs
can be shared, I'm wondering how it will be possible to find each task
struct address from the shared signal struct?

Comment 6 Dave Anderson 2018-08-28 18:03:17 UTC
(In reply to Dave Anderson from comment #5)
> OK thanks, I appreciate it.  I'm currently trying to provision a system
> with 4.19-rc1. 
> 
> I'm not clear on the task_struct pids changes yet.  Since signal_structs
> can be shared, I'm wondering how it will be possible to find each task
> struct address from the shared signal struct?

Ah, the pid_links[] hlist_node array is still contained in the task_struct:

--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -775,7 +775,8 @@ struct task_struct {
        struct list_head                ptrace_entry;

        /* PID/PID hash table linkage. */
-       struct pid_link                 pids[PIDTYPE_MAX];
+       struct pid                      *thread_pid;
+       struct hlist_node               pid_links[PIDTYPE_MAX];
        struct list_head                thread_group;
        struct list_head                thread_node;

Hopefully this can be conditionally handled in refresh_radix_tree_task_table().

Comment 7 Dave Anderson 2018-08-28 19:26:37 UTC
Created attachment 1479344 [details]
task_struct.pid_links patch

Comment 8 Dave Anderson 2018-08-28 19:29:50 UTC
(In reply to Dominique Martinet from comment #4)
> ...
> Yeah we could definitely check that, it's getting late here (JST) but I can
> do that tomorrow certainly; it looks straightforward enough. If you have
> time to look at the other incompatible change with task_struct pids field
> that would be appreciated!

That was an easy enough fix with this patch:

diff --git a/defs.h b/defs.h
index 8687ff1..1dc8cca 100644
--- a/defs.h
+++ b/defs.h
@@ -2036,6 +2036,7 @@ struct offset_table {                    /* stash of commonly-used offsets */
 	long memcg_cache_params___root_caches_node;
 	long memcg_cache_params_children;
 	long memcg_cache_params_children_node;
+	long task_struct_pid_links;
 };
 
 struct size_table {         /* stash of commonly-used sizes */
diff --git a/symbols.c b/symbols.c
index 2e6713a..48a0ee6 100644
--- a/symbols.c
+++ b/symbols.c
@@ -8606,6 +8606,8 @@ dump_offset_table(char *spec, ulong makestruct)
                 OFFSET(task_rss_stat_count));
         fprintf(fp, "              task_struct_pids: %ld\n",
                 OFFSET(task_struct_pids));
+        fprintf(fp, "         task_struct_pid_links: %ld\n",
+                OFFSET(task_struct_pid_links));
         fprintf(fp, "          task_struct_last_run: %ld\n",
                 OFFSET(task_struct_last_run));
         fprintf(fp, "         task_struct_timestamp: %ld\n",
diff --git a/task.c b/task.c
index 39fb0de..1eecfce 100644
--- a/task.c
+++ b/task.c
@@ -314,6 +314,7 @@ task_init(void)
 	        strcpy(buf, "alias last ps -l");
         	alias_init(buf);
 	}
+	MEMBER_OFFSET_INIT(task_struct_pid_links, "task_struct", "pid_links");
 	MEMBER_OFFSET_INIT(pid_link_pid, "pid_link", "pid");
 	MEMBER_OFFSET_INIT(pid_hash_chain, "pid", "hash_chain");
 
@@ -2382,7 +2383,10 @@ retry_radix_tree:
 		pid_tasks_0 = ULONG(pidbuf + OFFSET(pid_tasks));
 		if (!pid_tasks_0)
 			continue;
-		task = pid_tasks_0 - OFFSET(task_struct_pids);
+		if (VALID_MEMBER(task_struct_pids))
+			task = pid_tasks_0 - OFFSET(task_struct_pids);
+		else
+			task = pid_tasks_0 - OFFSET(task_struct_pid_links);
 
 		if (CRASHDEBUG(1))
 			console("pid: %lx  ns: %lx  tasks[0]: %lx task: %lx\n",
  

Here's a proof of concept with --no_modules:

  # crash --no_modules
  
  crash 7.2.3++
  Copyright (C) 2002-2017  Red Hat, Inc.
  Copyright (C) 2004, 2005, 2006, 2010  IBM Corporation
  Copyright (C) 1999-2006  Hewlett-Packard Co
  Copyright (C) 2005, 2006, 2011, 2012  Fujitsu Limited
  Copyright (C) 2006, 2007  VA Linux Systems Japan K.K.
  Copyright (C) 2005, 2011  NEC Corporation
  Copyright (C) 1999, 2002, 2007  Silicon Graphics, Inc.
  Copyright (C) 1999, 2000, 2001, 2002  Mission Critical Linux, Inc.
  This program is free software, covered by the GNU General Public License,
  and you are welcome to change it and/or distribute copies of it under
  certain conditions.  Enter "help copying" to see the conditions.
  This program has absolutely no warranty.  Enter "help warranty" for details.
   
   GNU gdb (GDB) 7.6
  Copyright (C) 2013 Free Software Foundation, Inc.
  License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
  This is free software: you are free to change and redistribute it.
  There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
  and "show warranty" for details.
  This GDB was configured as "x86_64-unknown-linux-gnu"...
  
  WARNING: kernel relocated [32MB]: patching 99238 gdb minimal_symbol values
  
  WARNING: no kernel module access                                       
  
        KERNEL: /vmlinux
      DUMPFILE: /dev/crash
          CPUS: 2
          DATE: Tue Aug 28 15:12:26 2018
        UPTIME: 00:27:57
  LOAD AVERAGE: 0.36, 0.27, 0.20
         TASKS: 136
      NODENAME: hp-dl380pgen8-02-vm-7.lab.bos.redhat.com
       RELEASE: 4.19.0-0.rc1.git0.1.fc30.x86_64
       VERSION: #1 SMP Mon Aug 27 13:01:19 UTC 2018
       MACHINE: x86_64  (2892 Mhz)
        MEMORY: 4 GB
           PID: 2972
       COMMAND: "crash"
          TASK: ffff953ad74b8000  [THREAD_INFO: ffff953ad74b8000]
           CPU: 0
         STATE: TASK_RUNNING (ACTIVE)
  
  crash>
  
> 
> I'll send the next version of the patch to the list as I see that's how it
> works in the archive.
> 
> -- 
> Dominique

Comment 9 Dominique Martinet 2018-08-28 22:45:41 UTC
> That was an easy enough fix with this patch:
> [...]

Nice!
It looked like there were quite a few other places using task_struct_pids, but now I see these were for older kernels so it does look much better than I had expected.
I agree this one looks good as you did, they split 'pid_link' struct in two and node was the first field (the original crash code actually relied on that and was looking at task_struct.pids.node), so it's great it came out that simple.

I'll send a v2 of the module patch within an hour or so.

-- 
Dominique

Comment 10 Dave Anderson 2018-08-29 14:51:40 UTC
Dominique,

I appreciate your proactive approach to this bugzilla, i.e., when
patches come along with the bug report!

Both patches have been applied upstream:

https://github.com/crash-utility/crash/commit/d7eec45d4c2cdd836ce48a81b0ae688a7d2879ba

  Fix for Linux 4.19-rc1 and later kernels that contain kernel commit
  2c4704756cab7cfa031ada4dab361562f0e357c0, titled "pids: Move the pgrp
  and session pid pointers from task_struct to signal_struct".  Without
  the patch, the crash session fails during initialization with the
  message "crash: invalid structure member offset: task_struct_pids".
  (anderson)

https://github.com/crash-utility/crash/commit/001f77a05585c15ebd14bb72d5fde314a63c06fe

  Fix for Linux 4.19-rc1 and later kernels that contain kernel commit
  7290d58095712a89f845e1bca05334796dd49ed2, titled "module: use
  relative references for __ksymtab entries".  Without the patch,
  kernels configured with CONFIG_HAVE_ARCH_PREL32_RELOCATIONS fail
  during session initialization, with a dump of the internel buffer
  allocation stats followed by the message "crash: cannot allocate
  any more memory!"
  (asmadeus)


I will be releasing crash-7.2.4 in early September, and at that time
I will rebase the Fedora Rawhide version.

Comment 11 Dominique Martinet 2018-08-29 21:33:38 UTC
Great! I like crash so I might as well get my hands dirty a bit, it feels wrong to repeatedly ask for new linux version fixes without looking at what's wrong :)
I'm in no rush for the fedora version itself, so the plan looks good to me.

Out of curiosity, I checked the diff between your commit and what I sent; is there any reason you made 'kernel_symbol_value' a long long in defs.h?
(it's probably fine either way, but it's the only one)


Anyway, thanks for the help.
-- 
Dominique

Comment 12 Dave Anderson 2018-08-30 12:57:25 UTC
(In reply to Dominique Martinet from comment #11)
> Great! I like crash so I might as well get my hands dirty a bit, it feels
> wrong to repeatedly ask for new linux version fixes without looking at
> what's wrong :)
> I'm in no rush for the fedora version itself, so the plan looks good to me.
> 
> Out of curiosity, I checked the diff between your commit and what I sent; is
> there any reason you made 'kernel_symbol_value' a long long in defs.h?
> (it's probably fine either way, but it's the only one)

My bad -- it was a cut and paste error.  I'll fix it toute suite...


Note You need to log in before you can comment on or make changes to this bug.