Bug 1381357

Summary: rpm-ostree fails to allocate 2147483648 bytes of ram sometimes
Product: [Fedora] Fedora Reporter: Dennis Gilmore <dennis>
Component: rpm-ostreeAssignee: Colin Walters <walters>
Status: CLOSED ERRATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 26CC: dennis, jlebon, pbrobinson, walters
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: rpm-ostree-2017.5-2.fc25 rpm-ostree-2017.5-2.fc26 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-05-02 02:02:16 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
Bug Depends On:    
Bug Blocks: 245418    

Description Dennis Gilmore 2016-10-03 19:57:33 UTC
running rpm-ostree in a armv7 chroot can fail with a failure to allocate memory 
2016-10-03 19:41:26,842 [INFO] {11601} koji.build.buildroot:428 Running mock as kojibuilder
INFO: mock.py version 1.2.21 starting (python version = 3.5.1)...
Start: init plugins
INFO: selinux disabled
Finish: init plugins
Start: run
Start: chroot init
INFO: calling preinit hooks
Mock Version: 1.2.21
INFO: Mock Version: 1.2.21
Finish: chroot init
INFO: Running in chroot: ['/bin/sh', '-c', '{ rm -f /var/lib/rpm/__db*; rm -rf /var/cache/yum/*; set -x; pungi-make-ostree --log-dir=/mnt/koji/compose/cd/log-latest/ --treefile=/mnt/koji/compose/cd/fedora-atomic/fedora-atomic-docker-host.json /mnt/koji/atomic-cd/; } < /dev/null 2>&1 | /usr/bin/tee /tmp/runroot.log; exit ${PIPESTATUS[0]}']
Start: chroot ['/bin/sh', '-c', '{ rm -f /var/lib/rpm/__db*; rm -rf /var/cache/yum/*; set -x; pungi-make-ostree --log-dir=/mnt/koji/compose/cd/log-latest/ --treefile=/mnt/koji/compose/cd/fedora-atomic/fedora-atomic-docker-host.json /mnt/koji/atomic-cd/; } < /dev/null 2>&1 | /usr/bin/tee /tmp/runroot.log; exit ${PIPESTATUS[0]}']
+ pungi-make-ostree --log-dir=/mnt/koji/compose/cd/log-latest/ --treefile=/mnt/koji/compose/cd/fedora-atomic/fedora-atomic-docker-host.json /mnt/koji/atomic-cd/
ERROR running command: rpm-ostree compose tree --repo=/mnt/koji/atomic-cd/ /mnt/koji/compose/cd/fedora-atomic/fedora-atomic-docker-host.json
COMMAND: rpm-ostree compose tree --repo=/mnt/koji/atomic-cd/ /mnt/koji/compose/cd/fedora-atomic/fedora-atomic-docker-host.json
------------------------------------------------------------------------------------------------------------------------------

(rpm-ostree compose tree:7): GLib-ERROR **: gmem.c:165: failed to allocate 2147483648 bytes
Traceback (most recent call last):
  File "/usr/bin/pungi-make-ostree", line 15, in <module>
    ostree.main()
  File "/usr/lib/python2.7/site-packages/pungi/ostree.py", line 63, in main
    run(opts)
  File "/usr/lib/python2.7/site-packages/pungi/ostree.py", line 48, in run
    make_ostree_repo(opts.ostree_repo, opts.treefile, log_dir=opts.log_dir)
  File "/usr/lib/python2.7/site-packages/pungi/ostree.py", line 43, in make_ostree_repo
    show_cmd=True, stdout=True, logfile=log_file)
  File "/usr/lib/python2.7/site-packages/kobo/shortcuts.py", line 335, in run
    raise RuntimeError(err_msg)
RuntimeError: ERROR running command: rpm-ostree compose tree --repo=/mnt/koji/atomic-cd/ /mnt/koji/compose/cd/fedora-atomic/fedora-atomic-docker-host.json
Finish: chroot ['/bin/sh', '-c', '{ rm -f /var/lib/rpm/__db*; rm -rf /var/cache/yum/*; set -x; pungi-make-ostree --log-dir=/mnt/koji/compose/cd/log-latest/ --treefile=/mnt/koji/compose/cd/fedora-atomic/fedora-atomic-docker-host.json /mnt/koji/atomic-cd/; } < /dev/null 2>&1 | /usr/bin/tee /tmp/runroot.log; exit ${PIPESTATUS[0]}']
ERROR: Command failed. See logs for output.
 # /usr/bin/systemd-nspawn -q -M 3b37981c30c14c29a199aec64f23aeee -D /var/lib/mock/f26-build-6576231-646603/root --setenv=PATH=/usr/bin:/bin:/usr/sbin:/sbin --setenv=HOME=/builddir --setenv=TERM=vt100 --setenv=LANG=en_US.UTF-8 --setenv=HOSTNAME=mock --setenv=SHELL=/bin/bash --setenv=PROMPT_COMMAND=printf "\033]0;<mock-chroot>\007" --setenv=PS1=<mock-chroot> \s-\v\$  /bin/sh -c { rm -f /var/lib/rpm/__db*; rm -rf /var/cache/yum/*; set -x; pungi-make-ostree --log-dir=/mnt/koji/compose/cd/log-latest/ --treefile=/mnt/koji/compose/cd/fedora-atomic/fedora-atomic-docker-host.json /mnt/koji/atomic-cd/; } < /dev/null 2>&1 | /usr/bin/tee /tmp/runroot.log; exit ${PIPESTATUS[0]}

is the mock output  it is trying to allocate 2GiB of ram, there is only about 80MiB of 4GiB ram used. my guess is that it can not get that much contiguous memory. but I question why that much is needed in the first place

http://koji.fedoraproject.org/koji/taskinfo?taskID=15927335 is an example task

Comment 1 Colin Walters 2016-10-03 20:01:33 UTC
Could use a stack trace; did a core dump make it out into e.g. coredumpctl?

Comment 2 Fedora End Of Life 2017-02-28 10:24:36 UTC
This bug appears to have been reported against 'rawhide' during the Fedora 26 development cycle.
Changing version to '26'.

Comment 3 Dennis Gilmore 2017-04-14 16:46:30 UTC
I have been able to get a coredump

root@cubietruck01 ~]# rpm-ostree compose --repo=repo tree fedora-atomic/fedora-atomic-docker-host.json

(rpm-ostree compose tree:9050): GLib-ERROR **: gmem.c:165: failed to allocate 2147483648 bytes
`trap' para punto de parada/seguimiento (`core' generado)
[root@cubietruck01 ~]# coredumpctl gdb rpm-ostree
           PID: 9050 (rpm-ostree)
           UID: 0 (root)
           GID: 0 (root)
        Signal: 5 (TRAP)
     Timestamp: Sat 2017-03-25 06:09:22 CDT (14s ago)
  Command Line: rpm-ostree compose --repo=repo tree fedora-atomic/fedora-atomic-docker-host.json
    Executable: /usr/bin/rpm-ostree
 Control Group: /user.slice/user-0.slice/session-4621.scope
          Unit: session-4621.scope
         Slice: user-0.slice
       Session: 4621
     Owner UID: 0 (root)
       Boot ID: 61e9f611bf2d4832a71e9c25b2872b69
    Machine ID: 97b27183f83940cc98dc6a38d1697270
      Hostname: cubietruck01.ausil.us
       Storage: /var/lib/systemd/coredump/core.rpm-ostree.0.61e9f611bf2d4832a71e9c25b2872b69.9050.1490440162000000.lz4
       Message: Process 9050 (rpm-ostree) of user 0 dumped core.

                Stack trace of thread 9050:
                #0  0x00000000b68dc7d4 raise (libpthread.so.0)
                #1  0x00000000b69b9790 g_log_default_handler (libglib-2.0.so.0)
                #2  0x00000000b69b9a0c g_logv (libglib-2.0.so.0)
                #3  0x00000000b69b9ba0 g_log (libglib-2.0.so.0)
                #4  0x00000000b69b8410 g_realloc (libglib-2.0.so.0)
                #5  0x00000000b69d58e0 g_string_maybe_expand (libglib-2.0.so.0)
                #6  0x00000000b69d5c48 g_string_insert_len (libglib-2.0.so.0)
                #7  0x000000007f669100 _rpmostree_varsubst_string (rpm-ostree)
                #8  0x000000007f6618c0 rpmostree_compose_builtin_tree (rpm-ostree)
                #9  0x000000007f663c7c rpmostree_builtin_compose (rpm-ostree)
                #10 0x000000007f64da88 main (rpm-ostree)
                #11 0x00000000b67728bc __libc_start_main (libc.so.6)
                #12 0x000000007f64dea0 $a (rpm-ostree)

GNU gdb (GDB) Fedora 7.12.50.20170309-5.fc27
Copyright (C) 2017 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "armv7hl-redhat-linux-gnueabi".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /usr/bin/rpm-ostree...Reading symbols from /usr/lib/debug/usr/bin/rpm-ostree-2017.3-4.fc27.arm.debug...done.
done.

warning: exec file is newer than core file.
[New LWP 9050]
[New LWP 9054]
[New LWP 9057]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/libthread_db.so.1".
Core was generated by `rpm-ostree compose --repo=repo tree fedora-atomic/fedora-atomic-docker-host.jso'.
Program terminated with signal SIGTRAP, Trace/breakpoint trap.
#0  raise (sig=5) at ../sysdeps/unix/sysv/linux/raise.c:51
51      }
[Current thread is 1 (Thread 0xb5371800 (LWP 9050))]
Missing separate debuginfos, use: dnf debuginfo-install gpgme-1.8.0-12.fc26.armv7hl gvfs-client-1.31.92-2.fc27.armv7hl gvfs-client-1.32.1-1.fc27.armv7hl libcurl-7.53.1-3.fc27.armv7hl libsolv-0.6.26-4.git.19.2262346.fc27.armv7hl nss-3.29.1-3.fc27.armv7hl nss-softokn-freebl-3.29.1-2.fc26.armv7hl openldap-2.4.44-8.fc26.armv7hl ostree-libs-2017.3-2.fc27.armv7hl rpm-libs-4.13.0.1-13.fc27.armv7hl
(gdb) bt
#0  raise (sig=5) at ../sysdeps/unix/sysv/linux/raise.c:51
#1  0xb69b9790 in g_log_default_handler (log_domain=log_domain@entry=0xb6a00408 "GLib", log_level=log_level@entry=6, message=0xb3d004d8 "gmem.c:165: failed to allocate 2147483648 bytes", 
    message@entry=0xb69b9a0c <g_logv+572> "\006\060\005\342\002", unused_data=unused_data@entry=0xb6f85c90 <__stack_chk_guard>) at gmessages.c:3036
#2  0xb69b9a0c in g_logv (log_domain=0xb6a00408 "GLib", log_level=log_level@entry=G_LOG_LEVEL_ERROR, format=format@entry=0xbef1e9e4 "؈\240\266", args=..., args@entry=...) at gmessages.c:1336
#3  0xb69b9ba0 in g_log (log_domain=<optimized out>, log_level=log_level@entry=G_LOG_LEVEL_ERROR, format=0xb6a088ac "%s: failed to allocate %u bytes") at gmessages.c:1398
#4  0xb69b8410 in g_realloc (mem=0x8145ffc0, n_bytes=2147483648) at gmem.c:164
#5  0xb69d58e0 in g_string_maybe_expand (string=0x814604b0, len=<optimized out>) at gstring.c:102
#6  0xb69d5c48 in g_string_insert_len (string=0x814604b0, pos=<optimized out>, val=0x8146009a "/atomic-host", len=<optimized out>) at gstring.c:476
#7  0x7f669100 in _rpmostree_varsubst_string (instr=0xb6f85c90 <__stack_chk_guard> "", instr@entry=0x81460080 "fedora/rawhide/${basearch}/atomic-host", substitutions=0x8141d568, 
    substitutions@entry=0x81406400, error=0xbef1ec30, error@entry=0x814061c8) at src/libpriv/rpmostree-util.c:151
#8  0x7f6618c0 in rpmostree_compose_builtin_tree (argc=<optimized out>, argv=<optimized out>, cancellable=0x81408db0, cancellable@entry=0x81404810, error=0x814061c8, error@entry=0xbef1ec30)
    at src/app/rpmostree-compose-builtin-tree.c:826
#9  0x7f663c7c in rpmostree_builtin_compose (argc=<optimized out>, argv=<optimized out>, cancellable=0x81404810, error=0xbef1ec30) at src/app/rpmostree-builtin-compose.c:139
#10 0x7f64da88 in main (argc=<optimized out>, argv=<optimized out>) at src/app/main.c:327
(gdb)

Comment 4 Dennis Gilmore 2017-04-14 16:49:39 UTC
it is with rpm-ostree-2017.3-4.fc27.armv7hl

Comment 5 Dennis Gilmore 2017-04-14 17:00:24 UTC
possibly related https://github.com/flatpak/flatpak/issues/483

Comment 6 Colin Walters 2017-04-18 14:05:54 UTC
The flatpak issue there is building static deltas, whereas this crash is...in a strange place in rpm-ostree.  BTW, static deltas are an arch-independent process, so it'd make sense to do it on a 64 bit arch like x86_64.

Anyways for this one...it's weird, offhand it looks to me like ${basearch} is somehow 2GB?  I wonder if we're somehow walking off an array in dnf_context_get_base_arch()?

This stack is good enough to do some code reading, but it'd be improved if we recompiled the stack without optimizations.

In particular, if you catch this again, try:

```
gdb --args rpm-ostree ...
start
b rpmostree_context_get_varsubsts
c
fini
p g_hash_table_lookup (varsubsts, "basearch")
```

Comment 7 Dennis Gilmore 2017-04-27 16:46:16 UTC
I build rpm-ostree with -O0

gdb --args rpm-ostree compose --repo=repo tree fedora-atomic/fedora-atomic-docker-host.json --force-nocache
GNU gdb (GDB) Fedora 7.12.50.20170309-5.fc27
Copyright (C) 2017 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "armv7hl-redhat-linux-gnueabi".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from rpm-ostree...Reading symbols from /usr/lib/debug/usr/bin/rpm-ostree-2017.4-2.fc27.arm.debug...done.
done.
(gdb) start
Temporary breakpoint 1 at 0x11768: file src/app/main.c, line 280.
Starting program: /usr/bin/rpm-ostree compose --repo=repo tree fedora-atomic/fedora-atomic-docker-host.json --force-nocache
Cannot parse expression `.L1017 4@r4'.
warning: Probes-based dynamic linker interface failed.
Reverting to original interface.

[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/libthread_db.so.1".

Temporary breakpoint 1, main (argc=6, argv=0xbe9b3c64) at src/app/main.c:280
280	{
Missing separate debuginfos, use: dnf debuginfo-install libarchive-3.3.1-1.fc27.armv7hl libcurl-7.54.0-1.fc27.armv7hl ostree-libs-2017.5-2.fc27.armv7hl rpm-libs-4.13.0.1-19.fc27.armv7hl
(gdb) b rpmostree_context_get_varsubsts
Breakpoint 2 at 0x7f637d8c: rpmostree_context_get_varsubsts. (2 locations)
(gdb) c
Continuing.
Detaching after fork from child process 13547.
[New Thread 0xb4fc1310 (LWP 13550)]
[New Thread 0xb45ff310 (LWP 13551)]

Thread 1 "rpm-ostree" hit Breakpoint 2, rpmostree_context_get_varsubsts (context=0x7f6f0408 [RpmOstreeContext]) at src/libpriv/rpmostree-core.c:509
509	  GHashTable *r = g_hash_table_new_full (g_str_hash, g_str_equal, g_free, g_free);
Missing separate debuginfos, use: dnf debuginfo-install gvfs-client-1.31.92-2.fc27.armv7hl gvfs-client-1.32.1-1.fc27.armv7hl
(gdb) fini
Run till exit from #0  rpmostree_context_get_varsubsts (context=0x7f6f0408 [RpmOstreeContext]) at src/libpriv/rpmostree-core.c:509
0x7f625740 in rpmostree_compose_builtin_tree (argc=2, argv=0xbe9b3c64, invocation=0xbe9b3a6c, cancellable=0x7f6e8810 [GCancellable], error=0xbe9b3acc) at src/app/rpmostree-compose-builtin-tree.c:801
801	  varsubsts = rpmostree_context_get_varsubsts (corectx);
Value returned is $1 = 0x7f6fefa0 = {[0x7f723940 "basearch"] = 0x7f7256f0}
(gdb) p g_hash_table_lookup (varsubsts, "basearch")

(/usr/bin/rpm-ostree compose tree:13543): GLib-CRITICAL **: g_hash_table_lookup: assertion 'hash_table != NULL' failed
$2 = (void *) 0x0
(gdb)

Comment 8 Colin Walters 2017-04-27 17:42:09 UTC
We debugged this a bit more.

The value of basearch is fine, it's "armhfp".

(gdb) p instr
$1 = 0x80832320 "fedora/rawhide/${basearch}/atomic-host"

Also looks fine.  So...I'm not sure yet.  I wrote a PR to add some more tests for this:

https://github.com/projectatomic/rpm-ostree/pull/755

Comment 9 Jonathan Lebon 2017-04-27 20:43:27 UTC
We have a potential fix for this:
https://github.com/projectatomic/rpm-ostree/pull/756

We can backport it and see if that solves it.

Comment 10 Dennis Gilmore 2017-04-28 01:04:25 UTC
Thread 1 "rpm-ostree" hit Breakpoint 2, _rpmostree_varsubst_string (instr=0x81427778 "fedora/rawhide/${basearch}/atomic-host", substitutions=0x813e45a0 = {...}, error=0xbe81718c) at src/libpriv/rpmostree-util.c:105
105	{
(gdb) p instr
$1 = 0x81427778 "fedora/rawhide/${basearch}/atomic-host"
(gdb) n
109	  g_autoptr(GString) varnamebuf = g_string_new ("");
(gdb) n
110	  g_autoptr(GString) result = g_string_new ("");
(gdb) n
112	  s = instr;
(gdb) n
113	  while ((p = strstr (s, "${")) != NULL)
(gdb) n
115	      const char *varstart = p + 2;
(gdb) n
116	      const char *varend = strchr (varstart, '}');
(gdb) n
118	      if (!varend)
(gdb) n
127	      g_string_append_len (result, s, p - s);
(gdb) n
130	      g_string_truncate (varnamebuf, 0);
(gdb) n
131	      g_string_append_len (varnamebuf, varstart, varend - varstart);
(gdb) n
133	      value = g_hash_table_lookup (substitutions, varnamebuf->str);
(gdb) n
134	      if (!value)
(gdb) n
142	      g_string_append (result, value);
(gdb) n
145	      s = varend+1;
(gdb) n
113	  while ((p = strstr (s, "${")) != NULL)
(gdb) n
148	  if (s != instr)
(gdb) n
151	      g_string_append_len (result, s, p - s);
(gdb) n

(/usr/bin/rpm-ostree compose tree:21949): GLib-ERROR **: gmem.c:165: failed to allocate 2147483648 bytes

Thread 1 "rpm-ostree" received signal SIGTRAP, Trace/breakpoint trap.
raise (sig=5) at ../sysdeps/unix/sysv/linux/raise.c:51
51	}
(gdb) 


I will do a build with the potential fix and do some testing

Comment 11 Dennis Gilmore 2017-04-28 12:45:21 UTC
I have built a rpm-ostree with the patches from https://github.com/projectatomic/rpm-ostree/pull/756 and https://github.com/projectatomic/rpm-ostree/pull/755 applied and so far I have ran rpm-ostree three times and not hit this, previously if one ran happened to work the next was failing. So it looks good in a quick test. Can we get builds in rawhide for it please?

Comment 12 Fedora Update System 2017-04-28 15:32:35 UTC
rpm-ostree-2017.5-2.fc25 has been submitted as an update to Fedora 25. https://bodhi.fedoraproject.org/updates/FEDORA-2017-2fa9ff86ba

Comment 13 Fedora Update System 2017-04-28 15:32:44 UTC
rpm-ostree-2017.5-2.fc26 has been submitted as an update to Fedora 26. https://bodhi.fedoraproject.org/updates/FEDORA-2017-c5b64407b6

Comment 14 Fedora Update System 2017-04-30 02:26:22 UTC
rpm-ostree-2017.5-2.fc25 has been pushed to the Fedora 25 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2017-2fa9ff86ba

Comment 15 Fedora Update System 2017-04-30 03:52:41 UTC
rpm-ostree-2017.5-2.fc26 has been pushed to the Fedora 26 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2017-c5b64407b6

Comment 16 Fedora Update System 2017-05-02 02:02:16 UTC
rpm-ostree-2017.5-2.fc25 has been pushed to the Fedora 25 stable repository. If problems still persist, please make note of it in this bug report.

Comment 17 Fedora Update System 2017-05-04 13:30:13 UTC
rpm-ostree-2017.5-2.fc26 has been pushed to the Fedora 26 stable repository. If problems still persist, please make note of it in this bug report.