RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1661443 - dockerd may leak memory resources if uncompressing a layer fails
Summary: dockerd may leak memory resources if uncompressing a layer fails
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: docker
Version: 7.6
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: rc
: ---
Assignee: Daniel Walsh
QA Contact: atomic-bugs@redhat.com
URL:
Whiteboard:
Depends On:
Blocks: 1186913 1647729
TreeView+ depends on / blocked
 
Reported: 2018-12-21 09:22 UTC by Sergio Lopez
Modified: 2023-09-14 04:44 UTC (History)
9 users (show)

Fixed In Version: docker-1.13.1-93.gitb2f74b2.el7
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-03-13 01:52:23 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2019:0487 0 None None None 2019-03-13 01:52:49 UTC

Description Sergio Lopez 2018-12-21 09:22:59 UTC
Description of problem:

This is the analysis of the core dump from a dockerd instance that was using more than 20GB or RAM:

 - While the type using the largest amount of memory isn't known, the number of objects gives us a hint about what it is:

# grep 855 viewcore-histogram.txt
    count     size      bytes type
      855 10027008 8573091840 unk10027008
      855   663552  567336960 unk663552
      855    73728   63037440 unk73728
      855    65536   56033280 unk65536
      855     4864    4158720 compress/flate.decompressor
      855     2688    2298240 [316]int
      855     1536    1313280 [1536]uint8
      855      704     601920 compress/gzip.Reader
      855      640     547200 github.com/docker/docker/vendor/github.com/vbatts/tar-split/archive/tar.Reader
      855      208     177840 github.com/docker/docker/vendor/github.com/vbatts/tar-split/archive/tar.Header
      855      160     136800 [19]int
      855      128     109440 crypto/sha256.digest
      855       96      82080 github.com/docker/docker/pkg/progress.Reader
      855       96      82080 hchan<github.com/docker/docker/pkg/progress.Progress>
      855       80      68400 github.com/docker/docker/vendor/golang.org/x/time/rate.Limiter
      855       32      27360 github.com/docker/docker/vendor/github.com/vbatts/tar-split/archive/tar.regFileReader
      855       32      27360 github.com/docker/docker/pkg/ioutils.cancelReadCloser
      855       32      27360 github.com/docker/docker/pkg/ioutils.readCloserWrapper
      855        8       6840 io.PipeReader


 - We also have the same number of goroutines stuck in tar.regFileReader:

# grep regFileReader dockerd.txt |wc -l
855

 - These goroutines are waiting for someone to flush the other side of the pipe:

(dlv) goroutine 4778918
Switched from 0 to 4778918 (thread 2151)
(dlv) bt
 0  0x000000000043d3dc in runtime.gopark
    at /usr/lib/golang/src/runtime/proc.go:288
 1  0x000000000043d4ce in runtime.goparkunlock
    at /usr/lib/golang/src/runtime/proc.go:293
 2  0x00000000004501f0 in sync.runtime_notifyListWait
    at /usr/lib/golang/src/runtime/sema.go:507
 3  0x0000000000480df0 in sync.(*Cond).Wait
    at /usr/lib/golang/src/sync/cond.go:56
 4  0x0000000000485faa in io.(*pipe).write
    at /usr/lib/golang/src/io/pipe.go:90
 5  0x00000000004864ec in io.(*PipeWriter).Write
    at /usr/lib/golang/src/io/pipe.go:157
 6  0x00000000004851e1 in io.(*teeReader).Read
    at /usr/lib/golang/src/io/io.go:527
 7  0x0000000000892f34 in github.com/docker/docker/vendor/github.com/vbatts/tar-split/archive/tar.(*regFileReader).Read
    at /usr/src/debug/docker-8633870faeff2c03dd6a75fdef9c2976379339c6/_build/src/github.com/docker/docker/vendor/github.com/vbatts/tar-split/archive/tar/reader.go:960
 8  0x0000000000892d7c in github.com/docker/docker/vendor/github.com/vbatts/tar-split/archive/tar.(*Reader).Read
    at /usr/src/debug/docker-8633870faeff2c03dd6a75fdef9c2976379339c6/_build/src/github.com/docker/docker/vendor/github.com/vbatts/tar-split/archive/tar/reader.go:945
 9  0x0000000000484ab3 in io.copyBuffer
    at /usr/lib/golang/src/io/io.go:392
10  0x0000000000484858 in io.Copy
    at /usr/lib/golang/src/io/io.go:362
11  0x0000000000863dd5 in github.com/docker/docker/vendor/github.com/vbatts/tar-split/tar/storage.(*bitBucketFilePutter).Put
    at /usr/src/debug/docker-8633870faeff2c03dd6a75fdef9c2976379339c6/_build/src/github.com/docker/docker/vendor/github.com/vbatts/tar-split/tar/storage/getter.go:99
12  0x0000000000895975 in github.com/docker/docker/vendor/github.com/vbatts/tar-split/tar/asm.NewInputTarStream.func1
    at /usr/src/debug/docker-8633870faeff2c03dd6a75fdef9c2976379339c6/_build/src/github.com/docker/docker/vendor/github.com/vbatts/tar-split/tar/asm/disassemble.go:87
13  0x000000000046cc21 in runtime.goexit
    at /usr/lib/golang/src/runtime/asm_amd64.s:2337

 - This is where the bug is located:

layer/layer_store.go:
    206 func (ls *layerStore) applyTar(tx MetadataTransaction, ts io.Reader, parent string, layer *roLayer        ) error {
    207         digester := digest.Canonical.New()
    208         tr := io.TeeReader(ts, digester.Hash())
    209 
    210         tsw, err := tx.TarSplitWriter(true)
    211         if err != nil {
    212                 return err
    213         }
    214         metaPacker := storage.NewJSONPacker(tsw)
    215         defer tsw.Close()
    216 
    217         // we're passing nil here for the file putter, because the ApplyDiff will
    218         // handle the extraction of the archive
    219         rdr, err := asm.NewInputTarStream(tr, metaPacker, nil)
    220         if err != nil {
    221                 return err
    222         }
    223 
    224         applySize, err := ls.driver.ApplyDiff(layer.cacheID, parent, rdr)
    225         if err != nil {

--> If ApplyDiff returns an error, the io.Copy bellow is not called, and the resources created by the call to NewInputTarStream are never released.

    226                 return err
    227         }
    228 
    229         // Discard trailing data but ensure metadata is picked up to reconstruct stream
    230         io.Copy(ioutil.Discard, rdr) // ignore error as reader may be closed
    231 
    232         layer.size = applySize
    233         layer.diffID = DiffID(digester.Digest())
    234 
    235         logrus.Debugf("Applied tar %s to %s, size: %d", layer.diffID, layer.cacheID, applySize)
    236 
    237         return nil
    238 }


Version-Release number of selected component (if applicable):

All, including upstream (moby).


How reproducible:

Always.


Steps to Reproduce:

I'm using dlv to simulate the storage issue:

[root@localhost binary-daemon]# dlv exec ./dockerd-dev
Type 'help' for list of commands.
(dlv) break applyTar
Breakpoint 1 set at 0xee94da for github.com/docker/docker/layer.(*layerStore).applyTar() /go/src/github.com/docker/docker/layer/layer_store.go:234
(dlv) c

(another terminal)

[root@localhost ~]# docker run --rm -ti busybox /bin/sh
Unable to find image 'busybox:latest' locally
latest: Pulling from library/busybox
90e01955edcd: Extracting    728kB/728kB

(back to dlv)

> github.com/docker/docker/layer.(*layerStore).applyTar() /go/src/github.com/docker/docker/layer/layer_store.go:234 (hits goroutine(142):1 total:1) (PC: 0xee94da)
Warning: debugging optimized function
   229:		ls.mounts[ml.name] = ml
   230:	
   231:		return nil
   232:	}
   233:	
=> 234:	func (ls *layerStore) applyTar(tx *fileMetadataTransaction, ts io.Reader, parent string, layer *roLayer) error {
   235:		digester := digest.Canonical.Digester()
   236:		tr := io.TeeReader(ts, digester.Hash())
   237:	
   238:		rdr := tr
   239:		if ls.useTarSplit {
(dlv) n
(...)
> github.com/docker/docker/layer.(*layerStore).applyTar() /go/src/github.com/docker/docker/layer/layer_store.go:255 (PC: 0xee95d6)
Warning: debugging optimized function
   250:			if err != nil {
   251:				return err
   252:			}
   253:		}
   254:	
=> 255:		applySize, err := ls.driver.ApplyDiff(layer.cacheID, parent, rdr)
   256:		if err != nil {
   257:			return err
   258:		}
   259:	
   260:		// Discard trailing data but ensure metadata is picked up to reconstruct stream
(dlv) p layer.cacheID
"b4e6ddae7f5cace7bc165d0bdfe3efcc1650eb61d335806ea293e61391910b49"

(just another terminal more)

[root@localhost moby]# # Make sure extracting the tar will fail...
[root@localhost moby]# mount -t tmpfs -o ro tmpfs /var/lib/docker/overlay2/b4e6ddae7f5cace7bc165d0bdfe3efcc1650eb61d335806ea293e61391910b49

(back to dlv)

(dlv) c
INFO[2018-12-21T09:56:56.454866823+01:00] Attempting next endpoint for pull after error: failed to register layer: mkdir /var/lib/docker/overlay2/b4e6ddae7f5cace7bc165d0bdfe3efcc1650eb61d335806ea293e61391910b49/diff: read-only file system 
(dlv) goroutine 196
Switched from 0 to 196 (thread 15923)
(dlv) bt
 0  0x000000000042f821 in runtime.gopark
    at /usr/local/go/src/runtime/proc.go:303
 1  0x000000000043fb3c in runtime.selectgo
    at /usr/local/go/src/runtime/select.go:313
 2  0x000000000046a78e in io.(*pipe).Write
    at /usr/local/go/src/io/pipe.go:87
 3  0x000000000046ac3e in io.(*PipeWriter).Write
    at /usr/local/go/src/io/pipe.go:153
 4  0x00000000004695c3 in io.(*teeReader).Read
    at /usr/local/go/src/io/io.go:537
 5  0x00000000004688ba in io.ReadAtLeast
    at /usr/local/go/src/io/io.go:310
 6  0x0000000000468a5a in io.ReadFull
    at /usr/local/go/src/io/io.go:329
 7  0x0000000000edc99e in github.com/docker/docker/vendor/github.com/vbatts/tar-split/archive/tar.(*Reader).readHeader
    at /go/src/github.com/docker/docker/vendor/github.com/vbatts/tar-split/archive/tar/reader.go:389
 8  0x0000000000edab02 in github.com/docker/docker/vendor/github.com/vbatts/tar-split/archive/tar.(*Reader).next
    at /go/src/github.com/docker/docker/vendor/github.com/vbatts/tar-split/archive/tar/reader.go:111
 9  0x0000000000eda894 in github.com/docker/docker/vendor/github.com/vbatts/tar-split/archive/tar.(*Reader).Next
    at /go/src/github.com/docker/docker/vendor/github.com/vbatts/tar-split/archive/tar/reader.go:74
10  0x0000000000ee3c6e in github.com/docker/docker/vendor/github.com/vbatts/tar-split/tar/asm.NewInputTarStream.func1
    at /go/src/github.com/docker/docker/vendor/github.com/vbatts/tar-split/tar/asm/disassemble.go:49
11  0x000000000045e561 in runtime.goexit
    at /usr/local/go/src/runtime/asm_amd64.s:1333

Without the patch, we get a stuck goroutine as shown above. With it, we get the same error (as expected), but without the stuck goroutine.


Additional info:

I've created a PR for upstream (moby):

 - https://github.com/moby/moby/pull/38413

If it's accepted, I'll create another PR for downstream (projectatomic)

Comment 2 Sergio Lopez 2018-12-24 08:16:03 UTC
The PR for upstream (moby) has been merged.

I've just created a PR for the backport downstream (projectatomic):

 - https://github.com/projectatomic/docker/pull/335

Comment 4 Daniel Walsh 2019-01-07 13:09:14 UTC
Lokesh and Franticek 
can we get out a new version of docker?

Comment 11 errata-xmlrpc 2019-03-13 01:52:23 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2019:0487

Comment 12 Red Hat Bugzilla 2023-09-14 04:44:10 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days


Note You need to log in before you can comment on or make changes to this bug.