| Summary: | Performance optimization of docker devicemapper graph driver | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | Jeremy Eder <jeder> |
| Component: | docker | Assignee: | Vivek Goyal <vgoyal> |
| Status: | CLOSED WONTFIX | QA Contact: | atomic-bugs <atomic-bugs> |
| Severity: | high | Docs Contact: | |
| Priority: | high | ||
| Version: | 7.3 | CC: | acarter, agk, dwalsh, lsm5, mcsontos, ndordet, twaugh, vgoyal, zkabelac |
| Target Milestone: | rc | Keywords: | Extras |
| Target Release: | --- | ||
| Hardware: | All | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2020-04-13 21:30:00 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
|
Comment 1
Vivek Goyal
2016-04-26 18:47:56 UTC
Or, is there any chance that libdevmapper can be thread safe so that multiple threads of docker can call into library in parallel instead of having to serialize all the operations. I am not sure why openshift would trust a brand new backend rather then OverlayFS, even if we could quickly build a lvm backend. But it would be good to know if lvm has better performance. Basically you do not want serialize all operation into a single thread processing stack and you want to keep processing of operations as independent commands. And yes, it should be tried to use lvm2 commands as direct replacement of docker device manipulations code. Try the new dm_udev_wait_immediate which will let you wait for udev outside the library while not holding the library mutex. https://www.redhat.com/archives/lvm-devel/2016-April/msg00145.html https://git.fedorahosted.org/cgit/lvm2.git/patch/?id=16019b518e287da19c87eb64229f5c3ca057cb05 I analysed the out.csv data and found the following:
*lookupDevice*
2. took
<type 'float'>
Nulls: False
Min: 1.630804
Max: 11622.047727
Sum: 193923.888232
Mean: 712.955471441
Median: 289.74097
Standard Deviation: 1261.40217467
Unique values: 272
Row count: 272
================================================================================
*lookupDeviceWithLock*
================================================================================
2. took
<type 'float'>
Nulls: False
Min: 1.630804
Max: 11622.047727
Sum: 193914.978846
Mean: 718.203625356
Median: 291.2552555
Standard Deviation: 1264.58521366
Unique values: 270
Row count: 270
lookupDeviceWithLock is calling lookupDevice. And waiting for the lock takes almost no time at all (ignoring those 2 extra calls to lookupDevice from elsewhere.) So those 3 minutes are spent in lookupDevice, which does nothing but calls loadMetadata:
func (devices *DeviceSet) loadMetadata(hash string) *devInfo {
info := &devInfo{Hash: hash, devices: devices}
jsonData, err := ioutil.ReadFile(devices.metadataFile(info))
if err != nil {
return nil
}
if err := json.Unmarshal(jsonData, &info); err != nil {
return nil
}
if info.DeviceID > maxDeviceID {
logrus.Errorf("Ignoring Invalid DeviceId=%d", info.DeviceID)
return nil
}
return info
}
IIUC looking at this function, lookupDevice is spending those 3 minutes reading and parsing JSON files
and there is nothing what LVM/DM could do to speed it up.
There is a cache (devices.Devices) in place, and I fail to understand this.
Jeremy, Vivek, was the source file instrumented incorrectly and is the csv is just plain wrong? Or is there something I am missing?
Everything would be easier and faster to resolve if go had a proper profiling tool. :-/
May be tests were done on a disk which was slow. Jeremy, any chance that these tests were done on an AWS instance. I have often seen that there additional disk can be very slow sometimes. (In reply to Vivek Goyal from comment #10) > May be tests were done on a disk which was slow. Jeremy, any chance that > these tests were done on an AWS instance. I have often seen that there > additional disk can be very slow *sometimes*. I hope not. Or does it make any sense to test performance on a platform where performance varies sometimes? Not EC2. Done in a KVM guest. Backing storage is 6x300GB SAS 10k RAID6. I am not a performance expert, but using VMs for testing performance just does not sound like a good idea to me. Way too many layers to consider. And we still do not know where's the bottleneck - is it CPU, I/O or memory bound? If it is absolutely essential to use VMs, e.g. as the infrastructure is using them, I have few easy optimizations to try. IIUC both the worker VMs and the built images should all be reproducible/throwaway, so we can optimize for speed, ignoring any data safety. First optimization is of course using "writeback" on the VMs disks. This greatly improves file locking speed, as the file does not have to be written through all the layers down to the slow HDDs. Using noop scheduler in the VMs and deadline on host brought the best results for my test VMs. Providing enough^TM memory to the VMs, no swap, and let the host do swapping. (This may not be efficient in "cloud". But does anyone expect to get consistent performance from single VM in cloud? Want more? Scale out! It rhymes with /.c.l..out/ after all.) What's the backing storage for VM images? Filesystem or LV? Raw, sparse, compressed? I suggest using linear LVs as these have less overhead than anything on top of FS. Using RAID6 seems like a waste. I suggest using no redundancy, maybe RAID0. Also RAID6 for thin-pool metadata is not good. We recommend linear or RAID1 volumes on the fastest available storage, ideally on a different disk than data disk(s). I suggest using one of the disks for metadata volumes, and the rest for data in RAID0. Also I would absolutely love to see performance on real data. Even though DM can not share buffers, I have a hunch OverlayFS will have significant overhead too, as there will not be that much sharing in an image build service. Any update on this Vivek? |