Bug 1570577
| Summary: | The runtime imageFs value is wrong in node stats/summary with devicemapper storage | ||||||
|---|---|---|---|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | DeShuai Ma <dma> | ||||
| Component: | Node | Assignee: | Robert Krawitz <rkrawitz> | ||||
| Status: | CLOSED ERRATA | QA Contact: | DeShuai Ma <dma> | ||||
| Severity: | high | Docs Contact: | |||||
| Priority: | high | ||||||
| Version: | 3.10.0 | CC: | aos-bugs, dma, jokerman, mmccomas, sjenning, weinliu, wmeng | ||||
| Target Milestone: | --- | Flags: | dma:
needinfo-
|
||||
| Target Release: | 3.10.0 | ||||||
| Hardware: | Unspecified | ||||||
| OS: | Unspecified | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | Doc Type: | No Doc Update | |||||
| Doc Text: |
undefined
|
Story Points: | --- | ||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2018-07-30 19:13:42 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Attachments: |
|
||||||
|
Description
DeShuai Ma
2018-04-23 09:30:36 UTC
Robert, could you take a look at this? Noting that capacityBytes in the stats (19316867072) closely matches the size of the root volume -- that number of bytes is 17.99 GiB. Could you include output of "df" and "df -i" on ip-172-18-3-124.ec2.internal? Progress update: attempting to reproduce with 3.10, using oc cluster up with lvm storage. Unable to reproduce with master, as cluster will not come up. Cluster comes up with 3.10.0 alpha-0, but storage information is correct:
"fs": {
"time": "2018-04-30T19:34:42Z",
"availableBytes": 12419698688,
"capacityBytes": 21003583488,
"usedBytes": 7493365760,
"inodesFree": 1107054,
"inodes": 1310720,
"inodesUsed": 203666
},
"runtime": {
"imageFs": {
"time": "2018-04-30T19:34:42Z",
"availableBytes": 19358810112,
"capacityBytes": 24469569536,
"usedBytes": 10023672506
}
}
Currently bisecting to try to determine either where this fails or am no longer able to bring up the cluster.
(In reply to Seth Jennings from comment #1) > Robert, could you take a look at this? http://pastebin.test.redhat.com/584960 Retest on v3.10.0-0.31.0. oc v3.10.0-0.31.0 kubernetes v1.10.0+b81c8f8 features: Basic-Auth GSSAPI Kerberos SPNEGO Server https://ip-172-18-10-141.ec2.internal:8443 openshift v3.10.0-0.31.0 kubernetes v1.10.0+b81c8f8 Unable to bring up a cluster with the current oc cluster up. Could you provide the sequence and configuration that you used to bring your cluster up? I can also see this:
$ oc get --raw=/api/v1/nodes/node1.lab.variantweb.net/proxy/stats/summary | grep -A 20 fs
"fs": {
"time": "2018-05-02T19:32:24Z",
"availableBytes": 15577264128,
"capacityBytes": 21463281664,
"usedBytes": 5886017536,
"inodesFree": 10443544,
"inodes": 10485232,
"inodesUsed": 41688
},
"runtime": {
"imageFs": {
"time": "2018-05-02T19:32:24Z",
"availableBytes": 15577264128,
"capacityBytes": 21463281664,
"usedBytes": 3244701384,
"inodesFree": 10443544,
"inodes": 10485232,
"inodesUsed": 41688
}
},
on node1:
[root@node1 ~]# df -h /
Filesystem Size Used Avail Use% Mounted on
/dev/vda1 20G 5.5G 15G 28% /
5.5G matches usedBytes in fs
# docker info | grep Data
WARNING: Usage of loopback devices is strongly discouraged for production use. Use `--storage-opt dm.thinpooldev` to specify a custom block storage device.
Data file: /dev/loop0
Data Space Used: 3.262 GB
Data Space Total: 107.4 GB
Data Space Available: 15.58 GB
Data loop file: /var/lib/docker/devicemapper/devicemapper/data
3.262 GB is somewhat close to imageFs usedBytes.
All the other fields between fs and runtime.imageFs are the same.
Please be sure to include the commands you used to set up the storage, /etc/docker/daemon.json, and /etc/sysconfig/docker. Thanks! I recreated locally with the following steps: - deploy 3.10 cluster with openshift-ansible - chose a compute node for testing - oc adm drain --ignore-daemonsets <node> - ssh into <node> - systemctl stop atomic-openshift-node - systemctl stop docker - rm -rf /var/lib/docker/* - edit /etc/sysconfig/docker-storage with s/overlay2/devicemapper - systemctl start docker - "docker info | grep Storage" makes sure devicemapper is in use - systemctl start atomic-openshift-node - exit node - oc adm uncordon <node> - oc get --raw=/api/v1/nodes/<node>/proxy/stats/summary | grep -A 20 fs In 3.9:
Before any image pull to DM storage:
"fs": {
"time": "2018-05-02T20:46:08Z",
"availableBytes": 19392348160,
"capacityBytes": 21463281664,
"usedBytes": 2070933504,
"inodesFree": 10451574,
"inodes": 10485232,
"inodesUsed": 33658
},
"runtime": {
"imageFs": {
"time": "2018-05-02T20:46:08Z",
"availableBytes": 17063477248,
"capacityBytes": 17083400192,
"usedBytes": 0
}
}
After image pull:
"fs": {
"time": "2018-05-02T20:48:10Z",
"availableBytes": 19382808576,
"capacityBytes": 21463281664,
"usedBytes": 2080473088,
"inodesFree": 10451530,
"inodes": 10485232,
"inodesUsed": 33702
},
"runtime": {
"imageFs": {
"time": "2018-05-02T20:48:10Z",
"availableBytes": 15486418944,
"capacityBytes": 17083400192,
"usedBytes": 1420732513
}
}
# docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
docker.io/openshift/origin latest ad6bb5bd6b73 2 hours ago 1.42 GB
# docker info 2>/dev/null | grep "Data Space"
Data Space Used: 1.597 GB
Data Space Total: 17.08 GB
Data Space Available: 15.49 GB
usedBytes is matching up with the image size. inodes fields are not included. availableBytes matches "Data Space Available" and capacityBytes matches "Data Space Total"
So there is a change in behavior (i.e. bug) here for 3.10
Already fixed in upstream cadvisor but not updated in kube 1.10 so we inherited the bug :-/ Origin PR: https://github.com/openshift/origin/pull/19601 Verify on v3.10.0-0.46.0
[root@ip-172-18-0-165 ~]# oc get --raw /api/v1/nodes/ip-172-18-14-141.ec2.internal/proxy/stats/summary |grep -w -A 16 fs
"fs": {
"time": "2018-05-16T07:43:18Z",
"availableBytes": 16132792320,
"capacityBytes": 19316867072,
"usedBytes": 3184074752,
"inodesFree": 9389563,
"inodes": 9437184,
"inodesUsed": 47621
},
"runtime": {
"imageFs": {
"time": "2018-05-16T07:43:18Z",
"availableBytes": 8842117120,
"capacityBytes": 13098811392,
"usedBytes": 6847236375
}
},
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:1816 |