Bug 1733235
Summary: | Installed worker nodes/machines have different amounts of memory | ||
---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Andrew McDermott <amcdermo> |
Component: | Cloud Compute | Assignee: | Andrew McDermott <amcdermo> |
Status: | CLOSED NOTABUG | QA Contact: | Jianwei Hou <jhou> |
Severity: | low | Docs Contact: | |
Priority: | unspecified | ||
Version: | 4.2.0 | CC: | agarcial, dhardie, gblomqui, jchaloup, mdhanve, rkrawitz |
Target Milestone: | --- | Keywords: | Reopened |
Target Release: | 4.3.0 | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2019-11-06 13:05:33 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 1731011 |
Description
Andrew McDermott
2019-07-25 13:40:58 UTC
Note: when a node/machine reports differing amount of memory the actual amount reported is derived from /proc/meminfo. If you ssh to each worker node then you can see machines in the same machineset can report different values in /proc/meminfo. Andrew, if you do 'oc adm node-logs' (or dmesg on the node), take a look for the BIOS-provided memory map. Is the memory map from the BIOS identical across nodes reporting different amount of memory? (In reply to Robert Krawitz from comment #2) > Andrew, if you do 'oc adm node-logs' (or dmesg on the node), take a look for > the BIOS-provided memory map. Is the memory map from the BIOS identical > across nodes reporting different amount of memory? Right now I have: $ oc get nodes -o json | jq '.items[].status.capacity["memory"]' | sort | uniq -c 3 "16420436Ki" 1 "8162892Ki" 2 "8162900Ki" $ oc get nodes NAME STATUS ROLES AGE VERSION ip-10-0-139-28.us-east-2.compute.internal Ready master 6h26m v1.14.0+bd34733a7 ip-10-0-143-189.us-east-2.compute.internal Ready worker 109m v1.14.0+bd34733a7 ip-10-0-155-36.us-east-2.compute.internal Ready worker 109m v1.14.0+bd34733a7 ip-10-0-157-15.us-east-2.compute.internal Ready master 6h26m v1.14.0+bd34733a7 ip-10-0-167-148.us-east-2.compute.internal Ready worker 7m2s v1.14.0+bd34733a7 ip-10-0-170-40.us-east-2.compute.internal Ready master 6h26m v1.14.0+bd34733a7 $ ssh ip-10-0-143-189.us-east-2.compute.internal dmesg |grep Mem [ 0.000000] Memory: 3937920K/8388212K available (12292K kernel code, 2101K rwdata, 3816K rodata, 2356K init, 3320K bss, 289296K reserved, 0K cma-reserved) [ 0.201067] x86/mm: Memory block size: 128MB $ ssh ip-10-0-143-189.us-east-2.compute.internal cat /proc/meminfo | egrep 'Mem[Total|Free|Available]' MemTotal: 8162900 kB MemFree: 4470312 kB MemAvailable: 7003856 kB $ ssh ip-10-0-155-36.us-east-2.compute.internal dmesg |grep Mem [ 0.000000] Memory: 3937920K/8388212K available (12292K kernel code, 2101K rwdata, 3816K rodata, 2356K init, 3320K bss, 289296K reserved, 0K cma-reserved) [ 0.237069] x86/mm: Memory block size: 128MB $ ssh ip-10-0-155-36.us-east-2.compute.internal cat /proc/meminfo | egrep 'Mem[Total|Free|Available]' MemTotal: 8162900 kB MemFree: 3576548 kB MemAvailable: 6200228 kB $ ssh ip-10-0-167-148.us-east-2.compute.internal dmesg |grep Mem [ 0.000000] Memory: 3908240K/8388212K available (12292K kernel code, 2101K rwdata, 3816K rodata, 2356K init, 3320K bss, 289304K reserved, 0K cma-reserved) [ 0.206061] x86/mm: Memory block size: 128MB $ ssh ip-10-0-167-148.us-east-2.compute.internal cat /proc/meminfo | egrep 'Mem[Total|Free|Available]' MemTotal: 8162892 kB MemFree: 4883292 kB MemAvailable: 6526712 kB Note the last node has 8162892 whereas the other worker nodes both have 8162900. And the output from dmesg shows: - Memory: 3937920K/8388212K - Memory: 3937920K/8388212K - Memory: 3908240K/8388212K Also if you look at https://bugzilla.redhat.com/show_bug.cgi?id=1731011#c3 you can get to a point where they are all equal, irrespective (it seems) of the availability zone. I want more detailed data than that, the entire memory map. Something like this: [ 0.000000] BIOS-provided physical RAM map: [ 0.000000] BIOS-e820: [mem 0x0000000000000000-0x000000000009cfff] usable [ 0.000000] BIOS-e820: [mem 0x000000000009d000-0x000000000009ffff] reserved [ 0.000000] BIOS-e820: [mem 0x00000000000e0000-0x00000000000fffff] reserved [ 0.000000] BIOS-e820: [mem 0x0000000000100000-0x0000000030f42fff] usable [ 0.000000] BIOS-e820: [mem 0x0000000030f43000-0x0000000043de3fff] reserved [ 0.000000] BIOS-e820: [mem 0x0000000043de4000-0x0000000043de4fff] ACPI NVS [ 0.000000] BIOS-e820: [mem 0x0000000043de5000-0x000000004ff29fff] reserved [ 0.000000] BIOS-e820: [mem 0x000000004ff2a000-0x000000004ffbefff] ACPI NVS [ 0.000000] BIOS-e820: [mem 0x000000004ffbf000-0x000000004fffefff] ACPI data [ 0.000000] BIOS-e820: [mem 0x000000004ffff000-0x0000000057ffffff] reserved [ 0.000000] BIOS-e820: [mem 0x0000000058600000-0x000000005e7fffff] reserved [ 0.000000] BIOS-e820: [mem 0x00000000e0000000-0x00000000efffffff] reserved [ 0.000000] BIOS-e820: [mem 0x00000000fd000000-0x00000000fe7fffff] reserved [ 0.000000] BIOS-e820: [mem 0x00000000fec00000-0x00000000fec00fff] reserved [ 0.000000] BIOS-e820: [mem 0x00000000fed00000-0x00000000fed00fff] reserved [ 0.000000] BIOS-e820: [mem 0x00000000fed10000-0x00000000fed19fff] reserved [ 0.000000] BIOS-e820: [mem 0x00000000fed84000-0x00000000fed84fff] reserved [ 0.000000] BIOS-e820: [mem 0x00000000fee00000-0x00000000fee00fff] reserved [ 0.000000] BIOS-e820: [mem 0x00000000ff800000-0x00000000ffffffff] reserved [ 0.000000] BIOS-e820: [mem 0x0000000100000000-0x000000109f7fffff] usable If they're not identical, there's probably not much you can do. Gathering via: $ echo $workers ip-10-0-143-189.us-east-2.compute.internal ip-10-0-155-36.us-east-2.compute.internal ip-10-0-167-148.us-east-2.compute. internal $ type g g is a function g () { for i in $workers; do echo "Memory map for: $i"; ssh -o StrictHostKeyChecking=no $i dmesg -t | egrep --color=auto 'BIOS' | tee /tmp/$i.meminfo; echo; done } $ diff3 ip-10-0-143-189.us-east-2.compute.internal ip-10-0-155-36.us-east-2.compute.internal ip-10-0-167-148.us-east-2.compute.internal $ echo $? 0 And the output: Memory map for: ip-10-0-143-189.us-east-2.compute.internal BIOS-provided physical RAM map: BIOS-e820: [mem 0x0000000000000000-0x000000000009dfff] usable BIOS-e820: [mem 0x000000000009e000-0x000000000009ffff] reserved BIOS-e820: [mem 0x00000000000e0000-0x00000000000fffff] reserved BIOS-e820: [mem 0x0000000000100000-0x00000000efffffff] usable BIOS-e820: [mem 0x00000000fc000000-0x00000000ffffffff] reserved BIOS-e820: [mem 0x0000000100000000-0x000000020fffffff] usable SMBIOS 2.7 present. DMI: Xen HVM domU, BIOS 4.2.amazon 08/24/2006 intel_idle: Please enable MWAIT in BIOS SETUP piix4_smbus 0000:00:01.3: SMBus base address uninitialized - upgrade BIOS or use force_addr=0xaddr Memory map for: ip-10-0-155-36.us-east-2.compute.internal BIOS-provided physical RAM map: BIOS-e820: [mem 0x0000000000000000-0x000000000009dfff] usable BIOS-e820: [mem 0x000000000009e000-0x000000000009ffff] reserved BIOS-e820: [mem 0x00000000000e0000-0x00000000000fffff] reserved BIOS-e820: [mem 0x0000000000100000-0x00000000efffffff] usable BIOS-e820: [mem 0x00000000fc000000-0x00000000ffffffff] reserved BIOS-e820: [mem 0x0000000100000000-0x000000020fffffff] usable SMBIOS 2.7 present. DMI: Xen HVM domU, BIOS 4.2.amazon 08/24/2006 intel_idle: Please enable MWAIT in BIOS SETUP piix4_smbus 0000:00:01.3: SMBus base address uninitialized - upgrade BIOS or use force_addr=0xaddr Memory map for: ip-10-0-167-148.us-east-2.compute.internal BIOS-provided physical RAM map: BIOS-e820: [mem 0x0000000000000000-0x000000000009dfff] usable BIOS-e820: [mem 0x000000000009e000-0x000000000009ffff] reserved BIOS-e820: [mem 0x00000000000e0000-0x00000000000fffff] reserved BIOS-e820: [mem 0x0000000000100000-0x00000000efffffff] usable BIOS-e820: [mem 0x00000000fc000000-0x00000000ffffffff] reserved BIOS-e820: [mem 0x0000000100000000-0x000000020fffffff] usable SMBIOS 2.7 present. DMI: Xen HVM domU, BIOS 4.2.amazon 08/24/2006 intel_idle: Please enable MWAIT in BIOS SETUP piix4_smbus 0000:00:01.3: SMBus base address uninitialized - upgrade BIOS or use force_addr=0xaddr (In reply to Andrew McDermott from comment #5) > Gathering via: > > $ echo $workers > ip-10-0-143-189.us-east-2.compute.internal > ip-10-0-155-36.us-east-2.compute.internal ip-10-0-167-148.us-east-2.compute. > internal > > $ type g > g is a function > g () > { > for i in $workers; > do > echo "Memory map for: $i"; > ssh -o StrictHostKeyChecking=no $i dmesg -t | egrep --color=auto > 'BIOS' | tee /tmp/$i.meminfo; > echo; > done > } > > $ diff3 ip-10-0-143-189.us-east-2.compute.internal > ip-10-0-155-36.us-east-2.compute.internal > ip-10-0-167-148.us-east-2.compute.internal > $ echo $? > 0 > > And the output: > > > Memory map for: ip-10-0-143-189.us-east-2.compute.internal > BIOS-provided physical RAM map: > BIOS-e820: [mem 0x0000000000000000-0x000000000009dfff] usable > BIOS-e820: [mem 0x000000000009e000-0x000000000009ffff] reserved > BIOS-e820: [mem 0x00000000000e0000-0x00000000000fffff] reserved > BIOS-e820: [mem 0x0000000000100000-0x00000000efffffff] usable > BIOS-e820: [mem 0x00000000fc000000-0x00000000ffffffff] reserved > BIOS-e820: [mem 0x0000000100000000-0x000000020fffffff] usable > SMBIOS 2.7 present. > DMI: Xen HVM domU, BIOS 4.2.amazon 08/24/2006 > intel_idle: Please enable MWAIT in BIOS SETUP > piix4_smbus 0000:00:01.3: SMBus base address uninitialized - upgrade BIOS or > use force_addr=0xaddr > > > Memory map for: ip-10-0-155-36.us-east-2.compute.internal > BIOS-provided physical RAM map: > BIOS-e820: [mem 0x0000000000000000-0x000000000009dfff] usable > BIOS-e820: [mem 0x000000000009e000-0x000000000009ffff] reserved > BIOS-e820: [mem 0x00000000000e0000-0x00000000000fffff] reserved > BIOS-e820: [mem 0x0000000000100000-0x00000000efffffff] usable > BIOS-e820: [mem 0x00000000fc000000-0x00000000ffffffff] reserved > BIOS-e820: [mem 0x0000000100000000-0x000000020fffffff] usable > SMBIOS 2.7 present. > DMI: Xen HVM domU, BIOS 4.2.amazon 08/24/2006 > intel_idle: Please enable MWAIT in BIOS SETUP > piix4_smbus 0000:00:01.3: SMBus base address uninitialized - upgrade BIOS or > use force_addr=0xaddr > > > Memory map for: ip-10-0-167-148.us-east-2.compute.internal > BIOS-provided physical RAM map: > BIOS-e820: [mem 0x0000000000000000-0x000000000009dfff] usable > BIOS-e820: [mem 0x000000000009e000-0x000000000009ffff] reserved > BIOS-e820: [mem 0x00000000000e0000-0x00000000000fffff] reserved > BIOS-e820: [mem 0x0000000000100000-0x00000000efffffff] usable > BIOS-e820: [mem 0x00000000fc000000-0x00000000ffffffff] reserved > BIOS-e820: [mem 0x0000000100000000-0x000000020fffffff] usable > SMBIOS 2.7 present. > DMI: Xen HVM domU, BIOS 4.2.amazon 08/24/2006 > intel_idle: Please enable MWAIT in BIOS SETUP > piix4_smbus 0000:00:01.3: SMBus base address uninitialized - upgrade BIOS or > use force_addr=0xaddr Just correcting/confirming the function for gathering was actually: [core@ssh-bastion-65fd55cb7f-bff4x tmp]$ type g g is a function g () { for i in $workers; do echo; echo "Memory map for: $i"; ssh -o StrictHostKeyChecking=no $i dmesg -t | egrep --color=auto --color=auto 'BIOS' | tee /tmp/$i; echo; done } Trying to narrow this done a bit more grepping for either 'BIOS' or 'mem' gives: [core@ssh-bastion-65fd55cb7f-bff4x ~]$ type g g is a function g () { for i in $workers; do echo; echo "Memory map for: $i"; ssh -o StrictHostKeyChecking=no $i dmesg -t | egrep --color=auto --color=auto 'BIOS|mem' | tee /tmp/$i; echo; done } [core@ssh-bastion-65fd55cb7f-bff4x tmp]$ diff3 $workers ====3 1:16c 2:16c NODE_DATA(0) allocated [mem 0x20ffd6000-0x20fffffff] 3:16c NODE_DATA(0) allocated [mem 0x20ffd5000-0x20fffefff] ====3 1:57c 2:57c [TTM] Zone kernel: Available graphics memory: 4081450 kiB 3:57c [TTM] Zone kernel: Available graphics memory: 4081446 kiB So perhaps this ^^ is the difference. Hmm. Note that mcelog can offline individual pages based on soft error rates: http://www.mcelog.org/badpageofflining.html I also see differences on GCP instances: $ kubectl get nodes NAME STATUS ROLES AGE VERSION amcder-9s2hb-m-0.c.openshift-gce-devel.internal Ready master 5h10m v1.14.0+0261aa0df amcder-9s2hb-m-1.c.openshift-gce-devel.internal Ready master 5h10m v1.14.0+0261aa0df amcder-9s2hb-m-2.c.openshift-gce-devel.internal Ready master 5h10m v1.14.0+0261aa0df amcder-9s2hb-w-a-g5cb7.c.openshift-gce-devel.internal Ready worker 4h56m v1.14.0+0261aa0df amcder-9s2hb-w-b-7sr2x.c.openshift-gce-devel.internal Ready worker 4h56m v1.14.0+0261aa0df amcder-9s2hb-w-c-s4xsq.c.openshift-gce-devel.internal Ready worker 4h56m v1.14.0+0261aa0df aim@spicy:~/go-projects/openshift-cluster-api/src/sigs.k8s.io/cluster-api $ oc get nodes -o json | jq '.items[].status.capacity["memory"]' | sort | uniq -c 11 "15389264Ki" 21 "15389280Ki" 1 "15389988Ki" aim@spicy:~/go-projects/openshift-cluster-api/src/sigs.k8s.io/cluster-api $ oc get machinesets --all-namespaces NAMESPACE NAME DESIRED CURRENT READY AVAILABLE AGE openshift-machine-api amcder-9s2hb-w-a 10 10 10 10 5h32m openshift-machine-api amcder-9s2hb-w-b 10 10 10 10 5h32m openshift-machine-api amcder-9s2hb-w-c 10 10 10 10 5h32m This mem discrepancy was affecting the autoscaler which was fix by https://github.com/kubernetes/autoscaler/commit/e8b3c2a111eb9b3b4fe15b4d4081175267ff6d76#diff-dfab69cae1cc7f7d024593ed57d7371a. I think we can assume the deviation as inherent to the cloud provider and close this, please reopen if relevant. |