Bug 1822875
| Summary: | less hugepage (1Gi size) is available than configured in VM kernel argument. | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Product: | Container Native Virtualization (CNV) | Reporter: | zenghui.shi <zshi> | ||||||
| Component: | Virtualization | Assignee: | sgott | ||||||
| Status: | CLOSED NOTABUG | QA Contact: | Israel Pinto <ipinto> | ||||||
| Severity: | unspecified | Docs Contact: | |||||||
| Priority: | unspecified | ||||||||
| Version: | 2.3.0 | CC: | aburden, cnv-qe-bugs, danken, fdeutsch, kbidarka, phoracek, rmohr | ||||||
| Target Milestone: | --- | Keywords: | Reopened | ||||||
| Target Release: | 2.5.0 | ||||||||
| Hardware: | Unspecified | ||||||||
| OS: | Unspecified | ||||||||
| Whiteboard: | |||||||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |||||||
| Doc Text: | Story Points: | --- | |||||||
| Clone Of: | Environment: | ||||||||
| Last Closed: | 2020-06-10 12:24:45 UTC | Type: | Bug | ||||||
| Regression: | --- | Mount Type: | --- | ||||||
| Documentation: | --- | CRM: | |||||||
| Verified Versions: | Category: | --- | |||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||
| Embargoed: | |||||||||
| Attachments: |
|
||||||||
Should not be the expected result "4Gi hugepages are available inside VM"? If you request 1G of hugepages on the VM/VMI, then you get a VM with 1G of RAM. If you then tell the VM kernel to treat 2G of ram as huge-pages, then you still only get one hugepage, since the whole VM only has 1G of ram. Just to clarify this: There exists nothing like hugepages-passthrough. If you request on the VMI 4 hugepages of 1G size, then the VMI has 4G of RAM available and then you cann tell the guest OS to create up to 4 hugepages of a size of 1G. I think that can be closed. Closing this BZ based on comment #2. Please feel free to re-open this issue if you feel this isn't fully addressed. @Roman, thanks for looking at this! In the problem described above, the total hugepage or memory allocated to VMI is 4Gi, and in VM, it asks for 2Gi hugepage, but only got 1Gi. so it's not that it asking for 1Gi, but expect to get more than 1Gi. Can you confirm that you ave 4Gi of ram inside the VM? I am still not convinced that the VM does not have enough ram available. I just realized that our hugepages setup in the upstream tests is broken. The tests are not providing meaningful results. Will have a closer look into this. So I verified that the hugepages backing is working. You should actually see a guest with 4Gi of RAM. Could you check that and share the full /proc/meminfo? Tried to create a VM with same specs, but looks like I need physical cpus on the worker node.
[kbidarka@kbidarka-host cpu-pinning]$ cat vmi-fedora-28-cores-hugepages.yaml
apiVersion: kubevirt.io/v1alpha3
kind: VirtualMachineInstance
metadata:
creationTimestamp: null
labels:
special: vmi-fedora28-cloud-cores2-hugepages
name: vmi-fedora28-cloud-cores2-hugepages
spec:
domain:
cpu:
sockets: 6
cores: 1
threads: 1
dedicatedCpuPlacement: true
memory:
hugepages:
pageSize: "1Gi"
devices:
disks:
- disk:
bus: virtio
name: containerdisk
- disk:
bus: virtio
name: cloudinitdisk
machine:
type: ""
resources:
requests:
memory: "4Gi"
limits:
memory: "4Gi"
terminationGracePeriodSeconds: 0
volumes:
- name: containerdisk
containerDisk:
image: kubevirt/fedora-cloud-registry-disk-demo
- cloudInitNoCloud:
userData: |-
#cloud-config
password: fedora
chpasswd: { expire: False }
name: cloudinitdisk
status: {}
I see the below error for Pod Events:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling <unknown> default-scheduler 0/6 nodes are available: 1 Insufficient hugepages-1Gi, 5 Insufficient cpu.
Warning FailedScheduling <unknown> default-scheduler 0/6 nodes are available: 1 Insufficient hugepages-1Gi, 5 Insufficient cpu.
Ok dropped the sockets from the vm spec.
[kbidarka@kbidarka-host cpu-pinning]$ cat vmi-fedora-28-cores-hugepages.yaml
apiVersion: kubevirt.io/v1alpha3
kind: VirtualMachineInstance
metadata:
creationTimestamp: null
labels:
special: vmi-fedora28-cloud-cores2-hugepages
name: vmi-fedora28-cloud-cores2-hugepages
spec:
domain:
cpu:
cores: 1
threads: 1
dedicatedCpuPlacement: true
memory:
hugepages:
pageSize: "1Gi"
devices:
disks:
- disk:
bus: virtio
name: containerdisk
- disk:
bus: virtio
name: cloudinitdisk
machine:
type: ""
resources:
requests:
memory: "4Gi"
limits:
memory: "4Gi"
terminationGracePeriodSeconds: 0
volumes:
- name: containerdisk
containerDisk:
image: kubevirt/fedora-cloud-registry-disk-demo
- cloudInitNoCloud:
userData: |-
#cloud-config
password: fedora
chpasswd: { expire: False }
name: cloudinitdisk
status: {}
See the below output on the pods,
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling <unknown> default-scheduler 0/6 nodes are available: 1 Insufficient devices.kubevirt.io/kvm, 1 Insufficient devices.kubevirt.io/tun, 1 Insufficient devices.kubevirt.io/vhost-net, 3 Insufficient hugepages-1Gi.
Probably I need to ensure hugepages on the worker nodes, I believe. Will check shortly for it. Just updating here for record sake, for now, what I see currently.
Adding all the steps performed to configure hugePages.
Configure HugePages for the WORKER NODE:
----------------------------------------
MachineConfig
---------------
[kbidarka@kbidarka-host hugepages]$ ls
50-kargs-1g-hugepages.yaml
[kbidarka@kbidarka-host hugepages]$ cat 50-kargs-1g-hugepages.yaml
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
labels:
machineconfiguration.openshift.io/role: worker
name: 50-kargs-1g-hugepages
spec:
kernelArguments:
- default_hugepagesz=1Gi
- hugepagesz=1Gi
- hugepages=4
oc apply -f 50-kargs-1g-hugepages.yaml
Nodes rebooted and got back in Ready state.
[kbidarka@kbidarka-host osdc]$ oc get nodes
NAME STATUS ROLES AGE VERSION
kbidarka-b33-mlkgs-master-0 Ready master 15h v1.17.1
kbidarka-b33-mlkgs-master-1 Ready master 15h v1.17.1
kbidarka-b33-mlkgs-master-2 Ready master 15h v1.17.1
kbidarka-b33-mlkgs-worker-c6hgj Ready worker 15h v1.17.1
kbidarka-b33-mlkgs-worker-tr88n Ready worker 15h v1.17.1
kbidarka-b33-mlkgs-worker-z6z49 Ready worker 15h v1.17.1
[kbidarka@kbidarka-host osdc]$ oc debug node/kbidarka-b33-mlkgs-worker-c6hgj
Starting pod/kbidarka-b33-mlkgs-worker-c6hgj-debug ...
To use host binaries, run `chroot /host`
Pod IP: 192.168.0.14
If you don't see a command prompt, try pressing enter.
sh-4.2# chroot /host
sh-4.4# cat /proc/cmdline
BOOT_IMAGE=(hd0,gpt1)/ostree/rhcos-ec6f097d71f19d2713779aeb6d2296122dec138ffa31e2fd0a15a0f39e39cada/vmlinuz-4.18.0-147.8.1.el8_1.x86_64 rhcos.root=crypt_rootfs console=tty0 console=ttyS0,115200n8 rd.luks.options=discard ostree=/ostree/boot.1/rhcos/ec6f097d71f19d2713779aeb6d2296122dec138ffa31e2fd0a15a0f39e39cada/0 ignition.platform.id=openstack default_hugepagesz=1Gi hugepagesz=1Gi hugepages=4
sh-4.4# free -h
total used free shared buff/cache available
Mem: 23Gi 5.7Gi 14Gi 6.0Mi 2.9Gi 17Gi
Swap: 0B 0B 0B
sh-4.4# cat /proc/meminfo | grep -i huge
AnonHugePages: 475136 kB
ShmemHugePages: 0 kB
HugePages_Total: 4
HugePages_Free: 0
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 1048576 kB
Hugetlb: 4194304 kB
[kbidarka@kbidarka-host osdc]$ oc describe node kbidarka-b33-mlkgs-worker-c6hgj
Capacity:
attachable-volumes-cinder: 256
cpu: 12
devices.kubevirt.io/kvm: 110
devices.kubevirt.io/tun: 110
devices.kubevirt.io/vhost-net: 110
ephemeral-storage: 83334124Ki
hugepages-1Gi: 4Gi
hugepages-2Mi: 0
memory: 24677332Ki
ovs-cni.network.kubevirt.io/br0: 1k
pods: 250
Allocatable:
attachable-volumes-cinder: 256
cpu: 11500m
devices.kubevirt.io/kvm: 110
devices.kubevirt.io/tun: 110
devices.kubevirt.io/vhost-net: 110
ephemeral-storage: 75726986728
hugepages-1Gi: 4Gi
hugepages-2Mi: 0
memory: 19332052Ki
ovs-cni.network.kubevirt.io/br0: 1k
pods: 250
As you can see we can see the "hugepages-1Gi: 4Gi" for the worker nodes.
Created attachment 1683319 [details]
Added detailed info in an attachment file
Configuration of the KubeVirt VM
---------------------------
Created a VM with below specs:
spec:
nodeSelector:
kubernetes.io/hostname: kbidarka-b33-mlkgs-worker-c6hgj
domain:
cpu:
cores: 1
threads: 1
dedicatedCpuPlacement: true
memory:
hugepages:
pageSize: "1Gi"
devices:
disks:
- disk:
bus: virtio
name: datavolumedisk1
machine:
type: ""
resources:
requests:
memory: "4Gi"
limits:
memory: "4Gi"
Added kernel params towards the end on line, GRUB_CMDLINE_LINUX
[cloud-user@vm-rhel7-hugepages ~]$ cat /etc/default/grub
GRUB_TIMEOUT=1
GRUB_DISTRIBUTOR="$(sed 's, release .*$,,g' /etc/system-release)"
GRUB_DEFAULT=saved
GRUB_DISABLE_SUBMENU=true
GRUB_TERMINAL_OUTPUT="console"
GRUB_CMDLINE_LINUX="console=tty0 crashkernel=auto console=ttyS0,115200n8 no_timer_check net.ifnames=0 default_hugepagesz=1Gi hugepagesz=1Gi hugepages=2"
GRUB_DISABLE_RECOVERY="true"
Ran the command, "grub2-mkconfig -o /boot/grub2/grub.cfg"
Rebooted system : "systemctl reboot"
After reboot of the VM:
[cloud-user@vm-rhel7-hugepages ~]$ cat /proc/cmdline
BOOT_IMAGE=/boot/vmlinuz-3.10.0-1126.el7.x86_64 root=UUID=4156b89b-8af8-4449-9e8e-e5b92322b399 ro console=tty0 crashkernel=auto console=ttyS0,115200n8 no_timer_check net.ifnames=0 default_hugepagesz=1Gi hugepagesz=1Gi hugepages=2
[cloud-user@vm-rhel7-hugepages ~]$ free -h
total used free shared buff/cache available
Mem: 3.7G 1.1G 2.4G 8.5M 141M 2.4G
Swap: 0B 0B 0B
[cloud-user@vm-rhel7-hugepages ~]$ cat /proc/meminfo | grep -i huge
AnonHugePages: 10240 kB
HugePages_Total: 1
HugePages_Free: 1
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 1048576 kB
-------------------------------------------------------
Created attachment 1683321 [details]
Added detailed info about VM in an attachment file
Was expecting, "HugePages_Total: 2", but we got "HugePages_Total: 1" Memory/RAM was specified as 4Gi in VM spec file and we see it as approx 4Gi (3.7G) as seen from above "free -h", in comment 12. Plan to have a documentation around huge-pages to set the expected behavior, https://bugzilla.redhat.com/show_bug.cgi?id=1845198 As per comment 16 , will close this bug and we plan to have the documentation around it. The docs PR was merged: https://github.com/openshift/openshift-docs/pull/23705 I will add a link to the docs when they are published. The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days |
Description of problem: 1) create VMI with dedicated cpus and hugepages: spec: domain: cpu: sockets: 6 <== 6 exclusive cpus cores: 1 threads: 1 dedicatedCpuPlacement: true memory: hugepages: pageSize: "1Gi" <=== 1Gi Hugepage size resources: requests: memory: "4Gi" limits: memory: "4Gi" <=== 1Gi*4 Hugepage 2) Inside VM, configure 1Gi*2 hugepage in kernel argument for workload and reboot the VM (assuming VM is backed by 1Gi*4 hugepages, allocate 2 for VM workload): default_hugepagesz=1GB hugepagesz=1G hugepages=2 3) After rebooting VM, check the hugepage available inside VM: # cat /proc/meminfo | grep -i huge HugePages_Total: 1 HugePages_Free: 1 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 2048 kB Hugetlb: 0 kB Only 1Gi hugepage is available, but 1Gi*2 are configured. Version-Release number of selected component (if applicable): CNV 2.3.0 Operator version: v0.26.4 How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Only 1Gi hugepage is available inside VM Expected results: 2Gi hugepages are available inside VM Additional info: