1822875 – less hugepage (1Gi size) is available than configured in VM kernel argument.

Bug 1822875 - less hugepage (1Gi size) is available than configured in VM kernel argument.

Summary: less hugepage (1Gi size) is available than configured in VM kernel argument.

Keywords:
Status:	CLOSED NOTABUG
Alias:	None
Product:	Container Native Virtualization (CNV)
Classification:	Red Hat
Component:	Virtualization
Sub Component:
Version:	2.3.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	unspecified
Target Milestone:	---
Target Release:	2.5.0
Assignee:	sgott
QA Contact:	Israel Pinto
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2020-04-10 08:52 UTC by zenghui.shi
Modified:	2023-09-14 05:55 UTC (History)
CC List:	7 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2020-06-10 12:24:45 UTC
Target Upstream Version:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
Added detailed info in an attachment file (20.55 KB, text/plain) 2020-04-30 14:14 UTC, Kedar Bidarkar	no flags	Details
Added detailed info about VM in an attachment file (9.37 KB, text/plain) 2020-04-30 14:19 UTC, Kedar Bidarkar	no flags	Details
View All

Description zenghui.shi 2020-04-10 08:52:39 UTC

Description of problem:

1) create VMI with dedicated cpus and hugepages:

spec:
  domain:
    cpu:
      sockets: 6                             <== 6 exclusive cpus
      cores: 1
      threads: 1
      dedicatedCpuPlacement: true
    memory:
      hugepages:
        pageSize: "1Gi"                      <=== 1Gi Hugepage size
    resources:
      requests:
        memory: "4Gi"
      limits:
        memory: "4Gi"                        <===  1Gi*4 Hugepage

2) Inside VM, configure 1Gi*2 hugepage in kernel argument for workload and reboot the VM (assuming VM is backed by 1Gi*4 hugepages, allocate 2 for VM workload):

default_hugepagesz=1GB hugepagesz=1G hugepages=2


3) After rebooting VM, check the hugepage available inside VM:

# cat /proc/meminfo | grep -i huge
HugePages_Total:       1
HugePages_Free:        1
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB
Hugetlb:               0 kB


Only 1Gi hugepage is available, but 1Gi*2 are configured.



Version-Release number of selected component (if applicable):

CNV 2.3.0
Operator version: v0.26.4

How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:
Only 1Gi hugepage is available inside VM

Expected results:
2Gi hugepages are available inside VM

Additional info:

Comment 1 Petr Horáček 2020-04-10 09:35:53 UTC

Should not be the expected result "4Gi hugepages are available inside VM"?

Comment 2 Roman Mohr 2020-04-22 12:15:29 UTC

If you request 1G of hugepages on the VM/VMI, then you get a VM with 1G of RAM. If you then tell the VM kernel to treat 2G of ram as huge-pages, then you still only get one hugepage, since the whole VM only has 1G of ram.

Just to clarify this: There exists nothing like hugepages-passthrough. If you request on the VMI 4 hugepages of 1G size, then the VMI has 4G of RAM available and then you cann tell the guest OS to create up to 4 hugepages of a size of 1G.

I think that can be closed.

Comment 3 sgott 2020-04-22 12:24:51 UTC

Closing this BZ based on comment #2. Please feel free to re-open this issue if you feel this isn't fully addressed.

Comment 4 zenghui.shi 2020-04-23 01:51:14 UTC

@Roman, thanks for looking at this!

In the problem described above, the total hugepage or memory allocated to VMI is 4Gi, and in VM, it asks for 2Gi hugepage, but only got 1Gi.
so it's not that it asking for 1Gi, but expect to get more than 1Gi.

Comment 5 Roman Mohr 2020-04-23 07:58:57 UTC

Can you confirm that you ave 4Gi of ram inside the VM? I am still not convinced that the VM does not have enough ram available.

Comment 6 Roman Mohr 2020-04-23 08:07:23 UTC

I just realized that our hugepages setup in the upstream tests is broken. The tests are not providing meaningful results. Will have a closer look into this.

Comment 7 Roman Mohr 2020-04-23 10:29:49 UTC

So I verified that the hugepages backing is working. You should actually see a guest with 4Gi of RAM. Could you check that and share the full /proc/meminfo?

Comment 8 Kedar Bidarkar 2020-04-29 11:41:06 UTC

Tried to create a VM with same specs, but looks like I need physical cpus on the worker node.

[kbidarka@kbidarka-host cpu-pinning]$ cat vmi-fedora-28-cores-hugepages.yaml
apiVersion: kubevirt.io/v1alpha3
kind: VirtualMachineInstance
metadata:
  creationTimestamp: null
  labels:
    special: vmi-fedora28-cloud-cores2-hugepages
  name: vmi-fedora28-cloud-cores2-hugepages
spec:
  domain:
    cpu:
      sockets: 6
      cores: 1
      threads: 1
      dedicatedCpuPlacement: true
    memory:
      hugepages:
        pageSize: "1Gi"
    devices:
      disks:
      - disk:
          bus: virtio
        name: containerdisk
      - disk:
          bus: virtio
        name: cloudinitdisk
    machine:
      type: ""
    resources:
      requests:
        memory: "4Gi"
      limits:
        memory: "4Gi"
  terminationGracePeriodSeconds: 0
  volumes:
  - name: containerdisk
    containerDisk:
      image: kubevirt/fedora-cloud-registry-disk-demo
  - cloudInitNoCloud:
      userData: |-
        #cloud-config
        password: fedora
        chpasswd: { expire: False }
    name: cloudinitdisk
status: {}


I see the below error for Pod Events:
Events:
  Type     Reason            Age        From               Message
  ----     ------            ----       ----               -------
  Warning  FailedScheduling  <unknown>  default-scheduler  0/6 nodes are available: 1 Insufficient hugepages-1Gi, 5 Insufficient cpu.
  Warning  FailedScheduling  <unknown>  default-scheduler  0/6 nodes are available: 1 Insufficient hugepages-1Gi, 5 Insufficient cpu.

Comment 9 Kedar Bidarkar 2020-04-29 11:53:02 UTC

Ok dropped the sockets from the vm spec.

[kbidarka@kbidarka-host cpu-pinning]$ cat vmi-fedora-28-cores-hugepages.yaml
apiVersion: kubevirt.io/v1alpha3
kind: VirtualMachineInstance
metadata:
  creationTimestamp: null
  labels:
    special: vmi-fedora28-cloud-cores2-hugepages
  name: vmi-fedora28-cloud-cores2-hugepages
spec:
  domain:
    cpu:
      cores: 1
      threads: 1
      dedicatedCpuPlacement: true
    memory:
      hugepages:
        pageSize: "1Gi"
    devices:
      disks:
      - disk:
          bus: virtio
        name: containerdisk
      - disk:
          bus: virtio
        name: cloudinitdisk
    machine:
      type: ""
    resources:
      requests:
        memory: "4Gi"
      limits:
        memory: "4Gi"
  terminationGracePeriodSeconds: 0
  volumes:
  - name: containerdisk
    containerDisk:
      image: kubevirt/fedora-cloud-registry-disk-demo
  - cloudInitNoCloud:
      userData: |-
        #cloud-config
        password: fedora
        chpasswd: { expire: False }
    name: cloudinitdisk
status: {}



See the below output on the pods, 

Events:
  Type     Reason            Age        From               Message
  ----     ------            ----       ----               -------
  Warning  FailedScheduling  <unknown>  default-scheduler  0/6 nodes are available: 1 Insufficient devices.kubevirt.io/kvm, 1 Insufficient devices.kubevirt.io/tun, 1 Insufficient devices.kubevirt.io/vhost-net, 3 Insufficient hugepages-1Gi.


Probably I need to ensure hugepages on the worker nodes, I believe. Will check shortly for it. Just updating here for record sake, for now, what I see currently.

Comment 10 Kedar Bidarkar 2020-04-30 14:12:35 UTC

Adding all the steps performed to configure hugePages.



Configure HugePages for the WORKER NODE:
----------------------------------------

MachineConfig 
---------------

[kbidarka@kbidarka-host hugepages]$ ls
50-kargs-1g-hugepages.yaml
[kbidarka@kbidarka-host hugepages]$ cat 50-kargs-1g-hugepages.yaml 
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
    labels:
        machineconfiguration.openshift.io/role: worker
    name: 50-kargs-1g-hugepages
spec:
    kernelArguments:
        - default_hugepagesz=1Gi
        - hugepagesz=1Gi
        - hugepages=4

oc apply -f 50-kargs-1g-hugepages.yaml



Nodes rebooted and got back in Ready state.


[kbidarka@kbidarka-host osdc]$ oc get nodes 
NAME                              STATUS   ROLES    AGE   VERSION
kbidarka-b33-mlkgs-master-0       Ready    master   15h   v1.17.1
kbidarka-b33-mlkgs-master-1       Ready    master   15h   v1.17.1
kbidarka-b33-mlkgs-master-2       Ready    master   15h   v1.17.1
kbidarka-b33-mlkgs-worker-c6hgj   Ready    worker   15h   v1.17.1
kbidarka-b33-mlkgs-worker-tr88n   Ready    worker   15h   v1.17.1
kbidarka-b33-mlkgs-worker-z6z49   Ready    worker   15h   v1.17.1


[kbidarka@kbidarka-host osdc]$ oc debug node/kbidarka-b33-mlkgs-worker-c6hgj
Starting pod/kbidarka-b33-mlkgs-worker-c6hgj-debug ...
To use host binaries, run `chroot /host`
Pod IP: 192.168.0.14
If you don't see a command prompt, try pressing enter.
sh-4.2# chroot /host


sh-4.4# cat /proc/cmdline 
BOOT_IMAGE=(hd0,gpt1)/ostree/rhcos-ec6f097d71f19d2713779aeb6d2296122dec138ffa31e2fd0a15a0f39e39cada/vmlinuz-4.18.0-147.8.1.el8_1.x86_64 rhcos.root=crypt_rootfs console=tty0 console=ttyS0,115200n8 rd.luks.options=discard ostree=/ostree/boot.1/rhcos/ec6f097d71f19d2713779aeb6d2296122dec138ffa31e2fd0a15a0f39e39cada/0 ignition.platform.id=openstack default_hugepagesz=1Gi hugepagesz=1Gi hugepages=4


sh-4.4# free -h 
              total        used        free      shared  buff/cache   available
Mem:           23Gi       5.7Gi        14Gi       6.0Mi       2.9Gi        17Gi
Swap:            0B          0B          0B



sh-4.4# cat /proc/meminfo | grep -i huge 
AnonHugePages:    475136 kB
ShmemHugePages:        0 kB
HugePages_Total:       4
HugePages_Free:        0
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:    1048576 kB
Hugetlb:         4194304 kB


[kbidarka@kbidarka-host osdc]$ oc describe node kbidarka-b33-mlkgs-worker-c6hgj

Capacity:
  attachable-volumes-cinder:        256
  cpu:                              12
  devices.kubevirt.io/kvm:          110
  devices.kubevirt.io/tun:          110
  devices.kubevirt.io/vhost-net:    110
  ephemeral-storage:                83334124Ki

  hugepages-1Gi:                    4Gi
  hugepages-2Mi:                    0

  memory:                           24677332Ki
  ovs-cni.network.kubevirt.io/br0:  1k
  pods:                             250
Allocatable:
  attachable-volumes-cinder:        256
  cpu:                              11500m
  devices.kubevirt.io/kvm:          110
  devices.kubevirt.io/tun:          110
  devices.kubevirt.io/vhost-net:    110
  ephemeral-storage:                75726986728

  hugepages-1Gi:                    4Gi
  hugepages-2Mi:                    0

  memory:                           19332052Ki
  ovs-cni.network.kubevirt.io/br0:  1k
  pods:                             250


As you can see we can see the "hugepages-1Gi:                    4Gi" for the worker nodes.

Comment 11 Kedar Bidarkar 2020-04-30 14:14:38 UTC

Created attachment 1683319 [details]
Added  detailed info in an attachment file

Comment 12 Kedar Bidarkar 2020-04-30 14:18:35 UTC

Configuration of the KubeVirt VM
---------------------------
Created a VM with below specs: 

    spec:
      nodeSelector:
        kubernetes.io/hostname: kbidarka-b33-mlkgs-worker-c6hgj
      domain:
        cpu:
          cores: 1
          threads: 1
          dedicatedCpuPlacement: true
        memory:
          hugepages:
            pageSize: "1Gi"
        devices:
          disks:
          - disk:
              bus: virtio
            name: datavolumedisk1
        machine:
          type: ""
        resources:
          requests:
            memory: "4Gi"
          limits:
            memory: "4Gi"


Added kernel params towards the end on line, GRUB_CMDLINE_LINUX

[cloud-user@vm-rhel7-hugepages ~]$ cat /etc/default/grub 
GRUB_TIMEOUT=1
GRUB_DISTRIBUTOR="$(sed 's, release .*$,,g' /etc/system-release)"
GRUB_DEFAULT=saved
GRUB_DISABLE_SUBMENU=true
GRUB_TERMINAL_OUTPUT="console"
GRUB_CMDLINE_LINUX="console=tty0 crashkernel=auto console=ttyS0,115200n8 no_timer_check net.ifnames=0 default_hugepagesz=1Gi hugepagesz=1Gi hugepages=2"
GRUB_DISABLE_RECOVERY="true"



Ran the command, "grub2-mkconfig -o /boot/grub2/grub.cfg"

Rebooted system : "systemctl reboot"

After reboot of the VM:

[cloud-user@vm-rhel7-hugepages ~]$ cat /proc/cmdline 
BOOT_IMAGE=/boot/vmlinuz-3.10.0-1126.el7.x86_64 root=UUID=4156b89b-8af8-4449-9e8e-e5b92322b399 ro console=tty0 crashkernel=auto console=ttyS0,115200n8 no_timer_check net.ifnames=0 default_hugepagesz=1Gi hugepagesz=1Gi hugepages=2

[cloud-user@vm-rhel7-hugepages ~]$ free -h 
              total        used        free      shared  buff/cache   available
Mem:           3.7G        1.1G        2.4G        8.5M        141M        2.4G
Swap:            0B          0B          0B

[cloud-user@vm-rhel7-hugepages ~]$ cat /proc/meminfo | grep -i huge 
AnonHugePages:     10240 kB
HugePages_Total:       1
HugePages_Free:        1
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:    1048576 kB

-------------------------------------------------------

Comment 13 Kedar Bidarkar 2020-04-30 14:19:07 UTC

Created attachment 1683321 [details]
Added  detailed info about VM  in an attachment file

Comment 14 Kedar Bidarkar 2020-04-30 14:20:28 UTC

Was expecting, "HugePages_Total:       2", 
but we got     "HugePages_Total:       1"

Memory/RAM was specified as 4Gi in VM spec file and 
we see it as approx 4Gi (3.7G) as seen from above "free -h", in comment 12.

Comment 16 Kedar Bidarkar 2020-06-08 16:09:44 UTC

Plan to have a documentation around huge-pages to set the expected behavior, 
https://bugzilla.redhat.com/show_bug.cgi?id=1845198

Comment 17 Kedar Bidarkar 2020-06-10 12:24:45 UTC

As per comment 16 , will close this bug and we plan to have the documentation around it.

Comment 18 Andrew Burden 2020-07-21 18:21:12 UTC

The docs PR was merged:
https://github.com/openshift/openshift-docs/pull/23705
I will add a link to the docs when they are published.

Comment 19 Red Hat Bugzilla 2023-09-14 05:55:23 UTC

The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days

Note You need to log in before you can comment on or make changes to this bug.