Bug 1822875

Summary: less hugepage (1Gi size) is available than configured in VM kernel argument.
Product: Container Native Virtualization (CNV) Reporter: zenghui.shi <zshi>
Component: VirtualizationAssignee: sgott
Status: CLOSED NOTABUG QA Contact: Israel Pinto <ipinto>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 2.3.0CC: aburden, cnv-qe-bugs, danken, fdeutsch, kbidarka, phoracek, rmohr
Target Milestone: ---Keywords: Reopened
Target Release: 2.5.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-06-10 12:24:45 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Added detailed info in an attachment file
none
Added detailed info about VM in an attachment file none

Description zenghui.shi 2020-04-10 08:52:39 UTC
Description of problem:

1) create VMI with dedicated cpus and hugepages:

spec:
  domain:
    cpu:
      sockets: 6                             <== 6 exclusive cpus
      cores: 1
      threads: 1
      dedicatedCpuPlacement: true
    memory:
      hugepages:
        pageSize: "1Gi"                      <=== 1Gi Hugepage size
    resources:
      requests:
        memory: "4Gi"
      limits:
        memory: "4Gi"                        <===  1Gi*4 Hugepage

2) Inside VM, configure 1Gi*2 hugepage in kernel argument for workload and reboot the VM (assuming VM is backed by 1Gi*4 hugepages, allocate 2 for VM workload):

default_hugepagesz=1GB hugepagesz=1G hugepages=2


3) After rebooting VM, check the hugepage available inside VM:

# cat /proc/meminfo | grep -i huge
HugePages_Total:       1
HugePages_Free:        1
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB
Hugetlb:               0 kB


Only 1Gi hugepage is available, but 1Gi*2 are configured.



Version-Release number of selected component (if applicable):

CNV 2.3.0
Operator version: v0.26.4

How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:
Only 1Gi hugepage is available inside VM

Expected results:
2Gi hugepages are available inside VM

Additional info:

Comment 1 Petr Horáček 2020-04-10 09:35:53 UTC
Should not be the expected result "4Gi hugepages are available inside VM"?

Comment 2 Roman Mohr 2020-04-22 12:15:29 UTC
If you request 1G of hugepages on the VM/VMI, then you get a VM with 1G of RAM. If you then tell the VM kernel to treat 2G of ram as huge-pages, then you still only get one hugepage, since the whole VM only has 1G of ram.

Just to clarify this: There exists nothing like hugepages-passthrough. If you request on the VMI 4 hugepages of 1G size, then the VMI has 4G of RAM available and then you cann tell the guest OS to create up to 4 hugepages of a size of 1G.

I think that can be closed.

Comment 3 sgott 2020-04-22 12:24:51 UTC
Closing this BZ based on comment #2. Please feel free to re-open this issue if you feel this isn't fully addressed.

Comment 4 zenghui.shi 2020-04-23 01:51:14 UTC
@Roman, thanks for looking at this!

In the problem described above, the total hugepage or memory allocated to VMI is 4Gi, and in VM, it asks for 2Gi hugepage, but only got 1Gi.
so it's not that it asking for 1Gi, but expect to get more than 1Gi.

Comment 5 Roman Mohr 2020-04-23 07:58:57 UTC
Can you confirm that you ave 4Gi of ram inside the VM? I am still not convinced that the VM does not have enough ram available.

Comment 6 Roman Mohr 2020-04-23 08:07:23 UTC
I just realized that our hugepages setup in the upstream tests is broken. The tests are not providing meaningful results. Will have a closer look into this.

Comment 7 Roman Mohr 2020-04-23 10:29:49 UTC
So I verified that the hugepages backing is working. You should actually see a guest with 4Gi of RAM. Could you check that and share the full /proc/meminfo?

Comment 8 Kedar Bidarkar 2020-04-29 11:41:06 UTC
Tried to create a VM with same specs, but looks like I need physical cpus on the worker node.

[kbidarka@kbidarka-host cpu-pinning]$ cat vmi-fedora-28-cores-hugepages.yaml
apiVersion: kubevirt.io/v1alpha3
kind: VirtualMachineInstance
metadata:
  creationTimestamp: null
  labels:
    special: vmi-fedora28-cloud-cores2-hugepages
  name: vmi-fedora28-cloud-cores2-hugepages
spec:
  domain:
    cpu:
      sockets: 6
      cores: 1
      threads: 1
      dedicatedCpuPlacement: true
    memory:
      hugepages:
        pageSize: "1Gi"
    devices:
      disks:
      - disk:
          bus: virtio
        name: containerdisk
      - disk:
          bus: virtio
        name: cloudinitdisk
    machine:
      type: ""
    resources:
      requests:
        memory: "4Gi"
      limits:
        memory: "4Gi"
  terminationGracePeriodSeconds: 0
  volumes:
  - name: containerdisk
    containerDisk:
      image: kubevirt/fedora-cloud-registry-disk-demo
  - cloudInitNoCloud:
      userData: |-
        #cloud-config
        password: fedora
        chpasswd: { expire: False }
    name: cloudinitdisk
status: {}


I see the below error for Pod Events:
Events:
  Type     Reason            Age        From               Message
  ----     ------            ----       ----               -------
  Warning  FailedScheduling  <unknown>  default-scheduler  0/6 nodes are available: 1 Insufficient hugepages-1Gi, 5 Insufficient cpu.
  Warning  FailedScheduling  <unknown>  default-scheduler  0/6 nodes are available: 1 Insufficient hugepages-1Gi, 5 Insufficient cpu.

Comment 9 Kedar Bidarkar 2020-04-29 11:53:02 UTC
Ok dropped the sockets from the vm spec.

[kbidarka@kbidarka-host cpu-pinning]$ cat vmi-fedora-28-cores-hugepages.yaml
apiVersion: kubevirt.io/v1alpha3
kind: VirtualMachineInstance
metadata:
  creationTimestamp: null
  labels:
    special: vmi-fedora28-cloud-cores2-hugepages
  name: vmi-fedora28-cloud-cores2-hugepages
spec:
  domain:
    cpu:
      cores: 1
      threads: 1
      dedicatedCpuPlacement: true
    memory:
      hugepages:
        pageSize: "1Gi"
    devices:
      disks:
      - disk:
          bus: virtio
        name: containerdisk
      - disk:
          bus: virtio
        name: cloudinitdisk
    machine:
      type: ""
    resources:
      requests:
        memory: "4Gi"
      limits:
        memory: "4Gi"
  terminationGracePeriodSeconds: 0
  volumes:
  - name: containerdisk
    containerDisk:
      image: kubevirt/fedora-cloud-registry-disk-demo
  - cloudInitNoCloud:
      userData: |-
        #cloud-config
        password: fedora
        chpasswd: { expire: False }
    name: cloudinitdisk
status: {}



See the below output on the pods, 

Events:
  Type     Reason            Age        From               Message
  ----     ------            ----       ----               -------
  Warning  FailedScheduling  <unknown>  default-scheduler  0/6 nodes are available: 1 Insufficient devices.kubevirt.io/kvm, 1 Insufficient devices.kubevirt.io/tun, 1 Insufficient devices.kubevirt.io/vhost-net, 3 Insufficient hugepages-1Gi.


Probably I need to ensure hugepages on the worker nodes, I believe. Will check shortly for it. Just updating here for record sake, for now, what I see currently.

Comment 10 Kedar Bidarkar 2020-04-30 14:12:35 UTC
Adding all the steps performed to configure hugePages.



Configure HugePages for the WORKER NODE:
----------------------------------------

MachineConfig 
---------------

[kbidarka@kbidarka-host hugepages]$ ls
50-kargs-1g-hugepages.yaml
[kbidarka@kbidarka-host hugepages]$ cat 50-kargs-1g-hugepages.yaml 
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
    labels:
        machineconfiguration.openshift.io/role: worker
    name: 50-kargs-1g-hugepages
spec:
    kernelArguments:
        - default_hugepagesz=1Gi
        - hugepagesz=1Gi
        - hugepages=4

oc apply -f 50-kargs-1g-hugepages.yaml



Nodes rebooted and got back in Ready state.


[kbidarka@kbidarka-host osdc]$ oc get nodes 
NAME                              STATUS   ROLES    AGE   VERSION
kbidarka-b33-mlkgs-master-0       Ready    master   15h   v1.17.1
kbidarka-b33-mlkgs-master-1       Ready    master   15h   v1.17.1
kbidarka-b33-mlkgs-master-2       Ready    master   15h   v1.17.1
kbidarka-b33-mlkgs-worker-c6hgj   Ready    worker   15h   v1.17.1
kbidarka-b33-mlkgs-worker-tr88n   Ready    worker   15h   v1.17.1
kbidarka-b33-mlkgs-worker-z6z49   Ready    worker   15h   v1.17.1


[kbidarka@kbidarka-host osdc]$ oc debug node/kbidarka-b33-mlkgs-worker-c6hgj
Starting pod/kbidarka-b33-mlkgs-worker-c6hgj-debug ...
To use host binaries, run `chroot /host`
Pod IP: 192.168.0.14
If you don't see a command prompt, try pressing enter.
sh-4.2# chroot /host


sh-4.4# cat /proc/cmdline 
BOOT_IMAGE=(hd0,gpt1)/ostree/rhcos-ec6f097d71f19d2713779aeb6d2296122dec138ffa31e2fd0a15a0f39e39cada/vmlinuz-4.18.0-147.8.1.el8_1.x86_64 rhcos.root=crypt_rootfs console=tty0 console=ttyS0,115200n8 rd.luks.options=discard ostree=/ostree/boot.1/rhcos/ec6f097d71f19d2713779aeb6d2296122dec138ffa31e2fd0a15a0f39e39cada/0 ignition.platform.id=openstack default_hugepagesz=1Gi hugepagesz=1Gi hugepages=4


sh-4.4# free -h 
              total        used        free      shared  buff/cache   available
Mem:           23Gi       5.7Gi        14Gi       6.0Mi       2.9Gi        17Gi
Swap:            0B          0B          0B



sh-4.4# cat /proc/meminfo | grep -i huge 
AnonHugePages:    475136 kB
ShmemHugePages:        0 kB
HugePages_Total:       4
HugePages_Free:        0
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:    1048576 kB
Hugetlb:         4194304 kB


[kbidarka@kbidarka-host osdc]$ oc describe node kbidarka-b33-mlkgs-worker-c6hgj

Capacity:
  attachable-volumes-cinder:        256
  cpu:                              12
  devices.kubevirt.io/kvm:          110
  devices.kubevirt.io/tun:          110
  devices.kubevirt.io/vhost-net:    110
  ephemeral-storage:                83334124Ki

  hugepages-1Gi:                    4Gi
  hugepages-2Mi:                    0

  memory:                           24677332Ki
  ovs-cni.network.kubevirt.io/br0:  1k
  pods:                             250
Allocatable:
  attachable-volumes-cinder:        256
  cpu:                              11500m
  devices.kubevirt.io/kvm:          110
  devices.kubevirt.io/tun:          110
  devices.kubevirt.io/vhost-net:    110
  ephemeral-storage:                75726986728

  hugepages-1Gi:                    4Gi
  hugepages-2Mi:                    0

  memory:                           19332052Ki
  ovs-cni.network.kubevirt.io/br0:  1k
  pods:                             250


As you can see we can see the "hugepages-1Gi:                    4Gi" for the worker nodes.

Comment 11 Kedar Bidarkar 2020-04-30 14:14:38 UTC
Created attachment 1683319 [details]
Added  detailed info in an attachment file

Comment 12 Kedar Bidarkar 2020-04-30 14:18:35 UTC
Configuration of the KubeVirt VM
---------------------------
Created a VM with below specs: 

    spec:
      nodeSelector:
        kubernetes.io/hostname: kbidarka-b33-mlkgs-worker-c6hgj
      domain:
        cpu:
          cores: 1
          threads: 1
          dedicatedCpuPlacement: true
        memory:
          hugepages:
            pageSize: "1Gi"
        devices:
          disks:
          - disk:
              bus: virtio
            name: datavolumedisk1
        machine:
          type: ""
        resources:
          requests:
            memory: "4Gi"
          limits:
            memory: "4Gi"


Added kernel params towards the end on line, GRUB_CMDLINE_LINUX

[cloud-user@vm-rhel7-hugepages ~]$ cat /etc/default/grub 
GRUB_TIMEOUT=1
GRUB_DISTRIBUTOR="$(sed 's, release .*$,,g' /etc/system-release)"
GRUB_DEFAULT=saved
GRUB_DISABLE_SUBMENU=true
GRUB_TERMINAL_OUTPUT="console"
GRUB_CMDLINE_LINUX="console=tty0 crashkernel=auto console=ttyS0,115200n8 no_timer_check net.ifnames=0 default_hugepagesz=1Gi hugepagesz=1Gi hugepages=2"
GRUB_DISABLE_RECOVERY="true"



Ran the command, "grub2-mkconfig -o /boot/grub2/grub.cfg"

Rebooted system : "systemctl reboot"

After reboot of the VM:

[cloud-user@vm-rhel7-hugepages ~]$ cat /proc/cmdline 
BOOT_IMAGE=/boot/vmlinuz-3.10.0-1126.el7.x86_64 root=UUID=4156b89b-8af8-4449-9e8e-e5b92322b399 ro console=tty0 crashkernel=auto console=ttyS0,115200n8 no_timer_check net.ifnames=0 default_hugepagesz=1Gi hugepagesz=1Gi hugepages=2

[cloud-user@vm-rhel7-hugepages ~]$ free -h 
              total        used        free      shared  buff/cache   available
Mem:           3.7G        1.1G        2.4G        8.5M        141M        2.4G
Swap:            0B          0B          0B

[cloud-user@vm-rhel7-hugepages ~]$ cat /proc/meminfo | grep -i huge 
AnonHugePages:     10240 kB
HugePages_Total:       1
HugePages_Free:        1
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:    1048576 kB

-------------------------------------------------------

Comment 13 Kedar Bidarkar 2020-04-30 14:19:07 UTC
Created attachment 1683321 [details]
Added  detailed info about VM  in an attachment file

Comment 14 Kedar Bidarkar 2020-04-30 14:20:28 UTC
Was expecting, "HugePages_Total:       2", 
but we got     "HugePages_Total:       1"

Memory/RAM was specified as 4Gi in VM spec file and 
we see it as approx 4Gi (3.7G) as seen from above "free -h", in comment 12.

Comment 16 Kedar Bidarkar 2020-06-08 16:09:44 UTC
Plan to have a documentation around huge-pages to set the expected behavior, 
https://bugzilla.redhat.com/show_bug.cgi?id=1845198

Comment 17 Kedar Bidarkar 2020-06-10 12:24:45 UTC
As per comment 16 , will close this bug and we plan to have the documentation around it.

Comment 18 Andrew Burden 2020-07-21 18:21:12 UTC
The docs PR was merged:
https://github.com/openshift/openshift-docs/pull/23705
I will add a link to the docs when they are published.

Comment 19 Red Hat Bugzilla 2023-09-14 05:55:23 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days