RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1220702 - wrong display of current memory after memory hot-plug
Summary: wrong display of current memory after memory hot-plug
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: libvirt
Version: 7.2
Hardware: x86_64
OS: Linux
medium
medium
Target Milestone: rc
: ---
Assignee: Peter Krempa
QA Contact: Virtualization Bugs
URL:
Whiteboard:
Depends On: 1216046
Blocks: 1288337 1305606 1313485
TreeView+ depends on / blocked
 
Reported: 2015-05-12 08:58 UTC by Luyao Huang
Modified: 2016-11-03 18:16 UTC (History)
7 users (show)

Fixed In Version: libvirt-2.0.0-6.el7
Doc Type: Bug Fix
Doc Text:
Clone Of: 1216046
Environment:
Last Closed: 2016-11-03 18:16:52 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2016:2577 0 normal SHIPPED_LIVE Moderate: libvirt security, bug fix, and enhancement update 2016-11-03 12:07:06 UTC

Description Luyao Huang 2015-05-12 08:58:37 UTC
+++ This bug was initially created as a clone of Bug #1216046 +++

--- Additional comment from Luyao Huang on 2015-04-30 04:51:34 EDT ---

Hi Peter,

...

And ask another question about the display of initial memory and memory.

I test memory device with qemu-2.3.0, I found the current memory is not correct if we hot-plug a memory device:

1. prepare a running vm with maxmemory and numa settings: 

  <maxMemory slots='16' unit='KiB'>2560000</maxMemory>
  <memory unit='KiB'>1024000</memory>
  <currentMemory unit='KiB'>1024000</currentMemory>
...
  <cpu>
    <numa>
      <cell id='0' cpus='0-1' memory='512000' unit='KiB'/>
      <cell id='1' cpus='2-3' memory='512000' unit='KiB'/>
    </numa>
  </cpu>

2. hot-plug a memory device
# virsh attach-device test3 memdevice.xml
Device attached successfully

3. check the current memory and memory ballon:

# virsh dumpxml test3
...
  <maxMemory slots='16' unit='KiB'>2560000</maxMemory>
  <memory unit='KiB'>1536000</memory>
  <currentMemory unit='KiB'>1024000</currentMemory>


# virsh dommemstat test3
actual 1536000
swap_in 0
swap_out 0
major_fault 370
minor_fault 393570
unused 1375984
available 1506908
rss 260156


# virsh qemu-monitor-command test3 --hmp info balloon
balloon: actual=1500

and check in guest the actual memory is 1471M. And from libvirt.org current memory is The actual allocation of memory for the guest. So this value(current memory) should not be 1536000 after hot-plug memory deivce?



And upstream patch:

commit 2f37362e44400d91f51c9e147f71e98a4eca42c0
Author: Peter Krempa <pkrempa>
Date:   Thu Apr 30 18:03:41 2015 +0200

    qemu: Fix balloon size handling with memory hot(un)plug
    
    Since libvirt doesn't call to update the new balloon size in qemu add
    code that will handle tweaking of the size of the current balloon
    statistic until qemu reports the new size using the event.

Comment 1 Peter Krempa 2015-05-12 11:08:38 UTC
Upstream commits fixing this and similar problems:

commit 2f37362e44400d91f51c9e147f71e98a4eca42c0
Author: Peter Krempa <pkrempa>
Date:   Thu Apr 30 18:03:41 2015 +0200

    qemu: Fix balloon size handling with memory hot(un)plug
    
    Since libvirt doesn't call to update the new balloon size in qemu add
    code that will handle tweaking of the size of the current balloon
    statistic until qemu reports the new size using the event.

commit de03b1dddee50f3f9718f03102697e7066af7f64
Author: Peter Krempa <pkrempa>
Date:   Thu Apr 30 17:43:53 2015 +0200

    conf: Fix up balloon size after removing a memory device from def
    
    To avoid having the ballooned memory size larger than the actual
    physical memory size, truncate the ballooned size if it overflows.

commit fccc2c331311422d89a6f87b64f82672bc3f3f75
Author: Peter Krempa <pkrempa>
Date:   Thu Apr 30 17:33:41 2015 +0200

    conf: Always truncate balloon size to maximum memory size
    
    Specifying a balloon size more than the memory size of a guest isn't
    something that should be rejected when parsing the XML. Truncate the
    size to the maximum memory size.

Comment 3 Fangge Jin 2015-07-28 04:51:44 UTC
1.I have a running guest with like below(currentMemory < memory):
# virsh dumpxml r71-2
...
  <maxMemory slots='16' unit='KiB'>2560000</maxMemory>
  <memory unit='KiB'>1024000</memory>
  <currentMemory unit='KiB'>1000000</currentMemory>
...
    <numa>
      <cell id='0' cpus='0-1' memory='512000' unit='KiB'/>
      <cell id='1' cpus='2' memory='512000' unit='KiB'/>
    </numa>

2.Then attach a mem device:
# virsh attach-device r71-2 dimm1.xml 
Device attached successfully

3.Check the xml:
# virsh dumpxml r71-2
...
  <maxMemory slots='16' unit='KiB'>2560000</maxMemory>
  <memory unit='KiB'>1286144</memory>
  <currentMemory unit='KiB'>1000000</currentMemory>

The current memory doesn't change. 

4.Check the memory in guest to find the memory does change:
# free -h
              total        used        free      shared  buff/cache   available
Mem:           1.2G        433M        365M        7.7M        402M        707M
Swap:          923M          0B        923M

So why the current memory doesn't change after hotplug a mem device when currentMemory < memory?

Comment 4 Luyao Huang 2015-09-01 09:42:34 UTC
Test with libvirt-1.2.17-6.el7.x86_64, still can reproduce the issue in comment 3

Comment 5 Luyao Huang 2015-09-07 08:01:13 UTC
Found some issue like this during memory device hot-unplug:

1. about the align memory size:
prepare a guest memory like this:

# virsh dumpxml rhel7.0-rhel --inactive
<domain type='kvm'>
  <name>rhel7.0-rhel</name>
  <uuid>67c7a123-5415-4136-af62-a2ee098ba6cd</uuid>
  <maxMemory slots='16' unit='KiB'>25600000</maxMemory>
  <memory unit='KiB'>4048000</memory>
  <currentMemory unit='KiB'>4048000</currentMemory>

after guest start memory will align to 4048896 but currentMemory still 4048000:

# virsh dumpxml rhel7.0-rhel
<domain type='kvm' id='20'>
  <name>rhel7.0-rhel</name>
  <uuid>67c7a123-5415-4136-af62-a2ee098ba6cd</uuid>
  <maxMemory slots='16' unit='KiB'>25600000</maxMemory>
  <memory unit='KiB'>4048896</memory>
  <currentMemory unit='KiB'>4048000</currentMemory>



2. if memory != currentMemory, do memory hot-plug then hot-unplug will make libvirt show wrong result for current memory size (looks like comment 3):

# cat memdevice.xml 
    <memory model='dimm'>
      <source>
        <pagesize unit='KiB'>4</pagesize>
        <nodemask>0</nodemask>
      </source>
      <target>
        <size unit='m'>128</size>
        <node>0</node>
      </target>
    </memory>

# virsh attach-device rhel7.0-rhel memdevice.xml;virsh dumpxml rhel7.0-rhel | grep -A1 "memory unit";virsh detach-device rhel7.0-rhel memdevice.xml;virsh dumpxml rhel7.0-rhel | grep -A1 "memory unit"
Device attached successfully

  <memory unit='KiB'>4179968</memory>
  <currentMemory unit='KiB'>3130496</currentMemory>
Device detached successfully

  <memory unit='KiB'>4048896</memory>
  <currentMemory unit='KiB'>2999424</currentMemory>

# virsh attach-device rhel7.0-rhel memdevice.xml;virsh dumpxml rhel7.0-rhel | grep -A1 "memory unit";virsh detach-device rhel7.0-rhel memdevice.xml;virsh dumpxml rhel7.0-rhel | grep -A1 "memory unit"
Device attached successfully

  <memory unit='KiB'>4179968</memory>
  <currentMemory unit='KiB'>2999424</currentMemory>
Device detached successfully

  <memory unit='KiB'>4048896</memory>
  <currentMemory unit='KiB'>2868352</currentMemory>

Comment 9 Peter Krempa 2016-04-15 12:30:05 UTC
Fixed upstream:

commit 6306ee6249024322bda2ab0f6f53bec7e24ea897
Author: Peter Krempa <pkrempa>
Date:   Wed Apr 6 15:57:57 2016 +0200

    qemu: hotplug: Properly recalculate/reload balloon size after hot(un)plug
    
    Rather than trying some magic calculations on our side query the monitor
    for the current size of the memory balloon both on hotplug and
    hotunplug.

Comment 11 Luyao Huang 2016-06-01 08:03:06 UTC
Try to verify this bug with libvirt-1.3.4-1.el7.x86_64:

1. prepare a guest with numa + maxmemory:

# virsh dumpxml rhel72-test
...
  <maxMemory slots='16' unit='KiB'>15243264</maxMemory>
  <memory unit='KiB'>3145728</memory>
  <currentMemory unit='KiB'>3145728</currentMemory>
  <vcpu placement='static' current='10'>24</vcpu>
...
  <cpu mode='host-passthrough'>
    <numa>
      <cell id='0' cpus='0-10' memory='1048576' unit='KiB'/>
      <cell id='1' cpus='11-20' memory='1048576' unit='KiB'/>
      <cell id='2' cpus='21-23' memory='1048576' unit='KiB'/>
    </numa>
  </cpu>
...

2. start guest and hot-plug memory device:

# virsh start rhel72-test
Domain rhel72-test started

# cat memdevice1G.xml 
    <memory model='dimm'>
      <target>
        <size unit='G'>1</size>
        <node>0</node>
      </target>
    </memory>

# virsh attach-device rhel72-test memdevice1G.xml 
Device attached successfully


3. recheck xml and guest memory:

# virsh dumpxml rhel72-test
...
  <maxMemory slots='16' unit='KiB'>15243264</maxMemory>
  <memory unit='KiB'>4194304</memory>
  <currentMemory unit='KiB'>4194304</currentMemory>
...

# virsh dominfo rhel72-test
Id:             53
Name:           rhel72-test
UUID:           855670a9-34e6-4da2-a1ec-1993de100d79
OS Type:        hvm
State:          running
CPU(s):         10
CPU time:       60.3s
Max memory:     4194304 KiB
Used memory:    4194304 KiB
Persistent:     yes
Autostart:      disable
Managed save:   no
Security model: selinux
Security DOI:   0
Security label: system_u:system_r:svirt_t:s0:c508,c895 (permissive)

# virsh dommemstat rhel72-test
actual 4194304
swap_in 0
swap_out 0
major_fault 1154
minor_fault 938030
unused 2532828
available 3175068
rss 957752


3. edit guest and make current memory < memory:

# virsh dumpxml rhel72-test --inactive
...
  <maxMemory slots='16' unit='KiB'>15242882</maxMemory>
  <memory unit='KiB'>3145728</memory>
  <currentMemory unit='KiB'>1145728</currentMemory>
  <vcpu placement='static' current='10'>24</vcpu>
...
  <cpu mode='host-passthrough'>
    <numa>
      <cell id='0' cpus='0-10' memory='1048576' unit='KiB'/>
      <cell id='1' cpus='11-20' memory='1048576' unit='KiB'/>
      <cell id='2' cpus='21-23' memory='1048576' unit='KiB'/>
    </numa>
  </cpu>
...

4. start guest and hot-plug memory device:


# virsh start rhel72-test
Domain rhel72-test started

# cat memdevice1G.xml 
    <memory model='dimm'>
      <target>
        <size unit='G'>1</size>
        <node>0</node>
      </target>
    </memory>

# virsh attach-device rhel72-test memdevice1G.xml 
Device attached successfully


5. recheck xml and guest memory:

# virsh dumpxml rhel72-test
...
  <maxMemory slots='16' unit='KiB'>15243264</maxMemory>
  <memory unit='KiB'>4194304</memory>
  <currentMemory unit='KiB'>2194304</currentMemory>
...

# virsh dominfo rhel72-test
Id:             54
Name:           rhel72-test
UUID:           855670a9-34e6-4da2-a1ec-1993de100d79
OS Type:        hvm
State:          running
CPU(s):         10
CPU time:       79.6s
Max memory:     4194304 KiB
Used memory:    2194304 KiB
Persistent:     yes
Autostart:      disable
Managed save:   no
Security model: selinux
Security DOI:   0
Security label: system_u:system_r:svirt_t:s0:c614,c971 (permissive)

# virsh dommemstat rhel72-test
actual 2194304
swap_in 0
swap_out 0
major_fault 1172
minor_fault 1038927
unused 511804
available 1175068
rss 880036

6. and test memory unplug:

# virsh detach-device rhel72-test memdevice1G.xml 
Device detached successfully

# virsh dumpxml rhel72-test
...
  <maxMemory slots='16' unit='KiB'>15243264</maxMemory>
  <memory unit='KiB'>3145728</memory>
  <currentMemory unit='KiB'>1145728</currentMemory>
...

# virsh dommemstat rhel72-test
actual 1145728
swap_in 0
swap_out 0
major_fault 1158
minor_fault 917555
unused 280604
available 912924
rss 1007208


7. test cold-plug/unplug when memory == currentmemory:
# virsh dumpxml rhel72-test --inactive
...

  <maxMemory slots='16' unit='KiB'>15242882</maxMemory>
  <memory unit='KiB'>3145728</memory>
  <currentMemory unit='KiB'>3145728</currentMemory>
...

# virsh attach-device rhel72-test memdevice1G.xml --config
Device attached successfully

# virsh dumpxml rhel72-test --inactive
...
  <maxMemory slots='16' unit='KiB'>15242882</maxMemory>
  <memory unit='KiB'>4194304</memory>
  <currentMemory unit='KiB'>4194304</currentMemory>     <---- updated
...

# virsh detach-device rhel72-test memdevice1G.xml --config
Device detached successfully

[root@hp-dl385g7-09 ~]# virsh dumpxml rhel72-test --inactive
...
  <maxMemory slots='16' unit='KiB'>15242882</maxMemory>
  <memory unit='KiB'>3145728</memory>
  <currentMemory unit='KiB'>3145728</currentMemory>
...


8. test cold-plug/unplug when memory != currentmemory:

# virsh dumpxml rhel72-test  --inactive
...
  <maxMemory slots='16' unit='KiB'>15242882</maxMemory>
  <memory unit='KiB'>3145728</memory>
  <currentMemory unit='KiB'>2145728</currentMemory>
...

# virsh attach-device rhel72-test memdevice1G.xml --config
Device attached successfully

# virsh dumpxml rhel72-test  --inactive
...
  <maxMemory slots='16' unit='KiB'>15242882</maxMemory>
  <memory unit='KiB'>4194304</memory>
  <currentMemory unit='KiB'>2145728</currentMemory>      <----not change
...

# virsh detach-device rhel72-test memdevice1G.xml --config
Device detached successfully

# virsh dumpxml rhel72-test  --inactive
<domain type='kvm'>
  <name>rhel72-test</name>
  <uuid>855670a9-34e6-4da2-a1ec-1993de100d79</uuid>
  <maxMemory slots='16' unit='KiB'>15242882</maxMemory>
  <memory unit='KiB'>3145728</memory>
  <currentMemory unit='KiB'>2145728</currentMemory>

Comment 12 Luyao Huang 2016-06-01 08:09:15 UTC
Hi Peter,

I noticed libvirt won't change current memory size after cold-plug a memory device on a guest which current memory != memory size, but will change the current memory size on a guest which current memory == memory size. (you can check step 7 and 8 in comment 11)

Could your please help to check if it is expected ?

Thanks,
Luyao

Comment 13 Peter Krempa 2016-06-15 07:54:09 UTC
It was deliberate but I might want to revisit that decision.

Comment 14 yalzhang@redhat.com 2016-06-27 09:31:42 UTC
Hi Peter,

I found below symptom on libvirt-1.3.5-1.el7.x86_64 about hotpulg/unplug, coldplug/unplug. Please take into consideration as well, Thank you.

current memory == memory
1) coldunplug will not change memory size.

current memory != memory
2) coldunplug will not change current memory and memory. 
3) hotunplug report succeed, but in fact it is failed. xml no change after hotunplug.

# virsh dumpxml rhel7.2
 <maxMemory slots='16' unit='KiB'>2560000</maxMemory>
  <memory unit='KiB'>1024000</memory>
  <currentMemory unit='KiB'>1024000</currentMemory>
  <vcpu placement='static'>4</vcpu>
....
<cpu mode='custom' match='exact'>
   <numa>
      <cell id='0' cpus='0-1' memory='512000' unit='KiB'/>
      <cell id='1' cpus='2-3' memory='512000' unit='KiB'/>
    </numa>
</cpu>
...

1) coldunplug will not change memory size.
# virsh domstate rhel7.2
shut off

# virsh attach-device rhel7.2 memdevice.xml --config
Device attached successfully

# virsh dumpxml rhel7.2 | grep current -B2
  <maxMemory slots='16' unit='KiB'>2560000</maxMemory>
  <memory unit='KiB'>1536000</memory>
  <currentMemory unit='KiB'>1536000</currentMemory>

# virsh detach-device rhel7.2 memdevice.xml --config
Device detached successfully

# virsh dumpxml rhel7.2 | grep current -B2
  <maxMemory slots='16' unit='KiB'>2560000</maxMemory>
  <memory unit='KiB'>1536000</memory>     -------->memory not change
  <currentMemory unit='KiB'>1024000</currentMemory>  

# virsh start rhel7.2; virsh dumpxml rhel7.2 | grep current -B2
Domain rhel7.2 started

  <maxMemory slots='16' unit='KiB'>2560000</maxMemory>
  <memory unit='KiB'>1024000</memory>  -------->memory will be 1024000 at running state, but change to 1536000 when the domain is down.
  <currentMemory unit='KiB'>1024000</currentMemory> 

current memory != memory
# virsh dumpxml rhel7.2
...
  <maxMemory slots='16' unit='KiB'>2560000</maxMemory>
  <memory unit='KiB'>1024000</memory>
  <currentMemory unit='KiB'>800000</currentMemory>
  <vcpu placement='static'>4</vcpu>
...
<cpu mode='custom' match='exact'>
   <numa>
      <cell id='0' cpus='0-1' memory='512000' unit='KiB'/>
      <cell id='1' cpus='2-3' memory='512000' unit='KiB'/>
    </numa>
</cpu>
...

2) coldunplug will not change current memory and memory.

# virsh domstate rhel7.2
shut off

# virsh attach-device rhel7.2 memdevice.xml --config
Device attached successfully

# virsh dumpxml rhel7.2 | grep current -B2
  <maxMemory slots='16' unit='KiB'>2560000</maxMemory>
  <memory unit='KiB'>1536000</memory>
  <currentMemory unit='KiB'>800000</currentMemory> ---just as comment 12

# virsh detach-device rhel7.2 memdevice.xml --config
Device detached successfully

# virsh dumpxml rhel7.2 | grep current -B2
  <maxMemory slots='16' unit='KiB'>2560000</maxMemory>
  <memory unit='KiB'>1536000</memory>  -----> not change
  <currentMemory unit='KiB'>800000</currentMemory> ----> not change

after the domain boot up, the memory and current memory is reasonbale. But after destroy, memory will changed to 1536000:
# virsh start rhel7.2 ; virsh dumpxml rhel7.2 | grep current -B2
Domain rhel7.2 started

  <maxMemory slots='16' unit='KiB'>2560000</maxMemory>
  <memory unit='KiB'>1024000</memory>  ---> it makes sense
  <currentMemory unit='KiB'>800000</currentMemory>  

# virsh destroy rhel7.2; virsh dumpxml rhel7.2 | grep current -B2
Domain rhel7.2 destroyed

  <maxMemory slots='16' unit='KiB'>2560000</maxMemory>
  <memory unit='KiB'>1536000</memory>
  <currentMemory unit='KiB'>800000</currentMemory>

3) hotunplug report succeed, but in fact it is failed. xml no change after hotunplug.
# virsh start rhel7.2

After the domain boot up,
# virsh dumpxml rhel7.2 | grep current -B2
  <maxMemory slots='16' unit='KiB'>2560000</maxMemory>
  <memory unit='KiB'>1024000</memory>
  <currentMemory unit='KiB'>800000</currentMemory>

# virsh attach-device rhel7.2 memdevice.xml 
Device attached successfully

# virsh attach-device rhel7.2 memdevice.xml 
Device attached successfully

# virsh dumpxml rhel7.2 
...
  <maxMemory slots='16' unit='KiB'>2560000</maxMemory>
  <memory unit='KiB'>1536000</memory>
  <currentMemory unit='KiB'>1312000</currentMemory>
...
    <memory model='dimm'>
      <target>
        <size unit='KiB'>512000</size>
        <node>0</node>
      </target>
      <alias name='dimm0'/>
      <address type='dimm' slot='0' base='0x100000000'/>
    </memory>
  <memory model='dimm'>
      <target>
        <size unit='KiB'>512000</size>
        <node>0</node>
      </target>
      <alias name='dimm1'/>
      <address type='dimm' slot='1' base='0x11f400000'/>
    </memory>
...
# virsh detach-device rhel7.2 memdevice.xml 
Device detached successfully
# virsh detach-device rhel7.2 memdevice.xml 
Device detached successfully
# virsh detach-device rhel7.2 memdevice.xml 
Device detached successfully
# virsh detach-device rhel7.2 memdevice.xml 
Device detached successfully
....
The command will always succeed, but in fact, after several times of detach(>4), the 2 dimm devices attached still in the guest xml. And guest become kernel panic afer the first time detach.


Refer to below info for more details about 3)
on host:
# virsh qemu-monitor-event rhel7.2 --loop
....
event ACPI_DEVICE_OST at 1467016144.421065 for domain rhel7.2: {"info":{"device":"dimm0","source":1,"status":0,"slot":"0","slot-type":"DIMM"}}

event ACPI_DEVICE_OST at 1467016202.688859 for domain rhel7.2: {"info":{"device":"dimm1","source":1,"status":1,"slot":"1","slot-type":"DIMM"}} 

---> status is "1", but the attach seems succeed in host, and there is a pop up acpi error in guest

event ACPI_DEVICE_OST at 1467016737.520537 for domain rhel7.2: {"info":{"device":"dimm0","source":3,"status":132,"slot":"0","slot-type":"DIMM"}} 

---->status is "132", but the detach succeed from the host, the guest become kernel panic at the same time


# virsh console rhel7.2
Connected to domain rhel7.2
Escape character is ^]

Red Hat Enterprise Linux Server 7.2 (Maipo)
Kernel 3.10.0-327.el7.x86_64 on an x86_64

localhost login: root
Password: 
Last login: Mon Jun 27 04:49:28 on tty1
------> attach the second dimm device here
[root@localhost ~]# [   54.946879] acpi PNP0C80:01: acpi_memory_enable_device() error  
[root@localhost ~]# cd /sys/devices/system/memory
[root@localhost memory]# ls
block_size_bytes   memory2   memory34  memory38  memory6  soft_offline_page
hard_offline_page  memory3   memory35  memory39  memory7  uevent
memory0		   memory32  memory36  memory4	 power
memory1		   memory33  memory37  memory5	 probe
-----> detach the dimm device here, guest become kernel panic

[root@localhost memory]# [  796.652495] ------------[ cut here ]------------
[  796.653273] kernel BUG at mm/memory_hotplug.c:1850!
[  796.653273] invalid opcode: 0000 [#1] SMP 
[  796.653273] Modules linked in: ip6t_rpfilter ip6t_REJECT ipt_REJECT xt_conntrack ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw iptable_filter snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_seq snd_seq_device snd_pcm ppdev snd_timer snd sg pcspkr soundcore virtio_balloon parport_pc i2c_piix4 parport ip_tables xfs libcrc32c sr_mod cdrom ata_generic pata_acpi virtio_net virtio_console virtio_blk qxl syscopyarea sysfillrect sysimgblt drm_kms_helper ata_piix ttm libata drm serio_raw virtio_pci virtio_ring virtio i2c_core floppy dm_mirror dm_region_hash dm_log dm_mod
[  796.653273] CPU: 2 PID: 6 Comm: kworker/u8:0 Not tainted 3.10.0-327.el7.x86_64 #1
[  796.653273] Hardware name: Red Hat KVM, BIOS 1.9.1-4.el7 04/01/2014
[  796.653273] Workqueue: kacpi_hotplug acpi_hotplug_work_fn
[  796.653273] task: ffff88001e75b980 ti: ffff88001e7cc000 task.ti: ffff88001e7cc000
[  796.653273] RIP: 0010:[<ffffffff81627f83>]  [<ffffffff81627f83>] remove_memory+0x93/0xa0
[  796.653273] RSP: 0018:ffff88001e7cfd28  EFLAGS: 00010202
[  796.653273] RAX: 0000000000000001 RBX: 0000000100000000 RCX: 0000000000000000
[  796.653273] RDX: 0000000000000000 RSI: ffff88003e40d6c8 RDI: ffffffff819c6e80
[  796.653273] RBP: ffff88001e7cfd48 R08: 0000000000000096 R09: 000000000000029b
[  796.653273] R10: 0000000000000000 R11: ffff88001e7cf9c6 R12: 000000001f400000
[  796.653273] R13: 000000011f400000 R14: 0000000000000000 R15: ffff880036b546b0
[  796.653273] FS:  0000000000000000(0000) GS:ffff88003e400000(0000) knlGS:0000000000000000
[  796.653273] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[  796.653273] CR2: 00007fdc21e54044 CR3: 000000001d3f7000 CR4: 00000000000006e0
[  796.653273] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  796.653273] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[  796.653273] Stack:
[  796.653273]  ffff88003b502800 ffff880036b546a0 ffff880036b546b0 0000000000000000
[  796.653273]  ffff88001e7cfd80 ffffffff81391d9b ffff88003c815800 ffff88003c8157f0
[  796.653273]  ffffffff81a037c0 ffff88003c815818 ffff88003c8159f0 ffff88001e7cfdb0
[  796.653273] Call Trace:
[  796.653273]  [<ffffffff81391d9b>] acpi_memory_device_remove+0x79/0xa5
[  796.653273]  [<ffffffff81363608>] acpi_bus_trim+0x5a/0x8d
[  796.653273]  [<ffffffff81364e16>] acpi_device_hotplug+0x1b7/0x418
[  796.653273]  [<ffffffff8135ee25>] acpi_hotplug_work_fn+0x1e/0x29
[  796.653273]  [<ffffffff8109d5fb>] process_one_work+0x17b/0x470
[  796.653273]  [<ffffffff8109e3cb>] worker_thread+0x11b/0x400
[  796.653273]  [<ffffffff8109e2b0>] ? rescuer_thread+0x400/0x400
[  796.653273]  [<ffffffff810a5aef>] kthread+0xcf/0xe0
[  796.653273]  [<ffffffff810a5a20>] ? kthread_create_on_node+0x140/0x140
[  796.653273]  [<ffffffff81645858>] ret_from_fork+0x58/0x90
[  796.653273]  [<ffffffff810a5a20>] ? kthread_create_on_node+0x140/0x140
[  796.653273] Code: ff ff 44 89 f7 e8 9e c9 b9 ff 48 c7 c7 80 6e 9c 81 e8 62 0b 01 00 5b 41 5c 41 5d 41 5e 5d c3 48 c7 c7 80 6e 9c 81 e8 4d 0b 01 00 <0f> 0b 66 2e 0f 1f 84 00 00 00 00 00 90 66 66 66 66 90 55 48 89 
[  796.653273] RIP  [<ffffffff81627f83>] remove_memory+0x93/0xa0
[  796.653273]  RSP <ffff88001e7cfd28>
[  796.726551] ---[ end trace f94301065d2faab8 ]---
[  796.727664] Kernel panic - not syncing: Fatal exception
[  796.728651] drm_kms_helper: panic occurred, switching back to text console

Comment 15 yalzhang@redhat.com 2016-06-28 10:19:23 UTC
Hi Peter,

Could your please help to check if the 1~3 in comment 14 is expected ?

Comment 16 yalzhang@redhat.com 2016-07-26 10:49:18 UTC
(In reply to yalzhang from comment #14)
Hi Peter, 

Sorry for such a mussy comment~
Pls ignore the comment 14 as I have re-test all the scenarios with below packages, I can not reproduce 1) 2) and 3) except for the problem in comment 12.

libvirt-2.0.0-3.el7.x86_64
qemu-kvm-rhev-2.6.0-15.el7.x86_64
guest kernel: kernel-3.10.0-478.el7.x86_64

Comment 17 Luyao Huang 2016-08-15 08:45:43 UTC
(In reply to Peter Krempa from comment #13)
> It was deliberate but I might want to revisit that decision.

This commit is merged into upstream:

commit 707063efa86a8cab3ee7d64ad203f2e517b82bce
Author: Shivaprasad G Bhat <sbhat.ibm.com>
Date:   Thu Jul 21 15:39:30 2016 +0530

    qemu: Adjust the cur_ballon on coldplug/unplug of dimms
    
    The cur_balloon also increases/decreases with dimm hotplug/unplug.
    To be consistent, adjust the value for coldplug too. This was inconsistently
    taken care when cur_ballon != memory to begin with. The patch fixes it
    irrespective of that.
    
    Signed-off-by: Shivaprasad G Bhat <sbhat.ibm.com>
    Signed-off-by: Peter Krempa <pkrempa>

And the problem described in step 7 and 8 in comment 11 have been fixed by this commit.

Peter, will you backport this patch in this bug ? or fix this problem in rhel7.4 during libvirt rebase ?

Thanks

Comment 19 Luyao Huang 2016-09-01 02:13:52 UTC
Verify this bug with libvirt-2.0.0-6.el7.x86_64:

retest the issue fixed by new patch:

1.

# virsh dumpxml r7
...
  <maxMemory slots='16' unit='KiB'>15242882</maxMemory>
  <memory unit='KiB'>2228224</memory>
  <currentMemory unit='KiB'>1228224</currentMemory>
  <vcpu placement='static' current='3'>12</vcpu>
...
    <memory model='dimm'>
      <target>
        <size unit='KiB'>131072</size>
        <node>1</node>
      </target>
    </memory>

2. cold-unplug memory device 

# virsh detach-device r7 memdevice.xml --config
Device detached successfully

3. check memory

# virsh dumpxml r7
<domain type='kvm'>
  <name>r7</name>
  <uuid>67c7a123-5415-4136-af62-a2ee098ba6cd</uuid>
  <maxMemory slots='16' unit='KiB'>15242882</maxMemory>
  <memory unit='KiB'>2097152</memory>
  <currentMemory unit='KiB'>1097152</currentMemory>

4. cold-plug memory device:

# virsh attach-device r7 memdevice1G.xml  --current
Device attached successfully

5. check guest xml:

# virsh dumpxml r7
<domain type='kvm'>
  <name>r7</name>
  <uuid>67c7a123-5415-4136-af62-a2ee098ba6cd</uuid>
  <maxMemory slots='16' unit='KiB'>15242882</maxMemory>
  <memory unit='KiB'>3145728</memory>
  <currentMemory unit='KiB'>2145728</currentMemory>
...
    <memory model='dimm'>
      <target>
        <size unit='KiB'>1048576</size>
        <node>1</node>
      </target>
    </memory>
...

6. detach memory device:

# virsh detach-device r7 memdevice1G.xml  --current
Device detached successfully

# virsh dumpxml r7
<domain type='kvm'>
  <name>r7</name>
  <uuid>67c7a123-5415-4136-af62-a2ee098ba6cd</uuid>
  <maxMemory slots='16' unit='KiB'>15242882</maxMemory>
  <memory unit='KiB'>2097152</memory>
  <currentMemory unit='KiB'>1097152</currentMemory>

And retest the problem described in comment 0 with step in comment 11 not regression found.

Comment 21 errata-xmlrpc 2016-11-03 18:16:52 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2016-2577.html


Note You need to log in before you can comment on or make changes to this bug.