Bug 1669053 - Guest call trace when boot with nvdimm device backed by /dev/dax
Summary: Guest call trace when boot with nvdimm device backed by /dev/dax
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Enterprise Linux Advanced Virtualization
Classification: Red Hat
Component: qemu-kvm
Version: ---
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: rc
: 8.0
Assignee: Stefan Hajnoczi
QA Contact: Yumei Huang
URL:
Whiteboard:
: 1581104 (view as bug list)
Depends On: 1581104
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-01-24 08:28 UTC by Yumei Huang
Modified: 2020-12-20 07:46 UTC (History)
12 users (show)

Fixed In Version: qemu-kvm-3.1.0-21.module+el8.0.1+3009+b48fff88
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1581104
Environment:
Last Closed: 2020-03-11 22:39:29 UTC
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Comment 3 Stefan Hajnoczi 2019-02-04 07:35:57 UTC
Reducing priority to medium.  Although this prevents guests from booting, launching QEMU directly from the command-line is not supported.

Comment 4 Stefan Hajnoczi 2019-02-12 03:27:37 UTC
Patch posted upstream to reject invalid sizes:
https://patchwork.ozlabs.org/patch/1040333/

Comment 5 Stefan Hajnoczi 2019-03-20 15:31:52 UTC
Moving to 8.1.0.

Comment 6 Stefan Hajnoczi 2019-03-20 15:36:22 UTC
That should have been "moving to 8.0.1", sorry!

Comment 8 Danilo de Paula 2019-04-08 19:57:22 UTC
Martin: can you grant pm_ack+, please?

Comment 9 Danilo de Paula 2019-04-11 18:33:24 UTC
Fix included in qemu-kvm-3.1.0-21.module+el8.0.1+3009+b48fff88

Comment 11 Yumei Huang 2019-04-25 03:52:15 UTC
Tested with qemu-kvm-3.1.0-23.module+el8.0.1+3077+4bc4491c, still hit the issue, QEMU didn't reject invalid pmem file sizes.

[   64.243066] watchdog: BUG: soft lockup - CPU#15 stuck for 22s! [systemd-udevd:920]
[   64.252174] Modules linked in: dax_pmem nd_pmem device_dax nd_btt sg i2c_piix4 joydev pcspkr nfit libnvdimm xfs libcrc32c sr_mod cdrom sd_mod ata_generic bochs_drm drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm ata_piix drm libata virtio_net net_failover serio_raw failover dm_mirror dm_region_hash dm_log dm_mod
[   64.270481] CPU: 15 PID: 920 Comm: systemd-udevd Tainted: G             L   --------- -  - 4.18.0-80.el8.x86_64 #1
[   64.276921] Hardware name: Red Hat KVM, BIOS 1.12.0-1.module+el8+2706+3c6581b6 04/01/2014
[   64.282493] RIP: 0010:pmem_do_bvec+0xf7/0x330 [nd_pmem]
[   64.286716] Code: 2b 0d f5 45 29 f0 48 c1 f9 06 48 c1 e1 0c 48 03 0d f6 45 29 f0 44 89 fa 4c 01 e1 0f 1f 44 00 00 41 83 ff 08 0f 82 70 ff ff ff <48> 8b 03 48 8d 79 08 48 89 de 48 83 e7 f8 48 89 01 48 8b 44 13 f8
[   64.299139] RSP: 0018:ffff999702a5b9a8 EFLAGS: 00010212 ORIG_RAX: ffffffffffffff13
[   64.304475] RAX: ffff8e17747c1680 RBX: ffff9997bfff0000 RCX: ffff8e17735fd000
[   64.309650] RDX: 0000000000001000 RSI: ffffdecf4acd7f40 RDI: ffff8e176d901c18
[   64.314821] RBP: 0000000000001000 R08: 0000000000000000 R09: 00000000003fff80
[   64.319965] R10: 000000007fff0000 R11: 0000000000000000 R12: 0000000000000000
[   64.325103] R13: ffffdecf4acd7f40 R14: 0000000000001000 R15: 0000000000001000
[   64.331990] FS:  00007f257bf2b940(0000) GS:ffff8e17757c0000(0000) knlGS:0000000000000000
[   64.338058] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   64.342627] CR2: 0000000067d8900f CR3: 00000002b4130000 CR4: 00000000000006e0
[   64.347801] Call Trace:
[   64.351033]  ? memcg_kmem_get_cache+0x50/0x150
[   64.355512]  ? kmem_cache_alloc+0x160/0x1d0
[   64.359454]  pmem_rw_page+0x46/0x80 [nd_pmem]
[   64.363466]  bdev_read_page+0x7b/0xb0
[   64.367159]  do_mpage_readpage+0x6c8/0x720
[   64.371679]  ? check_disk_change+0x60/0x60
[   64.376166]  ? __add_to_page_cache_locked+0x1df/0x240
[   64.381546]  mpage_readpages+0x115/0x1e0
[   64.385977]  ? check_disk_change+0x60/0x60
[   64.390500]  ? check_disk_change+0x60/0x60
[   64.394982]  ? __d_lookup_done+0x77/0xe0
[   64.399355]  read_pages+0x6b/0x190
[   64.403422]  __do_page_cache_readahead+0x1c1/0x1e0
[   64.410993]  force_page_cache_readahead+0x90/0xf0
[   64.416499]  generic_file_buffered_read+0x556/0xa00
[   64.419456]  ? __seccomp_filter+0x44/0x4a0
[   64.422136]  new_sync_read+0x121/0x170
[   64.424697]  vfs_read+0x8a/0x140
[   64.427068]  ksys_read+0x4f/0xb0
[   64.429435]  do_syscall_64+0x5b/0x1b0
[   64.431928]  entry_SYSCALL_64_after_hwframe+0x65/0xca
[   64.434877] RIP: 0033:0x7f257ae05b12

Comment 12 Ademar Reis 2019-04-25 15:01:21 UTC
*** Bug 1581104 has been marked as a duplicate of this bug. ***

Comment 13 Stefan Hajnoczi 2019-04-29 17:26:14 UTC
(In reply to Yumei Huang from comment #11)
> Tested with qemu-kvm-3.1.0-23.module+el8.0.1+3077+4bc4491c, still hit the
> issue, QEMU didn't reject invalid pmem file sizes.

Did you add the pmem=on option to -object memory-backend-file?

Here is my command-line for qemu-kvm-3.1.0-23.module+el8.0.1+3077+4bc4491c:

  $ qemu-system-x86_64 -m 1G,slots=4,maxmem=4G -M pc,nvdimm,accel=kvm -object memory-backend-file,mem-path=/dev/dax0.0,size=1G,id=mem0,share=on,pmem=on -device nvdimm,id=dimm0,memdev=mem0 -drive if=virtio,file=test.img,format=raw
  qemu-system-x86_64: -object memory-backend-file,mem-path=/dev/dax0.0,size=1G,id=mem0,share=on,pmem=on: size property 1073741824 is larger than pmem file "/dev/dax0.0" size 1054867456

Comment 14 Yumei Huang 2019-04-30 02:14:46 UTC
(In reply to Stefan Hajnoczi from comment #13)
> (In reply to Yumei Huang from comment #11)
> > Tested with qemu-kvm-3.1.0-23.module+el8.0.1+3077+4bc4491c, still hit the
> > issue, QEMU didn't reject invalid pmem file sizes.
> 
> Did you add the pmem=on option to -object memory-backend-file?

No, I didn't, as the bug was reported without the option. 

I checked qemu-kvm man page,

"
     The pmem option specifies whether the backing file specified by mem-
     path is in host persistent memory that can be accessed using the SNIA
     NVM programming model (e.g. Intel NVDIMM).  If pmem is set to 'on',
     QEMU will take necessary operations to guarantee the persistence of its
     own writes to mem-path (e.g. in vNVDIMM label emulation and live
     migration).
"

"pmem=on" means the /dev/dax is in host persistent memory. In my case, it's emulated by kernel line "memmap=4G!2G". I guess it shouldn't set pmem=on. 

Still, I had a try with pmem=on, qemu rejected it with another error.

#  /usr/libexec/qemu-kvm -m 10G,slots=20,maxmem=40G -smp 32  -M pc,nvdimm -object memory-backend-file,mem-path=/dev/dax7.0,size=2G,id=mem0,share=on,pmem=on -device nvdimm,id=dimm0,memdev=mem0 
qemu-kvm: -object memory-backend-file,mem-path=/dev/dax7.0,size=2G,id=mem0,share=on,pmem=on: Lack of libpmem support while setting the 'pmem=on' of memory-backend-file '(null)'. We can't ensure data persistence.


Stefan, Would you please help check if my understanding is right? Or if this fix is only for real nvdimm device? Thanks!

> 
> Here is my command-line for qemu-kvm-3.1.0-23.module+el8.0.1+3077+4bc4491c:
> 
>   $ qemu-system-x86_64 -m 1G,slots=4,maxmem=4G -M pc,nvdimm,accel=kvm
> -object
> memory-backend-file,mem-path=/dev/dax0.0,size=1G,id=mem0,share=on,pmem=on
> -device nvdimm,id=dimm0,memdev=mem0 -drive if=virtio,file=test.img,format=raw
>   qemu-system-x86_64: -object
> memory-backend-file,mem-path=/dev/dax0.0,size=1G,id=mem0,share=on,pmem=on:
> size property 1073741824 is larger than pmem file "/dev/dax0.0" size
> 1054867456

Comment 15 Stefan Hajnoczi 2019-05-01 15:28:34 UTC
(In reply to Yumei Huang from comment #14)
> (In reply to Stefan Hajnoczi from comment #13)
> > (In reply to Yumei Huang from comment #11)
> > > Tested with qemu-kvm-3.1.0-23.module+el8.0.1+3077+4bc4491c, still hit the
> > > issue, QEMU didn't reject invalid pmem file sizes.
> > 
> > Did you add the pmem=on option to -object memory-backend-file?
> 
> No, I didn't, as the bug was reported without the option. 
> 
> I checked qemu-kvm man page,
> 
> "
>      The pmem option specifies whether the backing file specified by mem-
>      path is in host persistent memory that can be accessed using the SNIA
>      NVM programming model (e.g. Intel NVDIMM).  If pmem is set to 'on',
>      QEMU will take necessary operations to guarantee the persistence of its
>      own writes to mem-path (e.g. in vNVDIMM label emulation and live
>      migration).
> "
> 
> "pmem=on" means the /dev/dax is in host persistent memory. In my case, it's
> emulated by kernel line "memmap=4G!2G". I guess it shouldn't set pmem=on. 

The size check is only performed when pmem=on.  This explains why you didn't encounter it.

I think the goal of testing is to exercise the NVDIMM code paths, so it makes sense to set pmem=on.  Otherwise you're running different code paths and not what a customer with a physical NVDIMM will use.

> Still, I had a try with pmem=on, qemu rejected it with another error.
> 
> #  /usr/libexec/qemu-kvm -m 10G,slots=20,maxmem=40G -smp 32  -M pc,nvdimm
> -object
> memory-backend-file,mem-path=/dev/dax7.0,size=2G,id=mem0,share=on,pmem=on
> -device nvdimm,id=dimm0,memdev=mem0 
> qemu-kvm: -object
> memory-backend-file,mem-path=/dev/dax7.0,size=2G,id=mem0,share=on,pmem=on:
> Lack of libpmem support while setting the 'pmem=on' of memory-backend-file
> '(null)'. We can't ensure data persistence.
> 
> 
> Stefan, Would you please help check if my understanding is right? Or if this
> fix is only for real nvdimm device? Thanks!

Oops, RHEL8 doesn't link against nvml (libpmem-devel) so the libpmem support is not being built in!  I have CCed Paul Lai.

Paul: I remember we discussed RPM dependencies on libpmem for RHEL 7 NVDIMM backports.  What is the status in RHEL 7?  It seems like we should update qemu-kvm.spec.template in RHEL8 to add a libpmem dependency too?

Comment 16 Stefan Hajnoczi 2019-05-01 15:35:50 UTC
(In reply to Stefan Hajnoczi from comment #15)
> Paul: I remember we discussed RPM dependencies on libpmem for RHEL 7 NVDIMM
> backports.  What is the status in RHEL 7?  It seems like we should update
> qemu-kvm.spec.template in RHEL8 to add a libpmem dependency too?

I took a quick look at RHEL7 and don't see the libpmem dependency there either, even though the libpmem code has been backported.  I am creating BZes to track this.

Comment 19 Yumei Huang 2019-06-10 08:38:54 UTC
Hi Stefan,
I tested with qemu-kvm-3.1.0-27.module+el8.0.1+3253+c5371cb3, if set pmem=on, will get same result as you got in comment 13. But if set pmem=off, qemu core dumped with "Bus error" during guest boot up. Would you please have a look, thanks!

Version:
qemu-kvm-3.1.0-27.module+el8.0.1+3253+c5371cb3
Kernel: 4.18.0-80.1.2.el8_0.x86_64 (both guest and host)

Comment 20 Yumei Huang 2019-06-10 08:45:31 UTC
Add more info for the core dump in comment 19.

Backtrace:

(gdb) bt
#0  0x00007f51dd9e797a in pthread_sigmask () at /lib64/libpthread.so.0
#1  0x0000562ddecd8870 in sigbus_reraise ()
#2  0x0000562ddecd88d3 in  ()
#3  0x00007f51dd9eadc0 in <signal handler called> () at /lib64/libpthread.so.0
#4  0x00007f51dd773480 in __memmove_avx_unaligned_erms () at /lib64/libc.so.6
#5  0x0000562ddee0c5e8 in nvdimm_dsm_write ()
#6  0x0000562ddeced413 in memory_region_write_accessor ()
#7  0x0000562ddeceb5c6 in access_with_adjusted_size ()
#8  0x0000562ddecef390 in memory_region_dispatch_write ()
#9  0x0000562ddec99463 in flatview_write_continue ()
#10 0x0000562ddec99689 in flatview_write ()
#11 0x0000562ddec9d793 in address_space_write ()
#12 0x0000562dded01030 in kvm_cpu_exec ()
#13 0x0000562ddecda646 in qemu_kvm_cpu_thread_fn ()
#14 0x0000562ddefe80e4 in qemu_thread_start ()
#15 0x00007f51dd9e02de in start_thread () at /lib64/libpthread.so.0
#16 0x00007f51dd7112e3 in clone () at /lib64/libc.so.6


QEMU cli:

/usr/libexec/qemu-kvm -M pc,nvdimm  \
 -m 1G,slots=256,maxmem=40G  \
-object memory-backend-file,id=mem2,share,mem-path=/dev/dax0.0,size=2G,align=128M,pmem=off \
-device nvdimm,memdev=mem2,id=nv2,label-size=2M \
/home/kvm_autotest_root/images/rhel801-64-virtio-scsi.qcow2 \
-monitor stdio -vnc :0


Host dax info:

# ndctl list
[
  {
    "dev":"namespace0.0",
    "mode":"devdax",
    "map":"dev",
    "size":2111832064,
    "uuid":"ea6ef3e1-9480-460d-ac1e-db42100af146",
    "chardev":"dax0.0",
    "align":4096
  }
]

Comment 24 WeiYang 2019-06-21 02:12:10 UTC
Hi, This is Richard from Intel and I am trying to reproduce this in our environment.

Generally, I need the source of qemu to reproduce it. So I am looking for a src rpm or the qemu source code itself.


BTW, I have tried to do rpmbuild qemu-kvm-2.12.0-74.module+el8.1.0+3227+57d66ad3.src.rpm, which is the base version of qemu-kvm-av.
But the rpmbuild doesn't work since lack of other dependent rpm, , libiscsi-devel, libssh2-devel, pkgconfig(gbm).

Following are the error message respectively when I tried to install them:

Error: 
 Problem: package libiscsi-devel-1.18.0-6.module+el8+2547+34fca794.i686 requires libiscsi.so.8, but none of the providers can be installed
  - package libiscsi-devel-1.18.0-6.module+el8+2547+34fca794.i686 requires libiscsi(x86-32) = 1.18.0-6.module+el8+2547+34fca794, but none of the providers can be installed
  - libiscsi-1.18.0-6.module+el8+2547+34fca794.i686 has inferior architecture
  - conflicting requests

Error: Unable to find a match: libssh2-devel

Package pkgconf-pkg-config-1.4.2-1.el8.x86_64 is already installed

Comment 25 Stefan Hajnoczi 2019-07-23 11:52:49 UTC
(In reply to WeiYang from comment #24)
> Hi, This is Richard from Intel and I am trying to reproduce this in our
> environment.
> 
> Generally, I need the source of qemu to reproduce it. So I am looking for a
> src rpm or the qemu source code itself.
> 
> 
> BTW, I have tried to do rpmbuild
> qemu-kvm-2.12.0-74.module+el8.1.0+3227+57d66ad3.src.rpm, which is the base
> version of qemu-kvm-av.
> But the rpmbuild doesn't work since lack of other dependent rpm, ,
> libiscsi-devel, libssh2-devel, pkgconfig(gbm).
> 
> Following are the error message respectively when I tried to install them:
> 
> Error: 
>  Problem: package libiscsi-devel-1.18.0-6.module+el8+2547+34fca794.i686
> requires libiscsi.so.8, but none of the providers can be installed
>   - package libiscsi-devel-1.18.0-6.module+el8+2547+34fca794.i686 requires
> libiscsi(x86-32) = 1.18.0-6.module+el8+2547+34fca794, but none of the
> providers can be installed
>   - libiscsi-1.18.0-6.module+el8+2547+34fca794.i686 has inferior architecture
>   - conflicting requests
> 
> Error: Unable to find a match: libssh2-devel
> 
> Package pkgconf-pkg-config-1.4.2-1.el8.x86_64 is already installed

Please try the latest AV 8.0.1 qemu-kvm.  This bug was fixed in qemu-kvm-3.1.0-21.module+el8.0.1+3009+b48fff88.

Comment 26 Yumei Huang 2019-07-24 02:37:47 UTC
Hi Stefan,

Would you please take a look at comment 19 and 20, qemu core dumped during guest booting. I also reproduced with qemu-kvm-4.0.0-5.module+el8.1.0+3622+5812d9bf, do I need to file a new bug for AV 8.1, or just keep this one? 


Thanks,
Yumei Huang

Comment 28 Stefan Hajnoczi 2019-08-01 13:13:02 UTC
(In reply to Yumei Huang from comment #26)
> Would you please take a look at comment 19 and 20, qemu core dumped during
> guest booting. I also reproduced with
> qemu-kvm-4.0.0-5.module+el8.1.0+3622+5812d9bf, do I need to file a new bug
> for AV 8.1, or just keep this one? 

The crash is a new bug.  Please track it with a separate BZ.  Thanks!

Comment 29 Yumei Huang 2019-08-02 02:53:46 UTC

(In reply to Stefan Hajnoczi from comment #28)
> (In reply to Yumei Huang from comment #26)
> > Would you please take a look at comment 19 and 20, qemu core dumped during
> > guest booting. I also reproduced with
> > qemu-kvm-4.0.0-5.module+el8.1.0+3622+5812d9bf, do I need to file a new bug
> > for AV 8.1, or just keep this one? 
> 
> The crash is a new bug.  Please track it with a separate BZ.  Thanks!

Thanks Stefan.

Filed Bug 1736789 to track the core dump issue.

Comment 30 Yumei Huang 2019-08-02 02:59:06 UTC
Moving to verified per comment 19 and 29.

Comment 31 Ademar Reis 2020-02-05 22:53:56 UTC
QEMU has been recently split into sub-components and as a one-time operation to avoid breakage of tools, we are setting the QEMU sub-component of this BZ to "General". Please review and change the sub-component if necessary the next time you review this BZ. Thanks

Comment 32 Jeff Nelson 2020-03-11 22:39:29 UTC
Given that this bug is VERIFIED and RHEL AV 8.1.0 shipped (went GA) on 11 Nov 2019, I am closing this bug report.


Note You need to log in before you can comment on or make changes to this bug.