Bug 1240494

Summary: The system won't bypass the cache even enable the auto_dump_bypass_cache in qemu.conf
Product: Red Hat Enterprise Linux 7 Reporter: Dan Zheng <dzheng>
Component: libvirtAssignee: Andrea Bolognani <abologna>
Status: CLOSED NOTABUG QA Contact: Virtualization Bugs <virt-bugs>
Severity: medium Docs Contact:
Priority: medium    
Version: 7.2CC: abologna, dyuan, gsun, lhuang, mzhan, ngu, rbalakri
Target Milestone: rc   
Target Release: ---   
Hardware: ppc64le   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-04-07 10:42:14 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Program showing the effect of O_DIRECT on /proc/*/fdinfo/* none

Description Dan Zheng 2015-07-07 06:14:46 UTC
Description of problem:
The system won't bypass the cache even enable the auto_dump_bypass_cache in qemu.conf. This problem does not exist on x86.

Version-Release number of selected component (if applicable):
libvirt-1.2.17-1.el7.ppc64le
qemu-kvm-rhev-2.3.0-5.el7.ppc64le
kernel-3.10.0-282.el7.ppc64le



How reproducible:
100%

Steps to Reproduce:
1. Set auto_dump_bypass_cache = 1 in qemu.conf
# cat /etc/libvirt/qemu.conf
...
auto_dump_bypass_cache = 1
...

2. Restart libvirtd, # systemctl restart libvirtd
3. Edit XML with <on_crash>coredump-restart</on_crash>
4. Start guest.
5. Prepare command line on another terminal-2 with below command.
# cat /proc/$(lsof -w /var/lib/libvirt/qemu/dump/*|awk '/libvirt_i/{print $2}')/fdinfo/*1*
6. Within the guest, stop kdump service and make the guest crash.
# systemctl stop kdump
# echo c > /proc/sysrq-trigger
7. Before the guest finishes crashing, run the command on step 5 for many times on terminal-2

# cat /proc/$(lsof -w /var/lib/libvirt/qemu/dump/*|awk '/libvirt_i/{print $2}')/fdinfo/*1*
pos:        3822059520
flags:        0600001
...
pos:        3968860160
flags:        0600001
...
pos:        4107272192
flags:        0600001

Actual result:
In the flags in step 6 output, the third value is '0' which stands for 'auto_dump_bypass_cache' is disabled. The system did not bypass the cache for auto dump with auto_dump_bypass_cache enabled.


Expect result:
In the flags in step 6 output, the third value is '4' which stands for 'auto_dump_bypass_cache' is enabled. The system should bypass the cache for auto dump with auto_dump_bypass_cache enabled.

Comment 3 Andrea Bolognani 2017-04-07 10:41:48 UTC
Created attachment 1269673 [details]
Program showing the effect of O_DIRECT on /proc/*/fdinfo/*

Comment 4 Andrea Bolognani 2017-04-07 10:42:14 UTC
(In reply to Dan Zheng from comment #0)
[...]
> # cat /proc/$(lsof -w /var/lib/libvirt/qemu/dump/*|awk '/libvirt_i/{print
> $2}')/fdinfo/*1*
> pos:        3822059520
> flags:        0600001
> ...
> pos:        3968860160
> flags:        0600001
> ...
> pos:        4107272192
> flags:        0600001
> 
> Actual result:
> In the flags in step 6 output, the third value is '0' which stands for
> 'auto_dump_bypass_cache' is disabled. The system did not bypass the cache
> for auto dump with auto_dump_bypass_cache enabled.
> 
> Expect result:
> In the flags in step 6 output, the third value is '4' which stands for
> 'auto_dump_bypass_cache' is enabled. The system should bypass the cache for
> auto dump with auto_dump_bypass_cache enabled.

The testing methodology is not entirely correct because it
doesn't account for the fact that the concrete value for eg.
O_DIRECT can be different depending on the architecture.

I've attached a simple C program that demostrates the issue:
when run on an x86_64 host, it will print

  $ ./direct whatever
  Without O_DIRECT:
  pos:	0
  flags:	0100001
  mnt_id:	62

  With O_DIRECT:
  pos:	0
  flags:	0140001
  mnt_id:	62

whereas on a ppc64le or aarch64 host it will print

  $ ./direct whatever
  Without O_DIRECT:
  pos:    0
  flags:  0400001
  mnt_id: 80

  With O_DIRECT:
  pos:    0
  flags:  0600001
  mnt_id: 80

So 0600001 is the expected output for ppc64le hosts, and
libvirt is behaving correctly. Closing as NOTABUG.