Bug 1533158 - QEMU support for libvirtd restarting qemu-pr-helper [NEEDINFO]
Summary: QEMU support for libvirtd restarting qemu-pr-helper
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: qemu-kvm-rhev
Version: 7.4
Hardware: Unspecified
OS: Linux
urgent
high
Target Milestone: rc
: 7.5
Assignee: Paolo Bonzini
QA Contact: Xueqiang Wei
URL:
Whiteboard:
Depends On:
Blocks: 1470007 1558125
TreeView+ depends on / blocked
 
Reported: 2018-01-10 15:45 UTC by Michal Privoznik
Modified: 2018-11-01 11:04 UTC (History)
24 users (show)

Fixed In Version: qemu-kvm-rhev-2.12.0-8.el7
Doc Type: Enhancement
Doc Text:
Clone Of: 1470007
Environment:
Last Closed: 2018-11-01 11:04:08 UTC
Target Upstream Version:
jschluet: needinfo? (pbonzini)


Attachments (Terms of Use)

Comment 9 Miroslav Rezanina 2018-07-24 14:11:01 UTC
Fix included in qemu-kvm-rhev-2.12.0-8.el7

Comment 11 Xueqiang Wei 2018-07-30 10:11:15 UTC
In Description: 
"As dicussed on the list libvirt needs an event and query status command to recover status of pr-helpers (and possibly start them) on libvirtd restart."

I tested on below version:
kernel-3.10.0-926.el7.x86_64
qemu-kvm-rhev-2.12.0-8.el7
libvirt-4.5.0-4.el7.x86_64

The service "qemu-pr-helper" is not started after restart libvirtd, I have to start it manually.


Details:

1. SCSI-3 PR test:
According to https://bugzilla.redhat.com/show_bug.cgi?id=1464908#c23, for function "SCSI-3 PR support to qemu" works well.

# sh test-persist.sh /dev/sdb
    Persistent Reservation Out cmd: 5f 06 00 00 00 00 00 00 18 00 
PR out: command (Register and ignore existing key) successful
  PR generation=0x1, 1 registered reservation key follows:
    0x123aaa
    Persistent Reservation Out cmd: 5f 01 05 00 00 00 00 00 18 00 
PR out: command (Reserve) successful
  PR generation=0x1, Reservation follows:
    Key=0x123aaa
    scope: LU_SCOPE,  type: Write Exclusive, registrants only
    Persistent Reservation Out cmd: 5f 02 05 00 00 00 00 00 18 00 
PR out: command (Release) successful
  PR generation=0x1, there is NO reservation held
    Persistent Reservation Out cmd: 5f 00 05 00 00 00 00 00 18 00 
PR out: command (Register) successful
  PR generation=0x2, there are NO registered reservation keys

# sh test-persist.sh /dev/sdb 
    Persistent Reservation Out cmd: 5f 06 00 00 00 00 00 00 18 00 
PR out: command (Register and ignore existing key) successful
  PR generation=0x4, 2 registered reservation keys follow:
    0x123aaa
    0x123aaa
    Persistent Reservation Out cmd: 5f 01 05 00 00 00 00 00 18 00 
PR out: command (Reserve) successful
  PR generation=0x4, Reservation follows:
    Key=0x123aaa
    scope: LU_SCOPE,  type: Write Exclusive, registrants only
    Persistent Reservation Out cmd: 5f 02 05 00 00 00 00 00 18 00 
PR out: command (Release) successful
  PR generation=0x4, there is NO reservation held
    Persistent Reservation Out cmd: 5f 00 05 00 00 00 00 00 18 00 
PR out: command (Register) successful
  PR generation=0x6, there are NO registered reservation keys


2. qemu-pr-helper

(1) # systemctl status qemu-pr-helper  -l

● qemu-pr-helper.service - Persistent Reservation Daemon for QEMU
   Loaded: loaded (/usr/lib/systemd/system/qemu-pr-helper.service; static; vendor preset: disabled)
   Active: inactive (dead)

Jul 30 05:11:18 ibm-x3650m4-05.lab.eng.pek2.redhat.com systemd[1]: [/usr/lib/systemd/system/qemu-pr-helper.service:9] Failed to parse protect system value, ignoring: strict
Jul 30 05:11:18 ibm-x3650m4-05.lab.eng.pek2.redhat.com systemd[1]: [/usr/lib/systemd/system/qemu-pr-helper.service:10] Unknown lvalue 'ReadWritePaths' in section 'Service'
Jul 30 05:11:22 ibm-x3650m4-05.lab.eng.pek2.redhat.com systemd[1]: [/usr/lib/systemd/system/qemu-pr-helper.service:9] Failed to parse protect system value, ignoring: strict
Jul 30 05:11:22 ibm-x3650m4-05.lab.eng.pek2.redhat.com systemd[1]: [/usr/lib/systemd/system/qemu-pr-helper.service:10] Unknown lvalue 'ReadWritePaths' in section 'Service'
Jul 30 05:11:40 ibm-x3650m4-05.lab.eng.pek2.redhat.com systemd[1]: [/usr/lib/systemd/system/qemu-pr-helper.service:9] Failed to parse protect system value, ignoring: strict
Jul 30 05:11:40 ibm-x3650m4-05.lab.eng.pek2.redhat.com systemd[1]: [/usr/lib/systemd/system/qemu-pr-helper.service:10] Unknown lvalue 'ReadWritePaths' in section 'Service'

(2) # systemctl status libvirtd
● libvirtd.service - Virtualization daemon
   Loaded: loaded (/usr/lib/systemd/system/libvirtd.service; enabled; vendor preset: enabled)
   Active: active (running) since Mon 2018-07-30 01:23:59 EDT; 3h 48min ago
     Docs: man:libvirtd(8)
           https://libvirt.org
 Main PID: 27460 (libvirtd)
    Tasks: 19 (limit: 32768)
   CGroup: /system.slice/libvirtd.service
           ├─26873 /usr/sbin/dnsmasq --conf-file=/var/lib/libvirt/dnsmasq/default.conf --leasefile-ro --dhcp-script=/usr/libexec/libv...
           ├─26874 /usr/sbin/dnsmasq --conf-file=/var/lib/libvirt/dnsmasq/default.conf --leasefile-ro --dhcp-script=/usr/libexec/libv...
           └─27460 /usr/sbin/libvirtd

Jul 30 01:23:59 ibm-x3650m4-05.lab.eng.pek2.redhat.com systemd[1]: Starting Virtualization daemon...
Jul 30 01:23:59 ibm-x3650m4-05.lab.eng.pek2.redhat.com systemd[1]: Started Virtualization daemon.
Jul 30 01:23:59 ibm-x3650m4-05.lab.eng.pek2.redhat.com dnsmasq[26873]: read /etc/hosts - 2 addresses
Jul 30 01:23:59 ibm-x3650m4-05.lab.eng.pek2.redhat.com dnsmasq[26873]: read /var/lib/libvirt/dnsmasq/default.addnhosts - 0 addresses
Jul 30 01:23:59 ibm-x3650m4-05.lab.eng.pek2.redhat.com dnsmasq-dhcp[26873]: read /var/lib/libvirt/dnsmasq/default.hostsfile
Jul 30 01:25:05 ibm-x3650m4-05.lab.eng.pek2.redhat.com libvirtd[27460]: 2018-07-30 05:25:05.824+0000: 27463: info : libvirt versi...com)
Jul 30 01:25:05 ibm-x3650m4-05.lab.eng.pek2.redhat.com libvirtd[27460]: 2018-07-30 05:25:05.824+0000: 27463: info : hostname: ibm....com
Jul 30 01:25:05 ibm-x3650m4-05.lab.eng.pek2.redhat.com libvirtd[27460]: 2018-07-30 05:25:05.824+0000: 27463: error : virDirOpenIn...tory
Jul 30 01:33:15 ibm-x3650m4-05.lab.eng.pek2.redhat.com libvirtd[27460]: 2018-07-30 05:33:15.265+0000: 27463: error : storageVolCr...eady
Hint: Some lines were ellipsized, use -l to show in full.

(3) # systemctl restart libvirtd
    # systemctl status libvirtd
● libvirtd.service - Virtualization daemon
   Loaded: loaded (/usr/lib/systemd/system/libvirtd.service; enabled; vendor preset: enabled)
   Active: active (running) since Mon 2018-07-30 05:12:33 EDT; 2s ago
     Docs: man:libvirtd(8)
           https://libvirt.org
 Main PID: 31453 (libvirtd)
    Tasks: 19 (limit: 32768)
   CGroup: /system.slice/libvirtd.service
           ├─26873 /usr/sbin/dnsmasq --conf-file=/var/lib/libvirt/dnsmasq/default.conf --leasefile-ro --dhcp-script=/usr/libexec/libv...
           ├─26874 /usr/sbin/dnsmasq --conf-file=/var/lib/libvirt/dnsmasq/default.conf --leasefile-ro --dhcp-script=/usr/libexec/libv...
           └─31453 /usr/sbin/libvirtd

Jul 30 05:12:33 ibm-x3650m4-05.lab.eng.pek2.redhat.com systemd[1]: Starting Virtualization daemon...
Jul 30 05:12:33 ibm-x3650m4-05.lab.eng.pek2.redhat.com systemd[1]: Started Virtualization daemon.
Jul 30 05:12:34 ibm-x3650m4-05.lab.eng.pek2.redhat.com dnsmasq[26873]: read /etc/hosts - 2 addresses
Jul 30 05:12:34 ibm-x3650m4-05.lab.eng.pek2.redhat.com dnsmasq[26873]: read /var/lib/libvirt/dnsmasq/default.addnhosts - 0 addresses
Jul 30 05:12:34 ibm-x3650m4-05.lab.eng.pek2.redhat.com dnsmasq-dhcp[26873]: read /var/lib/libvirt/dnsmasq/default.hostsfile


(4) # systemctl status qemu-pr-helper  -l
● qemu-pr-helper.service - Persistent Reservation Daemon for QEMU
   Loaded: loaded (/usr/lib/systemd/system/qemu-pr-helper.service; static; vendor preset: disabled)
   Active: inactive (dead)

Jul 30 05:11:18 ibm-x3650m4-05.lab.eng.pek2.redhat.com systemd[1]: [/usr/lib/systemd/system/qemu-pr-helper.service:9] Failed to parse protect system value, ignoring: strict
Jul 30 05:11:18 ibm-x3650m4-05.lab.eng.pek2.redhat.com systemd[1]: [/usr/lib/systemd/system/qemu-pr-helper.service:10] Unknown lvalue 'ReadWritePaths' in section 'Service'
Jul 30 05:11:22 ibm-x3650m4-05.lab.eng.pek2.redhat.com systemd[1]: [/usr/lib/systemd/system/qemu-pr-helper.service:9] Failed to parse protect system value, ignoring: strict
Jul 30 05:11:22 ibm-x3650m4-05.lab.eng.pek2.redhat.com systemd[1]: [/usr/lib/systemd/system/qemu-pr-helper.service:10] Unknown lvalue 'ReadWritePaths' in section 'Service'
Jul 30 05:11:40 ibm-x3650m4-05.lab.eng.pek2.redhat.com systemd[1]: [/usr/lib/systemd/system/qemu-pr-helper.service:9] Failed to parse protect system value, ignoring: strict
Jul 30 05:11:40 ibm-x3650m4-05.lab.eng.pek2.redhat.com systemd[1]: [/usr/lib/systemd/system/qemu-pr-helper.service:10] Unknown lvalue 'ReadWritePaths' in section 'Service'
Jul 30 05:12:39 ibm-x3650m4-05.lab.eng.pek2.redhat.com systemd[1]: [/usr/lib/systemd/system/qemu-pr-helper.service:9] Failed to parse protect system value, ignoring: strict
Jul 30 05:12:39 ibm-x3650m4-05.lab.eng.pek2.redhat.com systemd[1]: [/usr/lib/systemd/system/qemu-pr-helper.service:10] Unknown lvalue 'ReadWritePaths' in section 'Service'

(5) # systemctl start qemu-pr-helper  -l
    # systemctl status qemu-pr-helper  -l
● qemu-pr-helper.service - Persistent Reservation Daemon for QEMU
   Loaded: loaded (/usr/lib/systemd/system/qemu-pr-helper.service; static; vendor preset: disabled)
   Active: active (running) since Mon 2018-07-30 05:13:54 EDT; 7s ago
 Main PID: 31548 (qemu-pr-helper)
    Tasks: 1
   CGroup: /system.slice/qemu-pr-helper.service
           └─31548 /usr/bin/qemu-pr-helper

Jul 30 05:13:54 ibm-x3650m4-05.lab.eng.pek2.redhat.com systemd[1]: Started Persistent Reservation Daemon for QEMU.

Comment 13 Michal Privoznik 2018-08-02 10:27:59 UTC
(In reply to Xueqiang Wei from comment #11)
> In Description: 
> "As dicussed on the list libvirt needs an event and query status command to
> recover status of pr-helpers (and possibly start them) on libvirtd restart."
> 
> I tested on below version:
> kernel-3.10.0-926.el7.x86_64
> qemu-kvm-rhev-2.12.0-8.el7
> libvirt-4.5.0-4.el7.x86_64
> 
> The service "qemu-pr-helper" is not started after restart libvirtd, I have
> to start it manually.

The event is delivered only after guest tries PR. So if you kill qemu-pr-helper and don't issue any PR command the event is not delivered and thus libvirt does not start the helper. However, once you do issue PR libvirt should start the helper process immediately without guest seeing any failure. Can you check please if this is the case?

Comment 14 Xueqiang Wei 2018-08-02 11:52:03 UTC
(In reply to Michal Privoznik from comment #13)
> (In reply to Xueqiang Wei from comment #11)
> > In Description: 
> > "As dicussed on the list libvirt needs an event and query status command to
> > recover status of pr-helpers (and possibly start them) on libvirtd restart."
> > 
> > I tested on below version:
> > kernel-3.10.0-926.el7.x86_64
> > qemu-kvm-rhev-2.12.0-8.el7
> > libvirt-4.5.0-4.el7.x86_64
> > 
> > The service "qemu-pr-helper" is not started after restart libvirtd, I have
> > to start it manually.
> 
> The event is delivered only after guest tries PR. So if you kill
> qemu-pr-helper and don't issue any PR command the event is not delivered and
> thus libvirt does not start the helper. However, once you do issue PR
> libvirt should start the helper process immediately without guest seeing any
> failure. Can you check please if this is the case?


Case link:
https://polarion.engineering.redhat.com/polarion/#/project/RedHatEnterpriseLinux7/workitem?id=RHEL-113695

I do not kill qemu-pr-helper, the service is by default.


The premise of guest boot up: service "qemu-pr-helper" is started.

If service "qemu-pr-helper" is not started, guest can not boot up. 
(hit issue: "qemu-kvm: -object pr-manager-helper,id=helper0,path=/var/run/qemu-pr-helper.socket: Failed to connect socket /var/run/qemu-pr-helper.socket: No such file or directory") 


I think if I restart libvirtd, service "qemu-pr-helper" will be restarted. And then guest can boot up. But currently, service "qemu-pr-helper" is not started after restart libvirtd. 
I have to start service "qemu-pr-helper" manually, boot up guest and then do SCSI-3 PR test.

If service "qemu-pr-helper" is already started manually, why need libvirtd to recover status of qemu-pr-helper ?


If I understand it wrongly, please correct me. Thanks.

Comment 15 Michal Privoznik 2018-08-02 14:17:45 UTC
(In reply to Xueqiang Wei from comment #14)
> (In reply to Michal Privoznik from comment #13)
> > (In reply to Xueqiang Wei from comment #11)
> > > In Description: 
> > > "As dicussed on the list libvirt needs an event and query status command to
> > > recover status of pr-helpers (and possibly start them) on libvirtd restart."
> > > 
> > > I tested on below version:
> > > kernel-3.10.0-926.el7.x86_64
> > > qemu-kvm-rhev-2.12.0-8.el7
> > > libvirt-4.5.0-4.el7.x86_64
> > > 
> > > The service "qemu-pr-helper" is not started after restart libvirtd, I have
> > > to start it manually.
> > 
> > The event is delivered only after guest tries PR. So if you kill
> > qemu-pr-helper and don't issue any PR command the event is not delivered and
> > thus libvirt does not start the helper. However, once you do issue PR
> > libvirt should start the helper process immediately without guest seeing any
> > failure. Can you check please if this is the case?
> 
> 
> Case link:
> https://polarion.engineering.redhat.com/polarion/#/project/
> RedHatEnterpriseLinux7/workitem?id=RHEL-113695
> 
> I do not kill qemu-pr-helper, the service is by default.
> 
> 
> The premise of guest boot up: service "qemu-pr-helper" is started.

If qemu-pr-helper is ran as service, i.e. <reservations managed='no'/> then libvirt will not restart the helper process no matter what. What does your disk XML look like?

> 
> If service "qemu-pr-helper" is not started, guest can not boot up. 
> (hit issue: "qemu-kvm: -object
> pr-manager-helper,id=helper0,path=/var/run/qemu-pr-helper.socket: Failed to
> connect socket /var/run/qemu-pr-helper.socket: No such file or directory") 
> 
> 
> I think if I restart libvirtd, service "qemu-pr-helper" will be restarted.

Not at all. Libvirt never kills any processes it started on its own restart. Just like it doesn't kill qemu.

> And then guest can boot up. But currently, service "qemu-pr-helper" is not
> started after restart libvirtd. 

I'm failing to see why libvirt should start a service?

> I have to start service "qemu-pr-helper" manually, boot up guest and then do
> SCSI-3 PR test.
> 
> If service "qemu-pr-helper" is already started manually, why need libvirtd
> to recover status of qemu-pr-helper ?
> 
> 
> If I understand it wrongly, please correct me. Thanks.

I think you are trying to mix two approaches at the same time. Libvirt implemented PR in two ways: managed and unmanaged. If reservations are managed then libvirt automatically starts the qemu-pr-helper process when needed, restarts it after it dies, and so on. no 'service start qemu-pr-helper' needed. Note that in this case there is one helper per domain (no matter how much disks domain has).
The other approach is unmanaged reservations. In this case libvirt does not touch qemu-pr-helper process at all and merely just generates qemu cmd line so that qemu can connect to qemu-pr-helper process that is managed by third entity (e.g. systemd in your case). This has a drawback that if the helper process dies, libvirt will not restart it, because it was told not to.

Comment 16 Paolo Bonzini 2018-08-02 20:56:30 UTC
> This has a drawback that if the helper process dies, libvirt will not restart 
> it, because it was told not to.

... and the advantage that if libvirt and the helper process both disappear (they die or are stopped), hopefully someone else (systemd?) will restart the helper process.

However, the managed mode is better for security if I understand correctly, because qemu-pr-helper can use the same SELinux context as QEMU and is isolated by VM.

Comment 17 Paolo Bonzini 2018-08-02 21:19:08 UTC
To test this, ignore libvirt.  If you stop the qemu-pr-helper service and try to send a PR command from the guest, you should get an event in the QMP saying that QEMU has disconnected from qemu-pr-helper.  If you restart the qemu-pr-helper sevice, you should get another event saying that it has connected.

Also, the new QMP command query-pr-managers will tell you if QEMU is connected or not to qemu-pr-helper.

If you need more clarification, please ask.

Comment 18 Xueqiang Wei 2018-08-07 06:50:17 UTC
(In reply to Paolo Bonzini from comment #17)
> To test this, ignore libvirt.  If you stop the qemu-pr-helper service and
> try to send a PR command from the guest, you should get an event in the QMP
> saying that QEMU has disconnected from qemu-pr-helper.  If you restart the
> qemu-pr-helper sevice, you should get another event saying that it has
> connected.
> 
> Also, the new QMP command query-pr-managers will tell you if QEMU is
> connected or not to qemu-pr-helper.
> 
> If you need more clarification, please ask.


Hi Paolo and Michal,

Thank you for your explanation.

I retested on below version, it works as expected. So verify this bug.

kernel-3.10.0-931.el7.x86_64
qemu-kvm-rhev-2.12.0-8.el7

1. start the qemu-pr-helper service
2. query the status by qmp command
  
  {"execute":"query-pr-managers"}
  {"return": [{"connected": true, "id": "helper0"}]}

3. stop the qemu-pr-helper service
4. try to send a PR command from the guest.
   
   get an event in the QMP:
   {"timestamp": {"seconds": 1533622264, "microseconds": 314410}, "event": "PR_MANAGER_STATUS_CHANGED", "data": {"connected": false, "id": "helper0"}}

5. restart the qemu-pr-helper sevice.
6. try to send a PR command from the guest again.
   
   get an event in the QMP.
   {"timestamp": {"seconds": 1533622401, "microseconds": 446974}, "event": "PR_MANAGER_STATUS_CHANGED", "data": {"connected": true, "id": "helper0"}}

Comment 20 errata-xmlrpc 2018-11-01 11:04:08 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:3443


Note You need to log in before you can comment on or make changes to this bug.