Bug 1447300

Summary: enable libguestfs tools on ppc64le
Product: Red Hat Enterprise Virtualization Manager Reporter: Michal Skrivanek <michal.skrivanek>
Component: vdsmAssignee: Milan Zamazal <mzamazal>
Status: CLOSED ERRATA QA Contact: Israel Pinto <ipinto>
Severity: medium Docs Contact:
Priority: medium    
Version: 4.1.0CC: dfodor, dyuan, dzheng, gsun, ipinto, jiyan, lsurette, mzamazal, ratamir, rhv-bugzilla-bot, rjones, srevivo, trichard, tzheng, xchen, xuzhang, ycui, ykaul, ylavi
Target Milestone: ovirt-4.2.3   
Target Release: ---   
Hardware: ppc64le   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Enhancement
Doc Text:
Sparsify and sysprep can now be run on POWER hosts.
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-05-15 17:51:25 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Virt RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1526192    
Bug Blocks:    

Description Michal Skrivanek 2017-05-02 11:10:20 UTC
Since libguestfs was missing up until RHEL 7.4 the virt-* tool were not working on PPC64LE hosts. This can now be enabled so virt-sparsify and virt-sysprep can be utilized. virt-v2v shall still remain unsupported.

Comment 2 RHV bug bot 2017-12-06 16:18:54 UTC
WARN: Bug status wasn't changed from MODIFIED to ON_QA due to the following reason:

[Found non-acked flags: '{'rhevm-4.2-ga': '?'}', ]

For more info please contact: rhv-devops: Bug status wasn't changed from MODIFIED to ON_QA due to the following reason:

[Found non-acked flags: '{'rhevm-4.2-ga': '?'}', ]

For more info please contact: rhv-devops

Comment 3 RHV bug bot 2017-12-12 21:17:11 UTC
WARN: Bug status wasn't changed from MODIFIED to ON_QA due to the following reason:

[Found non-acked flags: '{'rhevm-4.2-ga': '?'}', ]

For more info please contact: rhv-devops: Bug status wasn't changed from MODIFIED to ON_QA due to the following reason:

[Found non-acked flags: '{'rhevm-4.2-ga': '?'}', ]

For more info please contact: rhv-devops

Comment 4 Israel Pinto 2017-12-18 12:36:17 UTC
Blocked by: BZ 1526192

Comment 5 RHV bug bot 2017-12-18 17:06:41 UTC
WARN: Bug status (ON_QA) wasn't changed but the folowing should be fixed:

[Found non-acked flags: '{'rhevm-4.2-ga': '?'}', ]

For more info please contact: rhv-devops: Bug status (ON_QA) wasn't changed but the folowing should be fixed:

[Found non-acked flags: '{'rhevm-4.2-ga': '?'}', ]

For more info please contact: rhv-devops

Comment 7 Israel Pinto 2018-01-28 11:53:45 UTC
Engine version: Software Version:4.2.1.3-0.1.el7

Host:
OS Version:RHEL - 7.5 - 3.el7
OS Description:Red Hat Enterprise Linux Server 7.5 Beta (Maipo)
Kernel Version:3.10.0 - 830.el7.ppc64le
KVM Version:2.10.0 - 18.el7
LIBVIRT Version:libvirt-3.9.0-7.el7
VDSM Version:vdsm-4.20.17-1.el7ev

i could not find any libguestfs packages on host.
Also, I tested it with virt-sparsify which is part of libguestfs tools.
It failed, see VDSM log:

2018-01-28 13:44:27,163+0200 ERROR (tasks/3) [root] Job u'8cbe99cd-111e-4372-9ad4-40838269827f' failed (jobs:221)
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/vdsm/jobs.py", line 157, in run
    self._run()
  File "/usr/lib/python2.7/site-packages/vdsm/storage/sdm/api/sparsify_volume.py", line 56, in _run
    virtsparsify.sparsify_inplace(self._vol_info.path)
  File "/usr/lib/python2.7/site-packages/vdsm/virtsparsify.py", line 66, in sparsify_inplace
    cmd = [_VIRTSPARSIFY.cmd, '--machine-readable', '--in-place', vol_path]
  File "/usr/lib/python2.7/site-packages/vdsm/common/cmdutils.py", line 70, in cmd
    self.name)
OSError: [Errno 2] No such file or directory: virt-sparsify

Comment 8 Michal Skrivanek 2018-02-27 17:58:01 UTC
sorry, Israel, it shouldn't have been moved to ON_QA as it does require us to pull in the package

Comment 9 Israel Pinto 2018-03-18 09:56:30 UTC
Verify with:
Engine Version:4.2.2.4-0.1.el7
Host: 
OS Version:RHEL - 7.5 - 8.el7
Kernel Version:3.10.0 - 861.el7.ppc64le
KVM Version:2.10.0 - 21.el7_5.1
LIBVIRT Version:libvirt-3.9.0-14.el7
VDSM Version:vdsm-4.20.22-1.el7ev

Libguest fs:
libguestfs-tools-c-1.36.10-6.el7.ppc64le
libguestfs-1.36.10-6.el7.ppc64le

Engine log:
2018-03-18 11:43:04,659+02 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.GetHostJobsVDSCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-71) [1d8b6a1f-6b1b-496d-b910-78f63aea8a97] FINISH, GetHostJobsVDSCommand, return: {6ed3f4a3-ce3a-486e-891f-d011997b759e=HostJobInfo:{id='6ed3f4a3-ce3a-486e-891f-d011997b759e', type='storage', description='sparsify_volume', status='failed', progress='null', error='VDSError:{code='GeneralException', message='General Exception: ("Command ['/usr/bin/virt-sparsify', '--machine-readable', '--in-place', u'/rhev/data-center/mnt/10.16.29.93:_ppc__ge__1__nfs__1/e5e6a7e4-d0c0-4ffc-a408-e4409ffe1c42/images/dec15d27-352f-4736-9ab5-73407354ee6a/afb1d067-441f-40e3-b074-90340cc645f0'] failed with rc=1 out=['3/12'] err=['virt-sparsify: error: libguestfs error: could not create appliance through ', 'libvirt.', '', 'Try running qemu directly without libvirt using this environment variable:', 'export LIBGUESTFS_BACKEND=direct', '', 'Original error from libvirt: internal error: Process exited prior to exec: ', 'libvirt:  error : cannot limit locked memory to 18874368: Operation not ', 'permitted [code=1 int1=-1]', '', 'If reporting bugs, run virt-sparsify with debugging enabled and include the ', 'complete output:', '', '  virt-sparsify -v -x [...]']",)'}'}}, log id: 796ae26c
2018-03-18 11:43:04,662+02 INFO  [org.ovirt.engine.core.bll.StorageJobCallback] (EE-ManagedThreadFactory-engineScheduled-Thread-71) [1d8b6a1f-6b1b-496d-b910-78f63aea8a97] Command SparsifyImage id: 'd061ff39-16cb-4707-8760-3c0dc27251a6': job '6ed3f4a3-ce3a-486e-891f-d011997b759e' execution was completed with VDSM job status 'failed'
2018-03-18 11:43:04,665+02 INFO  [org.ovirt.engine.core.bll.StorageJobCallback] (EE-ManagedThreadFactory-engineScheduled-Thread-71) [1d8b6a1f-6b1b-496d-b910-78f63aea8a97] Command SparsifyImage id: 'd061ff39-16cb-4707-8760-3c0dc27251a6': execution was completed, the command status is 'FAILED'
2018-03-18 11:43:05,675+02 ERROR [org.ovirt.engine.core.bll.storage.disk.SparsifyImageCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-42) [1d8b6a1f-6b1b-496d-b910-78f63aea8a97] Ending command 'org.ovirt.engine.core.bll.storage.disk.SparsifyImageCommand' with failure.
2018-03-18 11:43:05,687+02 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engineScheduled-Thread-42) [1d8b6a1f-6b1b-496d-b910-78f63aea8a97] EVENT_ID: USER_SPARSIFY_IMAGE_FINISH_FAILURE(1,327), Failed to sparsify golden_mixed_virtio_template.

Adding logs

Comment 11 Yaniv Kaul 2018-03-18 10:07:14 UTC
Israel, the issue here seems quite different. It's in the log very clearly:
libvirt:  error : cannot limit locked memory to 18874368: Operation not permitted 

Googling this, the 1st and 2nd hits are BZs - long solved.
Can you involve the virt-QE team on this?

Comment 12 Israel Pinto 2018-03-18 13:50:03 UTC
Adding Dan Yuan from VirtQE.
I check the BZs according the error messages: 
1. "error: libguestfs error: could not create appliance through..."
   https://bugzilla.redhat.com/show_bug.cgi?id=1337869
   The BZ on  aarch64 and not PPC and the failure is running  guestfish -N fs:ext3  
   Same message but not releated.
2. "libvirt: error : cannot limit locked memory ..."
   https://bugzilla.redhat.com/show_bug.cgi?id=1293024
    - We can have problem of ulimit on the host
      [root@ibm-p8-rhevm-hv-01 vdsm]# ulimit 
      unlimited
      [root@ibm-p8-rhevm-hv-01 vdsm]# ulimit -a
      core file size          (blocks, -c) unlimited
      data seg size           (kbytes, -d) unlimited
      scheduling priority             (-e) 0
      file size               (blocks, -f) unlimited
      pending signals                 (-i) 480151
      max locked memory       (kbytes, -l) 65536
      max memory size         (kbytes, -m) unlimited
      open files                      (-n) 1024
      pipe size            (512 bytes, -p) 8
      POSIX message queues     (bytes, -q) 819200
      real-time priority              (-r) 0
      stack size              (kbytes, -s) 8192
      cpu time               (seconds, -t) unlimited
      max user processes              (-u) 480151
      virtual memory          (kbytes, -v) unlimited
      file locks                      (-x) unlimited
      [root@ibm-p8-rhevm-hv-01 vdsm]# ulimit -l
      65536
      I reset ulimit -l to unlimited, didn't solve the problem.

Dan, did you manage to do sparsify on vm disk, using virt-sparsify or with rhevm
On PCC?

Comment 13 Xuesong Zhang 2018-03-19 03:07:23 UTC
Update the needinfo to Dan Zheng(dzheng@), which is working one PPC for libvirt.

Comment 14 Xianghua Chen 2018-03-19 09:25:42 UTC
Hi rjones,
Could you help to take a look at this bug, is it related to libguestfs ?
I think we have enabled libguestfs on RHEL7.5 for ppc64le,  so I didn't get the bug summary.
And in Comment 11, it seems like a libvirt problem, and I'm not familiar with the interaction between vdsm+libvirt+libguestfs, please help to have a look, thanks.

Comment 15 Richard W.M. Jones 2018-03-19 09:38:47 UTC
As is always the case, please run ‘libguestfs-test-tool’ as the
same user where the virt tools normally run (vdsm user?) and post
the complete output.

Comment 17 Richard W.M. Jones 2018-03-19 10:10:03 UTC
ykaul: Can we modify vdsm to dump out the ulimits before it runs commands?

On both machines, using this command run as root:

  XDG_RUNTIME_DIR=/tmp su vdsm -c libguestfs-test-tool -s /bin/bash

everything is fine.  I also tried it without libvirt on both
machines and still no failure.

Could vdsm itself be changing the ulimits?

It's also possible that using 'su' does not set the ulimits the
same way that running vdsm does (presumably from systemd).

On the machine /etc/security/limits.d/95-kvm-memlock.conf (installed
by qemu) exists and looks correct.

Another thing I tried was adjusting the libguestfs appliance memory
size to find out where the new memory limit exists:

  LIBGUESTFS_MEMSIZE=6000 XDG_RUNTIME_DIR=/tmp su vdsm -c libguestfs-test-tool -s /bin/bash

It's somewhere between 6000MB and 7000MB.  The libguestfs appliance is
never this big so that's not the problem.

> [root@ibm-p8-rhevm-hv-01 vdsm]# ulimit -l
> 65536
> I reset ulimit -l to unlimited, didn't solve the problem.

ulimits don't work this way.  You cannot set them for vdsm
without modifying vdsm.

Comment 18 Richard W.M. Jones 2018-03-19 10:12:01 UTC
Also if vdsm runs from systemd then this is relevant:
https://serverfault.com/questions/628610/increasing-nproc-for-processes-launched-by-systemd-on-centos-7/678861#678861

Comment 19 Yaniv Kaul 2018-03-19 10:55:57 UTC
We are running from systemd, and have LimitNOFILE=4096
We have /etc/security/limits.d/99-vdsm.conf and have there

vdsm - nproc 4096
vdsm - nofile 12288

Comment 20 Richard W.M. Jones 2018-03-19 12:30:46 UTC
Right, but the problem isn't nproc or nofile, it's 
LimitMEMLOCK (see systemd.exec(5)).

In other words, on POWER, you need to replicate the limits from
/etc/security/limits.d/95-kvm-memlock.conf

It's kind of annoying that systemd doesn't have a way
to use the /etc/security/limits files.

Comment 21 Richard W.M. Jones 2018-03-19 20:31:12 UTC
I didn't try it, but maybe qemu could drop a configuration file
into /etc/systemd/system.conf.d/*.conf containing LimitMEMLOCK=...
(on POWER only of course)?  It would be analogous to the
/etc/security/limits.d/95-kvm-memlock.conf that qemu already creates.

Comment 22 Milan Zamazal 2018-04-05 08:28:09 UTC
Richard, what do you think would be a good value of LimitMEMLOCK? I can see that libguestfs-test-tool reports "guestfs_get_memsize: 768" on our POWER. Is my guess that 1 GB should be OK fine or should a different value be used?

Comment 23 Richard W.M. Jones 2018-04-05 08:37:33 UTC
It doesn't need to lock the whole of guest RAM, just a small
part used for page tables.

Please see comment 20 and comment 21.

Comment 24 Milan Zamazal 2018-04-05 09:12:36 UTC
I see, thank you, I misunderstood the values. While memlock is in kilobytes, libvirt reports the value in bytes. So the default limit 65536 KB should be enough and it seems to be applied on our Vdsm instance on POWER. The problem is apparently elsewhere.

Comment 25 Michal Skrivanek 2018-04-05 10:43:44 UTC
(In reply to Milan Zamazal from comment #24)
> I see, thank you, I misunderstood the values. While memlock is in kilobytes,
> libvirt reports the value in bytes. So the default limit 65536 KB should be
> enough and it seems to be applied on our Vdsm instance on POWER. The problem
> is apparently elsewhere.

what is the default value? IIUC it's 64kB, which is not enough

Comment 26 Milan Zamazal 2018-04-05 11:15:49 UTC
The limit is actually 64 MB on POWER, not 64 KB. While I couldn't see limit other than 64 MB anywhere there, it's indeed lost somewhere and doesn't apply to virt-sparsify or something inside. Adding it to vdsm.service resolves the problem, so let's use that fix.

Comment 31 errata-xmlrpc 2018-05-15 17:51:25 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2018:1489

Comment 32 Franta Kust 2019-05-16 13:05:06 UTC
BZ<2>Jira Resync