1151838 – [vdsm] Scan alignment fails with a VirtAlignError

Bug 1151838 - [vdsm] Scan alignment fails with a VirtAlignError

Summary: [vdsm] Scan alignment fails with a VirtAlignError

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Virtualization Manager
Classification:	Red Hat
Component:	vdsm
Sub Component:
Version:	3.5.0
Hardware:	x86_64
OS:	Unspecified
Priority:	medium
Severity:	medium
Target Milestone:	ovirt-3.6.2
Target Release:	3.6.0
Assignee:	Tal Nisan
QA Contact:	Aharon Canan
Docs Contact:
URL:
Whiteboard:
Duplicates (1):	1269860 (view as bug list)
Depends On:
Blocks:	TRACKER-bugs-affecting-libguestfs
TreeView+	depends on / blocked

Reported:	2014-10-12 13:44 UTC by Elad
Modified:	2019-10-10 09:26 UTC (History)
CC List:	17 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2016-03-09 19:26:00 UTC
oVirt Team:	Storage
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)
vdsm, libvirt and engine logs (1.10 MB, application/x-gzip) 2014-10-12 13:44 UTC, Elad	no flags	Details
logs (rhel7) (1.21 MB, application/x-gzip) 2015-03-22 09:01 UTC, Elad	no flags	Details
libguestfs-test-tool err output (47.19 KB, text/plain) 2015-05-13 15:28 UTC, Tal Nisan	no flags	Details
libguestfs-test-tool output (1008 bytes, text/plain) 2015-05-13 15:28 UTC, Tal Nisan	no flags	Details
View All

Links
System	ID	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHBA-2016:0362	normal	SHIPPED_LIVE	vdsm 3.6.0 bug fix and enhancement update	2016-03-09 23:49:32 UTC
oVirt gerrit	40402	master	MERGED	Alignment scans will now run in direct mode	2020-11-24 11:12:13 UTC
oVirt gerrit	40896	master	ABANDONED	debug: Add virt-alignment-scan.debug tool	2020-11-24 11:12:13 UTC
oVirt gerrit	51619	ovirt-3.6	MERGED	Alignment scans will now run in direct mode	2020-11-24 11:12:13 UTC

Description Elad 2014-10-12 13:44:33 UTC

Created attachment 946096 [details]
vdsm, libvirt and engine logs

Description of problem:
Tried to scan alignment to a virtual disk and to a direct LUN. Both failed with a  VirtAlignError on vdsm. 

Version-Release number of selected component (if applicable):
rhev3.5 vt5
vdsm-4.16.6-1.el7.x86_64
libvirt-daemon-1.2.8-4.el7.x86_64
qemu-kvm-rhev-1.5.3-60.el7_0.10.x86_64
sanlock-3.1.0-2.el7.x86_64


How reproducible:
Always

Steps to Reproduce:
1. Try to scan alignment to an image from a block storage domain or to a direct LUN


Actual results:
Scan alignment fails on vdsm:

Thread-2897::ERROR::2014-10-12 16:34:22,917::__init__::491::jsonrpc.JsonRpcServer::(_serveRequest) Internal server error
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/yajsonrpc/__init__.py", line 486, in _serveRequest
    res = method(**params)
  File "/usr/share/vdsm/rpc/Bridge.py", line 266, in _dynamicMethod
    result = fn(*methodArgs)
  File "/usr/share/vdsm/API.py", line 719, in getDiskAlignment
    return self._cif.getDiskAlignment(drive)
  File "/usr/share/vdsm/clientIF.py", line 379, in getDiskAlignment
    out = alignmentScan.scanImage(volPath)
  File "/usr/share/vdsm/alignmentScan.py", line 53, in scanImage
    "or guest:\n%s" % err)
VirtAlignError: An error scanning the disk image or guest:
['libguestfs: error: could not create appliance through libvirt.', '', 'Try running qemu directly without libvirt using this environment variable:', 'export LIBGUESTFS_BACKEND=direct', '', 'Original error from lib
virt: internal error: process exited while connecting to monitor:  [code=1 domain=10]']


Expected results:
Scan alignment should succeed

Additional info: vdsm, libvirt and engine logs

Comment 3 Elad 2014-10-13 10:29:02 UTC

The same issue with XML-RPC

Comment 4 Elad 2014-10-13 10:32:40 UTC

Thread-9522::ERROR::2014-10-13 13:28:08,415::BindingXMLRPC::1151::vds::(wrapper) unexpected error
Traceback (most recent call last):
  File "/usr/share/vdsm/rpc/BindingXMLRPC.py", line 1135, in wrapper
    res = f(*args, **kwargs)
  File "/usr/share/vdsm/rpc/BindingXMLRPC.py", line 523, in diskGetAlignment
    return api.getDiskAlignment(driveSpecs)
  File "/usr/share/vdsm/API.py", line 719, in getDiskAlignment
    return self._cif.getDiskAlignment(drive)
  File "/usr/share/vdsm/clientIF.py", line 379, in getDiskAlignment
    out = alignmentScan.scanImage(volPath)
  File "/usr/share/vdsm/alignmentScan.py", line 53, in scanImage
    "or guest:\n%s" % err)
VirtAlignError: An error scanning the disk image or guest:
['libguestfs: error: qemu-img: /tmp/libguestfsoHtgku/overlay1: qemu-img exited with error status 1.', 'To see full error messages you may need to enable debugging.', 'See http://libguestfs.org/guestfs-faq.1.html#d
ebugging-libguestfs']

Comment 5 Yaniv Lavi 2015-01-20 13:51:02 UTC

Is there any use case this works on?

Comment 6 Elad 2015-01-20 14:01:28 UTC

(In reply to Yaniv Dary from comment #5)
> Is there any use case this works on?

Scan alignment is a feature, it is supported by us and it supposed to work.

Comment 7 Yaniv Lavi 2015-01-20 18:24:42 UTC

(In reply to Elad from comment #6)
> (In reply to Yaniv Dary from comment #5)
> > Is there any use case this works on?
> 
> Scan alignment is a feature, it is supported by us and it supposed to work.

Again, which flow is this feature working in and what flows have issues?

Comment 8 Elad 2015-01-21 08:41:06 UTC

The scenario is the basic, just initiate scan alignment for a disk, located on a block storage and attached to a VM 

From the wiki page:
Provide a way to determine if the partitions of a virtual disk are aligned to the underlying device blocks.

http://www.ovirt.org/Features/DiskAlignment

Comment 10 Liron Aravot 2015-02-01 13:11:20 UTC

Hi,
I've tested it using  vdsm-4.16.8.1-5.el7ev.x86_64 (vt13.6) on RHEL 7 and couldn't reproduce.
Ran the alignment scan on a thin provisioned disk (ISCI domain) - 
reducing severity in the meanwhile.

The error in the logs is -
Thread-2897::DEBUG::2014-10-12 16:34:13,116::alignmentScan::43::Storage.Misc.excCmd::(runScanArgs) FAILED: <err> = 'libguestfs: error: could not create applia
nce through libvirt.\n\nTry running qemu directly without libvirt using this environment variable:\nexport LIBGUESTFS_BACKEND=direct\n\nOriginal error from li
bvirt: internal error: process exited while connecting to monitor:  [code=1 domain=10]\n'; <rc> = 1

so you could try to use LIBGUESTFS_BACKEND=direct to workaround that issue (also consulted with fsimonce about it),
as i understood yeela handled that part, so adding needinfo on her.




furthermore, i can see an error when running teardownImage from the scan alignment flow, nir - can you look if there's something that required more attention here?

Thread-2897::DEBUG::2014-10-12 16:34:22,911::lvm::517::Storage.OperationMutex::(_invalidatelvs) Operation 'lvm invalidate operation' released the operation mu
tex
Thread-2897::ERROR::2014-10-12 16:34:22,911::task::866::Storage.TaskManager.Task::(_setError) Task=`3ebb3258-4181-468d-a1cf-0d94165df852`::Unexpected error
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/task.py", line 873, in _run
    return fn(*args, **kargs)
  File "/usr/share/vdsm/logUtils.py", line 45, in wrapper
    res = f(*args, **kwargs)
  File "/usr/share/vdsm/storage/hsm.py", line 3265, in teardownImage
    dom.deactivateImage(imgUUID)
  File "/usr/share/vdsm/storage/blockSD.py", line 1010, in deactivateImage
    lvm.deactivateLVs(self.sdUUID, volUUIDs)
  File "/usr/share/vdsm/storage/lvm.py", line 1163, in deactivateLVs
    _setLVAvailability(vgName, toDeactivate, "n")
  File "/usr/share/vdsm/storage/lvm.py", line 817, in _setLVAvailability
    raise error(str(e))
CannotDeactivateLogicalVolume: Cannot deactivate Logical Volume: ('General Storage Exception: ("5 [] [\'  WARNING: lvmetad is running but disabled. Restart lvmetad before enabling it!\', \'  Logical volume b10fb981-1a81-460e-b4db-c8b458fc8233/719429a0-2c2b-4cf0-82c3-0cad11e7f118 in use.\']\\nb10fb981-1a81-460e-b4db-c8b458fc8233/[\'719429a0-2c2b-4cf0-82c3-0cad11e7f118\']",)',)

Comment 11 Nir Soffer 2015-02-10 08:53:14 UTC

(In reply to Liron Aravot from comment #10)
> furthermore, i can see an error when running teardownImage from the scan
> alignment flow, nir - can you look if there's something that required more
> attention here?
> 
> Thread-2897::DEBUG::2014-10-12
> 16:34:22,911::lvm::517::Storage.OperationMutex::(_invalidatelvs) Operation
> 'lvm invalidate operation' released the operation mu
> tex
> Thread-2897::ERROR::2014-10-12
> 16:34:22,911::task::866::Storage.TaskManager.Task::(_setError)
> Task=`3ebb3258-4181-468d-a1cf-0d94165df852`::Unexpected error
> Traceback (most recent call last):
>   File "/usr/share/vdsm/storage/task.py", line 873, in _run
>     return fn(*args, **kargs)
>   File "/usr/share/vdsm/logUtils.py", line 45, in wrapper
>     res = f(*args, **kwargs)
>   File "/usr/share/vdsm/storage/hsm.py", line 3265, in teardownImage
>     dom.deactivateImage(imgUUID)
>   File "/usr/share/vdsm/storage/blockSD.py", line 1010, in deactivateImage
>     lvm.deactivateLVs(self.sdUUID, volUUIDs)
>   File "/usr/share/vdsm/storage/lvm.py", line 1163, in deactivateLVs
>     _setLVAvailability(vgName, toDeactivate, "n")
>   File "/usr/share/vdsm/storage/lvm.py", line 817, in _setLVAvailability
>     raise error(str(e))
> CannotDeactivateLogicalVolume: Cannot deactivate Logical Volume: ('General
> Storage Exception: ("5 [] [\'  WARNING: lvmetad is running but disabled.
> Restart lvmetad before enabling it!\', \'  Logical volume
> b10fb981-1a81-460e-b4db-c8b458fc8233/719429a0-2c2b-4cf0-82c3-0cad11e7f118 in
> use.\']\\nb10fb981-1a81-460e-b4db-c8b458fc8233/[\'719429a0-2c2b-4cf0-82c3-
> 0cad11e7f118\']",)',)

Please check why deactivation failed. One issue may be that after scan error
the lv is deactivated, and then teardownImage is trying to deactivate
an inactive lv. In this case we have to fix the cleanup code.

Comment 12 Eyal Edri 2015-02-25 08:40:18 UTC

3.5.1 is already full with bugs (over 80), and since none of these bugs were added as urgent for 3.5.1 release in the tracker bug, moving to 3.5.2

Comment 13 Allon Mureinik 2015-03-21 19:48:09 UTC

(In reply to Liron Aravot from comment #10)
> Hi,
> I've tested it using  vdsm-4.16.8.1-5.el7ev.x86_64 (vt13.6) on RHEL 7 and
> couldn't reproduce.
> Ran the alignment scan on a thin provisioned disk (ISCI domain) - 
> reducing severity in the meanwhile.
Elad can you attempt to reproduce with RHEL7?
If we can't reproduce this, let's close it.

Comment 14 Elad 2015-03-22 09:01:14 UTC

Created attachment 1005034 [details]
logs (rhel7)

The same with RHEL7:

vdsm-4.16.12-2.el7ev.x86_64
Red Hat Enterprise Linux Server release 7.1 Beta (Maipo)



Thread-19870::ERROR::2015-03-22 10:58:16,358::__init__::493::jsonrpc.JsonRpcServer::(_serveRequest) Internal server error
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/yajsonrpc/__init__.py", line 488, in _serveRequest
    res = method(**params)
  File "/usr/share/vdsm/rpc/Bridge.py", line 264, in _dynamicMethod
    result = fn(*methodArgs)
  File "/usr/share/vdsm/API.py", line 740, in getDiskAlignment
    return self._cif.getDiskAlignment(drive)
  File "/usr/share/vdsm/clientIF.py", line 383, in getDiskAlignment
    out = alignmentScan.scanImage(volPath)
  File "/usr/share/vdsm/alignmentScan.py", line 53, in scanImage
    "or guest:\n%s" % err)
VirtAlignError: An error scanning the disk image or guest:
['libguestfs: error: could not create appliance through libvirt.', '', 'Try running qemu directly without libvirt using this environment variable:', 'export LIBGUESTFS_BACKEND=direct', '', 'Original error from lib
virt: internal error: process exited while connecting to monitor:  [code=1 domain=10]']

Comment 15 Liron Aravot 2015-03-22 09:20:47 UTC

Elad, i want to close this one and open a new bug to handle the deactivation failure.

On you reproduction have you tried the solution suggested in https://bugzilla.redhat.com/show_bug.cgi?id=1151838#c10 ?

Comment 16 Liron Aravot 2015-03-22 09:43:06 UTC

the needinfo? on ykaplan was accidently removed by someone else,
yeela, can you please take a look at the question in https://bugzilla.redhat.com/show_bug.cgi?id=1151838#c10 ?

Comment 17 Yeela Kaplan 2015-03-22 13:26:26 UTC

Not sure what is the question.
As you said,
It is pretty much straight-forward from the logs:

FAILED: <err> = 'libguestfs: error: could not create applia
nce through libvirt.\n\nTry running qemu directly without libvirt using this environment variable:\nexport LIBGUESTFS_BACKEND=direct\n\nOriginal error from li
bvirt: internal error: process exited while connecting to monitor:  [code=1 domain=10]\n'; <rc> = 1

There's obviously a problem with libvirt.
So you should test what happens when running with the variable / without...

Comment 18 Liron Aravot 2015-03-25 09:22:32 UTC

Yeela, my question is if there's any fix we can/need to apply so it'll work out of the box or if the situation should remain as is and we don't have an action item on that.

Comment 19 Allon Mureinik 2015-03-30 15:32:49 UTC

This bug has gone back and forth way too long.

Things we need to know:
1. Does this work in RHEL 6.6? (i.e., is this a regression in 7)?
2. Does using "export LIBGUESTFS_BACKEND=direct" break the VM's prestine status?

Comment 20 Yeela Kaplan 2015-04-08 10:46:53 UTC

Liron, my best answer is comment #17

Comment 21 Eyal Edri 2015-04-28 11:22:08 UTC

moving to 3.5.4 due to capacity planning for 3.5.3.
if you believe this should remain in 3.5.3, please sync with pm/dev/qe and a full triple ack for it. also - ensure priority is set accordingly.

Comment 22 Tal Nisan 2015-04-29 14:12:48 UTC

As written in the libguestfs documentation, calls from libvirt should be done with export LIBGUESTFS_BACKEND=direct

See the documentation in:
http://libguestfs.org/libguestfs-test-tool.1.html

Comment 23 Richard W.M. Jones 2015-04-29 15:41:34 UTC

There are a few different bugs here, but the most important one is:

VirtAlignError: An error scanning the disk image or guest:
['libguestfs: error: could not create appliance through libvirt.', '', 'Try running qemu directly without libvirt using this environment variable:', 'export LIBGUESTFS_BACKEND=direct', '', 'Original error from lib
virt: internal error: process exited while connecting to monitor:  [code=1 domain=10]']

Please run libguestfs-test-tool and let us know what the output is.

I'm going to guess this is the error is due to libvirtd not running
(as would happen, for example, if you install libvirt and immediately
run a libguestfs command without rebooting or starting libvirtd).
However let's not guess, let's see the libguestfs-test-tool output.

Setting LIBGUESTFS_BACKEND=direct is a workaround -- it causes libguestfs
to bypass libvirt completely and just run qemu directly.  This is
indeed less troublesome, but for enterprise deployments we really
ought to be using libvirt.

Comment 26 Tal Nisan 2015-05-13 15:28:33 UTC

Created attachment 1025104 [details]
libguestfs-test-tool err output

Comment 27 Tal Nisan 2015-05-13 15:28:57 UTC

Created attachment 1025105 [details]
libguestfs-test-tool output

Comment 28 Richard W.M. Jones 2015-05-13 16:46:03 UTC

Tal, unfortunately (or perhaps fortunately) that shows libguestfs-test-tool
working correctly.

I think what we need here is to keep running libguestfs-test-tool (as
vdsm user) until we see a failure.

It's also possible that it will only fail at certain times.  Particularly:
(1) Just after the host has been rebooted.
(2) Just after libvirt has been installed for the first time.
(3) Just after libvirt has been upgraded.
(4) After the libvirtd --timeout expires (so wait > 30 seconds before
  running libguestfs-test-tool).

BTW to get all the output into a single file do:
  libguestfs-test-tool |& tee /tmp/log

Comment 29 Nir Soffer 2015-05-13 19:06:13 UTC

(In reply to Richard W.M. Jones from comment #28)
> Tal, unfortunately (or perhaps fortunately) that shows libguestfs-test-tool
> working correctly.
> 
> I think what we need here is to keep running libguestfs-test-tool (as
> vdsm user) until we see a failure.

Tal was running libguestfs-test-tool not *as* vdsm but *by* vdsm. We commented
out the code running the scan alignment, and run instead the test tool, writing
the output to file.
 
> It's also possible that it will only fail at certain times.  Particularly:
> (1) Just after the host has been rebooted.
> (2) Just after libvirt has been installed for the first time.
> (3) Just after libvirt has been upgraded.

We are not testing this use case

> (4) After the libvirtd --timeout expires (so wait > 30 seconds before
>   running libguestfs-test-tool).

We did not run libguestfs at all - we run the test tool instead.

> BTW to get all the output into a single file do:
>   libguestfs-test-tool |& tee /tmp/log

Richard: Do you prefer the output interleaved into one file?

Comment 30 Richard W.M. Jones 2015-05-13 20:20:56 UTC

It doesn't matter, as long as I get all the output.  It's just a bit
easier to read in a single file.

I believe this is a bug in libvirt.  Unfortunately we've still not been
able to get the debugging information necessary to try to diagnose
how libvirt is failing.  Not your fault, just a fact that these libvirt
bugs are hard to diagnose (check out the following list of libvirt bugs
that cause failures in libguestfs to see what I mean:
https://bugzilla.redhat.com/buglist.cgi?bug_status=NEW&bug_status=ASSIGNED&bug_status=POST&bug_status=MODIFIED&bug_status=ON_DEV&bug_status=ON_QA&bug_status=VERIFIED&bug_status=RELEASE_PENDING&bug_status=CLOSED&component=libvirt&f1=blocked&list_id=3473615&o1=equals&query_format=advanced&v1=910269 )

I suggested above from my experience a number of ways that libvirt
can fail, but they are just suggestions.

Comment 36 Tal Nisan 2015-09-02 16:56:36 UTC

Hi Rick, attached is another log from running libguestfs-test-tool in VDSM context

     ************************************************************
     *                    IMPORTANT NOTICE
     *
     * When reporting bugs, include the COMPLETE, UNEDITED
     * output below in your bug report.
     *
     ************************************************************
LIBVIRT_LOG_FILTERS=
LIBVIRT_LOG_OUTPUTS=
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin
SELinux: Enforcing
guestfs_get_append: (null)
guestfs_get_autosync: 1
guestfs_get_backend: libvirt
guestfs_get_backend_settings: []
guestfs_get_cachedir: /var/tmp
guestfs_get_direct: 0
guestfs_get_hv: /usr/bin/qemu-kvm
guestfs_get_memsize: 500
guestfs_get_network: 0
guestfs_get_path: /usr/lib64/guestfs
guestfs_get_pgroup: 0
guestfs_get_program: libguestfs-test-tool
guestfs_get_recovery_proc: 1
guestfs_get_selinux: 0
guestfs_get_smp: 1
guestfs_get_tmpdir: /tmp
guestfs_get_trace: 0
guestfs_get_verbose: 1
host_cpu: x86_64
Launching appliance, timeout set to 600 seconds.
libguestfs: launch: program=libguestfs-test-tool
libguestfs: launch: version=1.30.0fedora=22,release=1.fc22,libvirt
libguestfs: launch: backend registered: unix
libguestfs: launch: backend registered: uml
libguestfs: launch: backend registered: libvirt
libguestfs: launch: backend registered: direct
libguestfs: launch: backend=libvirt
libguestfs: launch: tmpdir=/tmp/libguestfsIH0lxC
libguestfs: launch: umask=0022
libguestfs: launch: euid=36
libguestfs: libvirt version = 1002013 (1.2.13)
libguestfs: guest random name = guestfs-voy31bdpwam9wp3v
libguestfs: [00000ms] connect to libvirt
libguestfs: opening libvirt handle: URI = qemu:///session, auth = default+wrapper, flags = 0
libguestfs: successfully opened libvirt handle: conn = 0x7f3d1227dc00
libguestfs: qemu version (reported by libvirt) = 2003001 (2.3.1)
libguestfs: [00191ms] get libvirt capabilities
libguestfs: [00193ms] parsing capabilities XML
libguestfs: [00193ms] build appliance
libguestfs: [00193ms] begin building supermin appliance
libguestfs: [00193ms] run supermin
libguestfs: command: run: /usr/bin/supermin
libguestfs: command: run: \ --build
libguestfs: command: run: \ --verbose
libguestfs: command: run: \ --if-newer
libguestfs: command: run: \ --lock /var/tmp/.guestfs-36/lock
libguestfs: command: run: \ --copy-kernel
libguestfs: command: run: \ -f ext2
libguestfs: command: run: \ --host-cpu x86_64
libguestfs: command: run: \ /usr/lib64/guestfs/supermin.d
libguestfs: command: run: \ -o /var/tmp/.guestfs-36/appliance.d
supermin: version: 5.1.12
supermin: rpm: detected RPM version 4.12
supermin: package handler: fedora/rpm
supermin: acquiring lock on /var/tmp/.guestfs-36/lock
supermin: if-newer: output does not need rebuilding
libguestfs: [00202ms] finished building supermin appliance
libguestfs: command: run: qemu-img
libguestfs: command: run: \ create
libguestfs: command: run: \ -f qcow2
libguestfs: command: run: \ -o backing_file=/var/tmp/.guestfs-36/appliance.d/root,backing_fmt=raw
libguestfs: command: run: \ /tmp/libguestfsIH0lxC/overlay2
Formatting '/tmp/libguestfsIH0lxC/overlay2', fmt=qcow2 size=4294967296 backing_file='/var/tmp/.guestfs-36/appliance.d/root' backing_fmt='raw' encryption=off cluster_size=65536 lazy_refcounts=off refcount_bits=16
libguestfs: [00275ms] create libvirt XML
libguestfs: libvirt XML:\n<?xml version="1.0"?>\n<domain type="kvm" xmlns:qemu="http://libvirt.org/schemas/domain/qemu/1.0">\n  <name>guestfs-voy31bdpwam9wp3v</name>\n  <memory unit="MiB">500</memory>\n  <currentMemory unit="MiB">500</currentMemory>\n  <cpu mode="host-passthrough">\n    <model fallback="allow"/>\n  </cpu>\n  <vcpu>1</vcpu>\n  <clock offset="utc">\n    <timer name="rtc" tickpolicy="catchup"/>\n    <timer name="pit" tickpolicy="delay"/>\n    <timer name="hpet" present="no"/>\n  </clock>\n  <os>\n    <type>hvm</type>\n    <kernel>/var/tmp/.guestfs-36/appliance.d/kernel</kernel>\n    <initrd>/var/tmp/.guestfs-36/appliance.d/initrd</initrd>\n    <cmdline>panic=1 console=ttyS0 udevtimeout=6000 udev.event-timeout=6000 no_timer_check acpi=off printk.time=1 cgroup_disable=memory root=/dev/sdb selinux=0 guestfs_verbose=1 TERM=linux</cmdline>\n    <bios useserial="yes"/>\n  </os>\n  <on_reboot>destroy</on_reboot>\n  <devices>\n    <controller type="scsi" index="0" model="virtio-scsi"/>\n    <disk device="disk" type="file">\n      <source file="/tmp/libguestfsIH0lxC/scratch.1"/>\n      <target dev="sda" bus="scsi"/>\n      <driver name="qemu" type="raw" cache="unsafe"/>\n      <address type="drive" controller="0" bus="0" target="0" unit="0"/>\n    </disk>\n    <disk type="file" device="disk">\n      <source file="/tmp/libguestfsIH0lxC/overlay2"/>\n      <target dev="sdb" bus="scsi"/>\n      <driver name="qemu" type="qcow2" cache="unsafe"/>\n      <address type="drive" controller="0" bus="0" target="1" unit="0"/>\n      <shareable/>\n    </disk>\n    <serial type="unix">\n      <source mode="connect" path="/tmp/libguestfsIH0lxC/console.sock"/>\n      <target port="0"/>\n    </serial>\n    <channel type="unix">\n      <source mode="connect" path="/tmp/libguestfsIH0lxC/guestfsd.sock"/>\n      <target type="virtio" name="org.libguestfs.channel.0"/>\n    </channel>\n  </devices>\n  <qemu:commandline>\n    <qemu:env name="TMPDIR" value="/var/tmp"/>\n  </qemu:commandline>\n</domain>\n
libguestfs: command: run: ls
libguestfs: command: run: \ -a
libguestfs: command: run: \ -l
libguestfs: command: run: \ -R
libguestfs: command: run: \ -Z /var/tmp/.guestfs-36
libguestfs: /var/tmp/.guestfs-36:
libguestfs: total 12
libguestfs: drwxr-xr-x. 3 vdsm kvm  unconfined_u:object_r:user_tmp_t:s0 4096 Sep  2 19:48 .
libguestfs: drwxrwxrwt. 4 root root system_u:object_r:tmp_t:s0          4096 Sep  2 19:18 ..
libguestfs: drwxr-xr-x. 2 vdsm kvm  unconfined_u:object_r:user_tmp_t:s0 4096 Sep  2 18:43 appliance.d
libguestfs: -rw-r--r--. 1 vdsm kvm  unconfined_u:object_r:user_tmp_t:s0    0 Sep  2 18:43 lock
libguestfs: 
libguestfs: /var/tmp/.guestfs-36/appliance.d:
libguestfs: total 527356
libguestfs: drwxr-xr-x. 2 vdsm kvm unconfined_u:object_r:user_tmp_t:s0       4096 Sep  2 18:43 .
libguestfs: drwxr-xr-x. 3 vdsm kvm unconfined_u:object_r:user_tmp_t:s0       4096 Sep  2 19:48 ..
libguestfs: -rw-r--r--. 1 vdsm kvm system_u:object_r:virt_content_t:s0    1222144 Sep  2 19:48 initrd
libguestfs: -rwxr-xr-x. 1 vdsm kvm system_u:object_r:virt_content_t:s0    5967320 Sep  2 19:48 kernel
libguestfs: -rw-r--r--. 1 vdsm kvm system_u:object_r:virt_content_t:s0 4294967296 Sep  2 19:48 root
libguestfs: command: run: ls
libguestfs: command: run: \ -a
libguestfs: command: run: \ -l
libguestfs: command: run: \ -Z /tmp/libguestfsIH0lxC
libguestfs: total 196
libguestfs: drwxr-xr-x. 2 vdsm kvm  system_u:object_r:virt_tmp_t:s0       140 Sep  2 19:48 .
libguestfs: drwxrwxrwt. 9 root root system_u:object_r:tmp_t:s0            200 Sep  2 19:48 ..
libguestfs: srwxr-xr-x. 1 vdsm kvm  system_u:object_r:virt_tmp_t:s0         0 Sep  2 19:48 console.sock
libguestfs: srwxr-xr-x. 1 vdsm kvm  system_u:object_r:virt_tmp_t:s0         0 Sep  2 19:48 guestfsd.sock
libguestfs: -rw-r--r--. 1 vdsm kvm  system_u:object_r:virt_tmp_t:s0    197120 Sep  2 19:48 overlay2
libguestfs: -rw-r--r--. 1 vdsm kvm  system_u:object_r:virt_tmp_t:s0 104857600 Sep  2 19:48 scratch.1
libguestfs: -rwxr-xr-x. 1 vdsm kvm  system_u:object_r:virt_tmp_t:s0         0 Sep  2 19:48 umask-check
libguestfs: [00280ms] launch libvirt guest
libguestfs: error: could not create appliance through libvirt.

Try running qemu directly without libvirt using this environment variable:
export LIBGUESTFS_BACKEND=direct

Original error from libvirt: internal error: process exited while connecting to monitor:  [code=1 domain=10]
libguestfs-test-tool: failed to launch appliance
libguestfs: closing guestfs handle 0x7f3d1227d290 (state 0)
libguestfs: command: run: rm
libguestfs: command: run: \ -rf /tmp/libguestfsIH0lxC

Comment 37 Richard W.M. Jones 2015-09-02 19:12:25 UTC

Can you see if there is a file called 'guestfs-voy31bdpwam9wp3v.log'
located (somewhere?) on the local disk of that node.

It may be in vdsm's ~/.cache/libvirt/qemu/log/ directory, if vdsm has
a home directory; or if $HOME was set, it may be in $HOME/.cache; or
failing those options, try:

  find / -name '*voy31bdpwam9wp3v*'

This is clearly a libvirt bug.

Comment 38 Tal Nisan 2015-09-03 07:00:14 UTC

File exists although the time is not aligned with the time of the scan it seems:


2015-09-02 16:48:15.266+0000: starting up libvirt version: 1.2.13.1, package: 2.fc22 (Fedora Project, 2015-06-06-15:21:32, buildvm-13.phx2.fedoraproject.org), qemu version: 2.3.1 (qemu-2.3.1-1.fc22)
LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin HOME=/var/lib/vdsm USER=vdsm LOGNAME=vdsm QEMU_AUDIO_DRV=none TMPDIR=/var/tmp /usr/bin/qemu-kvm -name guestfs-voy31bdpwam9wp3v -S -machine pc-i440fx-2.3,accel=kvm,usb=off -cpu host -m 500 -realtime mlock=off -smp 1,sockets=1,cores=1,threads=1 -uuid 13a8e417-cac5-45ab-8b79-c6c141ca53ae -nographic -no-user-config -nodefaults -device sga -chardev socket,id=charmonitor,path=/var/lib/vdsm/.config/libvirt/qemu/lib/guestfs-voy31bdpwam9wp3v.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=discard -no-hpet -no-reboot -no-acpi -boot strict=on -kernel /var/tmp/.guestfs-36/appliance.d/kernel -initrd /var/tmp/.guestfs-36/appliance.d/initrd -append panic=1 console=ttyS0 udevtimeout=6000 udev.event-timeout=6000 no_timer_check acpi=off printk.time=1 cgroup_disable=memory root=/dev/sdb selinux=0 guestfs_verbose=1 TERM=linux -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -device virtio-scsi-pci,id=scsi0,bus=pci.0,addr=0x2 -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x3 -drive file=/tmp/libguestfsIH0lxC/scratch.1,if=none,id=drive-scsi0-0-0-0,format=raw,cache=unsafe -device scsi-hd,bus=scsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0-0-0-0,id=scsi0-0-0-0,bootindex=1 -drive file=/tmp/libguestfsIH0lxC/overlay2,if=none,id=drive-scsi0-0-1-0,format=qcow2,cache=unsafe -device scsi-hd,bus=scsi0.0,channel=0,scsi-id=1,lun=0,drive=drive-scsi0-0-1-0,id=scsi0-0-1-0 -chardev socket,id=charserial0,path=/tmp/libguestfsIH0lxC/console.sock -device isa-serial,chardev=charserial0,id=serial0 -chardev socket,id=charchannel0,path=/tmp/libguestfsIH0lxC/guestfsd.sock -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=org.libguestfs.channel.0 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x4 -msg timestamp=on
Domain id=2 is tainted: custom-argv
Domain id=2 is tainted: host-cpu
2015-09-02 16:48:15.497+0000: shutting down
~

Comment 39 Tal Nisan 2015-09-03 07:01:45 UTC

Checked the last updated time of the file, it's correct, also the time inside is correct on second look, didn't notice it has the time zone there as well

Comment 40 Richard W.M. Jones 2015-09-03 08:14:47 UTC

That means that qemu started up and then exited about 250ms later -
silently and with no error messages.  Nice.

SELinux?  You could try this:

  ausearch -ts yesterday

and see if any event corresponds to the time that qemu ran.

Comment 41 Richard W.M. Jones 2015-09-03 08:42:23 UTC

Another thing to check:

Does /var/lib/vdsm/.config/libvirt/qemu/lib/ exist and writable
by vdsm?  qemu-kvm is expected to place a monitor socket in here.

Comment 42 Richard W.M. Jones 2015-09-03 08:49:29 UTC

Do we have a libvirtd (daemon) log corresponding to comment 36?
It would be useful to compare timestamps to see if libvirtd told
qemu to shut down, or if qemu shut down spontaneously.

Comment 43 Richard W.M. Jones 2015-09-03 08:58:57 UTC

(In reply to Richard W.M. Jones from comment #42)
> Do we have a libvirtd (daemon) log corresponding to comment 36?
> It would be useful to compare timestamps to see if libvirtd told
> qemu to shut down, or if qemu shut down spontaneously.

And it'll have to be the session libvirtd log, not
/var/log/libvirt/libvirtd.log which is the system libvirtd
and is irrelevant.

Enabling per-session libvirtd logging is done like this.

In /var/lib/vdsm/.config/libvirt/ place a file called 'libvirtd.conf'
containing:

log_level=1
log_outputs="1:file:/tmp/libvirtd.log"

Kill the session libvirtd, ie the libvirtd process running under
the 'vdsm' user (if it is running).

Then try to reproduce the bug, and the libvirtd log information
should be in /tmp/libvirtd.log.

Comment 44 Tal Nisan 2015-10-08 12:24:05 UTC

*** Bug 1269860 has been marked as a duplicate of this bug. ***

Comment 45 Tal Nisan 2015-10-20 13:56:49 UTC

Sorry for the late reply, as for your questions:

Seems that no SELinux event exists for the corresponding time

/var/lib/vdsm/.config/libvirt/qemu/lib/ does exist and is writable by vdsm

As for the specific libvirt log, I've followed your instructions but it didn't seem that anything was logged into the log file, can you please review the steps, perhaps something was missing?

Comment 46 Richard W.M. Jones 2015-10-20 15:24:44 UTC

My instructions in comment 43 look correct.  Does a log file
get created at all after you do that?  Is it empty?  Does it
contain any lines?  Did you kill the session libvirtd after
creating the conf file?

Comment 47 Tal Nisan 2015-10-20 21:39:48 UTC

Yes, I've created the conf file, killed the libvirtd process after that and ran scan alignment again, the log file is not created

Comment 48 Richard W.M. Jones 2015-10-27 15:40:36 UTC

I don't know why libvirtd isn't generating logs, but that's
obviously what you've got to sort out.  See comment 43, comment 46.
If you can't work out how to get libvirtd to generate logs you
should ask someone from the libvirt team to help out with that.

Comment 49 Allon Mureinik 2015-11-15 09:06:53 UTC

Pushing out, based on severity.
In any event, if this is a libvirt regressoin, we can just document in a kbase that customers need to upgrade to version XYZ once its released.

Comment 53 Aharon Canan 2016-01-17 14:38:33 UTC

Verified
========
rhevm-3.6.2.5-0.1.el6.noarch
vdsm-4.17.17-0.el7ev.noarch


Steps to Reproduce:
===================
In a non active VM with disks (one on block and one direct lun), right click on each disk and select "Scan Alignment"


From engine.log (for the direct lun) 
====================================
2016-01-17 14:35:43,240 INFO  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (org.ovirt.thread.pool-7-thread-18) [22866400] Correlation ID: 22866400, Job ID: a93a02ac-5700-415e-b001-fc541f28807d, Call Stack: null, Custom Event ID: -1, Message: Alignment scan of disk 'VM1_Disk2' is complete.

From engine.log (for the disk on block) 
=======================================
2016-01-17 14:37:52,371 INFO  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (org.ovirt.thread.pool-7-thread-38) [dcacb30] Correlation ID: dcacb30, Job ID: 46958b9a-c8c4-4bd0-aa32-54e78afae66a, Call Stack: null, Custom Event ID: -1, Message: Alignment scan of disk 'VM1_Disk1' is complete.

Comment 55 errata-xmlrpc 2016-03-09 19:26:00 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-0362.html

Note You need to log in before you can comment on or make changes to this bug.