Bug 1378970 - [abrt] anaconda: gi.overrides.BlockDev.LVMError: Failed to call the 'LvCreateLinear' method on the '/com/redhat/lvmdbus1/Vg/0' object: Timeout was reached
Summary: [abrt] anaconda: gi.overrides.BlockDev.LVMError: Failed to call the 'LvCreate...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: libblockdev
Version: 25
Hardware: x86_64
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Vratislav Podzimek
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard: abrt_hash:8e57678f6d5ff56d41f45103002...
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-09-23 17:27 UTC by Adam Williamson
Modified: 2016-12-15 18:05 UTC (History)
8 users (show)

Fixed In Version: libblockdev-1.9-3 libblockdev-1.9-4.fc25
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-10-07 03:34:16 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
File: anaconda-tb (592.85 KB, text/plain)
2016-09-23 17:28 UTC, Adam Williamson
no flags Details
File: anaconda.log (43.84 KB, text/plain)
2016-09-23 17:28 UTC, Adam Williamson
no flags Details
File: backtrace (2.98 KB, text/plain)
2016-09-23 17:28 UTC, Adam Williamson
no flags Details
File: description (2.53 KB, text/plain)
2016-09-23 17:28 UTC, Adam Williamson
no flags Details
File: dnf.librepo.log (11.02 KB, text/plain)
2016-09-23 17:28 UTC, Adam Williamson
no flags Details
File: environ (482 bytes, text/plain)
2016-09-23 17:28 UTC, Adam Williamson
no flags Details
File: hawkey.log (1.03 KB, text/plain)
2016-09-23 17:28 UTC, Adam Williamson
no flags Details
File: ifcfg.log (2.40 KB, text/plain)
2016-09-23 17:28 UTC, Adam Williamson
no flags Details
File: lsblk_output (2.50 KB, text/plain)
2016-09-23 17:28 UTC, Adam Williamson
no flags Details
File: lvm.log (217.94 KB, text/plain)
2016-09-23 17:28 UTC, Adam Williamson
no flags Details
File: nmcli_dev_list (1.33 KB, text/plain)
2016-09-23 17:28 UTC, Adam Williamson
no flags Details
File: packaging.log (5.10 KB, text/plain)
2016-09-23 17:28 UTC, Adam Williamson
no flags Details
File: program.log (27.15 KB, text/plain)
2016-09-23 17:28 UTC, Adam Williamson
no flags Details
File: storage.log (86.68 KB, text/plain)
2016-09-23 17:28 UTC, Adam Williamson
no flags Details
File: syslog (64.61 KB, text/plain)
2016-09-23 17:28 UTC, Adam Williamson
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 1405112 0 unspecified CLOSED gi.overrides.BlockDev.LVMError: Failed to call the 'PvCreate' method on the '/com/redhat/lvmdbus1/Manager' object: GDBus... 2021-02-22 00:41:40 UTC

Internal Links: 1405112

Description Adam Williamson 2016-09-23 17:27:59 UTC
Version-Release number of selected component:
anaconda-25.20.1-1

Additional info:
reporter:       libreport-2.8.0
addons:         com_redhat_docker, com_redhat_kdump
cmdline:        /usr/bin/python3  /sbin/anaconda
cmdline_file:   BOOT_IMAGE=vmlinuz initrd=initrd.img inst.stage2=hd:LABEL=Fedora-S-dvd-x86_64-25 quiet
executable:     /sbin/anaconda
hashmarkername: anaconda
kernel:         4.8.0-0.rc6.git0.1.fc25.x86_64
product:        Fedora
release:        Cannot get release name.
type:           anaconda
version:        25

Truncated backtrace:
Traceback (most recent call last):
  File "/usr/lib64/python3.5/site-packages/gi/overrides/BlockDev.py", line 420, in wrapped
    ret = orig_obj(*args, **kwargs)
  File "/usr/lib64/python3.5/site-packages/gi/overrides/BlockDev.py", line 228, in lvm_thpoolcreate
    return _lvm_thpoolcreate(vg_name, lv_name, size, md_size, chunk_size, profile)
GLib.GError: g-io-error-quark: Failed to call the 'LvCreateLinear' method on the '/com/redhat/lvmdbus1/Vg/0' object: Timeout was reached (24)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib64/python3.5/site-packages/pyanaconda/threads.py", line 251, in run
    threading.Thread.run(self, *args, **kwargs)
  File "/usr/lib64/python3.5/threading.py", line 862, in run
    self._target(*self._args, **self._kwargs)
  File "/usr/lib64/python3.5/site-packages/pyanaconda/install.py", line 204, in doInstall
    turn_on_filesystems(storage, mount_only=flags.flags.dirInstall, callbacks=callbacks_reg)
  File "/usr/lib/python3.5/site-packages/blivet/osinstall.py", line 1112, in turn_on_filesystems
    storage.do_it(callbacks)
  File "/usr/lib/python3.5/site-packages/blivet/threads.py", line 45, in run_with_lock
    return m(*args, **kwargs)
  File "/usr/lib/python3.5/site-packages/blivet/blivet.py", line 163, in do_it
    self.devicetree.actions.process(callbacks=callbacks, devices=self.devices)
  File "/usr/lib/python3.5/site-packages/blivet/actionlist.py", line 48, in wrapped_func
    return func(obj, *args, **kwargs)
  File "/usr/lib/python3.5/site-packages/blivet/actionlist.py", line 322, in process
    action.execute(callbacks)
  File "/usr/lib/python3.5/site-packages/blivet/threads.py", line 45, in run_with_lock
    return m(*args, **kwargs)
  File "/usr/lib/python3.5/site-packages/blivet/deviceaction.py", line 318, in execute
    self.device.create()
  File "/usr/lib/python3.5/site-packages/blivet/threads.py", line 45, in run_with_lock
    return m(*args, **kwargs)
  File "/usr/lib/python3.5/site-packages/blivet/devices/storage.py", line 443, in create
    self._create()
  File "/usr/lib/python3.5/site-packages/blivet/threads.py", line 45, in run_with_lock
    return m(*args, **kwargs)
  File "/usr/lib/python3.5/site-packages/blivet/devices/lvm.py", line 1792, in decorated
    found, ret = self._try_specific_call(meth.__name__, *args, **kwargs)  # pylint: disable=no-member
  File "/usr/lib/python3.5/site-packages/blivet/threads.py", line 45, in run_with_lock
    return m(*args, **kwargs)
  File "/usr/lib/python3.5/site-packages/blivet/devices/lvm.py", line 1777, in _try_specific_call
    ret = getattr(cls, name)(self, *args, **kwargs)
  File "/usr/lib/python3.5/site-packages/blivet/devices/lvm.py", line 1526, in _create
    profile=profile_name)
  File "/usr/lib64/python3.5/site-packages/gi/overrides/BlockDev.py", line 441, in wrapped
    raise transform[1](msg)
gi.overrides.BlockDev.LVMError: Failed to call the 'LvCreateLinear' method on the '/com/redhat/lvmdbus1/Vg/0' object: Timeout was reached

Comment 1 Adam Williamson 2016-09-23 17:28:04 UTC
Created attachment 1204229 [details]
File: anaconda-tb

Comment 2 Adam Williamson 2016-09-23 17:28:06 UTC
Created attachment 1204230 [details]
File: anaconda.log

Comment 3 Adam Williamson 2016-09-23 17:28:07 UTC
Created attachment 1204231 [details]
File: backtrace

Comment 4 Adam Williamson 2016-09-23 17:28:09 UTC
Created attachment 1204232 [details]
File: description

Comment 5 Adam Williamson 2016-09-23 17:28:10 UTC
Created attachment 1204233 [details]
File: dnf.librepo.log

Comment 6 Adam Williamson 2016-09-23 17:28:11 UTC
Created attachment 1204234 [details]
File: environ

Comment 7 Adam Williamson 2016-09-23 17:28:12 UTC
Created attachment 1204235 [details]
File: hawkey.log

Comment 8 Adam Williamson 2016-09-23 17:28:14 UTC
Created attachment 1204236 [details]
File: ifcfg.log

Comment 9 Adam Williamson 2016-09-23 17:28:16 UTC
Created attachment 1204237 [details]
File: lsblk_output

Comment 10 Adam Williamson 2016-09-23 17:28:18 UTC
Created attachment 1204238 [details]
File: lvm.log

Comment 11 Adam Williamson 2016-09-23 17:28:19 UTC
Created attachment 1204239 [details]
File: nmcli_dev_list

Comment 12 Adam Williamson 2016-09-23 17:28:20 UTC
Created attachment 1204240 [details]
File: packaging.log

Comment 13 Adam Williamson 2016-09-23 17:28:21 UTC
Created attachment 1204241 [details]
File: program.log

Comment 14 Adam Williamson 2016-09-23 17:28:23 UTC
Created attachment 1204242 [details]
File: storage.log

Comment 15 Adam Williamson 2016-09-23 17:28:25 UTC
Created attachment 1204243 [details]
File: syslog

Comment 16 David Lehman 2016-09-29 17:53:30 UTC
This looks like a bug in the lvmdbus1 service, but I'll let you be the judge of that. Also adding Tony since it may end up on his plate.

Comment 17 Tony Asleson 2016-09-29 18:31:56 UTC
In lvm dbus we never return an actual timeout.  You can specify a timeout length until a job object path is returned.  For Vg.LvCreateLinear parameter 4 is the timeout until a job is created.

timeout value has the following meaning:

-1 == Wait forever until we get a result or error
0 == Return a job immediately
>0 == Number of seconds we wait before we return a job

However, all client dbus libraries have a timeout value and it's very important that the client library timeout > what you pass as an argument to lvm dbus.  Default is usually ~25 seconds.  Otherwise the client will timeout waiting before the operation is complete or a job is returned.

Based on the above debug message it looks like a client side timeout occurred.

Comment 18 Adam Williamson 2016-09-29 21:13:52 UTC
So just for a bit of background info for the newcomers: we use an automated system called openQA for testing Fedora:

https://openqa.fedoraproject.org/

It runs a bunch of tests, many of which are installation tests, and of course almost all of those (except the ones which specifically test non-LVM installs) create LVM setups. It runs a little over 100 tests on each Fedora compose (we get two composes nightly - one for Branched (i.e. F25), one for Rawhide), probably 60-70 of those involve an installation.

Around once in every, oh, 1.5 composes or so - so about 1 in every 100 install tests, give or take - we run into this error. It doesn't happen reliably in any specific test. It's probably more just a case of the host where the test is being run getting really busy with I/O or something. Each test runs in a VM, and we have four systems hosting test VMs (two production, two staging); three of them run up to four test VMs simultaneously, the fourth (which has more CPU cores) runs up to ten test VMs simultaneously. Each of these 'worker hosts' uses spinning rust storage, I think. So I guess the ten worker host, especially, could be working its disks pretty hard if all/most of the tests it happens to be running at any given time happen to be at a point where they're generating a lot of I/O.

Comment 19 Vratislav Podzimek 2016-09-30 11:13:34 UTC
A PR with what I believe causes the issue: https://github.com/rhinstaller/libblockdev/pull/123

Comment 20 Vratislav Podzimek 2016-10-03 16:10:02 UTC
(In reply to Vratislav Podzimek from comment #19)
> A PR with what I believe causes the issue:
> https://github.com/rhinstaller/libblockdev/pull/123

Merged.

Comment 21 Fedora Update System 2016-10-05 01:55:10 UTC
libblockdev-1.9-3.fc25 has been pushed to the Fedora 25 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2016-9305da925f

Comment 22 Fedora Update System 2016-10-07 03:34:16 UTC
libblockdev-1.9-4.fc25 has been pushed to the Fedora 25 stable repository. If problems still persist, please make note of it in this bug report.

Comment 23 Adam Williamson 2016-12-08 20:09:36 UTC
Saw a similar traceback on Rawhide in openQA stg today:

https://openqa.stg.fedoraproject.org/tests/63334

gi.overrides.BlockDev.LVMError: Failed to call the 'LvCreate' method on the '/com/redhat/lvmdbus1/Vg/0' object: Timeout was reached

Comment 24 Vratislav Podzimek 2016-12-09 13:39:11 UTC
(In reply to Adam Williamson from comment #23)
> Saw a similar traceback on Rawhide in openQA stg today:
> 
> https://openqa.stg.fedoraproject.org/tests/63334
> 
> gi.overrides.BlockDev.LVMError: Failed to call the 'LvCreate' method on the
> '/com/redhat/lvmdbus1/Vg/0' object: Timeout was reached

Interesting. But unless something has changed in lvmdbusd, the libblockdev's side should be okay, because it tells lvmdbusd to return a job object path in a second which is definitely shorter than the DBus timeout. Looks like lvmdbusd failed to provide the answer. If this appears more than once we definitely need to debug the issue more thoroughly.

Comment 25 Tony Asleson 2016-12-09 21:54:42 UTC
(In reply to Vratislav Podzimek from comment #24)
> Interesting. But unless something has changed in lvmdbusd, the libblockdev's
> side should be okay, because it tells lvmdbusd to return a job object path
> in a second which is definitely shorter than the DBus timeout. Looks like
> lvmdbusd failed to provide the answer. If this appears more than once we
> definitely need to debug the issue more thoroughly.

It looks like the timeout is set to 5000 or 5 seconds for the client library and the API timeout == 1 second.  This should be plenty of time when accounting for the imprecise nature of the lvmdbusd timeout implementation.

Is blivet running multiple operations concurrently to configure the storage or is it sequentially making changes?  Trying to understand any contributing factors.

Comment 26 Vratislav Podzimek 2016-12-12 08:03:53 UTC
(In reply to Tony Asleson from comment #25)
> (In reply to Vratislav Podzimek from comment #24)
> > Interesting. But unless something has changed in lvmdbusd, the libblockdev's
> > side should be okay, because it tells lvmdbusd to return a job object path
> > in a second which is definitely shorter than the DBus timeout. Looks like
> > lvmdbusd failed to provide the answer. If this appears more than once we
> > definitely need to debug the issue more thoroughly.
> 
> It looks like the timeout is set to 5000 or 5 seconds for the client library
> and the API timeout == 1 second.  This should be plenty of time when
> accounting for the imprecise nature of the lvmdbusd timeout implementation.
> 
> Is blivet running multiple operations concurrently to configure the storage
> or is it sequentially making changes?  Trying to understand any contributing
> factors.
All blivet operations are running in a single thread. It's a "worker" thread of anaconda (the main thread runs the UI), but blivet itself runs everything sequentially.


Note You need to log in before you can comment on or make changes to this bug.