Description of problem: Anaconda is crashing when configuring storage in the middle of installation. It looks like there is a problem with calling LVM DBus API. How reproducible: Always Steps to reproduce: 1) Start Rawhide installation on existing LVM partitioning 2) Set automatic partitioning 3) Begin installation process Actual result: Anaconda is crashing in Configuring storage step. Expected result: Installation should work correctly. Version-Release number of selected component: anaconda-27.8-1 The following was filed automatically by anaconda: anaconda 27.8-1 exception report Traceback (most recent call first): File "/usr/lib64/python3.6/site-packages/gi/overrides/BlockDev.py", line 762, in wrapped raise transform[1](msg) File "/usr/lib/python3.6/site-packages/blivet/devices/lvm.py", line 266, in _destroy blockdev.lvm.vgreduce(self.name, None) File "/usr/lib/python3.6/site-packages/blivet/threads.py", line 45, in run_with_lock return m(*args, **kwargs) File "/usr/lib/python3.6/site-packages/blivet/devices/storage.py", line 484, in destroy self._destroy() File "/usr/lib/python3.6/site-packages/blivet/threads.py", line 45, in run_with_lock return m(*args, **kwargs) File "/usr/lib/python3.6/site-packages/blivet/deviceaction.py", line 374, in execute self.device.destroy() File "/usr/lib/python3.6/site-packages/blivet/threads.py", line 45, in run_with_lock return m(*args, **kwargs) File "/usr/lib/python3.6/site-packages/blivet/actionlist.py", line 325, in process action.execute(callbacks) File "/usr/lib/python3.6/site-packages/blivet/actionlist.py", line 48, in wrapped_func return func(obj, *args, **kwargs) File "/usr/lib/python3.6/site-packages/blivet/blivet.py", line 163, in do_it self.devicetree.actions.process(callbacks=callbacks, devices=self.devices) File "/usr/lib/python3.6/site-packages/blivet/threads.py", line 45, in run_with_lock return m(*args, **kwargs) File "/usr/lib/python3.6/site-packages/blivet/osinstall.py", line 1112, in turn_on_filesystems storage.do_it(callbacks) File "/usr/lib64/python3.6/site-packages/pyanaconda/install_tasks.py", line 437, in run_task self._task(*self._task_args, **self._task_kwargs) File "/usr/lib64/python3.6/site-packages/pyanaconda/install_tasks.py", line 471, in start self.run_task() File "/usr/lib64/python3.6/site-packages/pyanaconda/install_tasks.py", line 303, in start item.start() File "/usr/lib64/python3.6/site-packages/pyanaconda/install_tasks.py", line 303, in start item.start() File "/usr/lib64/python3.6/site-packages/pyanaconda/install.py", line 356, in doInstall installation_queue.start() File "/usr/lib64/python3.6/threading.py", line 864, in run self._target(*self._args, **self._kwargs) File "/usr/lib64/python3.6/site-packages/pyanaconda/threads.py", line 251, in run threading.Thread.run(self, *args, **kwargs) gi.overrides.BlockDev.LVMError: Failed to call the 'Reduce' method on the '/com/redhat/lvmdbus1/Vg/0' object: GDBus.Error:org.freedesktop.DBus.Python.dbus.exceptions.DBusException: ('com.redhat.lvmdbus1.Vg', 'Exit code 3, stderr = Command does not accept option: --removemissing.\n') Additional info: addons: com_redhat_kdump, com_redhat_docker cmdline: /usr/libexec/system-python /sbin/anaconda executable: /sbin/anaconda hashmarkername: anaconda kernel: 4.12.0-0.rc1.git4.1.fc27.x86_64 product: Fedora release: Cannot get release name. type: anaconda version: rawhide
Created attachment 1282172 [details] File: anaconda-tb
Created attachment 1282173 [details] File: environ
Created attachment 1282174 [details] File: lsblk_output
Created attachment 1282175 [details] File: nmcli_dev_list
Created attachment 1282176 [details] File: os_info
Looks like an error in lvmdbusd because when called from a shell 'vgreduce --removemissing' works just fine.
From lvmdbusd: May 26 09:51:00 HOSTNAME lvmdbusd[1739]: 1739:1747 - CMD= /usr/sbin/lvm vgreduce --force --config devices { preferred_names=["^/dev/mapper/", "^/dev/md/", "^/dev/sd"] } log {level=7 file=/tmp/lvm.log syslog=0} --all --removemissing VolGroup Looking from older systems, the same command line is passed in, and does work. I think the parser for the CLI tools got a lot stricter. From the current help output: ---- ... Remove all unused PVs from a VG. vgreduce -a|--all VG [ COMMON_OPTIONS ] Remove all missing PVs from a VG. vgreduce --removemissing VG [ --mirrorsonly ] [ COMMON_OPTIONS ] ... --- I think --all and --removemissing are mutually exclusive. Possible solution: https://sourceware.org/git/?p=lvm2.git;a=blob;f=daemons/lvmdbusd/cmdhandler.py;h=8ed38cb05054991999edf3885e8ec2ec335b9669;hb=HEAD#l623 Perhaps change the 'if' to 'elif'. I assume --all is wider than --removemissing.
The two options mean different operations, and vgreduce has never done both at once, so it never made sense to specify both together. --all means to remove any unused PVs from the VG --removemissing means to remove any missing PVs from the VG Previously if both options were given, the incidental behavior of the code was to ignore --all and do --removemissing. With the new strict definitions, the code doesn't allow the two incompatible options to be set at once. If we wanted to preserve that old behavior, even though it didn't make sense, we could modify the command definition for removemissing to accept but ignore the all option: diff --git a/tools/command-lines.in b/tools/command-lines.in index 7f9c31fb5729..76323aae98c5 100644 --- a/tools/command-lines.in +++ b/tools/command-lines.in @@ -1633,6 +1633,7 @@ DESC: Remove all unused PVs from a VG. vgreduce --removemissing VG OO: --mirrorsonly, OO_VGREDUCE +IO: --all ID: vgreduce_missing DESC: Remove all missing PVs from a VG.
We'd prefer if you could remove the --all option, and just use --removemissing since the combination of the two doesn't make sense given their different definitions. The fact that lvm didn't catch the invalid combination before is best considered an lvm bug.
I get this error in Anaconda when using a kickstart file that contains: zerombr clearpart --all --initlabel autopart The kickstart file is only failing on rawhide, but not on F25 or F26 Alpha 1.7 Based on http://www.winglemeyer.org/fedora_docs_proposal/sample/ja/rawhide/install-guide/Kickstart_Syntax_Reference.html the syntax is still correct. Partitioning works if I do not use a kickstart file and use the partitioning in Anaconda. Any ideas how to be able to use a kickstart file in rawhide that partitions the drive?
Pushed to a branch https://sourceware.org/git/?p=lvm2.git;a=commitdiff;h=b3306383ba7cf77399e88da3bf1ae5cde9b0eb43 Not quite happy about the DBus API, especially the fact `--all` is assumed when empty list of PVs is passed by accident. Shall we add a new method with DeprecationWarning for the irragular use cases? Tony, is there a better, backward compatible fix? Or just document it?
I would remove the "assert len(pv_devices) == 0" in the missing path for your proposed change. If we want to ensure the user is using it correctly we could return a dbus error instead. What are you suggesting for a new API call that would remove the ambiguity? We could do as David is suggesting in comment 9 and just remove the '-all' option all together. As always the reduce_options parameter could be used for specialty cases not covered by the generic case(s). Just to recap a user wants to: 1. Remove a specific PV: missing == False, pv_devices == list of one ore more PVs 2. Remove any missing: missing == True, pv_devices == Does not matter/ignored 3. Remove all unused PV devices, missing == False, pv_devices == empty list I think the important part is there is no ambiguity here that cases the user to lose data.
Is this how to return DBus error: https://sourceware.org/git/?p=lvm2.git;a=commitdiff;h=115748d4916e20e3829fff6f72659de3c6459b9c , assert can be left in place, as it would be programming error if that happened. Re: Removing -all altogether: I do not think David suggested to completely remove --all, just to not use --all with --missing. Re: API: I also think accepted parameters of method and its meaning changing depending on other parameters is wrong. And I think David's efforts to break down commands (like vgreduce or lvconvert) by use cases into (almost) separate subcommands, as is in case of vgreduce reflected in --help output: vgreduce - Remove physical volume(s) from a volume group Remove a PV from a VG. vgreduce VG PV ... [ COMMON_OPTIONS ] Remove all unused PVs from a VG. vgreduce -a|--all VG [ COMMON_OPTIONS ] Remove all missing PVs from a VG. vgreduce --removemissing VG [ --mirrorsonly ] [ COMMON_OPTIONS ] || missing | len(PVs) | meaning | API || False | 0 | --all | RemoveUnusedPVs(options) || False | >0 | PVs | RemovePVs(devices, options) || True | 0 | --missing | RemoveMissingPVs(options) || True | >0 | N/A | -
(In reply to Marian Csontos from comment #13) > Is this how to return DBus error: > https://sourceware.org/git/?p=lvm2.git;a=commitdiff; > h=115748d4916e20e3829fff6f72659de3c6459b9c , assert can be left in place, as > it would be programming error if that happened. Yes, just raise the dbus exception. However, we also need to make sure we aren't changing the external behavior. If existing API users are calling and getting success, we need to preserve that so we need to be careful if we start raising an error. In general I'm OK with asserts, but in this case it's caused by an end API user using an incompatible selection of options for the call. When you hit the assert the end user is going to think that the service has an issue, not their interaction with it. They are going to get a cryptic exception returned to them instead of an error with some text explaining what happened. End users should never be able to cause an assert IMHO. > I also think accepted parameters of method and its meaning changing depending > on other parameters is wrong. I agree it's not perfect, it's what I put together in a vacuum based on limited input and absolutely no peer review by the lvm team. Hindsight is 20/20. The API can be extended with new methods which are better as needed. We could probably redesign this API call as we don't have many API users, but in general I'm against that as it sets a bad precedence by not maintaining backwards compatibility. APIs cannot change, otherwise they lose value. IMHO it's madness for LVM to change it's command line interface, that's the only interface end users have to interact with it. LVM potentially just blew up it's entire user base that wraps the command line with these latest changes. The command line as the API doesn't work when it changes.
We have already discussed elsewhere we need to handle deprecation in our CLI. We are pushing a lot of historical baggage -- for example LVM1 metadata support -- forward with no clear path to get rid of them. Is there a way to pass a (Deprecation)Warning from server to client? Seems DBus has no mechanism to deprecate a method, or particular use case. What options we have if we want to change anything in a backward incompatible way[1]? 1. To create a com.redhat.lvmdbus2, and document lvmdbus1 is Deprecated. 2. To start a lvmdbusd in a strict mode returning errors (useful for development/testing), and keep compatibility when in relaxed mode (useful for production) - see commit: [1] I think we should not be changing behavior, only becoming more strict like in this case and returning error. For now I have implemented the strict checking here: https://sourceware.org/git/?p=lvm2.git;a=commitdiff;h=c5d990d6f604d199798318e8cd5884a68e7776f3 We could also raise DeprecationWarning, but it will show in dbusd logs, not clients', so I think we should drop the second change: https://sourceware.org/git/?p=lvm2.git;a=commitdiff;h=c8981f814fc841f8d4b39d629df08ae8a9d8cecf Re: assert - yes, it should happen only if service does something wrong. It was protected by DBusError, and if we apply the strict checking patch, it does send an empty list down to _vg_reduce.
I asked around and the way to deprecate is document it and then after a suitable amount of time has passed remove it. However, you cannot deprecate something until you have a suitable replacement for it. I pushed an upstream correction for this. https://sourceware.org/git/?p=lvm2.git;a=commitdiff;h=192d142e1cd38c1593ec76502d8424997ec01ec8 We don't need to worry about checking for invalid combinations in the service as lvm command line is doing this already in older versions and newer versions. This change preserves API behavior with old and new versions of lvm.