Bug 1287861 - RFE: `pvmove -b ...` add ability to retrieve % complete and error(s)
RFE: `pvmove -b ...` add ability to retrieve % complete and error(s)
Status: NEW
Product: Fedora
Classification: Fedora
Component: lvm2 (Show other bugs)
rawhide
Unspecified Unspecified
unspecified Severity unspecified
: ---
: ---
Assigned To: Ondrej Kozina
Fedora Extras Quality Assurance
: FutureFeature
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2015-12-02 16:00 EST by Tony Asleson
Modified: 2016-02-12 09:27 EST (History)
11 users (show)

See Also:
Fixed In Version:
Doc Type: Enhancement
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed:
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Tony Asleson 2015-12-02 16:00:52 EST
Today when you issue a `pvmove -b ...` you don't have an easy way to identify what the status is of the running job that continues on after the pvmove exits.

Specifically:

1. How it completed, error/exit code & error message if applicable
2. Percent complete, monitor while move is in progress
3. Optionally being able to cancel the operation, if possible.
4. Ability to retrieve the above information if the operation gets interrupted and restarted due to crash/reboot of the system

Thanks!
Comment 1 Ondrej Kozina 2015-12-03 07:31:05 EST
Let's start with classical polling without lvmpolld being enabled:

1) It's almost impossible. The pvmove -b initiates the operation (metadata update, mirror construction and so on) and forks off, detaches itself from terminal and remains completely silent from now on. No output to log file or syslog at all. You'd have to run pvmove in foreground and parse out the output and error messages yourself.

2) If the pvmove operation was originally started in background by issuing i.e. 'pvmove -b /dev/sda', you can either:

- rerun 'pvmove /dev/sda' again in foreground and continue as in 1) (The command will lookup the helper LV to monitor and report back progress info)

- periodically call 'lvs -a vg1/pvmoveX' to poll for progress info manually

3) 'pvmove --abort /dev/sda' or 'pvmove --abort' to cancel all pvmoves being in progress.

4) Call 'pvmove' without any parameter to get/resume all pvmoves being in progress.

Otherwise it's almost impossible to address individual pvmove. Pvmove can be resumed as background job while VG is being reactivated. Such background process may be impact of vgchange or lvchange command. As well as intentional 'pvmove -b' command (a.k.a resume all in background) or pvmove -b /dev/sda. Same situation as in 1)
Comment 2 Ondrej Kozina 2015-12-03 08:19:37 EST
With lvmpolld enabled (by default in F23 and RHEL7.2):

Provided you'll get vg/pvmove0 LV name from pvmove -b command:

1), 2) Ask lvmpolld directly. You'll get one of these:
  in-progress (we miss percentages atm, trivial to fix)
  finished
  not-found (implementation detail)
  failed (with reason. I can add parsed error message if it's necessary)

3) pvmove --abort

4) Either same workaround, pvmove without parameters. Or I can add method to lvmpolld: something like "list of all operations" filtered on demand: "only active", "only finished", "only failed". Afterwards you can continue with 1) and 2). You can look at output off lvmpolld --dump to get a coarse grained picture.

But to resume all pvmove operations for sure you always have to call pvmove w/o parameters. With lvmpolld you can call: 'pvmove -b' and then see the list.
Comment 3 Tony Asleson 2016-02-09 17:31:00 EST
(In reply to Ondrej Kozina from comment #2)
> With lvmpolld enabled (by default in F23 and RHEL7.2):
> 
> Provided you'll get vg/pvmove0 LV name from pvmove -b command:

I just need something to correlate the pvmove that was just specified and the operation representing it in lvmpolld.  The lvname or the lvid would be fine.

We only can have one move per VG right?  So does that imply 
that the name is always vg/pvmove0?  Just trying to think if I can make this assumption until we get the code in place which returns it.

> 1), 2) Ask lvmpolld directly. You'll get one of these:
>   in-progress (we miss percentages atm, trivial to fix)

How does pvmove -i 1 dump the percentages if it's not getting it from the poll daemon?

>   finished
>   not-found (implementation detail)
>   failed (with reason. I can add parsed error message if it's necessary)

Yes error code and error message would be ideal!

> 3) pvmove --abort

To abort one move you specify the source PV.  I think it would be helpful to include this in the poll data output, otherwise I need to try and figure out which PV is the source, which I don't know if that's easy to do or not.  This is especially true if we are coming up from a reset/reboot and I have no knowledge of what a user initiated or not.

Maybe it would be good to include the full lvm command that initiated this move, that would be very helpful to the user IMHO too.

> 4) Either same workaround, pvmove without parameters. Or I can add method to
> lvmpolld: something like "list of all operations" filtered on demand: "only
> active", "only finished", "only failed". Afterwards you can continue with 1)
> and 2). You can look at output off lvmpolld --dump to get a coarse grained
> picture.

I'm ok with getting the full output of 'dump' and filtering myself.  However, an interrupted move does not resume on itself correct?  A user needs to re-run pvmove, thus until they do that does the polldaemon know about the pending/suspended move operations?

> But to resume all pvmove operations for sure you always have to call pvmove
> w/o parameters. With lvmpolld you can call: 'pvmove -b' and then see the
> list.

I have code ready today that connects to the poll daemon and requests a "dump", I'm ready for the other bits needed to make this work.
Comment 4 Ondrej Kozina 2016-02-12 09:27:34 EST
(In reply to Tony Asleson from comment #3)
> (In reply to Ondrej Kozina from comment #2)
> > With lvmpolld enabled (by default in F23 and RHEL7.2):
> > 
> > Provided you'll get vg/pvmove0 LV name from pvmove -b command:
> 
> I just need something to correlate the pvmove that was just specified and
> the operation representing it in lvmpolld.  The lvname or the lvid would be
> fine.
> 
> We only can have one move per VG right?  So does that imply

Nope. we can have more than one pvmove lv in a single vg.

> that the name is always vg/pvmove0?  Just trying to think if I can make this
> assumption until we get the code in place which returns it.

Please don't do that. Or well, I think it can't be so hard to get the json output of pvmove -b with required bits for you in a reasonable time. But I should not speak for Peter of course:)

> > 1), 2) Ask lvmpolld directly. You'll get one of these:
> >   in-progress (we miss percentages atm, trivial to fix)
> 
> How does pvmove -i 1 dump the percentages if it's not getting it from the
> poll daemon?

With lvmpolld enabled it's a sort of hybrid solution and I preferred not to take the approach but anyway:

After the pvmove is initiated the command will notify lvmpolld about the operation. Then the command keeps asking lvmpolld about status of the operation. Provided the answer is 'in-progress' the command will query the kernel target and will also print out the progress info. This will keep repeating until lvmpolld answers with anything else than 'in-progress' (I simplified it but you get the picture, I hope :))

> >   finished
> >   not-found (implementation detail)
> >   failed (with reason. I can add parsed error message if it's necessary)
> 
> Yes error code and error message would be ideal!
>
> > 3) pvmove --abort
> 
> To abort one move you specify the source PV.  I think it would be helpful to
> include this in the poll data output, otherwise I need to try and figure out
> which PV is the source, which I don't know if that's easy to do or not.

I agree. Workaround suggestion until I find a better way:
lvs -a vg/pvmove0 -o +devices (the first PV listed is the source one)
 
> This is especially true if we are coming up from a reset/reboot and I have
> no knowledge of what a user initiated or not.
>
> Maybe it would be good to include the full lvm command that initiated this
> move, that would be very helpful to the user IMHO too.

If such information was not stored in mda I won't be able to provide it. From lvmpolld POV I store the command that first _asked_ for polling. It may be different command than the one that actually initiated the operation by writing lvm mda. For example:

1) pvmove -i 1 /dev/sda (initiatiated pvmove lv vg00/pvmove0)
2) system restart
3) new lvmpolld instance started, initiated by vgchange -ay vg00 or by pvscan --cache if also auto-activation of vg00 took place in the same time)

For lvmpolld:
in 1) the command that initiated polling is pvmove
   3) the command that initiated polling is vgchange/pvscan whatever.

> > 4) Either same workaround, pvmove without parameters. Or I can add method to
> > lvmpolld: something like "list of all operations" filtered on demand: "only
> > active", "only finished", "only failed". Afterwards you can continue with 1)
> > and 2). You can look at output off lvmpolld --dump to get a coarse grained
> > picture.
> 
> I'm ok with getting the full output of 'dump' and filtering myself. 
> However, an interrupted move does not resume on itself correct?  A user
> needs to re-run pvmove, thus until they do that does the polldaemon know
> about the pending/suspended move operations?

Yes, you're right. lvmpolld does not start any polling unless asked to do so explicitly.

> 
> > But to resume all pvmove operations for sure you always have to call pvmove
> > w/o parameters. With lvmpolld you can call: 'pvmove -b' and then see the
> > list.
> 
> I have code ready today that connects to the poll daemon and requests a
> "dump", I'm ready for the other bits needed to make this work.

Hmm, would you be interested in list-all-active method in lvmpolld instead?

Note You need to log in before you can comment on or make changes to this bug.