Description of problem: I took out the exclusive activation lock on link-01 and then attempted to also grab that lock on link-02. That command failed as it should however the error code was still 0. [root@link-02 lib]# vgchange -aye Error locking on node link-02: Resource temporarily unavailable 0 logical volume(s) in volume group "linear_1_5844" now active [root@link-02 lib]# echo $? 0 Version-Release number of selected component (if applicable): lvm2-cluster-2.02.06-7.0.RHEL4
Part of the problem here, I think, is that vgchange can affect multiple volume groups. so some could have been activated, and others not. This isn't specific to clustered groups either. perhaps we need to have a particular returned error code that indicates whether some groups have failed to be activated.
Remember that vgchange -ay is shorthand for: for each VG on the command line (or all VGs if none given) for each LV in that VG lvchange -ay VG/LV Conventionally only the most severe error from any of the constituent commands (lvchange here) gets reported. In precisely what circumstances do you want an error? In a previous case the argument that won was that a command that changes something should report an error if the *change* was not possible because the entity was already in the state requested. That would mean if any referenced LV is *already* active, the command should return an error because the attempt to change it into the active state failed. [My preference was for error codes to reflect whether or not the final state was reached, regardless of whether or not anything had to change.] In a cluster it's even more complicated because of the way tags control the activation, and the need to query the lock status.
First, since the command didn't fall under either of the cases listed in comment #2, an error should be reported. The command attempted to change the state of an entity, which was not currently in that state (at least not on that node), and it failed to do so. The final state (being active) was never reached. But to answer your question, if the volume was already in the exclusive active state, and you issue that command again, I wouldn't expect an error because the final state was reached. However, the previous case that you refer to (bz 179473) where the entity was already "in the state requested" (or removed) and I argued the opposite should happen, doens't really apply to this case. In that case I was trying to manipulate an entity (PV) that no longer existed, so that would be like attempting to deactivate a nonexistent vg. Should that fail, well technically, a nonexistent vg isn't active so the command did technically work. :) [root@link-02 tmp]# vgchange -an foo Volume group "foo" not found [root@link-02 tmp]# echo $? 5 It does the right thing and gives the error. :)
Here's what you see with the latest: [root@link-08 lvm]# vgchange -ae Error locking on node link-08: Volume is busy on another node 1 logical volume(s) in volume group "vg" now active [root@link-08 lvm]# echo $? 0 I still argue that since the exclusive lock wasn't obtained, and since an error was even given, a non zero error code should be given as well. [root@link-08 lvm]# rpm -qa | grep lvm2 lvm2-cluster-2.02.13-1 lvm2-2.02.13-1
Yes, I know. only the error has changed as per 162809
Created attachment 144716 [details] Idea for an improvement Here's a patch I did a while ago that does this. The most controversial part of it (I suspect) will be the shifting up of the return codes, this is needed to keep INCOMPLETE in its rightful place in the severity stakes
email repsonse from Alasdair: > I don't see that a new error code gains us anything there: > > If operating on multiple objects and you need to know which ones did or > didn't succeed, then you simply perform the operations separately. > > Only use commands that operate on multiple objects when you aren't > interested in knowing. > > The bugzilla referenced is discussing "What precisely should -ae" mean? > Should it be different from "-ael" ? Under precisely what sets of > circumstances should it return an error, and based on the final state > or whether a change happened? > > The man page says: > If clustered locking is enabled, -ae will activate exclusively > on one node and -aly will activate only on the local node. > > Note that it does *not* say -ae will activate exclusively on the > *local* node. > > Alasdair I'm assigning this to him because nothing I do is going to get past this argument.
This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux maintenance release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Update release for currently deployed products. This request is not yet committed for inclusion in an Update release.
Since RHEL 4.8 External Beta has begun, and this bugzilla remains unresolved, it has been rejected as it is not proposed as exception or blocker.
Still unresolved? But since this was originally raised, we added code that can check the lock state? So can it now with -aey: check the lock state to see if it is already active exclusively on any node, and return success if so - no error message? if it is already active but not exclusive, issue the exclusive activation request in a way so that if it is active on exactly one node that lock will be changed to exclusive, but if it is active on multiple nodes, there'll be an error? if it is not already active on any nodes, try exclusive activation on the local node first, but if it's filtered so nothing happens, issue it to all nodes, ignoring errors provided that one node succeeds?
This should be solved by this commit: Version 2.02.46 - 21st May 2009 =============================== Detect LVs active on remote nodes by querying locks if supported. But this version is not planed for RHEL4, should be already fixed in RHEL5 and above.