Bug 1321617

Summary: lvs exits with non-zero return on successful run
Product: Red Hat Enterprise Linux 7 Reporter: Jack Waterworth <jwaterwo>
Component: lvm2Assignee: Peter Rajnoha <prajnoha>
lvm2 sub component: Displaying and Reporting QA Contact: Peter Rajnoha <prajnoha>
Status: CLOSED NOTABUG Docs Contact:
Severity: medium    
Priority: unspecified CC: agk, heinzm, jbrassow, lczerner, msnitzer, prajnoha, prockai, zkabelac
Version: 7.2Flags: lczerner: needinfo? (agk)
Target Milestone: rc   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-06-02 14:48:43 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Jack Waterworth 2016-03-28 16:03:15 UTC
Description of problem:

lvs exits with non-zero return on successful due to set volume group flags

------------------------------
[root@cs-rh6-3 ~]# lvs --config 'global{ locking_type = 1 }'; echo $?
  Skipping clustered volume group vg_virtualmachines
  Skipping volume group vg_virtualmachines
  Skipping clustered volume group gfs2meta
  Skipping volume group gfs2meta
  Skipping clustered volume group vg_vmconfigs
  Skipping volume group vg_vmconfigs
  LV      VG        Attr      LSize  Pool Origin Data%  Move Log Cpy%Sync Convert
  lv_home vg_csrh63 -wi-ao--- 73.50g                                             
  lv_root vg_csrh63 -wi-ao--- 50.00g                                             
  lv_swap vg_csrh63 -wi-ao--- 11.78g                                             
5
------------------------------

------------------------------
[root@localhost ~]# lvs
  Volume group testvg is exported
  LV   VG   Attr       LSize   Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert
  root rhel -wi-ao----   6.67g                                                    
  swap rhel -wi-ao---- 820.00m                                                    
[root@localhost ~]# echo $?
5
------------------------------

Version-Release number of selected component (if applicable):
lvm2-libs-2.02.130-5.el7_2.1.x86_64
lvm2-2.02.130-5.el7_2.1.x86_64


How reproducible:
Every Time

Steps to Reproduce:
1. Have an exported, or clustered volume group
2. run lvs
3. check return

Actual results:
lvs returns '5' instead of '0'

Expected results:
lvs should return '0'


Additional info:

lvs appears to have been created with scripting in-mind. the problem this creates is that a successful run of `lvs` with these volume group options configured causes a non-zero return resulting in script failure.  the non-zero returns should be reserved for failures, such as inconsistent metadata, or some kind of issue with LVM specifically, not special configurations.

this is specifically causing an issue within SSM, which checks the return of `lvs` during storage related operations.

Comment 2 Jack Waterworth 2016-03-28 16:06:09 UTC
related ssm bug: https://bugzilla.redhat.com/show_bug.cgi?id=1321236

Comment 3 Peter Rajnoha 2016-06-02 14:38:00 UTC
(In reply to Jack Waterworth from comment #0)
> Description of problem:
> 
> lvs exits with non-zero return on successful due to set volume group flags
> 
> ------------------------------
> [root@cs-rh6-3 ~]# lvs --config 'global{ locking_type = 1 }'; echo $?
>   Skipping clustered volume group vg_virtualmachines
>   Skipping volume group vg_virtualmachines
>   Skipping clustered volume group gfs2meta
>   Skipping volume group gfs2meta
>   Skipping clustered volume group vg_vmconfigs
>   Skipping volume group vg_vmconfigs
>   LV      VG        Attr      LSize  Pool Origin Data%  Move Log Cpy%Sync
> Convert
>   lv_home vg_csrh63 -wi-ao--- 73.50g                                        
> 
>   lv_root vg_csrh63 -wi-ao--- 50.00g                                        
> 
>   lv_swap vg_csrh63 -wi-ao--- 11.78g                                        
> 
> 5
> ------------------------------
> 
> ------------------------------
> [root@localhost ~]# lvs
>   Volume group testvg is exported
>   LV   VG   Attr       LSize   Pool Origin Data%  Meta%  Move Log Cpy%Sync
> Convert
>   root rhel -wi-ao----   6.67g                                              
> 
>   swap rhel -wi-ao---- 820.00m                                              
> 
> [root@localhost ~]# echo $?
> 5
> ------------------------------
> 
> Version-Release number of selected component (if applicable):
> lvm2-libs-2.02.130-5.el7_2.1.x86_64
> lvm2-2.02.130-5.el7_2.1.x86_64
> 
> 
> How reproducible:
> Every Time
> 
> Steps to Reproduce:
> 1. Have an exported, or clustered volume group
> 2. run lvs
> 3. check return
> 
> Actual results:
> lvs returns '5' instead of '0'
> 
> Expected results:
> lvs should return '0'

If one of the objects being processed fails (in this case the "skipped because of inapproriate locking used" causing the failure), the operation as a whole fails, no matter if other objects are processed correctly.

Currently, we're working on a new feature that will make it possible to track per-object processing in more detail where users can see return codes per each object.

The code is currently here: https://git.fedorahosted.org/cgit/lvm2.git/log/?h=dev-prajnoha-json. This patchset is adding 3 modes of output:

  - native (the classical output and error/warning reporting as it exists today)

  - extended (the error/warning messages issued during processing are collected and then reported as a report - that means in tabular view with columns and rows - similar to pvs/vgs/lvs output; also the return codes are collected per each object that is processed)

  - json (the same as "extended" output, but in JSON format)

For example:

# vgs
  Skipping clustered volume group vg
  VG     #PV #LV #SN Attr   VSize  VFree
  fedora   1   2   0 wz--n- 19.49g    0 

# echo $?
5

# vgs --reportformat extended
  Report: vg
  VG     #PV #LV #SN Attr   VSize  VFree
  fedora   1   2   0 wz--n- 19.49g    0 
  
  Report: status
  Seq MsgType    Context    ObjectType ObjectID                               ObjectName Msg                                Code 
    1 status     processing vg         elQNcv-AmqS-nZb4-mro1-RBq1-9ojN-sIozB9 fedora     success                                1
    2 error      processing vg         QPc3nM-gNwY-LXtg-Kq9j-flPE-80f5-mMd5Bc vg         Skipping clustered volume group vg     0
    3 status     processing vg         QPc3nM-gNwY-LXtg-Kq9j-flPE-80f5-mMd5Bc vg         failure                                5


# vgs --reportformat json    
  {
      "report": [
          {
              "vg": [
                  {"vg_name":"fedora", "pv_count":"1", "lv_count":"2", "snap_count":"0", "vg_attr":"wz--n-", "vg_size":"19.49g", "vg_free":"0 "}
              ]
          }
      ]
      ,
      "status": [
          {"seq_num":"1", "type":"status", "context":"processing", "object_type":"vg", "object_id":"elQNcv-AmqS-nZb4-mro1-RBq1-9ojN-sIozB9", "object_name":"fedora", "message":"success", "code":"1"},
          {"seq_num":"2", "type":"error", "context":"processing", "object_type":"vg", "object_id":"QPc3nM-gNwY-LXtg-Kq9j-flPE-80f5-mMd5Bc", "object_name":"vg", "message":"Skipping clustered volume group vg", "code":"0"},
          {"seq_num":"3", "type":"status", "context":"processing", "object_type":"vg", "object_id":"QPc3nM-gNwY-LXtg-Kq9j-flPE-80f5-mMd5Bc", "object_name":"vg", "message":"failure", "code":"5"}
      ]
  }

I hope this feature will also land in 7.3.

Comment 4 Alasdair Kergon 2016-06-02 14:48:43 UTC
The command is performing correctly.

By not specifying which VGs you wish to see, you are asking the tool to show everything, and it is giving an error because it is unable to do that because it cannot access every VG.

For finer granularity, run the command separately for each VG - then you can see which VGs are accessible and which are not.

Comment 5 Alasdair Kergon 2016-06-02 15:00:10 UTC
Note we already have:
       --ignoreskippedcluster
              Use to avoid exiting with an non-zero status code if the command
              is run without  clustered  locking  and  some  clustered  Volume
              Groups have to be skipped over.

Comment 6 Lukáš Czerner 2016-07-22 07:28:44 UTC
This seems very counter intuitive to me. It simply can't be understood without knowing the internal workings of the lvm, which I think is a bit too much to ask for the users.

Giving a warning, like you seem to try to do with "Volume group whatever is exported" - even though a bit of explaining that it can't show the logical volumes would be nice - is fine, but flat out return error code seems a bit weird to me.

If you really want to leave this in, then please at least change the misleading documentation on lvm:

"All tools return a status code of zero on success or non-zero on failure."


Now I have a question about how can I workaround this ? What does this error means actually ? Will it show in other cases as well where there really is a tangible failure ? I do not want to simply to ignore this return code if it can be returned in other more failure-like cases.

Thanks!
-Lukas