Bug 2233336 - [RFE] add an option to show technical colum descriptions instead of human readable / translated ones
Summary: [RFE] add an option to show technical colum descriptions instead of human rea...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: LVM and device-mapper
Classification: Community
Component: lvm2
Version: unspecified
Hardware: All
OS: All
unspecified
low
Target Milestone: ---
: ---
Assignee: Peter Rajnoha
QA Contact: cluster-qe
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2023-08-21 23:44 UTC by Roland Kletzing
Modified: 2025-06-25 06:46 UTC (History)
5 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2025-06-25 06:46:38 UTC
Embargoed:
pm-rhel: lvm-technical-solution?
pm-rhel: lvm-test-coverage?


Attachments (Terms of Use)

Description Roland Kletzing 2023-08-21 23:44:08 UTC
as discussed on the dm-devel mailing list ( cannot link yet as ml archive doesn't show yet ), please add another output format option which skips translation of technical/option field names into human readable form for better readability.

in it's current state, human readable form has ambiguous column headers (see example above), which makes it hard to read or may cause confusion. 

furthermore, columns may contain whitespaces, which needs us add option for adding extra separator, which isn't needed if there was a mode which prints the columns just like given option names

default:

# pvs --units s -o+pv_ba_start,seg_start,seg_start_pe,pvseg_start
  BA Start Start    Start Start
  0S       2097152S 1     0
  0S       2097152S 1     1
  0S       0S       0     5309
  0S       0S       0     5587
  0S       0S       0     5588 
 
(mind we have 3 times "Start" for different columns)

besides --reportformat basic|json , we could call this "--reportmode=optionnames" or "--reportmode=tech"

that could perhaps look like this :

# pvs --reportmode=optionnames --units s -o+pv_ba_start,seg_start,seg_start_pe,pvseg_start
  pv_ba_start seg_start seg_start_pe pvseg_start
  0S          2097152S  1            0
  0S          2097152S  1            1
  0S          0S        0            5309
  0S          0S        0            5587
  0S          0S        0            5588 


Some comment from David Teigland on this:

>It does sound better than some of the strange header abbreviations.
>It could also use the key words that appear with --nameprefixes.
>
>Dave

Comment 1 Peter Rajnoha 2023-08-22 13:56:08 UTC
That sounds useful, I myself got confused sometimes with the headings. A patchset to provide this feature is here: https://gitlab.com/lvmteam/lvm2/-/merge_requests/1

Though, instead of '--reportmode', I've added '--idsinheadings' command line option and 'report/ids_in_headings' lvm.conf option so the logic stays the same as we already use for other similar reporting options (--noheadings, --nameprefixes...).

Comment 2 Zdenek Kabelac 2023-08-22 14:34:27 UTC
The more puzzling question is - why this should be actually every needed ??

We do provide many flexible ways how to shape the output to make the parsing of the result easiest for automated parsing.

So it's certainly bad plan to take the result that is destined for console output readable by human and try to parse such output with a program when you simply can shape the output directly for the easiest machine parsing.

So as such - you can mark exact columns you want to get and print any header for them you may prefer.

Note - you can get also output in --rows.

Aka - we usually document the output in 'human readable' way - so how would then user know what all the shortcuts mean ??
And if the consumer is another tool - headers shouldn't be actually needed at all.

Comment 3 Roland Kletzing 2023-08-22 15:32:49 UTC
>The more puzzling question is - why this should be actually every needed ??

because real world people reading that output and or/forwarding it somewhere, where also formatting possibly may break ? 

the existent headers in it's default version are a little bit messy imho. you cannot be sure what's the beginning of a column or the end and as i already told, colum names are not uniqe, which is the main reason why i feel discomfort.

please try understand perspective of beginners for example.

is "BA Start BA Size" 4 or 2 columns ? BA-Start and BA-Size?  Or just BA-Start-BA-Size ?

is "SYS ID System ID" 4 or 2 columns ? SYS-ID and System-ID or is it Sys-ID-System-ID whereas "ID" and "ID" meaning something different (like Start-Start-Start) as shown above ?

imho, a heading / 1st line of column based output should always

- be clear / distinct
- have proper separation , where you always know where a column begins and where it ends. so i think it's a little bit unfortunate that per default,  whitespace is being used as a separator for columns AND for identifiers at the same time. 
- make it possible to search trough some manpage and find the meaning of that column

at it's current state, it also complicates things if you want to copy/paste such output into your favourite spreadsheet tool.

i remember i have ran into problems with this a while ago where i tried to study the output of those fields and gave up with that, as i was was totally frustrated to quickly match those column identifiers with those documented in the manpage. a translation layer between technical abbrevs and some real-world naming for it CAN be very helpful - but if there is no easy and reliable way to translate forth and back - it simply sucks.


pvs --units s -o pv_all,lv_all,vg_all|head -n1
  Fmt  PV UUID                                DevSize   PV         Maj Min PMdaFree  PMdaSize  PExtVsn 1st PE  PSize     PFree Used      Attr Allocatable Exported   Missing    PE   Alloc PV Tags #PMda #PMdaUse BA Start BA Size PInUse Duplicate LV UUID                                LV   LV          Path             DMPath                  Parent Layout     Role       InitImgSyn ImgSynced  Merging    Converting AllocPol   AllocLock  FixMin     SkipAct         WhenFull        Active ActLocal       ActRemote  ActExcl            Maj Min Rahead LSize     MSize #Seg Origin Origin UUID                            OSize Ancestors FAncestors Descendants FDescendants Mismatches SyncAction WBehind MinSync MaxSync Move Move UUID                              Convert Convert UUID                           Log Log UUID                               Data Data UUID                              Meta Meta UUID                              Pool Pool UUID                              LV Tags LProfile LLockArgs CTime                      RTime                      Host                  Modules Historical KMaj KMin KRahead LPerms    Suspended  LiveTable            InactiveTable        DevOpen    Data%  Snap%  Meta%  Cpy%Sync Cpy%Sync CacheTotalBlocks CacheUsedBlocks  CacheDirtyBlocks CacheReadHits    CacheReadMisses  CacheWriteHits   CacheWriteMisses KCacheSettings     KCachePolicy       KMFmt Health          KDiscards CheckNeeded     MergeFailed     SnapInvalid     Attr       Fmt  VG UUID                                VG     Attr   VPerms     Extendable Exported   Partial    AllocPol   Clustered  Shared  VSize     VFree SYS ID System ID LockType VLockArgs Ext   #Ext Free MaxLV MaxPV #PV #PV Missing #LV #SN Seq VG Tags VProfile #VMda #VMdaUse VMdaFree  VMdaSize  #VMdaCps



[root@backupvm1 ~]# pvs
  PV         VG     Fmt  Attr PSize   PFree
  /dev/sda2  centos lvm2 a--  <14,00g    0

[root@backupvm1 ~]# pvs -o help 2>&1 |grep -i size

    lv_size                - Size of LV in current units.
    lv_metadata_size       - For thin and cache pools, the size of the LV that holds the metadata.
    origin_size            - For snapshots, the size of the origin device of this LV.
    dev_size               - Size of underlying device in current units.
    pv_mda_size            - Size of smallest metadata area on this device in current units.
    pv_size                - Size of PV in current units.
    pv_ba_size             - Size of PV Bootloader Area in current units.
    vg_size                - Total size of VG in current units.
    vg_extent_size         - Size of Physical Extents in current units.
    vg_mda_size            - Size of smallest metadata area for this VG in current units.
    reshape_len            - Size of out-of-place reshape space in current units.
    reshape_len_le         - Size of out-of-place reshape space in logical extents.
    stripe_size            - For stripes, amount of data placed on one device before switching to the next.
    region_size            - For mirrors/raids, the unit of data per leg when synchronizing devices.
    chunk_size             - For snapshots, the unit of data used when tracking changes.
    seg_size               - Size of segment in current units.
    seg_size_pe            - Size of segment in physical extents.
    pvseg_size             - Number of extents in segment.


so, how do i know what exactly is "PSize" ?

btw ,is there a reason why this is printed to sterr instead stdout (when pvs -o help is a perfectly valid command) ?

Comment 4 Roland Kletzing 2023-08-22 15:40:56 UTC
>Note - you can get also output in --rows.

now THAT's indeed useful, thanks.

didn't come across and didn't know that with --nameprefixes it adds the missing link for translation to identifiers back and forth

# pvs --rows --nameprefixes
  PV LVM2_PV_NAME='/dev/sda2'
  VG LVM2_VG_NAME='centos'
  Fmt LVM2_PV_FMT='lvm2'
  Attr LVM2_PV_ATTR='a--'
  PSize LVM2_PV_SIZE='<14,00g'
  PFree LVM2_PV_FREE='0 '

Comment 5 Zdenek Kabelac 2023-08-22 19:37:23 UTC
Let's be realistic here.

The main usage for  i.e. 'pvs -o+xxxx,yyy'  is to be able to display just couple interesting columns. There is no real good use to display tens of columns - as that is really hard to look for a human unless user has some  ultra-wide monitor with very very long lines in terminal ;)

The reason why column names are somewhat 'shortened/abbreviated' compared to the long form is to make sure the column will use possibly the minimal number of characters - every character counts - so if the content of column may take just 4-5 characters - then wasting  another 3-4 characters on the column title is seen as counter productive.

If the users wants to  parse the output by some tool there are actually better ways then trying to 'reparse' this human readable form and instead use the already mentioned  JSON reporting format or the more splitted field based output.  And since the user will already know which field will go at which position - the any field name is already kind of unnecessary information (--noheadings).

IMHO 'pvs   --reportformat json_std' is mostly the path for your tool...

Comment 6 Peter Rajnoha 2023-08-22 19:48:24 UTC
Alternatively, we might as well just fix the existing headings so they're at least unique and they do not contain spaces - the example with "Start Start" while they mean two different things is weak and it's indeed true that it is misleading. This is just about readability for humans, I assume, not for machine parsing (for which we do have better formats/output possibilities). As for the difference in character count for column IDs vs. names - that is not much difference when considering the small amount of columns which is usually used - so I wouldn't bother about counting characters in 'ID' headings and 'name' headings.

Comment 7 Peter Rajnoha 2023-08-22 19:50:37 UTC
(Spaces in the 'name' headings is not great as well - that should be fixed too then.)

Comment 8 Zdenek Kabelac 2023-08-22 20:01:10 UTC
@peter -  the standard output is not meant to be parsed by any tools/scripts

So the name of column is just the 'shortest' abbreviation we could think of.
In most cases the user will just print only couple interesting columns - so the uniqueness of column name is really not needed - as the user already specified which columns and in which order he wants to see - so the column name is really there just to somehow resemble the name.

Usage of spaces is questionable - but as mentioned in previous posts - the reason for such 'aggressive' change which may broke some  'oldish' tools trying to parse some old formatted output we still support before we introduced all the shiny  --format*  support might not be worth to break for such change.

New users should simply use appropriate format outputs for machine parsing...

Comment 9 Roland Kletzing 2023-08-22 20:58:55 UTC
i'd really like to agree with peter here. 

please let's at least fix headings by removing spaces and make them unique.

now with pvs --rows --nameprefixes , i have what i missed and can live with it. 

butthat does not mean that i expect others will find that option and be happy with the default, but i don't want to start too long debates on principles here.

>So the name of column is just the 'shortest' abbreviation we could think of.

to be honest - if we can have "PMdaFree", "PExtVsn" #PMdaUse or KMFmt - i don't understand why BAStart, PEStart, PVBAStart, SegStart, SegStartPE  isn't a prorgess, when compared to "BA Start" , Start, Start , Start.

if there is no willingness or some majority of votes to change the current behaviour/default or add another reportmode, then leave it at is. being conservative with long-time existing tools isn't too bad. 

but i don't have an idea, how some change could break things here or confuse anybody (as it's about avoiding confusion)

Comment 10 Roland Kletzing 2023-08-22 21:01:05 UTC
>the reason for such 'aggressive' change which may broke some  'oldish' tools trying to parse some old 
>formatted output we still support before we introduced all the shiny  --format*  support might not be worth to break for such change.

it never was my intention to change the default but just to add some mode where you could more easily translate the column names via manpage

Comment 11 Zdenek Kabelac 2023-08-22 22:42:47 UTC
As said - the reason is to use 'shortest' abbrev for the column title - that do not add unnecessarily empty space.
So if the column content typically takes 5 chars -  using  abbrev. with  8 chars adds  unnecessary 3 chars.

Historically pvs/vgs/lvs tools were  terminal oriented - and 80 chars was 'traditional' width - thus that's the target to fit the max amount of info. 

Clearly situation changes over the time  - users now use mostly doubled or even more their width of terminals - so more info fits - but it would be still seen impractical to see wide 'spacing' between columns if the info is short in them (as the column heading title defines minimal width) and many columns have even a single  1/0 output only - thus would need only 2 chars in practice...

Also historically there were some 'vars' added in some rush without thinking deeply - thus some column titles are simply wastefully too long - thus you can see for some of them very long headings - however this is rather a historical ommition...

We could surely use some 'PEStr'  or some other shortens - but that might further miss lead other people why do we use shortcut "A" instead of shortcut "B".

So our only option would be to introduce a new option i.e.:    --nameheadings  abbrev | name   

where the name  would be an auto generated i.e.    reshape_len_le  ->    ReshapeLenLe    

Whether this is worth to spend time on it I'm really not sure - never seen a complain yet over this - and the use-case seems to be rather some form of unexpected output usage...  (trying to map column title back to origin column parameter name - while we had very different goal there)

Comment 12 Peter Rajnoha 2023-08-23 07:06:22 UTC
(In reply to Zdenek Kabelac from comment #11)
> So our only option would be to introduce a new option i.e.:   
> --nameheadings  abbrev | name   
> 
> where the name  would be an auto generated i.e.    reshape_len_le  ->   
> ReshapeLenLe    

For that, we'd need to store another alternative string - definitely not worth it - we already have the identifiers and names. I'd just go with the identifiers directly (that is, with this example, printing "reshape_len_le" in the header) - that is what the patch does ;) It's optional, the patch is easy. The change is straightforward. Anyone who wants that can enable that (I'd use that myself I think :)) It'll be off by default of course. So let's either take the patch or dump it and leave things as they are, probably no need to invent more for output headings.

Comment 13 Peter Rajnoha 2023-08-23 07:16:34 UTC
Really, this is just for making user's life a bit easier, nothing else (not about any parsing, machine processing of the output whatsoever) - I need to match the output with what I typed on console for the output (or I already have configured in the lvm.conf and maybe already forgot the order of columns I have configured there) and I need to quickly see the output and just be sure I don't assume things incorrectly. Minimizing mistakes. The identifiers are all around the "<reporting command> -o help", in man pages etc. It's just easier to match. Who wants to stay with the original conservative headers for 80-characters wide terminals, they are still there, used by default.

Comment 14 Peter Rajnoha 2023-08-23 11:20:28 UTC
Discussed this with Zdenek a bit more - we agreed on adding the alternative headings (the identifiers we also use for "pvs/vgs/lvs -o ..."). We will change the name though - instead of "--idsinheadings", we'll use "--headings <type>" where type would be "none" (equal to --noheadings behavior), "abbrev" (for the ones we normally use, default) and the new "id" one (we'll find a good name for that - the "id" is known internally, but we need to match what we use in man pages and around - so something from "field"/"options"/"columns"...). In lvm.conf, we'll reuse existing "report/headings" option which currently accepts 0/1, we'll add "2" for the new heading type.

Comment 15 Roland Kletzing 2023-08-23 12:22:21 UTC
fantastic! thanks for making it possible and provide such quick solution! really glad to hear.

Comment 16 Peter Rajnoha 2023-08-30 10:12:55 UTC
Merged https://gitlab.com/lvmteam/lvm2/-/merge_requests/1


Note You need to log in before you can comment on or make changes to this bug.