Bug 1899990

Summary: gptfdisk naming of paritions does not work for big endian systems
Product: Red Hat Enterprise Linux 8 Reporter: Prashanth Sundararaman <psundara>
Component: gdiskAssignee: Nikola Forró <nforro>
Status: CLOSED ERRATA QA Contact: Martin Kyral <mkyral>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 8.3CC: bgilbert, jkejda, kzak, lmiksik, miabbott, mkyral, msuchy, nforro
Target Milestone: rcKeywords: Triaged
Target Release: 8.0   
Hardware: s390x   
OS: Linux   
Whiteboard:
Fixed In Version: gdisk-1.0.3-9.el8 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 2065205 (view as bug list) Environment:
Last Closed: 2022-05-10 15:20:04 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 2065205    

Description Prashanth Sundararaman 2020-11-20 14:39:17 UTC
On s390x systems, when we list /dev/disk/by-partlabel, the name displayed is garbage.

/dev/disk/by-partlabel:
total 0
drwxr-xr-x. 2 root root 120 Aug 21 21:05  .
drwxr-xr-x. 7 root root 140 Aug 21 21:05  ..
lrwxrwxrwx. 1 root root  10 Aug 21 21:05 ''$'\344\210\200\344\244\200\344\274\200\345\214\200\342\264\200\344\210\200\344\274\200\344\274\200\345\220\200' -> ../../vda3
lrwxrwxrwx. 1 root root  10 Aug 21 21:05 ''$'\344\224\200\344\230\200\344\244\200\342\264\200\345\214\200\345\244\200\345\214\200\345\220\200\344\224\200\344\264\200' -> ../../vda2
lrwxrwxrwx. 1 root root  10 Aug 21 21:05 ''$'\346\210\200\346\274\200\346\274\200\347\220\200' -> ../../vda1
lrwxrwxrwx. 1 root root  10 Aug 21 21:05 ''$'\347\210\200\346\274\200\346\274\200\347\220\200' -> ../../vda4


Looks like this is not handled well for big endian . see discussion here: https://github.com/coreos/ignition-dracut/pull/100#issuecomment-529917905

Comment 1 Nikola Forró 2020-11-23 10:48:44 UTC
Do I assume correctly that this bug report is about gdisk storing the partition name as big endian, while libblkid (which is where /dev/disk/by-partlabel entries come from) decodes it as little endian?

Comment 2 Prashanth Sundararaman 2020-11-23 14:06:29 UTC
(In reply to Nikola Forró from comment #1)
> Do I assume correctly that this bug report is about gdisk storing the
> partition name as big endian, while libblkid (which is where
> /dev/disk/by-partlabel entries come from) decodes it as little endian?

Correct. so should this go to util-linux instead ?

Comment 3 Nikola Forró 2020-11-23 16:07:22 UTC
I'm not sure. Both Wikipedia and OSDev Wiki state that partition name should be stored in UTF-16LE, but UEFI specification doesn't say anything about it. But if every other tool/library in the OS expects it to be little endian, I think we should fix that in gdisk.

Comment 4 Prashanth Sundararaman 2020-11-23 20:04:06 UTC
(In reply to Nikola Forró from comment #3)
> I'm not sure. Both Wikipedia and OSDev Wiki state that partition name should
> be stored in UTF-16LE, but UEFI specification doesn't say anything about it.
> But if every other tool/library in the OS expects it to be little endian, I
> think we should fix that in gdisk.

yes i think by default it is little endian.this is the blkid output:

[core@bootstrap-0 ~]$ blkid
/dev/sda4: LABEL="root" UUID="910678ff-f77e-4a7d-8d53-86f2ac47a823" BLOCK_SIZE="512" TYPE="xfs" PARTLABEL="M-gM-^HM-^@M-fM-<M-^@M-fM-<M-^@M-gM-^PM-^@" PARTUUID="e27be4d8-f8e4-4b7b-9c03-460d2bc67e7d"

Comment 5 Nikola Forró 2020-11-24 18:01:15 UTC
Ok, it turns out it's actually the other way around, gdisk stores and reads partition names as little endian, while everything else (because everything else seems to be based on libblkid) does no conversion on big endian platforms.
gdisk however only does the conversion with its new implementation of UTF16 support, but it can still be built with the old implementation, relying on libicu, so the "fix" seems trivial.

Comment 6 Benjamin Gilbert 2020-11-24 18:41:20 UTC
UEFI spec 2.8 section 1.8 ("Conventions Used in this Document") contains section 1.8.1 ("Data Structure Descriptions"), which says "Supported processors are “little endian” machines." and then goes on to explain endianness.  The spec doesn't seem to address the endianness of character encodings, nor does it even say outright that data structures are stored little-endian, but I read that as implying little-endian characters.

So it sounds like the right fix is in libblkid?

Comment 7 Nikola Forró 2020-11-24 19:06:11 UTC
> So it sounds like the right fix is in libblkid?
That would make partition names platform independent, at least.

Karel, do you have an opinion on this?

Comment 8 Karel Zak 2021-09-22 14:12:33 UTC
OK, finally I had time to play with it. Sorry for the delay ;-)

* I've tried to create two GPT images on s390 and x86_64 by GNU Parted and fdisk. I get the same result everywhere and the correct result from blkid.

Device              Start   End Sectors Size Type             Name
gpt-s390-fdisk.img1  2048  4095    2048   1M Linux filesystem I ❤️  GPT.
gpt-s390-fdisk.img2  4096 20446   16351   8M Linux filesystem 日本は美しい

* I've tried to copy the images between s390 and x86_64 and in all combinations get the same correct result on both archs.

* I've tried to create images by gdisk, fdisk and blkid print incorrect result, parted aborted

* I have tried to copy gdisk image from s390 to x86_64 and it does not work, it seems gdisk itself is inconsistent:

s390:

# sha1sum gdisk.img
df50a0b326a6858413362e2a405b8cf3fd3e72bb  gdisk.img

# gdisk -l gdisk.img
..
Number  Start (sector)    End (sector)  Size       Code  Name
   1            2048            4095   1024.0 KiB  8300  I ❤️  GPT.
   2            4096         1023966   498.0 MiB   8300  日本は美しい


x86_64:

# sha1sum gdisk.img
df50a0b326a6858413362e2a405b8cf3fd3e72bb  gdisk.img

# gdisk -l gdisk.img
..
Number  Start (sector)    End (sector)  Size       Code  Name
   1            2048            4095   1024.0 KiB  8300  䤀 搧࿾  䜀倀吀⸀
   2            4096         1023966   498.0 MiB   8300  Ⱨ漰蹿地䐰


So, something is wrong within gdisk.


Note that GNU Parted (which uses libc iconv() with "UCS-2LE") is not able to read partition name too:

# gdisk -l /dev/sda
...
Number  Start (sector)    End (sector)  Size       Code  Name
   1            2048         1023966   499.0 MiB   8300  Linux filesystem

# parted rted /dev/sda print
,,,
Number  Start   End    Size   File system  Name                             Flags
 1      1049kB  524MB  523MB               䰀椀渀甀砀 昀椀氀攀猀礀猀琀攀洀


It seems that on BE gdisk writes something wrong/different to the GPT, on LE gdisk vs. parted work as expected.

Comment 9 Karel Zak 2021-09-22 14:53:48 UTC
I found the problem in gdisk code. 

The function GPTPart::SetName() reverses by order on BE, but function GPTData::SavePartitionTable() calls partitions[i].ReversePartBytes(); which again (!) reverses the same bytes. It means it does (on BE) UTF8 -> UTF16BE -> UTF16LE ;-)

You need

diff --git a/gptpart.cc b/gptpart.cc
index 17d6f15..d6c1330 100644
--- a/gptpart.cc
+++ b/gptpart.cc
@@ -234,7 +234,7 @@ void GPTPart::SetName(const string & theName) {
       // then to utf16le
       if ( uni < 0x10000 ) {
          name[ pos ] = (uint16_t) uni ;
-         if ( ! IsLittleEndian() ) ReverseBytes( name + pos , 2 ) ;
+//         if ( ! IsLittleEndian() ) ReverseBytes( name + pos , 2 ) ;
          pos ++ ;
       } // if
       else {
@@ -244,10 +244,10 @@ void GPTPart::SetName(const string & theName) {
          } // if
          uni -= 0x10000 ;
          name[ pos ] = (uint16_t)( uni >> 10 ) | 0xd800 ;
-         if ( ! IsLittleEndian() ) ReverseBytes( name + pos , 2 ) ;
+//         if ( ! IsLittleEndian() ) ReverseBytes( name + pos , 2 ) ;
          pos ++ ;
          name[ pos ] = (uint16_t)( uni & 0x3ff ) | 0xdc00 ;
-         if ( ! IsLittleEndian() ) ReverseBytes( name + pos , 2 ) ;
+//         if ( ! IsLittleEndian() ) ReverseBytes( name + pos , 2 ) ;
          pos ++ ;
       }
    } // for

or do not revert the partition name on write.

So conclusion, libblkid is correct and it works as expected. gdisk need to fix :-)

Comment 10 Karel Zak 2021-09-22 14:57:39 UTC
Ah, typo in my previous comment; gdisk does (on BE) UTF8 -> UTF16LE -> UTF16BE

Comment 11 Nikola Forró 2021-09-22 17:46:08 UTC
Thank you Karel, you are 100% correct.

It turns out upstream has meanwhile fixed the issue:
https://sourceforge.net/p/gptfdisk/code/ci/fded770b55fdb3a201ad515d785c17ac35705652/

And added a command to fix the wrongly-stored partition names:
https://sourceforge.net/p/gptfdisk/code/ci/331ad9c795e4db7d09e8141979f809e4f5815319/

I'm going to backport these commits.

Comment 13 Martin Kyral 2021-10-15 10:23:50 UTC
Please, is there anythink which prevents this bug from being moved over to ON_QA? The ITM for this bug is set to 7, ending Monday but the bug is still not in the QE side of the playground. Please, push the ITM by at least one and  switch the bug to QE.

I have a working test for the bug so once the bug is in ON_QA, the verification should be easy.

Thanks.

Comment 24 Nikola Forró 2022-03-15 12:04:41 UTC
It turns out there is the same double-reverse issue in the reading part, and it makes the internal gdisk_test.sh fail.

The following upstream commit has to be backported:
https://sourceforge.net/p/gptfdisk/code/ci/86dd5fea351a5a55bea26b7622eb85ebd6075a60/

Comment 34 Martin Kyral 2022-03-18 10:10:39 UTC
*** Bug 2064176 has been marked as a duplicate of this bug. ***

Comment 36 errata-xmlrpc 2022-05-10 15:20:04 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (gdisk bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2022:2019