Bug 1899990
Summary: | gptfdisk naming of paritions does not work for big endian systems | |||
---|---|---|---|---|
Product: | Red Hat Enterprise Linux 8 | Reporter: | Prashanth Sundararaman <psundara> | |
Component: | gdisk | Assignee: | Nikola Forró <nforro> | |
Status: | CLOSED ERRATA | QA Contact: | Martin Kyral <mkyral> | |
Severity: | medium | Docs Contact: | ||
Priority: | unspecified | |||
Version: | 8.3 | CC: | bgilbert, jkejda, kzak, lmiksik, miabbott, mkyral, msuchy, nforro | |
Target Milestone: | rc | Keywords: | Triaged | |
Target Release: | 8.0 | |||
Hardware: | s390x | |||
OS: | Linux | |||
Whiteboard: | ||||
Fixed In Version: | gdisk-1.0.3-9.el8 | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 2065205 (view as bug list) | Environment: | ||
Last Closed: | 2022-05-10 15:20:04 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 2065205 |
Description
Prashanth Sundararaman
2020-11-20 14:39:17 UTC
Do I assume correctly that this bug report is about gdisk storing the partition name as big endian, while libblkid (which is where /dev/disk/by-partlabel entries come from) decodes it as little endian? (In reply to Nikola Forró from comment #1) > Do I assume correctly that this bug report is about gdisk storing the > partition name as big endian, while libblkid (which is where > /dev/disk/by-partlabel entries come from) decodes it as little endian? Correct. so should this go to util-linux instead ? I'm not sure. Both Wikipedia and OSDev Wiki state that partition name should be stored in UTF-16LE, but UEFI specification doesn't say anything about it. But if every other tool/library in the OS expects it to be little endian, I think we should fix that in gdisk. (In reply to Nikola Forró from comment #3) > I'm not sure. Both Wikipedia and OSDev Wiki state that partition name should > be stored in UTF-16LE, but UEFI specification doesn't say anything about it. > But if every other tool/library in the OS expects it to be little endian, I > think we should fix that in gdisk. yes i think by default it is little endian.this is the blkid output: [core@bootstrap-0 ~]$ blkid /dev/sda4: LABEL="root" UUID="910678ff-f77e-4a7d-8d53-86f2ac47a823" BLOCK_SIZE="512" TYPE="xfs" PARTLABEL="M-gM-^HM-^@M-fM-<M-^@M-fM-<M-^@M-gM-^PM-^@" PARTUUID="e27be4d8-f8e4-4b7b-9c03-460d2bc67e7d" Ok, it turns out it's actually the other way around, gdisk stores and reads partition names as little endian, while everything else (because everything else seems to be based on libblkid) does no conversion on big endian platforms. gdisk however only does the conversion with its new implementation of UTF16 support, but it can still be built with the old implementation, relying on libicu, so the "fix" seems trivial. UEFI spec 2.8 section 1.8 ("Conventions Used in this Document") contains section 1.8.1 ("Data Structure Descriptions"), which says "Supported processors are “little endian” machines." and then goes on to explain endianness. The spec doesn't seem to address the endianness of character encodings, nor does it even say outright that data structures are stored little-endian, but I read that as implying little-endian characters. So it sounds like the right fix is in libblkid? > So it sounds like the right fix is in libblkid?
That would make partition names platform independent, at least.
Karel, do you have an opinion on this?
OK, finally I had time to play with it. Sorry for the delay ;-) * I've tried to create two GPT images on s390 and x86_64 by GNU Parted and fdisk. I get the same result everywhere and the correct result from blkid. Device Start End Sectors Size Type Name gpt-s390-fdisk.img1 2048 4095 2048 1M Linux filesystem I ❤️ GPT. gpt-s390-fdisk.img2 4096 20446 16351 8M Linux filesystem 日本は美しい * I've tried to copy the images between s390 and x86_64 and in all combinations get the same correct result on both archs. * I've tried to create images by gdisk, fdisk and blkid print incorrect result, parted aborted * I have tried to copy gdisk image from s390 to x86_64 and it does not work, it seems gdisk itself is inconsistent: s390: # sha1sum gdisk.img df50a0b326a6858413362e2a405b8cf3fd3e72bb gdisk.img # gdisk -l gdisk.img .. Number Start (sector) End (sector) Size Code Name 1 2048 4095 1024.0 KiB 8300 I ❤️ GPT. 2 4096 1023966 498.0 MiB 8300 日本は美しい x86_64: # sha1sum gdisk.img df50a0b326a6858413362e2a405b8cf3fd3e72bb gdisk.img # gdisk -l gdisk.img .. Number Start (sector) End (sector) Size Code Name 1 2048 4095 1024.0 KiB 8300 䤀 搧 䜀倀吀⸀ 2 4096 1023966 498.0 MiB 8300 Ⱨ漰蹿地䐰 So, something is wrong within gdisk. Note that GNU Parted (which uses libc iconv() with "UCS-2LE") is not able to read partition name too: # gdisk -l /dev/sda ... Number Start (sector) End (sector) Size Code Name 1 2048 1023966 499.0 MiB 8300 Linux filesystem # parted rted /dev/sda print ,,, Number Start End Size File system Name Flags 1 1049kB 524MB 523MB 䰀椀渀甀砀 昀椀氀攀猀礀猀琀攀洀 It seems that on BE gdisk writes something wrong/different to the GPT, on LE gdisk vs. parted work as expected. I found the problem in gdisk code. The function GPTPart::SetName() reverses by order on BE, but function GPTData::SavePartitionTable() calls partitions[i].ReversePartBytes(); which again (!) reverses the same bytes. It means it does (on BE) UTF8 -> UTF16BE -> UTF16LE ;-) You need diff --git a/gptpart.cc b/gptpart.cc index 17d6f15..d6c1330 100644 --- a/gptpart.cc +++ b/gptpart.cc @@ -234,7 +234,7 @@ void GPTPart::SetName(const string & theName) { // then to utf16le if ( uni < 0x10000 ) { name[ pos ] = (uint16_t) uni ; - if ( ! IsLittleEndian() ) ReverseBytes( name + pos , 2 ) ; +// if ( ! IsLittleEndian() ) ReverseBytes( name + pos , 2 ) ; pos ++ ; } // if else { @@ -244,10 +244,10 @@ void GPTPart::SetName(const string & theName) { } // if uni -= 0x10000 ; name[ pos ] = (uint16_t)( uni >> 10 ) | 0xd800 ; - if ( ! IsLittleEndian() ) ReverseBytes( name + pos , 2 ) ; +// if ( ! IsLittleEndian() ) ReverseBytes( name + pos , 2 ) ; pos ++ ; name[ pos ] = (uint16_t)( uni & 0x3ff ) | 0xdc00 ; - if ( ! IsLittleEndian() ) ReverseBytes( name + pos , 2 ) ; +// if ( ! IsLittleEndian() ) ReverseBytes( name + pos , 2 ) ; pos ++ ; } } // for or do not revert the partition name on write. So conclusion, libblkid is correct and it works as expected. gdisk need to fix :-) Ah, typo in my previous comment; gdisk does (on BE) UTF8 -> UTF16LE -> UTF16BE Thank you Karel, you are 100% correct. It turns out upstream has meanwhile fixed the issue: https://sourceforge.net/p/gptfdisk/code/ci/fded770b55fdb3a201ad515d785c17ac35705652/ And added a command to fix the wrongly-stored partition names: https://sourceforge.net/p/gptfdisk/code/ci/331ad9c795e4db7d09e8141979f809e4f5815319/ I'm going to backport these commits. Please, is there anythink which prevents this bug from being moved over to ON_QA? The ITM for this bug is set to 7, ending Monday but the bug is still not in the QE side of the playground. Please, push the ITM by at least one and switch the bug to QE. I have a working test for the bug so once the bug is in ON_QA, the verification should be easy. Thanks. It turns out there is the same double-reverse issue in the reading part, and it makes the internal gdisk_test.sh fail. The following upstream commit has to be backported: https://sourceforge.net/p/gptfdisk/code/ci/86dd5fea351a5a55bea26b7622eb85ebd6075a60/ *** Bug 2064176 has been marked as a duplicate of this bug. *** Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (gdisk bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2022:2019 |