RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1977236 - XFS with bigtime=1 inobtcount=1 enabled prevents booting of some ppc64le p9 with older firmware
Summary: XFS with bigtime=1 inobtcount=1 enabled prevents booting of some ppc64le p9 w...
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat Enterprise Linux 9
Classification: Red Hat
Component: xfsprogs
Version: 9.0
Hardware: ppc64le
OS: Unspecified
unspecified
unspecified
Target Milestone: beta
: 9.0 Beta
Assignee: Eric Sandeen
QA Contact: Zorro Lang
Michal Stubna
URL:
Whiteboard:
: 1977193 1985565 (view as bug list)
Depends On:
Blocks: 1898842 1971841
TreeView+ depends on / blocked
 
Reported: 2021-06-29 09:32 UTC by Petr Janda
Modified: 2023-09-15 01:34 UTC (History)
25 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-11-03 15:20:50 UTC
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
console log (6.06 KB, application/octet-stream)
2021-07-09 16:50 UTC, IBM Bug Proxy
no flags Details
console logs with call traces (98.38 KB, application/octet-stream)
2021-07-20 06:00 UTC, IBM Bug Proxy
no flags Details
call traces (99.46 KB, application/octet-stream)
2021-08-11 13:21 UTC, IBM Bug Proxy
no flags Details


Links
System ID Private Priority Status Summary Last Updated
IBM Linux Technology Center 193479 0 None None None 2021-07-09 13:29:01 UTC
Red Hat Knowledge Base (Solution) 6299901 0 None None None 2021-09-01 14:01:38 UTC

Description Petr Janda 2021-06-29 09:32:01 UTC
Description of problem:
Bare metal Power-9 systems use Petitboot as boot loader an it has to be able to mount /boot partition before system is started.
With petitboot v 1.7.6 its kernel doesn't support new XFS features enabled by default as resolution of bug 1937973 and refuses to mount it.

The sytem is unable to boot after installation.


Version-Release number of selected component (if applicable):
xfsprogs-5.12.0-3.el9
RHEL-9.0.0-20210626.0 ppc64le

How reproducible:
always

Steps to Reproduce:
1. install RHEL-9 containing xfsprogs-5.12.0-3.el9 (or newer with XFS_SB_FEAT_INCOMPAT_BIGTIME enabled) on OPAL system with Petitboot v 1.7.6
2. reboot after installation

Actual results:
System hangs in petitboot

Expected results:
System boots into installed OS correctly


Additional info:
when I enter into a shell in Petit boot I can try to mount XFS partition manually

/ # fdisk -l
Disk /dev/nvme0n1: 4058 MB, 4255121408 bytes, 8310784 sectors
16295 cylinders, 255 heads, 2 sectors/track
Units: sectors of 1 * 512 = 512 bytes

Device       Boot StartCHS    EndCHS        StartLBA     EndLBA    Sectors  Size Id Type
/dev/nvme0n1p1    4,4,1       1023,254,2        2048    8310783    8308736 4057M 8e Linux LVM
Disk /dev/sda: 1863 GB, 2000398934016 bytes, 3907029168 sectors
7660841 cylinders, 255 heads, 2 sectors/track
Units: sectors of 1 * 512 = 512 bytes

Device  Boot StartCHS    EndCHS        StartLBA     EndLBA    Sectors  Size Id Type
/dev/sda1 *  4,4,1       1023,254,2        2048    2099199    2097152 1024M 83 Linux
/dev/sda2    1023,254,2  1023,254,2     2099200 3907028991 3904929792 1862G 8e Linux LVM
Disk /dev/sdb: 1863 GB, 2000398934016 bytes, 3907029168 sectors
7660841 cylinders, 255 heads, 2 sectors/track
Units: sectors of 1 * 512 = 512 bytes

Device  Boot StartCHS    EndCHS        StartLBA     EndLBA    Sectors  Size Id Type
/dev/sdb1    4,4,1       1023,254,2        2048 3907028991 3907026944 1863G 8e Linux LVM
Disk /dev/dm-0: 4096 MB, 4294967296 bytes, 8388608 sectors
522 cylinders, 255 heads, 63 sectors/track
Units: sectors of 1 * 512 = 512 bytes

Disk /dev/dm-0 doesn't contain a valid partition table
fdisk: device has more than 2^32 sectors, can't use all of them
Disk /dev/dm-1: 2048 GB, 2199023255040 bytes, 4294967295 sectors
267349 cylinders, 255 heads, 63 sectors/track
Units: sectors of 1 * 512 = 512 bytes

Disk /dev/dm-1 doesn't contain a valid partition table
Disk /dev/dm-2: 70 GB, 75161927680 bytes, 146800640 sectors
9137 cylinders, 255 heads, 63 sectors/track
Units: sectors of 1 * 512 = 512 bytes

Disk /dev/dm-2 doesn't contain a valid partition table
/ # mount -o ro /dev/sda1 /tmp/mount
mount: mounting /dev/sda1 on /tmp/mount failed: Invalid argument

/ # dmesg | tail
[ 5836.405876] XFS (sda1): Filesystem cannot be safely mounted by this kernel.
[ 5842.504333] XFS (sda1): Superblock has unknown read-only compatible features (0x8) enabled.
[ 5842.504336] XFS (sda1): Attempted to mount read-only compatible filesystem read-write.
[ 5842.504338] XFS (sda1): Filesystem can only be safely mounted read only.
[ 5896.708035] XFS (sda1): Superblock has unknown read-only compatible features (0x8) enabled.
[ 5896.708038] XFS (sda1): Superblock has unknown incompatible features (0x8) enabled.
[ 5896.708041] XFS (sda1): Filesystem cannot be safely mounted by this kernel.
[ 9103.256236] XFS (sda1): Superblock has unknown read-only compatible features (0x8) enabled.
[ 9103.256240] XFS (sda1): Superblock has unknown incompatible features (0x8) enabled.
[ 9103.256242] XFS (sda1): Filesystem cannot be safely mounted by this kernel.

Comment 3 Dan Horák 2021-06-29 12:14:39 UTC
copying my reply from a related email thread:

Petitboot is just an application running in a Linux based mini-distro, for new kernel you need the whole new firmware build. The kernel in the mini-distro (skiroot) must support the filesystem where the /boot content is stored, at least read-only. Which might be doable for some OpenPOWER system (if the firmware has upstream on github), but difficult for others.

My first workaround would be to limit the RTT testing to virtualized systems (KVM or LPAR/PowerVM) as they boot via grub2.

A second workaround is to let anaconda format /boot as eg. ext4 (or an older xfs or ...) that would be understood by skiroot's kernel.

The proper solution is to update the firmware ...

Comment 4 Eric Sandeen 2021-06-29 13:50:39 UTC
So, this isn't really an xfsprogs bug per se, it's just that Petitboot doesn't understand the latest xfs features.  Kernel v5.10 understands these new features, and it was released in

Comment 5 Eric Sandeen 2021-06-29 13:54:16 UTC
So, this isn't really an xfsprogs bug per se, it's just that Petitboot doesn't understand the latest xfs features.  Kernel v5.10 understands these new features, and it was released in Dec 202.  Two paths forward: upgrade the kernel in Petitboot to 5.10 or newer, or teach Anaconda to make the /boot filesystem on this architecture with these features disabled, for now.

Unfortunately I don't think there is any way for mkfs.xfs to detect that it's formatting boot for these systems to auto-tune itself...

-Eric

Comment 6 Steve Best 2021-06-29 17:36:01 UTC
*** Bug 1977193 has been marked as a duplicate of this bug. ***

Comment 7 IBM Bug Proxy 2021-06-29 17:46:25 UTC
------- Comment From chavez.com 2021-06-29 10:56 EDT-------
Hi Steve,

Can you clarify what you mean by Petiboot hangs? Are you saying it neither attempts to boot the newly installed RHEL 9 OS or any of the options from the menu are selectable or something else?

RHBZ bug 1977236 mentions the issue with that hang is that petiboot can't mount a /boot filesystem with this XFS feature (XFS_SB_FEAT_INCOMPAT_BIGTIME) enabled. It gives a couple of choices. One is to use something like ext4 for /boot or update the OPAL firmware.

I'll add someone from the OPAL team to comment on whether newer versions of petiboot have an xfsprogs able to mount an XFS filesystem with the newer features.

Lastly, you may want to follow the instructions in 1977236 and see if attempting to manually mount the XFS filesystem does fail for you the same way with:

------- Comment From chavez.com 2021-06-29 11:00 EDT-------
BTW, can you provide the level of firmware that is installed on that P9 box? Thanks.

------- Comment From cdeadmin.com 2021-06-29 11:03 EDT-------
cde00 (cdeadmin.com) added native attachment /tmp/AIXOS13394811/console on 2021-06-29 10:02:57

Comment 8 Eric Sandeen 2021-06-29 18:27:03 UTC
(In reply to IBM Bug Proxy from comment #7)
> ------- Comment From chavez.com 2021-06-29 10:56 EDT-------
> Hi Steve,
> 
> Can you clarify what you mean by Petiboot hangs? Are you saying it neither
> attempts to boot the newly installed RHEL 9 OS or any of the options from
> the menu are selectable or something else?
> 
> RHBZ bug 1977236 mentions the issue with that hang is that petiboot can't
> mount a /boot filesystem with this XFS feature
> (XFS_SB_FEAT_INCOMPAT_BIGTIME) enabled. It gives a couple of choices. One is
> to use something like ext4 for /boot or update the OPAL firmware.

Please don't switch to ext4 for this, if we have to special-case the /boot
mkfs anyway, it would be preferable to just disable these features in XFS at
mkfs time.

> I'll add someone from the OPAL team to comment on whether newer versions of
> petiboot have an xfsprogs able to mount an XFS filesystem with the newer
> features.

To be clear - 
The issue is not xfsprogs, which is not involved in mounting the filesystem.

A kernel version 5.10 or newer is needed to mount an XFS filesystem with these
features.

Thanks,
-Eric

Comment 9 Dan Horák 2021-06-30 08:03:20 UTC
If I see right, then op-build is "stuck" at kernel 5.4.x :-(

Comment 10 Dan Horák 2021-06-30 08:20:13 UTC
op-build is the buildsystem for the OpenPOWER firmware = https://github.com/open-power/op-build

Comment 11 Steve Best 2021-07-09 13:48:05 UTC
(In reply to Eric Sandeen from comment #5)
> So, this isn't really an xfsprogs bug per se, it's just that Petitboot
> doesn't understand the latest xfs features.  Kernel v5.10 understands these
> new features, and it was released in Dec 202.  Two paths forward: upgrade
> the kernel in Petitboot to 5.10 or newer, or teach Anaconda to make the
> /boot filesystem on this architecture with these features disabled, for now.
> 
> Unfortunately I don't think there is any way for mkfs.xfs to detect that
> it's formatting boot for these systems to auto-tune itself...
> 
> -Eric

IBM,

can you share your plans for Petitboot in this area? I assume that Petitboot has plans to get updated to support xfs for this?

Thanks,
-Steve

Comment 12 IBM Bug Proxy 2021-07-09 16:50:39 UTC
Created attachment 1800084 [details]
console log

Comment 13 Eric Sandeen 2021-07-13 16:40:42 UTC
Steve, is there any way we can tie minimum petitboot/firmware requirements to a RHEL9 install? That's really the only sane path forward, IMHO.

Comment 14 Steve Best 2021-07-13 16:56:25 UTC
(In reply to Eric Sandeen from comment #13)
> Steve, is there any way we can tie minimum petitboot/firmware requirements
> to a RHEL9 install? That's really the only sane path forward, IMHO.

Eric,
I'm still waiting on IBM answering my question when and if Petitboot we be updated for this, without an answer I'm not sure we know what our path forward could be.

-Steve

Comment 15 Eric Sandeen 2021-07-13 18:01:42 UTC
I just wondered if there is any way for the OS (i.e. Anaconda) to detect the petitboot version prior to install...

Comment 16 Dan Horák 2021-07-13 18:31:14 UTC
there is "lsmcode", which reports the firmware details, bellow is an output from my Talos

[dan@talos libica]$ sudo lsmcode                                                                                                                                                   Version of System Firmware : 
 Product Name          : OpenPOWER Firmware
 Product Version       : talos-v1.20-161-g76f78f4
 Product Extra         : 	skiboot-bc106a0
 Product Extra         : 	bmc-firmware-version-2.00
 Product Extra         : 	occ-a8d0767
 Product Extra         : 	hostboot-884b60b
 Product Extra         : 	buildroot-2017.11.2-8-g4b6188e0f2
 Product Extra         : 	machine-xml-221192a
 Product Extra         : 	sbe-a389a5d
 Product Extra         : 	petitboot-v1.7.1-p836d356
 Product Extra         : 	linux-v4.15.9-openpower1-p9e03417

and from our team's Boston

[root@ibm-p9b-generic-01 ~]# lsmcode
Version of System Firmware : 
 Product Name          : OpenPOWER Firmware
 Product Version       : SUPERMICRO-P9DSU-V2.14-20190807-prod
 Product Extra         : 	skiboot-v6.0.20
 Product Extra         : 	bmc-firmware-version-2.13
 Product Extra         : 	occ-8fa3854
 Product Extra         : 	hostboot-8591ded-p4f715ce
 Product Extra         : 	buildroot-2018.11.3-12-g222837a
 Product Extra         : 	capp-ucode-p9-dd2-v4
 Product Extra         : 	machine-xml-734a35e
 Product Extra         : 	hostboot-binaries-hw072719a.op920
 Product Extra         : 	sbe-b6ee17b
 Product Extra         : 	hcode-hw072719a.op920
 Product Extra         : 	petitboot-v1.7.5-p11ed908
 Product Extra         : 	linux-4.19.57-openpower1-p48ee860

Comment 17 IBM Bug Proxy 2021-07-19 19:00:32 UTC
------- Comment From chavez.com 2021-07-19 14:54 EDT-------
*** Bug 193695 has been marked as a duplicate of this bug. ***

Comment 18 IBM Bug Proxy 2021-07-19 19:10:22 UTC
------- Comment From chavez.com 2021-07-19 15:00 EDT-------
Hi Ryan,

We are seeing additional reports of this problem now. Are there any plans to update OPAL's kernel version this year? If not, Red Hat may have to include special checks in RHEL 9 to determine OPAL's kernel version to avoid passing mkfs.xfs the bigtime=1 option that prevents OPAL from mounting the xfs filesystem if less than version 5.10.

Comment 19 IBM Bug Proxy 2021-07-20 06:00:53 UTC
Created attachment 1803503 [details]
console logs with call traces


------- Comment on attachment From preeti.thakur.com 2021-07-20 01:50 EDT-------




Hi,
here is an update from my side
I could able to boot system when installed with ext4 but while boot call traces are seen.
attaching console logs

Comment 20 IBM Bug Proxy 2021-07-20 06:00:54 UTC
------- Comment From cdeadmin.com 2021-07-20 01:52 EDT-------
cde00 (cdeadmin.com) added native attachment /tmp/AIXOS13394811/wcwsp3.txt on 2021-07-20 00:52:45

Comment 21 IBM Bug Proxy 2021-07-21 12:51:11 UTC
------- Comment From hegdevasant.com 2021-07-21 08:41 EDT-------
(In reply to comment #12)
> If I see right, then op-build is "stuck" at kernel 5.4.x :-(

We are rebase `op-build` kernel via https://github.com/open-power/op-build/pull/4214. IT will move to 5.10 soon.

Regarding official firmware, I need to check with our release management team.

-Vasant

Comment 22 Steve Best 2021-07-28 08:54:25 UTC
(In reply to IBM Bug Proxy from comment #21)
> ------- Comment From hegdevasant.com 2021-07-21 08:41 EDT-------
> (In reply to comment #12)
> > If I see right, then op-build is "stuck" at kernel 5.4.x :-(
> 
> We are rebase `op-build` kernel via
> https://github.com/open-power/op-build/pull/4214. IT will move to 5.10 soon.
> 
> Regarding official firmware, I need to check with our release management
> team.
> 
> -Vasant

when will the firmware be released? without a fix .. customers won't be able to install RHEL 9.0. we need a plan from IBM to fix this issue.

-Steve

Comment 23 IBM Bug Proxy 2021-07-28 18:20:44 UTC
------- Comment From chavez.com 2021-07-28 14:14 EDT-------
(In reply to comment #25)
> when will the firmware be released? without a fix .. customers won't be able
> to install RHEL 9.0. we need a plan from IBM to fix this issue.
> -Steve

Hi Steve,

Vasant has been actively discussing this issue almost daily with architects and hardware product owners. Hopefully, he'll have an update for y'all soon.

Comment 24 Jeff Bastian 2021-08-04 19:18:01 UTC
*** Bug 1985565 has been marked as a duplicate of this bug. ***

Comment 26 IBM Bug Proxy 2021-08-11 11:10:50 UTC
------- Comment From preeti.thakur.com 2021-08-11 07:03 EDT-------
updated with new fw

[root@ltc-wcwsp3 ~]# lsmcode
Product Name     : OpenPOWER Firmware
Product Version    : witherspoon-OP9-v2.6-9.93
Product Extra     : 	skiboot-v6.8-45-g8246de863
Product Extra     : 	bmc-firmware-version-0.00
Product Extra     : 	occ-16131c3
Product Extra     : 	hostboot-9e73780
Product Extra     : 	buildroot-2021.02.3-2-g2c7a998
Product Extra     : 	capp-ucode-p9-dd2-v4
Product Extra     : 	machine-xml-0f9b366
Product Extra     : 	hostboot-binaries-hw080421a.opmst10
Product Extra     : 	sbe-8b47418
Product Extra     : 	hcode-hw080421a.opmst
Product Extra     : 	petitboot-v1.12
Product Extra     : 	linux-5.10.50-openpower1-p59fd803

and we are able to detect /boot with xfs file system ..
though the call traces are still seen.

above has been verified with 0626 build as could not able to install the system with 0725 or beta build for which will be raising a new defect.

Comment 27 IBM Bug Proxy 2021-08-11 13:21:08 UTC
Created attachment 1813129 [details]
call traces

Comment 28 IBM Bug Proxy 2021-08-11 13:30:50 UTC
------- Comment From cdeadmin.com 2021-08-11 09:23 EDT-------
cde00 (cdeadmin.com) added native attachment /tmp/AIXOS13394811/wcwsp3_Calltraces.txt on 2021-08-11 08:22:57

Comment 29 IBM Bug Proxy 2021-08-11 14:11:09 UTC
------- Comment From preeti.thakur.com 2021-08-11 10:08 EDT-------
rasied a defect for issue as mentioned in comment 30

https://bugzilla.linux.ibm.com/show_bug.cgi?id=193955

Comment 30 Eric Sandeen 2021-08-11 21:25:44 UTC
Your call traces are likely the same problem as reported in:

https://bugzilla.kernel.org/show_bug.cgi?id=210749

Comment 31 Eric Sandeen 2021-08-11 21:28:46 UTC
As for the original XFS issue, we can close this now, yes? The original issue was not a RHEL bug per se, but a firmware compatibility question... this issue should, however, be documented (or even programatically tested at install time) I think.

Comment 32 IBM Bug Proxy 2021-08-12 13:50:56 UTC
------- Comment From kalshett.com 2021-08-12 09:46 EDT-------
(In reply to comment #28)
> Hi Preeti, Kalpana,
> Can you please flash below FW on witherspoon and then try to install RHEL 9
> and see if it works fine or not?
> https://github.com/open-power/op-build/releases/download/v2.7/witherspoon.
> pnor

@Preeti: The above version shown v2.7 pnor but from your testing I see the the version shown as v2.6
witherspoon-OP9-v2.6-9.93

Can you please confirm did you tested pnor posted by Vasanth ?
https://github.com/open-power/op-build/releases/download/v2.7/witherspoon.pnor

Also, IMO, it is always better to first recreate the original issue that is reported on this defect and apply the pnor suggested by Vasanth and see original reported  (this defect) is not seen.

Comment 33 IBM Bug Proxy 2021-08-12 14:30:43 UTC
------- Comment From kalshett.com 2021-08-12 10:25 EDT-------
(In reply to comment #28)
> Hi Preeti, Kalpana,
> Can you please flash below FW on witherspoon and then try to install RHEL 9
> and see if it works fine or not?
> https://github.com/open-power/op-build/releases/download/v2.7/witherspoon.
> pnor
> -Vasant

Vasanth: From IBM internal builds we can get v2.6 from below link:
https://rchweb.rchland.ibm.com/afs/rchland.ibm.com/projects/esw/op999/Builds/999.2132.20210810n/images/lab/witherspoon/

So how do we apply the witherspoon.pnor alone from the git link that you have posted?
I.e, https://github.com/open-power/op-build/releases/tag/v2.7

Comment 34 IBM Bug Proxy 2021-08-12 14:40:43 UTC
------- Comment From preeti.thakur.com 2021-08-12 10:36 EDT-------
in reply to comment 35
was in sync with Vasant and he confirmed for go ahead.
will restest incase of any discrepancy

Comment 35 IBM Bug Proxy 2021-08-12 14:50:39 UTC
------- Comment From preeti.thakur.com 2021-08-12 10:43 EDT-------
below tar was provided for update which was used

https://rchweb.rchland.ibm.com/afs/rchland.ibm.com/projects/esw/op999/Builds/999.2132.20210810n/images/lab/witherspoon/witherspoon.pnor.squashfs.tar

Comment 36 IBM Bug Proxy 2021-08-16 05:50:34 UTC
------- Comment From hegdevasant.com 2021-08-16 01:41 EDT-------
Ok.. test was done using upstream op-build v2.7 ... where we have rebased petitboot kernel to 5.10.x .. This was to double check whether 5.10 kernel works fine or not.

This is *not* official released firmware. We are working with IBM program management to fix official firmware .. which will take some time.

In the meantime if you want to test you above mentioned upstream firmware (or just rebase petitboot kernel to 5.10.x).

-Vasant

Comment 37 IBM Bug Proxy 2021-08-16 06:10:38 UTC
------- Comment From preeti.thakur.com 2021-08-16 02:08 EDT-------
thanks alot Vasant for update.

Since fix is already applied in said machine ie wcwsp3 and its working, we can continue to test here.
for other systems we can apply the fix provided and continue with our testing.
we can keep this defect open till the time firmware is official released.

Thanks

Comment 38 IBM Bug Proxy 2021-08-17 08:11:09 UTC
------- Comment From hegdevasant.com 2021-08-17 04:04 EDT-------
(In reply to comment #34)
> Your call traces are likely the same problem as reported in:
>
> https://bugzilla.kernel.org/show_bug.cgi?id=210749

I guess this is different from call traces/kernel PANIC hit during installation.
In above defect we hit call traces as we hit duplicate sysfs node name, but system continues to boot.
But in case of LTC bz 193955 (which is not-yet-mirrored to RedHat bugzilla) we hit PANIC during installation.

-Vasant

Comment 39 IBM Bug Proxy 2021-08-17 18:50:34 UTC
------- Comment From gusld.com 2021-08-17 14:40 EDT-------
(In reply to comment #39)
> Ok.. test was done using upstream op-build v2.7 ... where we have rebased
> petitboot kernel to 5.10.x .. This was to double check whether 5.10 kernel
> works fine or not.
>
> This is *not* official released firmware. We are working with IBM program
> management to fix official firmware .. which will take some time.

Thanks Vasant!

Red Hat,

Even though we are working to get a firmware fix released, we should consider that customers may have P9 systems with old firmware... maybe we should still have a check done during the RHEL9.0 installation to detect old firmware and, in this case, either enable the workaround or abort the installation with an instructive message so that customers know what to do.

Comment 40 Eric Sandeen 2021-08-17 19:02:13 UTC
(In reply to IBM Bug Proxy from comment #39)

> Red Hat,
> 
> Even though we are working to get a firmware fix released, we should
> consider that customers may have P9 systems with old firmware... maybe we
> should still have a check done during the RHEL9.0 installation to detect old
> firmware and, in this case, either enable the workaround or abort the
> installation with an instructive message so that customers know what to do.

I agree. Can you file a bug against Anaconda for this issue, and include detailed steps for how the installer can detect teh installed firmware version, and which version it should look for?

Thanks,
-Eric

Comment 41 IBM Bug Proxy 2021-08-19 10:01:07 UTC
------- Comment From hegdevasant.com 2021-08-19 05:59 EDT-------
Hello RedHat,

By any chance do you know the xfs patches that caused this regression? Any idea how easy/difficult to backport them to 5.3/5.4 kernel?

If its something fairly easy then may be its worth to backport those patches instead of rebeasng. That's the other option I'm thinking of.

-Vasant

Comment 42 Eric Sandeen 2021-08-19 16:12:10 UTC
(In reply to IBM Bug Proxy from comment #41)
> ------- Comment From hegdevasant.com 2021-08-19 05:59 EDT-------
> Hello RedHat,
> 
> By any chance do you know the xfs patches that caused this regression? Any
> idea how easy/difficult to backport them to 5.3/5.4 kernel?
> 
> If its something fairly easy then may be its worth to backport those patches
> instead of rebeasng. That's the other option I'm thinking of.
> 
> -Vasant

The timestamp series was 25-30 patches, and the inode btree counter was about 10 patches.  There may also be some dependencies; I'd have to look more closely.  It's not something we would probably consider backporting out of sequence ourselves, just for what it's worth.

Does the firmware need to mount the host filesystem in read-write, or in readonly mode?  The inode btree counters are at least RO compatible.

Does the firmware care about timestamps on this filesystem at all?  If all it needs to do is find boot image blocks, it may not care, and we could do something more targeted to just ignore the feature if timestamps don't matter. If you need to write to it, or check timestamps, that won't be an option.

We still need to know how to check for firmware version, so that the installer can at least warn the user if their current firmware is known to be incompatible, can you please provide that info as well?

Thanks,
-Eric

Comment 43 Dan Horák 2021-08-19 16:25:22 UTC
The firmware is a Linux (mini-)distro based on the buildroot project giving you access to petitboot (kexec bootloader as a userspace app), a shell and other tools. I'm not sure about the need of RO or RW access.

The kernel version is stored in the device-tree in /proc/device-tree/ibm,firmware-versions/linux

Comment 44 IBM Bug Proxy 2021-08-20 07:40:52 UTC
------- Comment From hegdevasant.com 2021-08-20 03:30 EDT-------
(In reply to comment #46)
> The firmware is a Linux (mini-)distro based on the buildroot project giving
> you access to petitboot (kexec bootloader as a userspace app), a shell and
> other tools. I'm not sure about the need of RO or RW access.
> The kernel version is stored in the device-tree in
> /proc/device-tree/ibm,firmware-versions/linux

AFAIK we mount boot partition RO only and parse grub.cfg to detected all installed kernels.

-Vasant

Comment 45 IBM Bug Proxy 2021-08-24 15:01:21 UTC
------- Comment From gusld.com 2021-08-24 10:56 EDT-------
(In reply to comment #46)
> The firmware is a Linux (mini-)distro based on the buildroot project giving
> you access to petitboot (kexec bootloader as a userspace app), a shell and
> other tools. I'm not sure about the need of RO or RW access.
> The kernel version is stored in the device-tree in
> /proc/device-tree/ibm,firmware-versions/linux

Vasant,
Is this the standard/recommended way to query the firmware version? This doesn't seem to be present at least on my ZZ bare metal.

Comment 46 Dan Horák 2021-08-24 17:14:30 UTC
Another way could be running "lsmcode", but it's reading the device tree if I see right ...

IIRC the "ibm,firmware-versions" entry was only added to skiboot at some point, it wasn't always there. Looking at git history it came with skiboot 5.9 (https://github.com/open-power/skiboot/blob/master/doc/release-notes/skiboot-5.9.rst)

It's documented in https://github.com/open-power/skiboot/blob/master/doc/device-tree/ibm%2Cfirmware-versions.rst

Comment 49 Eric Sandeen 2021-08-25 22:01:21 UTC
I've filed bug #1997832 against Anaconda - we will need to detect the petitboot version in any case. Whether we recommend an upgrade or implement some sort of workaround is yet to be determined.

As this is not really an XFS bug per se, we may eventually close this bug in favor of the Anaconda bug, which is the only place a remedy can really occur.

Thanks,
-Eric

Comment 54 Gustavo Luiz Duarte (IBM) 2021-09-29 19:11:49 UTC
We now have a firmware build for P9 Witherspoon systems with petitboot kernel 5.10... it is available for Red Hat at:

http://rhel8gduarteibm.usersys.redhat.com/fw/OP940_2138A-prod/

We should update a Witherspoon system and verify that this firmware update fixes the issue.

Comment 55 Gustavo Luiz Duarte (IBM) 2021-09-29 19:26:29 UTC
(In reply to Gustavo Luiz Duarte (IBM) from comment #54)
> We now have a firmware build for P9 Witherspoon systems with petitboot
> kernel 5.10... it is available for Red Hat at:
> 
> http://rhel8gduarteibm.usersys.redhat.com/fw/OP940_2138A-prod/
> 
> We should update a Witherspoon system and verify that this firmware update
> fixes the issue.

I should have mentioned that this is a production signed firmware image and should be installed on a system which currently has a production signed firmware... there are instructions on the above URL on how to check the currently installed firmware image using the CheckSecurityLevel.sh script (available at https://docs.engineering.redhat.com/display/KE/PowerPC#PowerPC-Signed/Unsignedfirmware) and how to proceed with the firmware upgrade.

If a development signed (aka. unsigned, aka. imprint) image is required, please let me know.

Comment 56 IBM Bug Proxy 2021-09-29 19:28:04 UTC
------- Comment From chavez.com 2021-09-29 15:23 EDT-------
*** Bug 193955 has been marked as a duplicate of this bug. ***

Comment 57 Arrash 2021-09-29 20:20:18 UTC
Thank you, Gustavo.  I am updating ibm-p9wr-03.ibm2 with this new firmware and will report back shortly

Comment 58 Arrash 2021-09-29 22:15:15 UTC
I can confirm that with the firmware Gustavo provided we are able to install RHEL9 on Witherspoon systems that have production firmware.

Comment 63 Eric Sandeen 2021-11-03 15:20:50 UTC
I'm going to close this as NOTABUG, for xfs/xfsprogs in any case.  The issue is on the petitboot firmware side.  The Anaconda bug should have implemented changes to handle this gracefully and notify the user:  https://bugzilla.redhat.com/show_bug.cgi?id=1997832

Comment 69 Red Hat Bugzilla 2023-09-15 01:34:48 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 365 days


Note You need to log in before you can comment on or make changes to this bug.