Bug 1731069 - After reboot during system installation, the system can not find the boot option wrote into the disk [Power8]
Summary: After reboot during system installation, the system can not find the boot opt...
Keywords:
Status: NEW
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: grub2
Version: 7.7
Hardware: ppc64le
OS: Linux
urgent
urgent
Target Milestone: rc
: 7.7
Assignee: Bootloader engineering team
QA Contact: Release Test Team
URL:
Whiteboard:
Depends On:
Blocks: 1689150 1689420 1776446
TreeView+ depends on / blocked
 
Reported: 2019-07-18 09:33 UTC by Ping Zhang
Modified: 2020-01-13 19:01 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:
Target Upstream Version:


Attachments (Terms of Use)
the full failed console.log (234.60 KB, text/plain)
2019-07-18 09:33 UTC, Ping Zhang
no flags Details
successful console.log (411.90 KB, text/plain)
2019-07-18 09:34 UTC, Ping Zhang
no flags Details
Successful_Petitboot-v1.4.4-91eed07_grub.cfg (4.46 KB, text/plain)
2019-08-16 09:53 UTC, Ping Zhang
no flags Details
Successful-Petitboot-v1.4.4-91eed07.grub.cfg (4.46 KB, text/plain)
2019-08-16 10:06 UTC, Ping Zhang
no flags Details
Part1 of RH-Bug1731069.tar.gz (4.00 MB, application/gzip)
2019-08-30 09:19 UTC, Ping Zhang
no flags Details
Part2 of RH-Bug1731069.tar.gz (4.00 MB, application/octet-stream)
2019-08-30 09:20 UTC, Ping Zhang
no flags Details
Part3 of RH-Bug1731069.tar.gz (4.00 MB, application/octet-stream)
2019-08-30 09:22 UTC, Ping Zhang
no flags Details
Part4 of RH-Bug1731069.tar.gz (3.88 MB, application/octet-stream)
2019-08-30 09:23 UTC, Ping Zhang
no flags Details


Links
System ID Priority Status Summary Last Updated
IBM Linux Technology Center 179313 None None None 2019-08-01 20:22:18 UTC

Description Ping Zhang 2019-07-18 09:33:43 UTC
Created attachment 1591734 [details]
the full failed console.log

Description of problem:
When i ran my tests for RHEL-7.7 on p8 systems, the test always failed because the system can not install successfully. It always boot from network interface,but the disk with 
wrote boot option,as below:
 Petitboot (v1.4.4-e1658ec)    
    
 qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq     
     
  System information    
  System configuration    
  System status log    
  Language    
  Rescan devices    
  Retrieve config from URL    
 *     
 Exit to shell               
    
   
 qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq     
    
  Enter=accept, e=edit, n=new, x=exit, l=language, g=log, h=help    
 Welcome to Petitboot  Info: Waiting for device discovery   
      
8286-42A 103519V     
 [enP3p5s0f0] Configuring with DHCP       
Processing DHCP lease response (ip: 10.19.15.81)       
Requesting config tftp://10.19.42.13/bootloader/netqe-p8-02.knqe. [?7ll    
      
     
 M M      
     
[Network: enP3p5s0f0 / 40:f2:e9:5a:44:fc]   
    
netboot enP3p5s0f0 (pxelinux.0)       
   1 downloads in progress...  
  [enP3p5s0f0] Failed to download tftp://10.19.42.13/bootloader/netqe-p8-02.knqe [?7l.    
[-- MARK -- Wed May 22 19:25:00 2019] 
[-- MARK -- Wed May 22 19:30:00 2019] 
What's more, it will cause the system to hang.

However, when i arrange test for RHEL8.1.0 and RHEL7.6 on these systems,
It works well, the testing ran smoothly on these P8 systems.

Version-Release number of selected component (if applicable):
RHEL7.7, even the latest RHEL7.7 it also will encounter this problem.

How reproducible:
Arrange some multihost test on two p8 systems, or just provision two p8 systems via beaker xml.
Here are my multihost jobs:
https://beaker.engineering.redhat.com/jobs/3599302
sometimes the singlehost job also will reproduce this issue:
https://beaker.engineering.redhat.com/jobs/3614795

Steps to Reproduce:
1. submit a job with some multihost task
2. check the system installation of two system

Actual results:
the system can not find the boot option on disk after reboot during system installation, Which caused
6 times multihost testcase for RHEL7.7, failed 5 times 

Expected results:
the system can boot from disk after reboot during system installation. 

Additional info:
hosts:
netqe-p8-01.knqe.lab.eng.bos.redhat.com 
netqe-p8-02.knqe.lab.eng.bos.redhat.com

Comment 2 Ping Zhang 2019-07-18 09:34:32 UTC
Created attachment 1591735 [details]
successful console.log

Comment 3 Javier Martinez Canillas 2019-07-18 15:15:29 UTC
Can you please share the /boot/grub2/grub.cfg and /boot/grub2/grubenv files for the successful and failing cases?

Also, I noticed that you are using a different Petitboot version:

 - Petitboot (v1.4.4-91eed07) in the successful case.
 - Petitboot (v1.4.4-e1658ec) in the failing case.

Could you please test with the same machine and Petitboot version just to make sure that the problem is not in the OPAL firmware? Since the grub.cfg and grubenv files are parsed by Petitboot and not grub2 for ppc64le PowerNV (Non-Virtualized). It still could be though that the grub tools are not generating a correct grub config file in the failing case.

Comment 4 IBM Bug Proxy 2019-08-01 20:20:18 UTC
------- Comment From diegodo@br.ibm.com 2019-08-01 16:13 EDT-------
Hi,

please, could you provide the files asked in the previous comment?

Thanks

Comment 5 IBM Bug Proxy 2019-08-14 18:40:27 UTC
------- Comment From mbringm@us.ibm.com 2019-08-14 14:39 EDT-------
RedHat:
We need more information here.  We do not have access to the beaker environment
for replication or debugging.

* Can you generate a sosreport for the platform?
* Firmware versions
* PowerNV or PowerVM configuration
* Adapter configuration

Also, there was no response to Frank's question/request regarding the 2 different versions of petitboot
that were observed.

Comment 6 IBM Bug Proxy 2019-08-15 14:32:47 UTC
------- Comment From mbringm@us.ibm.com 2019-08-14 14:40 EDT-------
RedHat:
Since RHEL 7.7 has gone to GA, is this still an issue?

Comment 7 Ping Zhang 2019-08-16 09:53:55 UTC
Created attachment 1604322 [details]
Successful_Petitboot-v1.4.4-91eed07_grub.cfg

Comment 8 Ping Zhang 2019-08-16 10:06:54 UTC
Created attachment 1604324 [details]
Successful-Petitboot-v1.4.4-91eed07.grub.cfg

Comment 9 Ping Zhang 2019-08-16 10:15:40 UTC
(In reply to Javier Martinez Canillas from comment #3)
> Can you please share the /boot/grub2/grub.cfg and /boot/grub2/grubenv files
> for the successful and failing cases?
> 
> Also, I noticed that you are using a different Petitboot version:
> 
>  - Petitboot (v1.4.4-91eed07) in the successful case.
>  - Petitboot (v1.4.4-e1658ec) in the failing case.
> 
> Could you please test with the same machine and Petitboot version just to
> make sure that the problem is not in the OPAL firmware? Since the grub.cfg
> and grubenv files are parsed by Petitboot and not grub2 for ppc64le PowerNV
> (Non-Virtualized). It still could be though that the grub tools are not
> generating a correct grub config file in the failing case.

for the successful cases, i uploaded two grub.cfg, and the grubenv is as below:
cat /boot/grub2/grubenv
# GRUB Environment Block
saved_entry=Red Hat Enterprise Linux Server (3.10.0-1062.el7.ppc64le) 7.7 (Maipo)
#####################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################################

for the failed cases, I can not found the /boot directory, so that there are no grub.cfg or grubenv file.

Comment 10 Ping Zhang 2019-08-16 11:28:01 UTC
(In reply to IBM Bug Proxy from comment #5)
> ------- Comment From mbringm@us.ibm.com 2019-08-14 14:39 EDT-------
> RedHat:
> We need more information here.  We do not have access to the beaker
> environment
> for replication or debugging.
> 
> * Can you generate a sosreport for the platform?
> * Firmware versions
> * PowerNV or PowerVM configuration
> * Adapter configuration
> 
> Also, there was no response to Frank's question/request regarding the 2
> different versions of petitboot
> that were observed.

I generate the sosreport for this two systems, but can not upload to bugzilla,
I did not found some good method to share these file with you.

Comment 11 IBM Bug Proxy 2019-08-16 14:50:41 UTC
------- Comment From chavez@us.ibm.com 2019-08-16 10:45 EDT-------
Hello,

Please try https://testcase.software.ibm.com and login with anonymous and no password. Navigate to the /toibm/linux directory and upload the files there. Once done, please add a comment here with the name of the file(s) uploaded.

Comment 12 Ping Zhang 2019-08-20 11:03:31 UTC
(In reply to IBM Bug Proxy from comment #11)
> ------- Comment From chavez@us.ibm.com 2019-08-16 10:45 EDT-------
> Hello,
> 
> Please try https://testcase.software.ibm.com and login with anonymous and no
> password. Navigate to the /toibm/linux directory and upload the files there.
> Once done, please add a comment here with the name of the file(s) uploaded.

I create a tar file named RH-Bug1731069.tar which includes two sosreports of
these two systems. And I think i uploaded it two the /toibm/linux directory,
but i am not sure, because i can not read that directory.

Comment 13 IBM Bug Proxy 2019-08-20 15:42:51 UTC
------- Comment From mbringm@us.ibm.com 2019-08-20 11:35 EDT-------
(In reply to comment #16)
> (In reply to IBM Bug Proxy from comment #11)
> > Hello,
> >
> > Please try https://testcase.software.ibm.com and login with anonymous and no
> > password. Navigate to the /toibm/linux directory and upload the files there.
> > Once done, please add a comment here with the name of the file(s) uploaded.
> I create a tar file named RH-Bug1731069.tar which includes two sosreports of
> these two systems. And I think i uploaded it two the /toibm/linux directory,
> but i am not sure, because i can not read that directory.

I just checked that directory, but do not see any file with the name RH-Bug1731069.tar.
Did you select the file locally on your system with 'Browse' before using the 'Upload (binary)' button?

Comment 14 Ping Zhang 2019-08-21 02:09:32 UTC
(In reply to IBM Bug Proxy from comment #13)
> ------- Comment From mbringm@us.ibm.com 2019-08-20 11:35 EDT-------
> (In reply to comment #16)
> > (In reply to IBM Bug Proxy from comment #11)
> > > Hello,
> > >
> > > Please try https://testcase.software.ibm.com and login with anonymous and no
> > > password. Navigate to the /toibm/linux directory and upload the files there.
> > > Once done, please add a comment here with the name of the file(s) uploaded.
> > I create a tar file named RH-Bug1731069.tar which includes two sosreports of
> > these two systems. And I think i uploaded it two the /toibm/linux directory,
> > but i am not sure, because i can not read that directory.
> 
> I just checked that directory, but do not see any file with the name
> RH-Bug1731069.tar.
> Did you select the file locally on your system with 'Browse' before using
> the 'Upload (binary)' button?
Maybe the file is too large, I try to upload the compressed version of it, it seems 
successful at first, without the Error 403.
Maybe you can found the file named RH-Bug1731069.zip or RH-Bug1731069.tar.gz in this 
directory.

Comment 15 IBM Bug Proxy 2019-08-21 14:20:27 UTC
------- Comment From mbringm@us.ibm.com 2019-08-21 10:15 EDT-------
> Maybe the file is too large, I try to upload the compressed version of it,
> it seems successful at first, without the Error 403.
> Maybe you can found the file named RH-Bug1731069.zip or RH-Bug1731069.tar.gz
> in this directory.

Don't see either file.  Investigating problem.

Comment 16 IBM Bug Proxy 2019-08-21 16:31:09 UTC
------- Comment From chavez@us.ibm.com 2019-08-21 12:26 EDT-------
While we figure out what is going on with testcase, if it is just sosreports you want to provide and they exceed the attachment size limit, consider using the split command, e.g.

split -b 4M sosreport.tar.xz

and let us know the order of the pieces and we can re-assemble them with

cat part1 part2 part3 > sosreport.tar.xz

Comment 17 Ping Zhang 2019-08-30 09:19:37 UTC
Created attachment 1609811 [details]
Part1 of RH-Bug1731069.tar.gz

Comment 18 Ping Zhang 2019-08-30 09:20:27 UTC
Created attachment 1609812 [details]
Part2 of RH-Bug1731069.tar.gz

Comment 19 Ping Zhang 2019-08-30 09:22:06 UTC
Created attachment 1609813 [details]
Part3 of RH-Bug1731069.tar.gz

Comment 20 Ping Zhang 2019-08-30 09:23:36 UTC
Created attachment 1609815 [details]
Part4 of RH-Bug1731069.tar.gz

Comment 21 IBM Bug Proxy 2019-09-09 20:00:44 UTC
------- Comment From diegodo@br.ibm.com 2019-09-09 15:50 EDT-------
Hi RedHat

I think we should put some RHEL installer maintainer in CC of this bug.

Per the previous messages, we don't have the /boot dir in the failing cases which suggests that it could be a result of a failure during the installation and possibly due some grub issue.

Is it possible to have the complete log of Anaconda Installer of the failing case? Maybe we could get some hint about what is getting wrong in this step.

THanks

Comment 22 IBM Bug Proxy 2019-09-23 13:31:24 UTC
------- Comment From diegodo@br.ibm.com 2019-09-23 09:28 EDT-------
Hi,

what are the next steps here?

I think we should consider to put the installer maintainers here, so we can try to understand why we dont have the /boot dir after the installation..

Thanks

Comment 23 Javier Martinez Canillas 2019-10-14 13:42:49 UTC
(In reply to Javier Martinez Canillas from comment #3)

[snip]

> 
>  - Petitboot (v1.4.4-91eed07) in the successful case.
>  - Petitboot (v1.4.4-e1658ec) in the failing case.
> 
> Could you please test with the same machine and Petitboot version just to
> make sure that the problem is not in the OPAL firmware? Since the grub.cfg
> and grubenv files are parsed by Petitboot and not grub2 for ppc64le PowerNV
> (Non-Virtualized). It still could be though that the grub tools are not
> generating a correct grub config file in the failing case.

There was never an answer to this question as far as I can tell. Since the bootloader is not controlled by the OS for ppc64le OPAL, it would be good to test using the same Petitboot version to make sure that the problem is not in the bootloader.

By the /boot directory not found, do you mean that the directory does not exist at all or that the boot partition can't be mounted on that directory (do you have a boot partition or only a root partition with a /boot directory)?

Comment 24 IBM Bug Proxy 2019-11-18 16:00:52 UTC
------- Comment From mbringm@us.ibm.com 2019-11-18 10:56 EDT-------
RedHat:
Any update on this one?

Comment 25 IBM Bug Proxy 2019-11-20 19:07:49 UTC
------- Comment From diegodo@br.ibm.com 2019-11-20 14:00 EDT-------
(In reply to comment #28)
> (In reply to Javier Martinez Canillas from comment #3)
> [snip]
> >
> >  - Petitboot (v1.4.4-91eed07) in the successful case.
> >  - Petitboot (v1.4.4-e1658ec) in the failing case.
> >
> > Could you please test with the same machine and Petitboot version just to
> > make sure that the problem is not in the OPAL firmware? Since the grub.cfg
> > and grubenv files are parsed by Petitboot and not grub2 for ppc64le PowerNV
> > (Non-Virtualized). It still could be though that the grub tools are not
> > generating a correct grub config file in the failing case.
> There was never an answer to this question as far as I can tell. Since the
> bootloader is not controlled by the OS for ppc64le OPAL, it would be good to
> test using the same Petitboot version to make sure that the problem is not
> in the bootloader.
> By the /boot directory not found, do you mean that the directory does not
> exist at all or that the boot partition can't be mounted on that directory
> (do you have a boot partition or only a root partition with a /boot
> directory)?

I'm assuming the partition can't be mounted, but it would be better to wait the answer from Ping.

@Ping could you please confirm which is the scenario we do have here?

Thanks!

Comment 26 Ping Zhang 2019-12-19 07:07:08 UTC
(In reply to IBM Bug Proxy from comment #25)
> ------- Comment From diegodo@br.ibm.com 2019-11-20 14:00 EDT-------
> (In reply to comment #28)
> > (In reply to Javier Martinez Canillas from comment #3)
> > [snip]
> > >
> > >  - Petitboot (v1.4.4-91eed07) in the successful case.
> > >  - Petitboot (v1.4.4-e1658ec) in the failing case.
> > >
> > > Could you please test with the same machine and Petitboot version just to
> > > make sure that the problem is not in the OPAL firmware? Since the grub.cfg
> > > and grubenv files are parsed by Petitboot and not grub2 for ppc64le PowerNV
> > > (Non-Virtualized). It still could be though that the grub tools are not
> > > generating a correct grub config file in the failing case.
> > There was never an answer to this question as far as I can tell. Since the
> > bootloader is not controlled by the OS for ppc64le OPAL, it would be good to
> > test using the same Petitboot version to make sure that the problem is not
> > in the bootloader.
> > By the /boot directory not found, do you mean that the directory does not
> > exist at all or that the boot partition can't be mounted on that directory
> > (do you have a boot partition or only a root partition with a /boot
> > directory)?
> 
> I'm assuming the partition can't be mounted, but it would be better to wait
> the answer from Ping.
> 
> @Ping could you please confirm which is the scenario we do have here?
> 
> Thanks!

When i caught this problem, I am sorry about that for the Petitboot version of these
system, I do not have the time to change it. 
for the scenario, I only have a root partition with a /boot directory, when i encountered 
this problem.

Comment 27 IBM Bug Proxy 2020-01-13 19:01:54 UTC
------- Comment From diegodo@br.ibm.com 2020-01-13 13:52 EDT-------
Hi Ping, is the problem still occurring? Could you please check if it works on Petitboot (v1.4.4-91eed07)?

Thank you


Note You need to log in before you can comment on or make changes to this bug.