Bug 1388653

Summary: REAR utility fails in RHEL 7.2/7.3 with "ERROR: BUG BUG BUG! Could not determine size of disk sdp/sdp1, please file a bug"
Product: Red Hat Enterprise Linux 7 Reporter: Blake Powers <bpowers>
Component: rearAssignee: Pavel Cahyna <pcahyna>
Status: CLOSED ERRATA QA Contact: Tereza Cerna <tcerna>
Severity: high Docs Contact: Petr Bokoc <pbokoc>
Priority: unspecified    
Version: 7.2CC: bpowers, bubrown, cww, gratien.dhaese, jaeshin, jmazanek, jmoon, lhh, mmezynsk, ovasik, pbokoc, pcahyna, rdutta, rjh405, rmetrich, shivgupt, snagar, tcerna
Target Milestone: rc   
Target Release: 7.2   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: rear-2.00-4.el7 Doc Type: Bug Fix
Doc Text:
*ReaR* no longer fails to determine disk size during a "mkrescue" operation Previously, the *ReaR* (Relax-and-Recover) utility sometimes encountered a failure while querying partition sizes when saving the disk layout due to a race condition with `udev`. As a consequence, the "mkrescue" operation failed with the following message: ERROR: BUG BUG BUG! Could not determine size of disk Therefore it was not possible to create the rescue image. The bug has been fixed, and rescue image creation now works as expected.
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-04-10 18:43:20 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1298243, 1393870, 1400961, 1420851, 1465925, 1472751    
Attachments:
Description Flags
fireball.log
none
upstream PR 1418 none

Description Blake Powers 2016-10-25 20:48:18 UTC
Created attachment 1214062 [details]
fireball.log

Problem Description
-----------------------------
rear command fails with Could not determine size of disk error message on RHEL 7.2 & 7.3: 

[root@fireball rear]# rear -vdD mkrescue
Relax-and-Recover 1.17.2 / Git
Using log file: /var/log/rear/rear-fireball.log
Creating disk layout
ERROR: BUG BUG BUG!  Could not determine size of disk sdp/sdp1, please file a bug. 
=== Issue report ===
Please report this unexpected issue at: https://github.com/rear/rear/issues
Also include the relevant bits from /var/log/rear/rear-fireball.log

HINT: If you can reproduce the issue, try using the -d or -D option !
====================
Aborting due to an error, check /var/log/rear/rear-fireball.log for details
You should also rm -Rf /tmp/rear.nl4OmeO6rVvIqBz
Terminated


Steps to reproduce 
-----------------------------------
1.Install rear utility on an RHEL 7.2/7.3 System: 
2. Make a backup using the rear command: '# rear -v mkbackup' 


-- Other Details ---
[root@fireball ~]# cat /etc/*release
NAME="Red Hat Enterprise Linux Server"
VERSION="7.3 (Maipo)"
ID="rhel"
ID_LIKE="fedora"
VERSION_ID="7.3"
PRETTY_NAME="Red Hat Enterprise Linux Server 7.3 Beta (Maipo)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:redhat:enterprise_linux:7.3:beta:server"
HOME_URL="https://www.redhat.com/"
BUG_REPORT_URL="https://bugzilla.redhat.com/"

REDHAT_BUGZILLA_PRODUCT="Red Hat Enterprise Linux 7"
REDHAT_BUGZILLA_PRODUCT_VERSION=7.3
REDHAT_SUPPORT_PRODUCT="Red Hat Enterprise Linux"
REDHAT_SUPPORT_PRODUCT_VERSION="7.3 Beta"
Red Hat Enterprise Linux Server release 7.3 Beta (Maipo)
Red Hat Enterprise Linux Server release 7.3 Beta (Maipo)


[root@fireball ~]# rpm -qi rear
Name        : rear
Version     : 1.17.2
Release     : 6.el7
Architecture: x86_64
Install Date: Wed 14 Sep 2016 12:57:39 PM CDT
Group       : Applications/File
Size        : 933655
License     : GPLv3
Signature   : RSA/SHA256, Wed 27 Jul 2016 11:07:36 AM CDT, Key ID 938a80caf21541eb
Source RPM  : rear-1.17.2-6.el7.src.rpm
Build Date  : Tue 19 Jul 2016 03:59:14 AM CDT
Build Host  : x86-034.build.eng.bos.redhat.com
Relocations : (not relocatable)
Packager    : Red Hat, Inc. <http://bugzilla.redhat.com/bugzilla>
Vendor      : Red Hat, Inc.
URL         : http://relax-and-recover.org/
Summary     : Relax-and-Recover is a Linux disaster recovery and system migration tool

[root@ltcfbl6fb10 ~]# rear -v mkbackup
Relax and Recover 1.12.0 / 2011-11-22 10:21:35 +0100
Creating disk layout
ERROR: BUG BUG BUG!  Could not determine size of disk sda/sda11049kB5243kB4194kBprimaryboot,prep, please file a bug. 
	Please report this as a bug to the authors of Relax and Recover
ERROR: BUG BUG BUG!  Could not determine start of partition sda11049kB5243kB4194kBprimaryboot,prep, please file a bug. 
	Please report this as a bug to the authors of Relax and Recover
ERROR: BUG BUG BUG!  Could not determine size of disk sda/sda25243kB530MB524MBprimaryext4, please file a bug. 
	Please report this as a bug to the authors of Relax and Recover
ERROR: BUG BUG BUG!  Could not determine start of partition sda25243kB530MB524MBprimaryext4, please file a bug. 
	Please report this as a bug to the authors of Relax and Recover
ERROR: BUG BUG BUG!  Could not determine size of disk sda/sda3530MB26.8GB26.3GBprimarylvm, please file a bug. 
	Please report this as a bug to the authors of Relax and Recover
ERROR: BUG BUG BUG!  Could not determine start of partition sda3530MB26.8GB26.3GBprimarylvm, please file a bug. 
	Please report this as a bug to the authors of Relax and Recover
Aborting due to an error, check /tmp/rear-ltcfbl6fb10.log for details
Finished in 21 seconds
Terminated

Comment 3 Jakub Mazanek 2016-12-06 09:07:38 UTC
Blake,

can you please attach the rear configuration files from the system where you see the issue ?

Comment 4 Jakub Mazanek 2016-12-06 09:26:13 UTC
Please see the similar issue due to syntax error in configuration file as resolved in upstream https://github.com/rear/rear/issues/898

Comment 5 Blake Powers 2016-12-13 16:46:47 UTC
Good-morning, Jakub.


After reviewing the github bug I am not sure it matches the customers exact sitaution. So far, we have not seen any instances in the logs that would indicate there was an issue in /etc/rear/local.conf. Also, the 'BACKUP_PROG_EXCLUDE' parameter looks to be appropriately configured and set in the customers instance. 

> Were you able to find some that I might have accidentally looked over?

Much appreciated!
R.B.P.

Comment 7 Gratien D'haese 2017-01-19 09:44:10 UTC
Please re-run the rear test with debug option to see where it exactly fails

Comment 8 Gratien D'haese 2017-03-01 11:38:29 UTC
I cannot do much without the proper input like:
- content of /etc/rear/{local|site}.conf file
- output of mount -v
- output of 'rear -D savelayout' (well the rear log I need)
- /var/lib/rear/layout/disklayout.conf file

thanks,
Gratien

PS: you may also open an issue at our GitHub place https://github.com/rear/rear/issues and post the files there (and reference to this BZ)

Comment 13 Jakub Mazanek 2017-03-21 14:36:54 UTC
Jmoon,

could you please provide the customer's info requested by Gratien in #c8 ?

Thanks

Comment 27 Gratien D'haese 2017-09-19 07:17:54 UTC
Perhaps the fix https://github.com/rear/rear/commit/daaf5f05dbd1c0319441a198e3bad4982dedbeb9 could be of any help?

Comment 28 Bob Hall 2017-10-03 21:56:27 UTC
I'm experiencing what appears to be a similar error:

----
# yum list rear
Loaded plugins: langpacks
Installed Packages
rear.x86_64                                          1.17.2-9.el7_3
----
# cat /etc/rear/local.conf
# Default is to create Relax-and-Recover rescue media as ISO image
# set OUTPUT to change that
# set BACKUP to activate an automated (backup and) restore of your data
# Possible configuration values can be found in /usr/share/rear/conf/default.conf
#
# This file (local.conf) is intended for manual configuration. For configuration
# through packages and other automated means we recommend creating a new
# file named site.conf next to this file and to leave the local.conf as it is. 
# Our packages will never ship with a site.conf.

BACKUP=NSR
OUTPUT=ISO
OUTPUT_URL=file:///mnt/rescue_system

# vg_linux  => /dev/vg_linux/lv_root => /, swap
# vmaxssdvg => /dev/vmaxssdvg/vmaxlv => /vmax
# xiossdvg  => /dev/xiossdvg/xioslv  => /xios
ONLY_INCLUDE_VG=( 'vg_linux' )
EXCLUDE_MOUNTPOINTS=( '/vmax' '/xios' )
EXCLUDE_RECREATE=( '/vmax' '/xios' )
EXCLUDE_RESTORE=( '/vmax' '/xios' )
----
Running: rear -d -v mkrescue
----
The following was found in the /var/log/rear/rear*.log file:

...
2017-10-03 14:36:46.286011260 Creating disk layout
2017-10-03 14:36:46.287573064 Preparing layout directory.
2017-10-03 14:36:46.291445914 Removing old layout file.
2017-10-03 14:36:46.293187510 Including layout/save/GNU/Linux/15_save_diskbyid_mappings.sh
2017-10-03 14:36:46.565368291 Saved diskbyid_mappings
2017-10-03 14:36:46.566807424 Including layout/save/GNU/Linux/20_partition_layout.sh
2017-10-03 14:36:46.574814259 Saving disk partitions.
2017-10-03 14:36:46.651178811 ERROR: BUG BUG BUG!  Could not determine size of disk sda/sda1, please file a bug.
=== Issue report ===
Please report this unexpected issue at: https://github.com/rear/rear/issues
Also include the relevant bits from /var/log/rear/rear-blv-labsdx-03.log

HINT: If you can reproduce the issue, try using the -d or -D option !
====================
=== Stack trace ===
Trace 0: /sbin/rear:252 main
Trace 1: /usr/share/rear/lib/mkrescue-workflow.sh:27 WORKFLOW_mkrescue
Trace 2: /usr/share/rear/lib/framework-functions.sh:70 SourceStage
Trace 3: /usr/share/rear/lib/framework-functions.sh:31 Source
Trace 4: /usr/share/rear/layout/save/GNU/Linux/20_partition_layout.sh:261 source
Trace 5: /usr/share/rear/layout/save/GNU/Linux/20_partition_layout.sh:64 extract_partitions
Trace 6: /usr/share/rear/lib/layout-functions.sh:506 get_disk_size
Trace 7: /usr/share/rear/lib/_input-output-functions.sh:156 BugIfError
Trace 8: /usr/share/rear/lib/_input-output-functions.sh:144 BugError
Message: BUG BUG BUG!  Could not determine size of disk sda/sda1, please file a bug.
=== Issue report ===
Please report this unexpected issue at: https://github.com/rear/rear/issues
Also include the relevant bits from /var/log/rear/rear-blv-labsdx-03.log

HINT: If you can reproduce the issue, try using the -d or -D option !
====================
===================
2017-10-03 14:36:46.801496379 No partitions found on /dev/sdaa.
2017-10-03 14:36:46.854786881 No partitions found on /dev/sdab.
2017-10-03 14:36:47.059742458 No partitions found on /dev/sdad.
...

----
# cat /var/lib/rear/layout/disklayout.conf
disk /dev/sda 328944844800 gpt
part /dev/sda 262144000 1048576 EFI0x20System0x20Partition boot /dev/sda1
part /dev/sda 524288000 263192576 rear-noname none /dev/sda2
part /dev/sda 4294967296 787480576 rear-noname lvm /dev/sda3
part /dev/sda 323862134784 5082447872 rear-noname lvm /dev/sda4
disk /dev/sdaa 2949120 gpt
disk /dev/sdab 2949120 gpt
disk /dev/sdac 328944844800 gpt
part /dev/sdac 262144000 1048576 EFI0x20System0x20Partition boot /dev/sdac1
part /dev/sdac 524288000 263192576 rear-noname none /dev/sdac2
part /dev/sdac 4294967296 787480576 rear-noname lvm /dev/sdac3
part /dev/sdac 323862134784 5082447872 rear-noname lvm /dev/sdac4
disk /dev/sdad 328944844800 gpt
disk /dev/sdae 438593126400 gpt
disk /dev/sdaf 2949120 gpt
disk /dev/sdag 2949120 gpt
disk /dev/sdah 2949120 gpt
disk /dev/sdai 2949120 gpt
disk /dev/sdaj 2949120 gpt
disk /dev/sdak 2949120 gpt
disk /dev/sdal 328944844800 gpt
disk /dev/sdam 328944844800 gpt
disk /dev/sdan 438593126400 gpt
disk /dev/sdao 5898240 unknown
disk /dev/sdap 68720394240 gpt
disk /dev/sdb 328944844800 gpt
disk /dev/sdc 438593126400 gpt
disk /dev/sdd 328944844800 gpt
disk /dev/sde 328944844800 gpt
disk /dev/sdf 438593126400 gpt
disk /dev/sdg 328944844800 gpt
part /dev/sdg 262144000 1048576 EFI0x20System0x20Partition boot /dev/sdg1
part /dev/sdg 524288000 263192576 rear-noname none /dev/sdg2
part /dev/sdg 4294967296 787480576 rear-noname lvm /dev/sdg3
part /dev/sdg 323862134784 5082447872 rear-noname lvm /dev/sdg4
disk /dev/sdh 328944844800 gpt
disk /dev/sdi 438593126400 gpt
disk /dev/sdj 328944844800 gpt
part /dev/sdj 262144000 1048576 EFI0x20System0x20Partition boot /dev/sdj1
part /dev/sdj 524288000 263192576 rear-noname none /dev/sdj2
part /dev/sdj 4294967296 787480576 rear-noname lvm /dev/sdj3
part /dev/sdj 323862134784 5082447872 rear-noname lvm /dev/sdj4
disk /dev/sdk 328944844800 gpt
disk /dev/sdl 438593126400 gpt
part /dev/sdl 438591029248 1048576 primary none /dev/sdl1
disk /dev/sdm 68719476736 gpt
disk /dev/sdn 328944844800 gpt
part /dev/sdn 262144000 1048576 EFI0x20System0x20Partition boot /dev/sdn1
part /dev/sdn 524288000 263192576 rear-noname none /dev/sdn2
part /dev/sdn 4294967296 787480576 rear-noname lvm /dev/sdn3
part /dev/sdn 323862134784 5082447872 rear-noname lvm /dev/sdn4
disk /dev/sdo 328944844800 gpt
disk /dev/sdp 438593126400 gpt
part /dev/sdp 438591029248 1048576 primary none /dev/sdp1
disk /dev/sdq 328944844800 gpt
part /dev/sdq 262144000 1048576 EFI0x20System0x20Partition boot /dev/sdq1
part /dev/sdq 524288000 263192576 rear-noname none /dev/sdq2
part /dev/sdq 4294967296 787480576 rear-noname lvm /dev/sdq3
part /dev/sdq 323862134784 5082447872 rear-noname lvm /dev/sdq4
disk /dev/sdr 328944844800 gpt
disk /dev/sds 438593126400 gpt
part /dev/sds 438591029248 1048576 primary none /dev/sds1
disk /dev/sdt 5898240 unknown
disk /dev/sdu 68720394240 gpt
disk /dev/sdv 68719476736 gpt
part /dev/sdv 68717379584 1048576 primary none /dev/sdv1
disk /dev/sdw 2949120 gpt
disk /dev/sdx 2949120 gpt
disk /dev/sdy 2949120 gpt
disk /dev/sdz 2949120 gpt
----

It does look a lot like the issue patched in the GitHub link posted by Gratien D'haese on 2017-09-19 03:17:54 EDT.

Comment 29 Pavel Cahyna 2017-10-12 15:34:56 UTC
(In reply to Bob Hall from comment #28)
> I'm experiencing what appears to be a similar error:
> 
> ----
> # yum list rear
> Loaded plugins: langpacks
> Installed Packages
> rear.x86_64                                          1.17.2-9.el7_3

Can you please try to reproduce with rear-2.00 to see whether the issue is fixed now?

Comment 32 Bob Hall 2017-10-30 22:27:39 UTC
Sorry but I'll have to wait until the vendor updates their release before I can test (due to our security policy).

Comment 34 Gratien D'haese 2018-01-19 13:41:25 UTC
Was the fix from RH back-ported into ReaR upstream sources? I couldn't tell by just looking at this BZ

Comment 35 Pavel Cahyna 2018-01-19 13:54:00 UTC
The RH fix is backported upstream commit 3a2eefdd6ec0826946d49e9de14416eea2649f96.

Comment 36 Pavel Cahyna 2018-02-08 19:18:35 UTC
Created attachment 1393386 [details]
upstream PR 1418

Comment 37 Pavel Cahyna 2018-02-08 19:19:43 UTC
(In reply to Pavel Cahyna from comment #35)
> The RH fix is backported upstream commit
> 3a2eefdd6ec0826946d49e9de14416eea2649f96.

On a closer examination this does not look entirely correct, as there were subsequent commits upstream (PR#1418). https://github.com/rear/rear/pull/1418
To make our fix closer to the upstream one, I propose to apply the attached patch instead, which contains all the commits in PR#1418.

Comment 38 Pavel Cahyna 2018-02-15 10:52:42 UTC
(In reply to Bob Hall from comment #32)
> Sorry but I'll have to wait until the vendor updates their release before I
> can test (due to our security policy).

We would like to reproduce your issue before releasing an update in order to test it. Do you please have some hint on reproducing it? I see from your disklayout.conf that you have quite a lot of disks - is there something special about them? E.g. being attached via iSCSI or some other "less typical" way?

Comment 39 Bob Hall 2018-02-19 16:30:57 UTC
(In reply to Pavel Cahyna from comment #38)
> (In reply to Bob Hall from comment #32)
> > Sorry but I'll have to wait until the vendor updates their release before I
> > can test (due to our security policy).
> 
> We would like to reproduce your issue before releasing an update in order to
> test it. Do you please have some hint on reproducing it? I see from your
> disklayout.conf that you have quite a lot of disks - is there something
> special about them? E.g. being attached via iSCSI or some other "less
> typical" way?

Hello Pavel,

I currently have a ticket open with RedHat to try and troubleshoot the issue. Hopefully that will lead to a bug fix. Thank you.

Comment 40 Pavel Cahyna 2018-02-19 16:37:09 UTC
(In reply to Bob Hall from comment #39)
> (In reply to Pavel Cahyna from comment #38)
> > (In reply to Bob Hall from comment #32)
> > > Sorry but I'll have to wait until the vendor updates their release before I
> > > can test (due to our security policy).
> > 
> > We would like to reproduce your issue before releasing an update in order to
> > test it. Do you please have some hint on reproducing it? I see from your
> > disklayout.conf that you have quite a lot of disks - is there something
> > special about them? E.g. being attached via iSCSI or some other "less
> > typical" way?
> 
> Hello Pavel,
> 
> I currently have a ticket open with RedHat to try and troubleshoot the
> issue. Hopefully that will lead to a bug fix. Thank you.

Hello Bob,

I am the RedHat engineer assigned to this issue. We are working on a fix and it would help us if you could give us some hints on reproducing it because it was hard to reproduce for us, or alternatively if you could test a build with the attempted fix that I would gladly provide.

Comment 41 Bob Hall 2018-02-20 15:47:44 UTC
Hello Pavel,

For this issue, the RedHat engineer has had me testing with a simplified local.conf file that looks like this:

OUTPUT=ISO
AUTOEXCLUDE_MULTIPATH=n
ONLY_INCLUDE_VG=( 'vg_linux' )

The problem I'm having now is that our SD-X server won't boot off the resulting ISO file. While "Loading initial ramdisk" the server console displays the message: "error: failure reading sector 0x15ba0 from 'cd0'", where the 'cd0' is actually referring to the ISO being passed via a URL.

The engineer now suspects it may be a problem with our UEFI on the SD-X, and hence there isn't actually an issue with ReaR. For the moment we're taking that tack with the testing. Thank you.

Comment 42 Pavel Cahyna 2018-02-20 15:53:18 UTC
(In reply to Bob Hall from comment #41)

> The problem I'm having now is that our SD-X server won't boot off the
> resulting ISO file.

Wait, if you have an ISO file, it means that ReaR did not encounter the bug anymore while creating it? Because the original problem was that ReaR would not actually be able to create an ISO file.

Comment 43 Pavel Cahyna 2018-02-20 17:23:34 UTC
(In reply to Bob Hall from comment #41)
> Hello Pavel,
> 
> For this issue, the RedHat engineer has had me testing with a simplified
> local.conf file

Can you please tell that RedHat engineer to contact me? Thanks.

Comment 44 Bob Hall 2018-02-20 20:54:24 UTC
Thinking back on that a bit, I believe one of the issues was caused by me not having NSR_POOL_NAME set to the required value for our backup infrastructure. (It was also unclear to me whether I should be setting NSR_POOL_NAME or NSR_DEFAULT_POOL_NAME.) I was fiddling around with various parameters and at some point was able to get an ISO built and boot up to the rescue shell, but then the recover failed.

I've sent a message to the RedHat Engineer asking him to contact you.

Comment 45 Renaud Métrich 2018-02-21 08:52:23 UTC
@Pavel

I'm the RH eng dealing with Bob's issue.
I'm currently suspecting a bug in genisoimage causing the ISO to not boot.
But for sure, I'm convinced this issue is different from the bug description.

Comment 46 Pavel Cahyna 2018-02-21 09:33:58 UTC
Thank you, Renaud, for your reply. I was interested in the problem described in https://bugzilla.redhat.com/show_bug.cgi?id=1388653#c28. It is certainly a ReaR bug, a real one (also reported upstream), but somewhat difficult to reproduce. Do you remember what was the environment that triggered it? I see there are lots of disk devices which seems suspicious - is there some sort of SAN being involved?

Comment 47 Tereza Cerna 2018-02-23 09:53:02 UTC
I was not able to reproduce this bug. So I tested:

  * fixed in upstream
  * patches were applied in rear-2.00-6.el7
  * our test suite works as expected

-> VERIFIED, SanityOnly

Comment 50 errata-xmlrpc 2018-04-10 18:43:20 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:1000