Bug 207372

Summary: partprobe in RHEL 4 hardlocks the system
Product: Red Hat Enterprise Linux 4 Reporter: Tom "spot" Callaway <tcallawa>
Component: partedAssignee: David Cantrell <dcantrell>
Status: CLOSED ERRATA QA Contact: Brock Organ <borgan>
Severity: medium Docs Contact:
Priority: medium    
Version: 4.0CC: dkovalsk, jhrozek
Target Milestone: ---   
Target Release: ---   
Hardware: i386   
OS: Linux   
Whiteboard:
Fixed In Version: RHBA-2007-0275 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2007-05-01 17:35:33 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Strace from a partprobe run where it locks up
none
The state of /dev before they "write-enable" some of their devices.
none
The state of /dev after.
none
/var/log/messages from their system
none
Output from parted print for all attached storage devices
none
Sun disklabel reading code that won't lock partprobe(8) none

Description Tom "spot" Callaway 2006-09-20 20:51:40 UTC
One of my customers is trying to use partprobe to rediscover "lost" LUNs. It
works correctly when they specify a disk device as an option to partprobe, but
when they run partprobe with no options, the box hardlocks and has to be rebooted.

Comment 1 Tom "spot" Callaway 2006-09-20 20:52:55 UTC
Created attachment 136778 [details]
Strace from a partprobe run where it locks up

I asked the customer to run strace and send me the output from before it
hardlocks, this is what they sent me.

Comment 2 Tom "spot" Callaway 2006-09-20 20:56:41 UTC
The reason that they're using partprobe is this:
When they make some of their devices "write-disabled" or "established" in EMC
terminology, then reboot, they can't see some of their device nodes. When they
"write-enable" those devices ("split" in EMC terminology), they want to be able
to see those devices again.

They were running fdisk on the devices to recover them without a reboot. I
suggested that they use partprobe instead.


Comment 3 Tom "spot" Callaway 2006-09-20 20:59:42 UTC
Only output from running partprobe before it locks is this:

rr131149@rpc9900# partprobe
Warning: The disk CHS geometry (1024,1,45) does not match the geometry stored on
the disk label (48,15,64).
Warning: The disk CHS geometry (960,1,6) does not match the geometry stored on
the disk label (6,15,64).
Warning: The disk CHS geometry (1024,1,45) does not match the geometry stored on
the disk label (48,15,64).
Warning: The disk CHS geometry (960,1,6) does not match the geometry stored on
the disk label (6,15,64).

Comment 4 Tom "spot" Callaway 2006-09-20 21:01:13 UTC
Created attachment 136779 [details]
The state of /dev before they "write-enable" some of their devices.

Comment 5 Tom "spot" Callaway 2006-09-20 21:03:18 UTC
Created attachment 136780 [details]
The state of /dev after.

Comment 6 Tom "spot" Callaway 2006-09-20 21:04:14 UTC
Created attachment 136781 [details]
/var/log/messages from their system

Comment 7 David Cantrell 2006-09-20 21:05:49 UTC
I'm unfamiliar with EMC terminology, but the word 'split' to me indicates that
some sort of LUN resizing is happening.

But before that, I need to know the disk label type that these volumes are
using.  You can run:

parted /dev/whatever print

And look at the 'Partition Table:' line.  Actually, the whole output of that
print command would be handy.

Comment 8 Tom "spot" Callaway 2006-09-21 14:44:37 UTC
Created attachment 136859 [details]
Output from parted print for all attached storage devices

Comment 9 David Cantrell 2006-10-17 18:00:46 UTC
I've built parted-1.8.0rc2 RPMs for RHEL-4.  Can you try partprobe from 1.8.0rc2
and see where that gets you?

http://people.redhat.com/dcantrel/rhel/parted/

Comment 10 David Cantrell 2006-10-31 22:36:03 UTC
Heard from customer that parted-1.8.0rc2 resolves the partprobe crash problem. 
Adding this to the RHEL 4.5 proposed list.

Comment 11 RHEL Program Management 2006-10-31 22:44:48 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 12 David Cantrell 2007-01-31 20:58:12 UTC
Based on the log files provided, libparted was crashing on the Sun disklabels in
the EMC array.  Customer verified that parted-1.8.0rc2 works fine, so backported
the Sun disklabel code from that release to the RHEL-4 tree.

Comment 13 David Cantrell 2007-01-31 21:00:06 UTC
Created attachment 147049 [details]
Sun disklabel reading code that won't lock partprobe(8)

Backport of Sun disklabel code from parted-1.8.0rc2 that fixes the problem. 
Applied in parted-1.6.19-15.EL in RHEL-4.

Comment 16 Jakub Hrozek 2007-03-27 13:55:32 UTC
This fix is available in the latest RHEL-4.5 Beta.  If the customer is in
a position to test the beta, please ask for testing feedback. Thanks.

Comment 18 Red Hat Bugzilla 2007-05-01 17:35:33 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2007-0275.html