Bug 207372 - partprobe in RHEL 4 hardlocks the system
partprobe in RHEL 4 hardlocks the system
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: parted (Show other bugs)
4.0
i386 Linux
medium Severity medium
: ---
: ---
Assigned To: David Cantrell
Brock Organ
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2006-09-20 16:51 EDT by Tom "spot" Callaway
Modified: 2007-11-30 17:07 EST (History)
2 users (show)

See Also:
Fixed In Version: RHBA-2007-0275
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2007-05-01 13:35:33 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Strace from a partprobe run where it locks up (104.55 KB, text/plain)
2006-09-20 16:52 EDT, Tom "spot" Callaway
no flags Details
The state of /dev before they "write-enable" some of their devices. (2.31 KB, text/plain)
2006-09-20 17:01 EDT, Tom "spot" Callaway
no flags Details
The state of /dev after. (2.41 KB, text/plain)
2006-09-20 17:03 EDT, Tom "spot" Callaway
no flags Details
/var/log/messages from their system (211.72 KB, text/plain)
2006-09-20 17:04 EDT, Tom "spot" Callaway
no flags Details
Output from parted print for all attached storage devices (9.10 KB, text/plain)
2006-09-21 10:44 EDT, Tom "spot" Callaway
no flags Details
Sun disklabel reading code that won't lock partprobe(8) (6.36 KB, patch)
2007-01-31 16:00 EST, David Cantrell
no flags Details | Diff

  None (edit)
Description Tom "spot" Callaway 2006-09-20 16:51:40 EDT
One of my customers is trying to use partprobe to rediscover "lost" LUNs. It
works correctly when they specify a disk device as an option to partprobe, but
when they run partprobe with no options, the box hardlocks and has to be rebooted.
Comment 1 Tom "spot" Callaway 2006-09-20 16:52:55 EDT
Created attachment 136778 [details]
Strace from a partprobe run where it locks up

I asked the customer to run strace and send me the output from before it
hardlocks, this is what they sent me.
Comment 2 Tom "spot" Callaway 2006-09-20 16:56:41 EDT
The reason that they're using partprobe is this:
When they make some of their devices "write-disabled" or "established" in EMC
terminology, then reboot, they can't see some of their device nodes. When they
"write-enable" those devices ("split" in EMC terminology), they want to be able
to see those devices again.

They were running fdisk on the devices to recover them without a reboot. I
suggested that they use partprobe instead.
Comment 3 Tom "spot" Callaway 2006-09-20 16:59:42 EDT
Only output from running partprobe before it locks is this:

rr131149@rpc9900# partprobe
Warning: The disk CHS geometry (1024,1,45) does not match the geometry stored on
the disk label (48,15,64).
Warning: The disk CHS geometry (960,1,6) does not match the geometry stored on
the disk label (6,15,64).
Warning: The disk CHS geometry (1024,1,45) does not match the geometry stored on
the disk label (48,15,64).
Warning: The disk CHS geometry (960,1,6) does not match the geometry stored on
the disk label (6,15,64).
Comment 4 Tom "spot" Callaway 2006-09-20 17:01:13 EDT
Created attachment 136779 [details]
The state of /dev before they "write-enable" some of their devices.
Comment 5 Tom "spot" Callaway 2006-09-20 17:03:18 EDT
Created attachment 136780 [details]
The state of /dev after.
Comment 6 Tom "spot" Callaway 2006-09-20 17:04:14 EDT
Created attachment 136781 [details]
/var/log/messages from their system
Comment 7 David Cantrell 2006-09-20 17:05:49 EDT
I'm unfamiliar with EMC terminology, but the word 'split' to me indicates that
some sort of LUN resizing is happening.

But before that, I need to know the disk label type that these volumes are
using.  You can run:

parted /dev/whatever print

And look at the 'Partition Table:' line.  Actually, the whole output of that
print command would be handy.
Comment 8 Tom "spot" Callaway 2006-09-21 10:44:37 EDT
Created attachment 136859 [details]
Output from parted print for all attached storage devices
Comment 9 David Cantrell 2006-10-17 14:00:46 EDT
I've built parted-1.8.0rc2 RPMs for RHEL-4.  Can you try partprobe from 1.8.0rc2
and see where that gets you?

http://people.redhat.com/dcantrel/rhel/parted/
Comment 10 David Cantrell 2006-10-31 17:36:03 EST
Heard from customer that parted-1.8.0rc2 resolves the partprobe crash problem. 
Adding this to the RHEL 4.5 proposed list.
Comment 11 RHEL Product and Program Management 2006-10-31 17:44:48 EST
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.
Comment 12 David Cantrell 2007-01-31 15:58:12 EST
Based on the log files provided, libparted was crashing on the Sun disklabels in
the EMC array.  Customer verified that parted-1.8.0rc2 works fine, so backported
the Sun disklabel code from that release to the RHEL-4 tree.
Comment 13 David Cantrell 2007-01-31 16:00:06 EST
Created attachment 147049 [details]
Sun disklabel reading code that won't lock partprobe(8)

Backport of Sun disklabel code from parted-1.8.0rc2 that fixes the problem. 
Applied in parted-1.6.19-15.EL in RHEL-4.
Comment 16 Jakub Hrozek 2007-03-27 09:55:32 EDT
This fix is available in the latest RHEL-4.5 Beta.  If the customer is in
a position to test the beta, please ask for testing feedback. Thanks.
Comment 18 Red Hat Bugzilla 2007-05-01 13:35:33 EDT
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2007-0275.html

Note You need to log in before you can comment on or make changes to this bug.