Bug 246423

Summary: parted confused by spurious garbage reports "loop" table and wrong data
Product: [Fedora] Fedora Reporter: Michal Jaegermann <michal>
Component: partedAssignee: Joel Andres Granados <jgranado>
Status: CLOSED RAWHIDE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: medium Docs Contact:
Priority: low    
Version: 7CC: triage
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2008-05-29 17:51:52 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
two initial disk sectors which throw parted into a loop none

Description Michal Jaegermann 2007-07-02 06:20:13 UTC
Description of problem:

On a machine which runs Linux "for ages", and happens to 
have eleven partition on a disk, parted reported the following

Model: ATA WDC WD1800JB-00D (scsi)
Disk /dev/sda: 180GB
Sector size (logical/physical): 512B/512B
Partition Table: loop

Number  Start  End    Size   File system  Flags
 1      0.00B  180GB  180GB  fat16

while fdisk, sfdisk and Linux kernel did not have any doubts
about the real layout of this device.  For 'sfdisk' it happens
to look like this:

# partition table of /dev/sda
unit: sectors

/dev/sda1 : start=       63, size= 81915372, Id= 7, bootable
/dev/sda2 : start= 81915435, size=213825150, Id= 7
/dev/sda3 : start=295740585, size= 55906200, Id= 5
/dev/sda4 : start=        0, size=        0, Id= 0
/dev/sda5 : start=295740648, size=  1012032, Id=1b
/dev/sda6 : start=296768745, size=   208845, Id=83
/dev/sda7 : start=296977653, size= 14683347, Id=83
/dev/sda8 : start=311661063, size=  4096512, Id=83
/dev/sda9 : start=315757638, size=  4096512, Id=82
/dev/sda10: start=319854213, size=   819252, Id=83
/dev/sda11: start=320673528, size= 30973257, Id=83

That caught me in anaconda, which is now using parted,
during an attempted system upgrade.  Obviously anaconda
got totally distorted picture of a situation and was unable
to find any Linux partitions not mentioning anything to
upgrade.

It turned out that parted got totally spooked by some
lefovers on sector 0.  After I zeroed first 120 bytes, or so,
of that sector and installed on it grub once again then
suddenly parted found required data and got partitioning right.

Data from the original first two sectors of that disk are
attached.

Version-Release number of selected component (if applicable):
Any version of parted I had an oportunity to try.  That
includes some olders one and also whatever is used by F7
anaconda

How reproducible:
always with the "right" boot sector

Comment 1 Michal Jaegermann 2007-07-02 06:20:13 UTC
Created attachment 158322 [details]
two initial disk sectors which throw parted into a loop

Comment 2 David Cantrell 2008-02-08 00:57:08 UTC
Obviously the label detection routines failed on this disk.  I've tried to
recreate it here using rawhide, but can't.  I think because I'm using rawhide
and redoing the disk trying to get a similar layout to what you have, it's still
doesn't contain all of the MBR you had.

Do you still have the MBR that I could examine and see if this problem is still
in parted?  Also, if the disk is still partitioning the same way, can you try
booting the rawhide installer and seeing if parted can read the disk?

We've been using parted in anaconda since before Fedora was even around.  So,
for ages.  I'm surprised we haven't seen this problem before.

Comment 3 Michal Jaegermann 2008-02-08 02:51:04 UTC
> Obviously the label detection routines failed on this disk.

This happened way before that.  File systems and labels on
those is not something parted cares about when trying to
a partitioning info.

> Do you still have the MBR that I could examine

Well, that attachment from comment #1 gives the first two original
sectors which made parted to go haywire.  As I wrote - zeroing out
the first 120 bytes from the first sector made the problem to vanish
so all what made parted unhappy should be there.

> Also, if the disk is still partitioning the same way

I am afraid that it is not and through all these months was actually
changed few times over (even if this is one of those disks which
I still have around - I do not remember after such long time).
In any case I modified that boot sector trying to get around the
issue but not before I saved that data from the attachment.

> We've been using parted in anaconda since before Fedora was even
> around.  So, for ages.  I'm surprised we haven't seen this problem
> before.

Oh, I have seen many disks so this surprised me as well.  OTOH
I assure you that the issue was real.  parted supports many types
of partitions and possibly some leftovers, without any meaning
for fdisk or sfdisk, confused it.  I did not study all possible
formats to be able to tell for sure.

It is clear that you will not see that on a "virgin" disk.

Comment 4 David Cantrell 2008-02-09 03:10:11 UTC
(In reply to comment #3)
> > Obviously the label detection routines failed on this disk.
> 
> This happened way before that.  File systems and labels on
> those is not something parted cares about when trying to
> a partitioning info.

Our terminology is probably not syncing up.  I'm talking about disk label detection.

BTW, I am the upstream maintainer of GNU parted as well as the Fedora package
maintainer for it.

> > Do you still have the MBR that I could examine
> 
> Well, that attachment from comment #1 gives the first two original
> sectors which made parted to go haywire.  As I wrote - zeroing out
> the first 120 bytes from the first sector made the problem to vanish
> so all what made parted unhappy should be there.

Can you give me more details of how the disk was provisioned?  What programs did
partitioning and filesystem creation?  Before you ran anaconda.

> > Also, if the disk is still partitioning the same way
> 
> I am afraid that it is not and through all these months was actually
> changed few times over (even if this is one of those disks which
> I still have around - I do not remember after such long time).
> In any case I modified that boot sector trying to get around the
> issue but not before I saved that data from the attachment.

That's ok, I'll go with what you gave me and any other descriptions of how the
system was set up.  The order of operating system installation.  Even if it was
a single OS computer, if it ever ran different operating systems, tell me the
order you installed them in.

> > We've been using parted in anaconda since before Fedora was even
> > around.  So, for ages.  I'm surprised we haven't seen this problem
> > before.
> 
> Oh, I have seen many disks so this surprised me as well.  OTOH
> I assure you that the issue was real.  parted supports many types
> of partitions and possibly some leftovers, without any meaning
> for fdisk or sfdisk, confused it.  I did not study all possible
> formats to be able to tell for sure.
> 
> It is clear that you will not see that on a "virgin" disk.

We maintain parted to be as cross-platform and cross-OS as possible.  It's true
that we support a variety of disk label types, architectures, and
filesystems--on every platform that parted can compile on.  When it comes to
disk label detection, it's an art form.  There are no defined standards, only
loose standards and "accepted" best practices that operating systems follow.  So
we try to cover as many of those as possible.  I want to recreate your scenario
as closely as possible and figure out what detection path libparted is following
so I can fix it.

Comment 5 Michal Jaegermann 2008-02-09 19:59:22 UTC
> I'm talking about disk label detection.

Sorry for misunderstanding.

> Can you give me more details of how the disk was provisioned?
> What programs did partitioning and filesystem creation?
> Before you ran anaconda.

Not with any degree of certainity.  As I wrote that disk was
in use "from always" in Linux.  I _think_ that the original
partition table was written on it by anaconda; definitely not
later one that from FC5 but it could be earlier.  I have some
indications that the original partition table was writtent on
that disk in December of 2004.  It is possible that sfdisk was
used for that, or maybe (but less likely) fdisk from that time.
Surely one of the three above.

How those "garbage bytes" happened I have no idea.  Clearing
initial 120 bytes and making grub to write on this sector again
surely changed it.  Of course now this was a newer version of grub.

I believe that I am talking about the right disk.  In the time
of filing a report I did not think that I will be asked such
questions half a year later and I did not record identifiers
like a serial number.

> if it ever ran different operating systems

This I happen to know.  That disk was always used only with Linux.
That applies to all 180 GB WD1800JB-00D disks I have around and on
which this incident could possibly occur; even if I cannot tell
100% if a bit above I identified the right one.

Comment 6 Bug Zapper 2008-05-14 13:22:59 UTC
This message is a reminder that Fedora 7 is nearing the end of life. Approximately 30 (thirty) days from now Fedora will stop maintaining and issuing updates for Fedora 7. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as WONTFIX if it remains open with a Fedora 'version' of '7'.

Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version prior to Fedora 7's end of life.

Bug Reporter: Thank you for reporting this issue and we are sorry that we may not be able to fix it before Fedora 7 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora please change the 'version' of this bug. If you are unable to change the version, please add a comment here and someone will do it for you.

Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete. If possible, it is recommended that you try the newest available Fedora distribution to see if your bug still exists.

Please read the Release Notes for the newest Fedora distribution to make sure it will meet your needs:
http://docs.fedoraproject.org/release-notes/

The process we are following is described here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 7 Joel Andres Granados 2008-05-26 15:31:18 UTC
On my tests with the attachment in from comment #1:
parted:
[root@dhcp-lab-115 Misc]# parted /dev/loop0 print
Error: Unable to open /dev/loop0 - unrecognised disk label.               
Information: Don't forget to update /etc/fstab, if necessary.

fdisk:
[root@dhcp-lab-115 Misc]# fdisk /dev/loop0 -l
[root@dhcp-lab-115 Misc]# 

sfdisk:
[root@dhcp-lab-115 Misc]# sfdisk /dev/loop0 -l
Disk /dev/loop0: cannot get geometry

Disk /dev/loop0: 0 cylinders, 255 heads, 63 sectors/track
lseek: Invalid argument

sfdisk: seek error on /dev/loop0 - cannot seek to 295740585
Units = cylinders of 8225280 bytes, blocks of 1024 bytes, counting from 0

   Device Boot Start     End   #cyls    #blocks   Id  System
/dev/loop0p1   *      0+   5098    5099-  40957686    7  HPFS/NTFS
/dev/loop0p2       5099   18408   13310  106912575    7  HPFS/NTFS
/dev/loop0p3      18409   21888    3480   27953100    5  Extended
/dev/loop0p4          0       -       0          0    0  Empty
[root@dhcp-lab-115 Misc]# 

As you can see non of these generate the behavior described in comment #0.  IMO
the partition table had some sort of issue (like expressed in comment #0) but
cannot be reproduced with the file that is attached.

Can you reproduce the behavior with the file as expressed in comment #0?

Comment 8 Michal Jaegermann 2008-05-26 18:30:56 UTC
I tried the same experiment with parted-1.8.8-5.fc9, and
sfdisk and fdisk from util-linux-ng-2.14-0.1.fc10 using a file
with a dump of two sectors of a disk I am using right now.
Results are as follows:
  - sfdisk dumps a table although it complains "cannot seek to ..."
    if an extended partition is present and further "logical"
    partitions are skipped.
  - fdisk remains silent like in comment #7
  - parted complains "Error: Can't have a partition outside the disk!"

The same thing with a file from an attachment (id=158322) produces
four primary partitions with sfdisk but parted comes with "unrecognised
disk label" for a change (even if sfdisk accepted it).  Would not help
much with anaconda, wouldn't it?

The difference is that I was running 'sfdisk -d ...' to have an
output from the original report and you have 'sfdisk -l ...'.

So, if it is not feasible to find from available data what makes
parted unhappy then one would have to look at original data from the
whole disk from well over a year ago and I do not have that anymore.

Comment 9 Joel Andres Granados 2008-05-27 07:52:02 UTC
(In reply to comment #8)

> So, if it is not feasible to find from available data what makes
> parted unhappy then one would have to look at original data from the
> whole disk from well over a year ago and I do not have that anymore.

This is what I was afraid of :(.  To debug this issue I would need something to
test with, since I don't have an error message or a starting point.  At this
stage I would close this issue with insufficient data and ask you to reopen if
you see something funcky.
What do you think?



Comment 10 Michal Jaegermann 2008-05-27 16:52:24 UTC
Even with data available it should be possible to find out why parted
responds with "unrecognised disk label" while "partition outside the disk"
error would be really expected.

The reported error is surely obscure and I think that chances that I run
into it again are practically nonexistent; although anaconda saying that
it cannot find any partitions on a working for a long time system was
definitely a surprise with no obvious way to fix it.

It sounds like you need the whole disk if somebody will run into the
issue again.  This is hardly feasible unless you manufacture yourself
something of that sort.  Writing provided initial two sectors on a big
enough blank disk, "fixing" resulting mess with a help of an fdisk and
trying what parted has to say appears as at least a way to try.  It does
not look that a loop device is sufficient for that.

Comment 11 Joel Andres Granados 2008-05-28 11:39:01 UTC
That was a great idea!!! thx. (Using the harddrive one)

I get the same behavior expressed in comment #0, which is good, but the fdisk
blows up somewhere.  sfdisk gave me the list of primary partition.  Going to
investigate why parted is giving the wrong information.

Comment 12 Michal Jaegermann 2008-05-28 17:32:01 UTC
> Going to investigate why parted is giving the wrong information.

As parted works with more partitioning methods, a.k.a. "label types",
than (s)fdisk, and attempts to guess a partition type from what it
sees on a disk, it may turn out that the problem is "unfixable"
because of a clash with, say, a gpt handling code.

It is good to know that you can at least look at the problem. Thanks!



Comment 13 Joel Andres Granados 2008-05-29 09:47:47 UTC
(In reply to comment #12)
> > Going to investigate why parted is giving the wrong information.
> 
> As parted works with more partitioning methods, a.k.a. "label types",
> than (s)fdisk, and attempts to guess a partition type from what it
> sees on a disk, it may turn out that the problem is "unfixable"
> because of a clash with, say, a gpt handling code.
> 

This is something that I have considered, but even if that is the case, parted
is still misbehaving IMO.  might be that the heuristic it is using has not
considered a corner case.  Will have to look at this closer.



Comment 14 Joel Andres Granados 2008-05-29 13:30:22 UTC
Upstream just committed a patch that changes the heuristic for fat recognition.
 This effectively gets me past the point where parted misinformed the user about
the drive having one partition.  Its now at the point where it blows up on a
bunch of sanity checks (The fact that there is a partition past my drives max
capacity is a certain point of failure in parted.).  This is mainly due to the
fact that my disk is not described by the partition table given in comment #0.
I'm going to close this one with the certainty that the rest of parted will pick
up the actual partition table.

Comment 15 Joel Andres Granados 2008-05-29 13:31:16 UTC
relative upstream commit 2fb0836622aafcdcb7da511c3890a28887a36754

Comment 16 Joel Andres Granados 2008-05-29 13:35:19 UTC
comment 15 is the tests for the change,  This commit 
2fb0836622aafcdcb7da511c3890a28887a36754 is the actual meat.


Comment 17 Michal Jaegermann 2008-05-29 18:35:14 UTC
> Its now at the point where it blows up on a bunch of sanity checks.

That is why I was talking about "fixing the resulting mess".  I expect
that if you would use, say, sfdisk to replace an existing extended
partition with one within confines of a test disk you have on hands
(assuming that none of other partitions is too big) then the current
version of rawhide parted would be happy.

Yes, it indeed looks that an upstream change took care about the issue.