1147998 – Cloud image does not permit successful reboot

Bug 1147998 - Cloud image does not permit successful reboot

Summary: Cloud image does not permit successful reboot

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Fedora
Classification:	Fedora
Component:	cloud-utils
Sub Component:
Version:	21
Hardware:	Unspecified
OS:	Unspecified
Priority:	urgent
Severity:	high
Target Milestone:	---
Assignee:	Juerg Haefliger
QA Contact:	Fedora Extras Quality Assurance
Docs Contact:
URL:
Whiteboard:	https://fedoraproject.org/wiki/Common...
Depends On:	1156603
Blocks:	F21BetaBlocker
TreeView+	depends on / blocked

Reported:	2014-09-30 13:42 UTC by Lars Kellogg-Stedman
Modified:	2015-04-13 23:02 UTC (History)
CC List:	10 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2014-10-30 17:46:42 UTC
Type:	Bug
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Bugzilla	1210428	0	unspecified	CLOSED	sfdisk 2.26 destroys existing boot sector when editing disk label	2021-02-22 00:41:40 UTC
Red Hat Bugzilla	1211405	0	unspecified	CLOSED	sfdisk dump/restore alters partition table on some disks (at least Fedora cloud images)	2021-02-22 00:41:40 UTC

Internal Links: 1210428 1211405

Description Lars Kellogg-Stedman 2014-09-30 13:42:33 UTC

The fedora 21 cloud image (http://fedoraproject.org/get-prerelease#cloud) boots successfully in openstack once, but subsequent reboots fail (the vm gets stuck at the "Booting from Hard Disk..." message).

Applying the syslinux MBR (dd if=/usr/share/syslinux/mbr.bin of=/dev/vda) appears to correct the problem, either before the first boot or after the first boot.

After first booting, the mbr looks like this:

# xxd -l512 /dev/vda
0000000: fab8 0010 8ed0 bc00 b0b8 0000 8ed8 8ec0  ................
0000010: fbbe 007c bf00 06b9 0002 f3a4 ea21 0600  ...|.........!..
0000020: 00be be07 3804 750b 83c6 1081 fefe 0775  ....8.u........u
0000030: f3eb 16b4 02b0 01bb 007c b280 8a74 018b  .........|...t..
0000040: 4c02 cd13 ea00 7c00 00eb fe00 0000 0000  L.....|.........
0000050: 0000 0000 0000 0000 0000 0000 0000 0000  ................
0000060: 0000 0000 0000 0000 0000 0000 0000 0000  ................
0000070: 0000 0000 0000 0000 0000 0000 0000 0000  ................
0000080: 0000 0000 0000 0000 0000 0000 0000 0000  ................
0000090: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000a0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000b0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000c0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000d0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000e0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000f0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
0000100: 0000 0000 0000 0000 0000 0000 0000 0000  ................
0000110: 0000 0000 0000 0000 0000 0000 0000 0000  ................
0000120: 0000 0000 0000 0000 0000 0000 0000 0000  ................
0000130: 0000 0000 0000 0000 0000 0000 0000 0000  ................
0000140: 0000 0000 0000 0000 0000 0000 0000 0000  ................
0000150: 0000 0000 0000 0000 0000 0000 0000 0000  ................
0000160: 0000 0000 0000 0000 0000 0000 0000 0000  ................
0000170: 0000 0000 0000 0000 0000 0000 0000 0000  ................
0000180: 0000 0000 0000 0000 0000 0000 0000 0000  ................
0000190: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00001a0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00001b0: 0000 0000 0000 0000 0023 c818 0000 8066  .........#.....f
00001c0: 0900 8392 d4ff 0008 0000 18f4 7f02 0000  ................
00001d0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00001e0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00001f0: 0000 0000 0000 0000 0000 0000 0000 55aa  ..............U.

After applying syslinux mbr.bin, the mbr looks like this:

0000000: 33c0 fa8e d88e d0bc 007c 89e6 0657 8ec0  3........|...W..
0000010: fbfc bf00 06b9 0001 f3a5 ea1f 0600 0052  ...............R
0000020: 52b4 41bb aa55 31c9 30f6 f9cd 1372 1381  R.A..U1.0....r..
0000030: fb55 aa75 0dd1 e973 0966 c706 8d06 b442  .U.u...s.f.....B
0000040: eb15 5ab4 08cd 1383 e13f 510f b6c6 40f7  ..Z......?Q...@.
0000050: e152 5066 31c0 6699 e866 00e8 3501 4d69  .RPf1.f..f..5.Mi
0000060: 7373 696e 6720 6f70 6572 6174 696e 6720  ssing operating 
0000070: 7379 7374 656d 2e0d 0a66 6066 31d2 bb00  system...f`f1...
0000080: 7c66 5266 5006 536a 016a 1089 e666 f736  |fRfP.Sj.j...f.6
0000090: f47b c0e4 0688 e188 c592 f636 f87b 88c6  .{.........6.{..
00000a0: 08e1 41b8 0102 8a16 fa7b cd13 8d64 1066  ..A......{...d.f
00000b0: 61c3 e8c4 ffbe be7d bfbe 07b9 2000 f3a5  a......}.... ...
00000c0: c366 6089 e5bb be07 b904 0031 c053 51f6  .f`........1.SQ.
00000d0: 0780 7403 4089 de83 c310 e2f3 4874 5b79  ..t.@.......Ht[y
00000e0: 3959 5b8a 4704 3c0f 7406 247f 3c05 7522  9Y[.G.<.t.$.<.u"
00000f0: 668b 4708 668b 5614 6601 d066 21d2 7503  f.G.f.V.f..f!.u.
0000100: 6689 c2e8 acff 7203 e8b6 ff66 8b46 1ce8  f.....r....f.F..
0000110: a0ff 83c3 10e2 cc66 61c3 e876 004d 756c  .......fa..v.Mul
0000120: 7469 706c 6520 6163 7469 7665 2070 6172  tiple active par
0000130: 7469 7469 6f6e 732e 0d0a 668b 4408 6603  titions...f.D.f.
0000140: 461c 6689 4408 e830 ff72 2766 813e 007c  F.f.D..0.r'f.>.|
0000150: 5846 5342 7509 6683 c004 e81c ff72 1381  XFSBu.f......r..
0000160: 3efe 7d55 aa0f 85f2 febc fa7b 5a5f 07fa  >.}U.......{Z_..
0000170: ffe4 e81e 004f 7065 7261 7469 6e67 2073  .....Operating s
0000180: 7973 7465 6d20 6c6f 6164 2065 7272 6f72  ystem load error
0000190: 2e0d 0a5e acb4 0e8a 3e62 04b3 07cd 103c  ...^....>b.....<
00001a0: 0a75 f1cd 18f4 ebfd 0000 0000 0000 0000  .u..............
00001b0: 0000 0000 0000 0000 0023 c818 0000 8066  .........#.....f
00001c0: 0900 8392 d4ff 0008 0000 18f4 7f02 0000  ................
00001d0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00001e0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00001f0: 0000 0000 0000 0000 0000 0000 0000 55aa  ..............U.

Comment 1 Matthew Miller 2014-09-30 13:45:01 UTC

FWIW I tested rebooting in EC2 and it _does_ work. However, that's not at all surprising, as that's using PVM rather than HVM (and therefore, not using the boot loader).

Comment 2 Matthew Miller 2014-09-30 14:10:23 UTC

Tentatively reassigning to cloud-utils, on the theory that growpart is to blame.

Comment 3 Mike Ruckman 2014-09-30 15:14:30 UTC

There's been at least one other person that ran into this issue with libvirt as well. We don't have cloud specific criteria for this, but I think the Beta Shutdown, Reboot, Logout Criteria fits here: "Shutting down, logging out and rebooting must work using standard console commands and the mechanisms offered..."

Comment 4 Dusty Mabe 2014-09-30 15:30:15 UTC

(In reply to Matthew Miller from comment #2)
> Tentatively reassigning to cloud-utils, on the theory that growpart is to
> blame.

Confirmed.

If I instruct cloud-init to not grow my partition then I can reboot as much as I want. I am using virt-install --import and using a local iso image as a source for cloud-init data. Here is the user-data I am passing in:

[dustymabe@localhost guests]$ cat my-user-data 
#cloud-config
password: passw0rd
chpasswd: { expire: False }
ssh_pwauth: True
growpart:
  mode: off


After booting an instance in this manner I then run growpart manually:

[root@localhost ~]# growpart /dev/sda 1                                                                                                                                                              
CHANGED: partition=1 start=2048 old: size=6144000 end=6146048 new: size=6286612,end=6288660

And... I can no longer reboot anymore.

NOTE: I also tried running fdisk manually and adding a 2nd partition instead of growing the first partition to make sure that any old change to the partition table didn't affect reboot. Making this did not inhibit me from rebooting.

Comment 5 Lars Kellogg-Stedman 2014-09-30 17:12:42 UTC

This problem was not present in the F20 cloud images because those images used the syslinux MBR by default, which does not appear to be affected by this issue.

For both the syslinux-mbr and the original mbr, there is a change to the mbr after the first reboot:

$ diff mbr-pre-reboot mbr-post-reboot
28,29c28,29
< 00001b0: 0000 0000 0000 0000 0023 c818 0000 8020  .........#..... 
< 00001c0: 2100 8392 547e 0008 0000 00c0 5d00 0000  !...T~......]...
---
> 00001b0: 0000 0000 0000 0000 0023 c818 0000 8066  .........#.....f
> 00001c0: 0900 8392 d4ff 0008 0000 18f4 7f02 0000  ................

I don't know whether this represents (a) something writing where it shouldn't or (b) something that is expected but to which the non-syslinux mbr is sensitive.

Comment 6 Lars Kellogg-Stedman 2014-09-30 18:58:23 UTC

Summarizing by request:

- The F21 cloud images have a boot loader that appears to be installed by parted.

- There is some change to the mbr as part of the intial boot/resize/reboot process.  I do not have enough mbr-fu to know what's going on here (is it expected? or a bug?)

- The parted-derived mbr is unable to boot the image after the initial boot/resize/reboot.

- Using a syslinux-derived mbr -- applied either to the original image or before reboot -- seems to avoid the problem.  There is still a change to the mbr, but it does not prevent a sucessful reboot.

It is not clear whether this problem is due to a bug in growpart, a bug in the parted-dervied mbr, or something else.  Using the syslinux mbr seems to avoid the problem.

Comment 7 Lars Kellogg-Stedman 2014-09-30 19:46:28 UTC

Workarounds:

From inside a vm, after booting a cloud image:

    # dd if=/usr/share/syslinux/mbr.bin of=/dev/vda

To fix a qcow2 image on disk:

    # qemu-nbd -c /dev/nbd0 fedora-cloud-base.qcow2
    # dd if=/usr/share/syslinux/mbr.bin of=/dev/nbd0
    # qemu-nbd -d /dev/nbd0

To fix a raw image on disk:

    # dd if=/usr/share/syslinux/mbr.bin of=fedora-cloud-base.img conv=notrunc

Comment 8 Matthew Miller 2014-10-01 13:50:57 UTC

One possibile workaround is to put

dd if=/usr/share/syslinux/mbr.bin of=/dev/vda

in the kickstart post script used when the images are generated.

Comment 9 Adam Williamson 2014-10-03 16:15:09 UTC

Discussed at 2014-10-03 blocker review meeting: http://meetbot.fedoraproject.org/fedora-blocker-review/2014-10-03/f21-blocker-review.2014-10-03-15.58.log.txt . Accepted as a blocker per criterion "Shutting down, logging out and rebooting must work using standard console commands and the mechanisms offered..."

Comment 10 Mike Ruckman 2014-10-10 22:36:11 UTC

Verified again on TC3 base image on Openstack.

Comment 11 Juerg Haefliger 2014-10-14 14:19:48 UTC

Growpart is not doing anything wrong AFAICT. In fact, disabling growroot and running the following already breaks the subsequent boot:
$ sfdisk -d /dev/vda > mbr.dump
$ sfdisk --force /dev/vda < mbr.dump

The above command updates the CHS addresses in the partition table which seems to trip the bootloader code. sfdisk is somewhat confused about the CHS geometry and 'corrects' it when writing back the partition table. I'm not exactly sure why and how it decides to do this but it seems to cause problems for the installed bootloader.

Before repartitioning:

# sfdisk -l /dev/vda
sfdisk: Disk /dev/vda: cannot get geometry

Disk /dev/vda: 652 cylinders, 255 heads, 63 sectors/track
sfdisk: Warning: The partition table looks like it was made
  for C/H/S=*/147/20 (instead of 652/255/63).
For this listing I'll assume that geometry.
Units: cylinders of 1505280 bytes, blocks of 1024 bytes, counting from 0

   Device Boot Start     End   #cyls    #blocks   Id  System
/dev/vda1   *      0+   2090-   2090-   3072000   83  Linux
		start: (c,h,s) expected (0,102,9) found (0,32,33)
		end: (c,h,s) expected (1023,146,20) found (382,146,20)
/dev/vda2          0       -       0          0    0  Empty
/dev/vda3          0       -       0          0    0  Empty
/dev/vda4          0       -       0          0    0  Empty

I'm not familiar with the bootloader code so can't say if it's a parted bootloader problem or if the extlinux bootloader is simply more forgiving and/or robust. As Lars mentioned, using the extlinux bootloader 'fixes' the issues so I second Matt's suggestion to dd it to the disk during the anaconda post-install stage.

Comment 12 Matthew Miller 2014-10-14 14:58:34 UTC

(In reply to Juerg Haefliger from comment #11)
> Growpart is not doing anything wrong AFAICT. In fact, disabling growroot and
> running the following already breaks the subsequent boot:
> $ sfdisk -d /dev/vda > mbr.dump
> $ sfdisk --force /dev/vda < mbr.dump
> 
> The above command updates the CHS addresses in the partition table which
> seems to trip the bootloader code. sfdisk is somewhat confused about the CHS
> geometry and 'corrects' it when writing back the partition table. I'm not
> exactly sure why and how it decides to do this but it seems to cause
> problems for the installed bootloader.


Hmmmm; maybe we should switch growpart to using parted? (Ugh, that kind of looks like a rewrite and not a small bugfix.)

Since we're in the freeze now, I'll put the post-install workaround into place if people agree. Alternately, we could update anaconda to always install the extlinux MBR if extlinux is used.

Comment 13 Dusty Mabe 2014-10-14 15:14:09 UTC

(In reply to Matthew Miller from comment #12)

> Since we're in the freeze now, I'll put the post-install workaround into
> place if people agree. 

Agreed. We should probably do this for now.


> Alternately, we could update anaconda to always
> install the extlinux MBR if extlinux is used.


I think this gets back to the subject of https://bugzilla.redhat.com/show_bug.cgi?id=1015931

Comment 14 Juerg Haefliger 2014-10-14 16:04:19 UTC

The question is why does this break the gparted bootloader? I tried both sfdisk and fdisk, same result.

Comment 15 Matthew Miller 2014-10-14 16:30:40 UTC

So, this is _probably_ really an sfdisk bug. But we've got a lot of different possible workarounds:

1. install extlinux mbr in anaconda (bug #1015931)
2. install extlinux mbr in kickstart post (easiest!)
3. use parted instead of sfdisk in grow-part
4. use parted instead of grow-part directly from cloud-init

I was excited to see that parted 3.2 (now in F21) includes resizepart support directly. And then those hopes for an easy fix there were kind of dashed by upstream cloud-init https://bugs.launchpad.net/cloud-init/+bug/1212492, where parted resizing has been completely removed because it is apparently broken. (Hooray! Everything is broken!)

That probably rules out options 3 and 4. At this stage, my proposal is to implement 2 for F21 and to plan for 1 for F22 and beyond.

Comment 16 Matthew Miller 2014-10-14 19:41:13 UTC

Oh yeah, I left "fix sfdisk" off the list. But I guess maybe that's a possibility?

Comment 17 Matthew Miller 2014-10-15 00:18:52 UTC

Anyway, I committed the `dd if=/usr/share/syslinux/mbr.bin of=/dev/vda` workaround to the kickstart files for now. This should resolve this bug for now.

Comment 18 Bruno Wolff III 2014-10-15 12:08:42 UTC

Note that the f21 branch of spin-kickstarts was fubar'd before Matt did his commit. After going back before the incorrect merge, Matt's commit no longer cleanly applied to that branch. So that needs to get looked at and committed again.

Comment 19 Matthew Miller 2014-10-15 13:26:36 UTC

(In reply to Bruno Wolff III from comment #18)
> Note that the f21 branch of spin-kickstarts was fubar'd before Matt did his
> commit. After going back before the incorrect merge, Matt's commit no longer
> cleanly applied to that branch. So that needs to get looked at and committed
> again.

Thanks Bruno. Done.

Comment 20 Adam Williamson 2014-10-15 17:46:15 UTC

MODIFIED per c#17. This change should be in TC4/RC1.

Comment 21 Matthew Miller 2014-10-22 14:01:04 UTC

We are still waiting for a TC4 cloud image to be created.

Comment 22 Adam Williamson 2014-10-22 21:05:27 UTC

We still need the bug to be in MODIFIED state to indicate that we can compose (the issue is 'addressed' in that we believe when an image actually gets composed, the bug will be fixed).

Comment 23 Matthew Miller 2014-10-27 19:52:36 UTC

We're still blocked in actual testing by bug #1156603 (compose failure) but I've tested the kickstart with the fix with anaconda _locally_ and it seems to work fine, including rebooting.

Comment 24 Matthew Miller 2014-10-27 23:55:30 UTC

I tested the RC1 builds Dennis just made and the fix works. Presumably will also work in RC2. :)

Comment 25 Adam Williamson 2014-10-30 17:46:42 UTC

This has been confirmed fixed in RC1 and RC4 by multiple testers.

Comment 26 Adam Williamson 2015-04-13 22:46:25 UTC

For the record, we have been looking into this area again recently. We have also confirmed that this bug still affects Fedora 22 if the 'dd' line is removed from cloud kickstart %post.

We basically reconstructed and reconfirmed Lars' work from #c6, with an additional wrinkle. Dusty found that simply doing this:

sfdisk --dump /dev/vda > dump
xxd -l512 /dev/vda > before
sfdisk /dev/vda < dump --force
xxd -l512 /dev/vda > after

is enough to cause a change to the disk label - 'before' and 'after' are not identical. That seems clearly a bug in sfdisk, as this command should only dump and reload the existing configuration. It seems this is the change that prevents the parted-deployed boot sector code from working. He also noted this difference in the output of 'file /dev/vda' before and after the dump/restore:

------------

# Before sfdisk dump/restore sfdisk -d /dev/vda > mbr.dump; sfdisk --force /dev/vda < mbr.dump;
[root@f22sfdisk ~]# file -s /dev/vda
/dev/vda: DOS/MBR boot sector; partition 1 : ID=0x83, active, start-CHS (0x0,32,33), end-CHS (0x17e,146,20), startsector 2048, 6144000 sectors

# After sfdisk dump/restore
[root@f22sfdisk ~]# file -s /dev/vda
/dev/vda: DOS/MBR boot sector; partition 1 : ID=0x83, active, start-CHS (0x2,0,33), end-CHS (0x3d1,4,20), startsector 2048, 6144000 sectors

------------

I'm going to file a util-linux bug.

Comment 27 Adam Williamson 2015-04-13 23:02:35 UTC

util-linux bug: https://bugzilla.redhat.com/show_bug.cgi?id=1211405

Note You need to log in before you can comment on or make changes to this bug.