Created attachment 432468 [details] init.log from the usb key in question When attempting to boot one of the recent nightly composes [1] (as of Jul 16) from a usb key, the boot process fails with a dozen errors from mount. Debugging dracut resulted in the attached init.db. It seems to be an issue with the ext3fs.img which lives in the squashfs.img. We suspected that it might be something caused by the liveusb-creator, too. dmsetup ls --tree returns (psyche being my own machine's name): vg_psyche-lv_swap (253:1) `- (8:5) vg_psyche-lv_root (253:9) `- (8:5) live-osimg-min (253:3) |- (7:3) `- (7:1) live-rw (253:2) |- (7:3) `- (7:4) losetup -a returns: /dev/loop0 [0001]:6931 (/osmin.img) /dev/loop1 [0700]:2 (/squashfs.osmin/osmin) /dev/loop2 [0811]:4 (/sysroot/LiveOS/squashfs.img) /dev/loop3 [0702]:3 (/squashfs/LiveOS/ext3fs.img) /dev/loop4 [0001]:6981 (/overlay) [1] http://alt.fedoraproject.org/pub/alt/nightly-composes/
Also occurs when booting from a CD burned from the .iso and from a USB written to with dd command. It would seem to be in the squashfs and not caused by the liveusb-creator.
Here are the relevant parts of dracut: + mount -n -t vfat -o ro /dev/sdc1 /sysroot + dd if=/sysroot/LiveOS/osmin.img of=/osmin.img + losetup -r /dev/loop0 /osmin.img + mkdir -p /squashfs.osmin + mount -n -t squashfs -o ro /dev/loop0 /squashfs.osmin + losetup -f + losetup -r /dev/loop1 /squashfs.osmin/osmin + umount -l /squashfs.osmin + losetup -r /dev/loop2 /sysroot/LiveOS/squashfs.img + mkdir -p /squashfs + mount -n -t squashfs -o ro /dev/loop2 /squashfs + losetup -f + losetup -r /dev/loop3 /squashfs/LiveOS/ext3fs.img + umount -l /squashfs + umount -l /sysroot + dd if=/dev/null of=/overlay bs=1024 count=1 seek=524288 + losetup /dev/loop4 /overlay + dmsetup create live-rw + echo 0 4194304 snapshot /dev/loop3 /dev/loop4 p 8 + echo 0 4194304 snapshot /dev/loop3 /dev/loop1 p 8 + dmsetup create --readonly live-osimg-min + /bin/mount /dev/mapper/live-rw /sysroot mount: wrong fs type, bad option, bad superblock on /dev/mapper/live-rw, missing codepage or helper program, or other error In some cases useful info is found in syslog - try
reassigning to get a second opinion of what could be wrong... nothing changed in dracut, so there might be some changes in the .img creation process.
http://dracut.git.sourceforge.net/git/gitweb.cgi?p=dracut/dracut;a=blob;f=modules.d/90dmsquash-live/dmsquash-live-root;h=c98cdef5897e04a8b1ad3655d4450402b2ff804c;hb=7d86d90d1152a8d496bbd8c41b6c865ca0c3f03b Here is the dracut script.
Adding this as a F14Alpha blocker as it impacts the ability to test live images.
Please help us save time at this week's Fedora 14 Alpha Blocker Bug review meeting by adding your comments on the assessment of this bug. If this bug is still unresolved by Friday, July 23, 2010, we would appreciate your attendance at the Fedora 14 Alpha blocker meeting on freenode in the #fedora-bugzappers channel at 16:00 UTC. The following information would be very helpful to have prior to Friday: a) whether you believe it is truly a blocker bug b) what additional information you need to troubleshoot or fix this bug c) When you estimate having a fix ready Thank you, John
I tried looking at this a bit and was testing the kde-i386-20100720.15.iso image and found that ext3fs.img is not mountable.
e2fsprogs-1.41.12-5.fc14 was built on the 13th. It might be useful to try building ISOs with an earlier version of it to see if that makes a difference.
Re:Comment #8 soas-i386-20100702.15.iso was last Nightly Composes Soas.iso that worked using livecd-to disk with Soas.ks for remix in f14 (rawhide) build system: Soas-v4-07142010-remix was the last build that worked before yum update --skip-broken (I am using rawhide so may have worked longer before update moved to rawhide) see Test results: http://wiki.sugarlabs.org/go/Talk:Features/Soas_V4/Install_Test_Table#Test_results
When I built an ISO on my rawhide system ext3fs.img was mountable. I haven't tested the ks yet, and since it's pretty late probably won't be able to until tomorrow.
I did get to test it. I did a dd to a USB drive and when trying to boot I got a quick flash of syslinux and then the screen stayed black and it didn't appear that anything was happening. This is similar to what I was seeing when trying to boot off of the images from the nightly compose page. This suggests that there might be two separate problems.
In response to Comment 11 Apparently it is an issue with a missing splash.jpg file or in fedora-logos If you look at a standard F13 DVD, it has: display boot.msg background splash.jpg In my testing the black screen was fixed in Syslinux start up by copying boot.msg and splash.jpg to syslinux/ folder and adding display boot.msg background splash.jpg To the syslinux.cfg The same goes for the LiveCD stuff, it is missing splash.jpg and boot.msg plus the corresponding reference in isolinux.cfg That is in fact a seperate issue.
In a nutshell, the livecd-creator program : ...snip # Relabels the entire installation root (for SELinux) * ># Creates a live CD specific initramfs that matches the installed kernel # Unmounts the kernel file systems mounted inside the installation root # Unmounts the installation root # Creates a squashfs file system containing only the default ext3/4 file (compression) # Configures the boot loader # Creates an iso9660 bootable CD/DVD ....... * >I note that my f14(rawhide) build HD quit making working .iso's after a kernel update was part of yum update. could this be part of the problem? (from http://fedoraproject.org/wiki/How_to_create_and_use_a_Live_CD)
+ dd if=/dev/null of=/overlay bs=1024 count=1 seek=524288 + losetup /dev/loop4 /overlay + dmsetup create live-rw The overlay file is never formatted, then we try to mount it. I don't see how this is supposed to work.
*** Bug 613213 has been marked as a duplicate of this bug. ***
f14(rawhide) remix build using livecd-creator works with soas.ks edited to use generic logos yum updated to new kernel 2.6.35-0.49.rc5....fc14.i686 ? CD Boots fine
In response to comment 16 The fedora-logos changes which present a black screen and no ISOLINUX bootloader on the nightly-composes is a different issue than the one in this bug report. This bug report involves the Live media booting but dying with an error message of unable to mount /dev/mapper/live-rw The bug report which Comment 16 is referring to is Bug 617115 Please post there about your fedora-logo black screen issues with ISOLINUX/SYSLINUX
Discussed at the 2010/07/23 blocker review meeting, we accept this bug as a blocker. It would help if people can test the nightlies...well, nightly...for the next week or so and see how it goes. It seems like there may be multiple bugs here, it'd be helpful to have each isolated and separately filed as well. -- Fedora Bugzappers volunteer triage team https://fedoraproject.org/wiki/BugZappers -- Fedora Bugzappers volunteer triage team https://fedoraproject.org/wiki/BugZappers
For the black screen on boot issue, see: https://bugzilla.redhat.com/show_bug.cgi?id=617115 -- Fedora Bugzappers volunteer triage team https://fedoraproject.org/wiki/BugZappers
In reply to Comment 10 I have just tested a livecd spin using a RAWHIDE repository source on an F13 machine and the ext3fs.img does now mount properly. It also does boot properly which leads me to believe there was a problem in both the ext3fs.img or live-rw overlay file format being broken. This issue should now be closed.
I am still seeing the mount problem with desktop-i386-20100723.01.iso. I think there is still something going on here. I'm going to do a local build on an F14 machine and see if that is different.
When I run livecd-creator on an F14 system I can mount the ext3 image. Due to the black screen issue I can't boot the image though.
In reply to Comment 22 Hold down shift before the media boots, to get a SYSLINUX/ISOLINUX prompt. Then enter: linux0 Press ENTER Please remember the two issues are seperate issues however. My testing shows with RAWHIDE repository building a LiveCD from on an F13 host, the unable to mount root issues has been fixed, I am not sure how.. but it has. Please use the steps above to bypass the broken black screen issue with the splash.jpg and menu background splash.jpg issue with SYSLINUX/ISOLINUX for your testing.
I'll need to wait until I get to work to test this. The keyboard on the easiest machine to test this doesn't work until an OS is running, so doesn't work for working around the black screen issue. It isn't my machine though so I can't regularly be mucking with its hardware for testing. The machines that are mine to play with don't boot off of USB devices. One won't boot off of DVD RWs (I can't burn CDs due to another bug and using DVD Rs costs money.) and the last triggers a KMS kernel bug (though the shift trick did get it to start booting) that terminates the boot. I should be able to do a couple of tests tomorrow at work.
I was able to test a local build on a F14 machine that had a fix for the black screen issue that was blocking my testing. The system booted up, but I couldn't login. When I tried, I got a message about not being able to change the monitor settings. I don't know if that is a general problem or related to specific hardware. I'll test the image on other hardware tomorrow. That doesn't explain why the nightly composes have this issue and building on other F14 systems doesn't.
for daily tests of soas nightly composes see: http://wiki.sugarlabs.org/go/Talk:Features/Soas_V4/Install_Test_Table#Test_results this page also shows builds with an external USB 500GB hard drive with an daily updated f14(rawhide) install of Gnome-sugar as a livecd-creator-soas.ks build system.
In reply to Comment 25 well I just built a livecd from a RAWHIDE repository that I rsync'ed manually instead of using the URL provided in the kickstart file with livecd-tools and it does have the splash problem of course, unti that is fixed. I also noticed the nightly spins say the ext3fs.img is of Ext4 format, and my locally built livecd-creator created image from rawhide says it is Ext3. Maybe they are using a modified version of livecd-creator to build the nightlies.
I get the message about monitor settings on every boot with Rawhide, currently. -- Fedora Bugzappers volunteer triage team https://fedoraproject.org/wiki/BugZappers
Just some more info here: The compose box for the nightly composes is running rawhide. It's a x86_64 xen guest. I have: xfce-x86_64-20100714.16.iso -> boots normally. xfce-x86_64-20100717.15.iso -> fails as noted in this bug. So, it seems like something between those two dates to me, but I am very puzzled as to what. I've tried downgrading to the livecd-tools from before the last update. I tried downgrading e2fsprogs that was updated on the 14th. ;( Happy to provide more info about the compose machine. I wonder: Those of you who have made local spins that work, have those all been 32bit hosts composing?
using ACER Aspire ONE Intel Atom N450 1.66 GHZ with 500 GB USB external drive with f14(rawhide) installed for builds; livecd-tools spin-kickstarts. Here are tests: http://wiki.sugarlabs.org/go/Talk:Features/Soas_V4/Install_Test_Table#Test_results - looks like soas-i386-20100702.15.iso last .iso that worked from Nightly Composes - I can still do working remixes by using generic-logos in .ks. Just did one now.
I think someone forgot to mention the root=live:/dev/sr0 workaround here! ie scsi boot currently works but not ide if I understood correctly.
In reply to comment 31 This has nothing to do with that bug. This has to do with the ext3fs.img not being mountable and reading file magic says it's Ext4 and failure to mount it when the system boots. I again must say, this bug has nothing to do with the SPLASH.PNG for ISOLINUX NOR SYSLINUX and is still being worked on.
In reply to Comment 29 Can you take one of the nightly spins, mount the ISO9660 filesystem, then mount the squashfs.img, and also try mounting the ext3fs.img and see if it mounts correctly. I know the RAWHIDE squashfs-tools allows using lzma compression now and it is the default in livecd-creator, if it is available, but at the same time if the squashfs.ko in the kernels don't support SquashFS with lzma, then we could see problems like this. I just want to be sure on your compose host running a RAWHIDe kernel, that when you mount the SquashFS image via loopback, the ext3fs.img iside it is readable also. Unless someone already knows if you must pass specific options to mount for SquashFS images w/ lzma or something, we need this information Kevin.
Also Kevin, if you find that the nightlies' ext3fs.img is not mountable on the RAWHIDE host, would you mind trying to build one using the --compression-type=zlib to livecd-creator and then try mounting those ISO9660, SquashFS and Ext3 filesystem images. I am curious if the kernel module squashfs.ko does not support SquashFS w /lzma yet which might cause the issue, and make the contained ext3fs.img not mountable and seem corrupt.
I had already tested this the squashfs image mounts, the ext image doesn't. The default compressor for both squashfs-tools and livecd-creator are still zlib. If the kernel patches land for 2.6.36 I'll ask for an exception for the LZMA feature to change the default for livecd-creator to lzma, but the default for squashfs-tools will remain zlib. As of this time there is not lzma compression support in the kernel. I don't know if Lougher is planning to have some ready to submit for 2.6.36. There have been some xattr changes in squashfs support in the kernel. I am also in the process of getting a new squashfs-tools out that is synced up to upstream. The changes have been cleanup and better error handling for the xattr stuff. Probably not anything that will help with this bug.
In reply to Comment 35 According to fs.py around line 43 mksquashfs default is LZMA, not ZLIB. def mksquashfs(in_img, out_img, compress_type): # Allow zlib to work for older versions of mksquashfs if compress_type == "zlib": args = ["/sbin/mksquashfs", in_img, out_img] else: args = ["/sbin/mksquashfs", in_img, out_img, "-comp", compress_type] if not sys.stdout.isatty(): args.append("-no-progress") ret = subprocess.call(args) if ret != 0: raise SquashfsError("'%s' exited with error (%d)" % (string.join(args, " "), ret)) Here you test if compress_type is "zlib" not "lzma". This is probably breaking the nightlies making them not mountable and also not boot correctly. Looking at decompressor.c in the kernel tree for the latest kernel in koji LZMA is still not supported: static const struct squashfs_decompressor squashfs_lzma_unsupported_comp_ops = { NULL, NULL, NULL, LZMA_COMPRESSION, "lzma", 0 }; static const struct squashfs_decompressor *decompressor[] = { &squashfs_zlib_comp_ops, &squashfs_lzma_unsupported_comp_ops, &squashfs_lzo_unsupported_comp_ops, &squashfs_unknown_comp_ops };
Seems like a bad way to do an if else, why not if compress_type = "lzma" then mksquashfs -comp lzma else mksquashfs (Use it's defaults)
This should be like this instead: def mksquashfs(in_img, out_img, compress_type): # Allow zlib to work for older versions of mksquashfs if compress_type == "lzma": args = ["/sbin/mksquashfs", in_img, out_img, "-comp", compress_type] else: args = ["/sbin/mksquashfs", in_img, out_img] if not sys.stdout.isatty(): args.append("-no-progress") ret = subprocess.call(args) if ret != 0: raise SquashfsError("'%s' exited with error (%d)" % (string.join(args, " "), ret)) Makes much more sense to not change the default of zlib unless explicitly specified, especially when kernel lzma and squash are not available. The ext3fs.img isn't mountable I suspect since the compose host may be defaulting to lzma, I am not sure.
The default is zlib. If it wasn't things would have been broken long before July 15th. What you may be confused about is that the -comp option to mksquashfs is new to 4.1. Using it with older versions doesn't work and will cause mksquashfs to error out. So if we want zlib compression, we don't pass a -comp option to mksquashsfs. This allows the current livecd-creator to working with older versions of squashfs-tools. The default compression type for mksquashfs will remain zlib in Fedora to help allow applications be backwards compatible if desired. The default compression for livecd-creator will hopefully at some point change to lzma, but can't right now as Lougher's lzma patches were not accepted in the past and he (nor anyobe else) has provided acceptible ones. He plans to do so at some point, but there are no guarantees.
I don't see how this is working, it's -comp gzip not -comp zlib [root@ip70-190-121-13 tmp]# mksquashfs /tmp /tmp/squashfs.img -comp zlib FATAL_ERROR: Compressor "zlib" is not supported! Compressors available: gzip (default) lzma [root@ip70-190-121-13 tmp]# mksquashfs /tmp /tmp/squashfs.img -comp gzip Parallel mksquashfs: Using 4 processors Creating 4.0 filesystem on /tmp/squashfs.img, block size 131072. [=================================================================================================================|] 1/1 100% Exportable Squashfs 4.0 filesystem, gzip compressed, data block size 131072 compressed data, compressed metadata, compressed fragments, compressed xattrs
Line 49 of live.py seems to say zlib also, not gzip. self.compress_type = "zlib" """mksquashfs compressor to use.""" Maybe we can fix these. You mentioned we would see something before the 14th if this were a problem, but I disagree. What you commit to git isn't packaged and put into RAWHIDE repository which the compose hosts use until you rebuild the package, and tag it submit it.
You are right about that being a bug. At some point I was inconsistent about using gzip and zlib (or mksquashfs changed what it called it and I didn't notice). It works because when default compression is used no -comp option is passed. However, this should be fixed.
I am going to fix the "zlib" to "gzip" issue right away. But that isn't what is causing this problem. squashfs-tools went into rawhide in early June. That is when we would have seen the problem.
In reply to Comment 42 This isn't the case, in /usr/bin/livecd-creator It seems we're using compress_type and set a default: imgopt.add_option("", "--compression-type", type="string", dest="compress_type", help="Compression type recognized by mksquashfs (default zlib, lzma needs custom kernel)", default="zlib") The dest="compress_type" is set to default "zlib" Maybe you meant default=None instead.
No. That is a style issue. I separated the compressor specified for livecd-creator from how we specify the one for mksquashfs in order to help support using older versions of mksquashfs. The default for livecd-creator is 'gzip' (now that I fixed the thinko you pointed out). The 'gzip' compressor is specified for mksquashfs by not using a -comp option. This works with both versions 4.0 and 4.1 of mksquashfs. There is a way to specify not using compression for livecd-creator, but in that case mksquashfs isn't used at all.
Absolutely not the case. When you have an option and set a default it is filled with that default whether or not you use --compression-type= or not to livecd-creator. In the case before the typo fixes were committed, and still currently the default will be what you set it to here: imgopt.add_option("", "--compression-type", type="string", dest="compress_type", help="Compression type recognized by mksquashfs (default zlib, lzma needs custom kernel)", default="zlib") Unless you use default=None then it depends solely on if someone passes --compressor-type= to teh program. Otherwise it is currently defaulted to gzip, which is fine.
These are examples of using default=None or default="some value". sysopt.add_option("-t", "--tmpdir", type="string", dest="tmpdir", default="/var/tmp", help="Temporary directory to use (default: /var/tmp)") sysopt.add_option("", "--cache", type="string", dest="cachedir", default=None, help="Cache directory to use (default: private cache") parser.add_option_group(sysopt)
In any case on this bug, it seems the nightly-spins have a ext3fs.img which file magic reads as being Ext4 filesystem type. I'm trying to find out if this is a Ext4 bug, or if it is not even supposed to be using mkfs.ext4 or what is going on. In imgcreate it uses mkfs. self.fstype but then directly below that it does some tune2fs stuff on the same image, which might be causing the problem.
The expected behavior is that gzip is the default for livecd-creator. If the lzma patches ever land in the kernel, and testing doesn't show problems with resource usage, the default will switch to lzma (for some future Fxx). But only for livecd-creator, mksquashfs will still have gzip as the default. And the test for gzip being handled by not including a -comp option will remain. I think the problem is more likely tied to e2fsprogs which was updated around the time the problem started showing up. I also think there is something arch related about the problem. nirik was trying to test if that was the case. Possibly there is some interaction with the gcc update as well.
I'm still puzzled on this issue and whats causing it. /mnt2/LiveOS/ext3fs.img: Linux rev 1.0 ext4 filesystem data (extents) (large files) (huge files) # mount -o loop /mnt2/LiveOS/ext3fs.img /mnt3 mount: wrong fs type, bad option, bad superblock on /dev/loop2, missing codepage or helper program, or other error In some cases useful info is found in syslog - try dmesg | tail or so dmesg: JBD: no valid journal superblock found EXT4-fs (loop2): error loading journal I did try a while back a livecd-tools version that was before any changes this cycle, so I don't think it's a livecd-tools problem. I am wondering if it's a 32bit vs 64bit issue. ie, 32 is working for folks, but the nightly compose machine is 64bit. Happy to try anything else people can suggest.
(In reply to comment #32) > In reply to comment 31 > This has nothing to do with that bug. > > This has to do with the ext3fs.img not being mountable and reading file magic > says it's Ext4 and failure to mount it when the system boots. Nice - sorry I was confusing with bug 609049. TooManyLiveBootBugs... :-/
In reply to Comment 51 please stop changing the bug around. This has not been confirmed to be caused from x86_64 unless you want to provide some testing results and confirmation, leave the bug attributes alone.
i686 boots for me (need root=live:/dev/sr0 for ide, ie qemu).
Just to clarify my working i686 spins are on a i686 host - which is why I hadn't seen this issue yet.
desktop i686 doesn't boot for me and fails with "no root device found". Built on current (aka GA+updates) F13 i686, running in a 32-bit KVM guest. I tried "linux0 root=live:/dev/sr0" at the syslinux boot prompt.
This is due to the journal being broken in some way. May be it is an Ext4 bug only showing up after resize2fs. Bug 619020 has been filed to get some more eyes on it
esandeen has concluded this is probably not an Ext4 filesystem bug. He has concluded that it appears to be a SquashFS bug instead. Bug 619020 covers this and has been reassigned to squashfs-tools maintainer.
This bug appears to have been reported against 'rawhide' during the Fedora 14 development cycle. Changing version to '14'. More information and reason for this action is here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping
Discussed at today's blocker meeting. This is still a blocker. We realize it's a complex issue, but please be aware that this needs to be resolved in some way or another - we need to be able to generate working live images for x86-64 and i686 - by Tuesday 2010-08-03, or the Alpha will slip. -- Fedora Bugzappers volunteer triage team https://fedoraproject.org/wiki/BugZappers
This is basically a duplicate now of bug 619020 for which a fixed squashfs-tools package is now available in Bodhi.
This bug was caused by 619020 which is now closed. *** This bug has been marked as a duplicate of bug 619020 ***