Bug 1462150
| Summary: | mkfs.msdos fails to make a filesystem | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | Lubos Kocman <lkocman> |
| Component: | lorax | Assignee: | Brian Lane <bcl> |
| Status: | CLOSED ERRATA | QA Contact: | Release Test Team <release-test-team-automation> |
| Severity: | urgent | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 7.4 | CC: | bcl, jkonecny, jstodola, lkocman, pkotvan, sbueno |
| Target Milestone: | rc | ||
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | lorax-19.6.95-1 | Doc Type: | If docs needed, set a value |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2018-04-10 17:38:04 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Lubos Kocman
2017-06-16 10:36:01 UTC
This looks like a problem with the host. DEBUG util.py:417: mount: /dev/loop0: can't read superblock DEBUG util.py:417: losetup: /dev/loop0: detach failed: No such device or address There is NO way that the changes from lorax-19.6.90-1 to lorax-19.6.91-1 could have caused a problem. They were this single commit: https://github.com/rhinstaller/lorax/commit/31fec67150b0f5d92f3ac4be88f7eb7d387edbf3 Which only touches the documentation and example kickstarts. This nightly log has more details, I'm pretty sure it's the same thing but with the important details included: http://download-node-02.eng.bos.redhat.com/nightly/RHEL-7.4-20170620.n.0/logs/x86_64/buildinstall-Workstation-logs/program.log Running... mkefiboot --label=ANACONDA /mnt/redhat/nightly/RHEL-7.4-20170620.n.0/work/x86_64/buildinstall/Workstation/EFI/BOOT /mnt/redhat/nightly/RHEL-7.4-20170620.n.0/work/x86_64/buildinstall/Workstation/images/efiboot.img mkfs.fat 3.0.20 (12 Jun 2013) Loop device does not match a floppy size, using default hd params ERROR:program:mkfs.msdos: Attempting to create a too large filesystem mkfs.msdos: Attempting to create a too large filesystem ERROR:pylorax.imgutils:mkfs exited with a non-zero return code: 1 ERROR:pylorax.imgutils:None ERROR:program:losetup: /dev/loop0: detach failed: No such device or address losetup: /dev/loop0: detach failed: No such device or address So the mkfs.msdos failed for some reason. It doesn't happen with any of the other variations that I looked at (Server, Client). The directory it is pulling from has the same content as the others, only about 8.4M of data. Conclusions that I've come to: 1. mkfs.msdos is failing because it thinks the /dev/loop0 is too big for a fat filesystem. The error returned happens when there are no suitable FAT formats for the size of the device. 2. The contents of EFI/BOOT are small, so that's not the problem. 3. If the loop0 device is too small for the contents it will create the filesystem but the copy will fail and you will get a different error than what we see. 4. If the size estimate for EFI/BOOT is wrong it could create too big a /dev/loop0 But I don't see how that could happen. 5. If /dev/loop0 wasn't fully setup when mkfs.msdos runs on it, it may have trouble. I don't really see how that could happen either, but I ran 20k passes of a test script on 2 systems and saw no failures. Not proof, but no smoking gun either. 6. If you run mkfs.msdos /dev/loop0 without setting up the loop, you get the error we are seeing in the logs: mkfs.msdos: Attempting to create a too large filesystem So my conclusion is that somehow the mkfs.msdos call runs before the /dev/loop0 is completely setup. I'm not sure *how* it is happening but it seems to fit the results. Here's a patch that may fix it. I'm not 100% sure since I cannot reproduce it, but at the least it shouldn't make it any worse: https://github.com/rhinstaller/lorax/pull/221 *** Bug 1464962 has been marked as a duplicate of this bug. *** Note about most recent failures with 19.6.94 -- it doesn't appear (to me) to be any worse than .92, it worked in some places and not others. The reason for *this* failure (which is not related to the previous failures) is that the check I added does not take into account the inconsistent results from losetup. Output from program.log on working runs: Running... losetup --find --show /var/tmp/lorax.rupsah/installroot/images/runtime-workdir/LiveOS/rootfs.img /dev/loop0 Running... udevadm settle --timeout 300 Running... losetup --list -O BACK-FILE /dev/loop0 BACK-FILE /var/tmp/lorax.rupsah/installroot/images/runtime-workdir/LiveOS/rootfs.img When the new test fails, it looks like this: Running... losetup --find --show /var/tmp/lorax.Y7tnwi/installroot/images/runtime-workdir/LiveOS/rootfs.img /dev/loop0 Running... udevadm settle --timeout 300 Running... losetup --list -O BACK-FILE /dev/loop0 BACK-FILE /var/tmp/lorax.Y7tnwi/installroot/images/runtime-workdir/LiveO* It ends up that in lib/loopdev.c in the losetup code it truncates the output to 64 chars with a * at the end if it is unable to get the backing store via sysfs. Why can't it get it via sysfs? Unknown at this time. This is still a problem. I'm going to revert build in errata to 19.6.92-1.el7. So we don't accidentally ship what we didn't compose. I expect that we will not risk any more respins for RC. (from http://download-node-02.eng.bos.redhat.com/brewroot/work/tasks/8731/13578731/root.log) Installing: lorax x86_64 19.6.94-1.el7 build 169 k strace x86_64 4.12-4.el7 build 458 k (from http://download-node-02.eng.bos.redhat.com/brewroot/work/tasks/8731/13578731/runroot.log) + rm -rf /mnt/redhat/devel/candidate-trees/RHEL-7.4-20170630.0/work/x86_64/buildinstall/Client + lorax '--product=Red Hat Enterprise Linux' --version=7.4 --release=7.4 --source=file:///mnt/redhat/devel/candidate-trees/RHEL-7.4-20170630.0/work/x86_64/repo --variant=Client --nomacboot --isfinal --buildarch=x86_64 '--volid=RHEL-7.4 Client.x86_64' --logfile=/mnt/redhat/devel/candidate-trees/RHEL-7.4-20170630.0/logs/x86_64/buildinstall-Client-logs/lorax.log /mnt/redhat/devel/candidate-trees/RHEL-7.4-20170630.0/work/x86_64/buildinstall/Client ... 2017-06-30 04:46:10,521: writing .buildstamp file writing .buildstamp file 2017-06-30 04:46:11,074: doing post-install configuration doing post-install configuration 2017-06-30 04:46:11,091: running runtime-postinstall.tmpl running runtime-postinstall.tmpl Operation failed: No such file or directory 2017-06-30 04:46:11,316: writing .discinfo file writing .discinfo file 2017-06-30 04:46:11,327: backing up installroot backing up installroot 2017-06-30 04:46:11,920: generating kernel module metadata generating kernel module metadata 2017-06-30 04:46:11,920: doing depmod and module-info for 3.10.0-691.el7.x86_64 doing depmod and module-info for 3.10.0-691.el7.x86_64 2017-06-30 04:46:14,365: cleaning unneeded files cleaning unneeded files 2017-06-30 04:46:14,395: running runtime-cleanup.tmpl running runtime-cleanup.tmpl 2017-06-30 04:46:17,535: creating the runtime image creating the runtime image Traceback (most recent call last): File "/usr/sbin/lorax", line 337, in <module> main(sys.argv) File "/usr/sbin/lorax", line 235, in main remove_temp=True) File "/usr/lib/python2.7/site-packages/pylorax/__init__.py", line 301, in run compression=compression, compressargs=compressargs) File "/usr/lib/python2.7/site-packages/pylorax/treebuilder.py", line 165, in create_runtime "Anaconda", size=size) File "/usr/lib/python2.7/site-packages/pylorax/imgutils.py", line 100, in mkrootfsimg mkext4img(rootdir, outfile, label=label, size=fssize) File "/usr/lib/python2.7/site-packages/pylorax/imgutils.py", line 411, in mkext4img mkfsargs=["-L", label, "-b", "1024", "-m", "0"], graft=graft) File "/usr/lib/python2.7/site-packages/pylorax/imgutils.py", line 388, in mkfsimage with LoopDev(outfile, size) as loopdev: File "/usr/lib/python2.7/site-packages/pylorax/imgutils.py", line 293, in __enter__ self.loopdev = loop_attach(self.filename) File "/usr/lib/python2.7/site-packages/pylorax/imgutils.py", line 153, in loop_attach loop_waitfor(dev.strip(), outfile) File "/usr/lib/python2.7/site-packages/pylorax/imgutils.py", line 145, in loop_waitfor raise RuntimeError("Unable to setup %s on %s" % (loop_dev, outfile)) RuntimeError: Unable to setup /dev/loop3 on /var/tmp/lorax.zM_jw5/installroot/images/runtime-workdir/LiveOS/rootfs.img PR for the final patch - https://github.com/rhinstaller/lorax/pull/231 Hi Lubos, can you please confirm that the issue is fixed in the current version of lorax? Thanks. *** Bug 1543325 has been marked as a duplicate of this bug. *** Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:0947 |