Bug 1227537

Summary: dracut generates initramfs which fails to boot for kernels newer than 4.1.0-0.rc4.git1.1.fc23
Product: [Fedora] Fedora Reporter: Michal Jaegermann <michal>
Component: systemdAssignee: systemd-maint
Status: CLOSED RAWHIDE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: high Docs Contact:
Priority: unspecified    
Version: rawhideCC: dracut-maint-list, eblake, harald, johannbg, jonathan, jsynacek, juhani.jaakola, lnykryn, msekleta, redhat, s, systemd-maint, udovdh, zbyszek
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-06-12 22:36:21 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
a screen image from a failure to boot
none
dracut generated rdsosreport.txt for 4.1.0-0.rc6.git0.1.fc23.x86_64 kernel
none
git fa05e972 v2 for rawhide 202-5 none

Description Michal Jaegermann 2015-06-03 00:20:57 UTC
Created attachment 1034048 [details]
a screen image from a failure to boot

Description of problem:

The last kernel which boots without any issues is 4.1.0-0.rc4.git1.1.fc23.x86_64.  For kernels 4.1.0-0.rc5.git0.1.fc23.x86_64 and 4.1.0-0.rc6.git0.1.fc23.x86_64 a boot ends up in an emergency shell because systemd-fsck-root.service failed on a root file system (specified by label).  It says "Exit the shell to continue" but if one tries that then it responds with "Failed to start default target. Transaction is destructive" and gets stuck.  A picture of a screen with a failed boot is attached.

Asking for a status of systemd-fsck-root.service results in the following:

● systemd-fsck-root.service - File System Check on /dev/disk/by-label/\x2f1
   Loaded: loaded (/run/systemd/generator/systemd-fsck-root.service)
   Active: failed (Result: exit-code) since Tue 2015-06-02 22:44:43 UTC; 7min ago
     Docs: man:systemd-fsck-root.service(8)
  Process: 291 ExecStart=/usr/lib/systemd/systemd-fsck /dev/disk/by-label//1 (code=exited, status=1/FAILURE)
 Main PID: 291 (code=exited, status=1/FAILURE)

Jun 02 22:44:43 YYY systemd[1]: Starting File System Check on /dev/disk/by-label/\x2f1...
Jun 02 22:44:43 YYY systemd-fsck[291]: Failed to stat /dev/disk/by-label//1: No such file or directory
Jun 02 22:44:43 YYY systemd[1]: systemd-fsck-root.service: Main process exited, code=exited, status=1/FAILURE
Jun 02 22:44:43 YYY systemd[1]: Failed to start File System Check on /dev/disk/by-label/\x2f1.
Jun 02 22:44:43 YYY systemd[1]: systemd-fsck-root.service: Unit entered failed state.
Jun 02 22:44:43 YYY systemd[1]: systemd-fsck-root.service: Failed with result 'exit-code'.

OTOH a direct fsck on that "Failed to start File System Check on /dev/disk/by-label/\x2f1" runs without any issues.  Moreover an attached rdsosreport.txt was saved on this "non-existing" file system after it got manually mounted.

Version-Release number of selected component (if applicable):
dracut-041-10.git20150219.fc23.x86_64

How reproducible:
always

Steps to Reproduce:
1. try to boot with a newer kernel


Additional info: 
Recently systemd was updated to systemd-220-3.fc23.x86_64.  Only that initramfs-4.1.0-0.rc5.git0.1.fc23.x86_64.img was created _before_ systemd was updated and initramfs for 4.1.0-0.rc4.git1.1.fc23.x86_64 luckily still boots.  Maybe kernel changes are too blame?  I do not really know.  The whole boot mechanism becomes a maze-like.

Comment 1 Michal Jaegermann 2015-06-03 00:22:30 UTC
Created attachment 1034050 [details]
dracut generated rdsosreport.txt for 4.1.0-0.rc6.git0.1.fc23.x86_64 kernel

Comment 2 Michal Jaegermann 2015-06-03 00:40:25 UTC
Replacing in a boot command "root=LABEL=/1" with "root=UUID=..." allows me to boot again.  No idea what is screwing up but this is a nasty bug.

Comment 3 Harald Hoyer 2015-06-03 09:49:17 UTC
/dev/disk/by-label:
total 0
lrwxrwxrwx 1 root 0 10 Jun  2 22:44 \x2f1

So, it seems /dev/disk/by-label/\x2f1 exists, but systemd unescapes it to /dev/disk/by-label//1

Reassigning to systemd.

Comment 4 Harald Hoyer 2015-06-03 09:52:40 UTC
https://github.com/systemd/systemd/issues/50

Comment 5 Michal Jaegermann 2015-06-03 15:46:44 UTC
(In reply to Harald Hoyer from comment #3)
> 
> So, it seems /dev/disk/by-label/\x2f1 exists, but systemd unescapes it to
> /dev/disk/by-label//1

It seemed to me originally that both initramfs-4.1.0-0.rc4.git1.1.fc23.x86_64.img (booting) and initramfs-4.1.0-0.rc5.git0.1.fc23.x86_64.img (failing to boot) were made with the same version of systemd.  It looks that I was mistaken and the older initramfs was created with systemd-219-12.fc23 and the next one, which failed to boot, with systemd-220-1.fc23.  The current one installed is systemd-220-3.fc23.

Comment 6 Daniel Mack 2015-06-05 07:42:44 UTC
This should now be fixed upstream with commit fa05e97257. Could you give that a try?

Comment 7 Michal Jaegermann 2015-06-05 15:21:44 UTC
(In reply to Daniel Mack from comment #6)
> This should now be fixed upstream with commit fa05e97257. Could you give
> that a try?

Is there any way to see that particular commit without cloning the whole repository from github?  Nothing strikes me as "obvious".  Not even if github is a place I really should be looking at.

Comment 9 Michal Jaegermann 2015-06-05 21:03:00 UTC
(In reply to Daniel Mack from comment #6)
> This should now be fixed upstream with commit fa05e97257. Could you give
> that a try?

Yes, indeed, with this change there are no more troubles with systemd-fsck-root.service and a root file system specified by label.  With a new initramfs my machine boots.

Only that fa05e97257.diff is not a form acceptable to systemd.spec.  Once hacked around that an over three hours recompilation time is deep into a ridiculous territory.  It is good that I had to go somewhere while this compilation was running or I would be convinced a long time ago that something went haywire.   1.4G of disk space taken by this compilation does not help very much.  You should have warn me.

Comment 10 Michal Jaegermann 2015-06-08 18:51:29 UTC
A just updated systemd-220-5.fc23 sports this bug too.

Comment 11 poma 2015-06-08 21:23:30 UTC
Created attachment 1036469 [details]
git fa05e972 v2 for rawhide 202-5

Comment 12 Michal Jaegermann 2015-06-12 22:36:21 UTC
systemd-220-8.fc23 does not seem to suffer from that affliction anymore.

Comment 13 Juhani Jaakola 2015-07-29 13:37:17 UTC
I have the same problem in Fedora 22 with versions:

kernel-4.0.8-300.fc22.i686 and kernel-4.1.2-200.fc22.i686
systemd-219-19.fc22.i686

The workaround in comment 2 (changing the root parameter in boot command line) works for me.

Comment 14 Michal Schmidt 2017-10-10 14:49:29 UTC
*** Bug 1313463 has been marked as a duplicate of this bug. ***