Bug 1098499

Summary: add "--systemd-mark" to nbd-client arguments
Product: [Fedora] Fedora Reporter: Harald Hoyer <harald>
Component: dracutAssignee: dracut-maint-list
Status: CLOSED EOL QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 20CC: den.mail, dracut-maint-list, enslaver, gansalmon, gregory.lee.bartholomew, harald, i, itamar, johannbg, jonathan, kernel-maint, lnykryn, madhu.chinakonda, msekleta, plautrba, systemd-maint, vpavlin, wtogami, xjakub, zbyszek
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 1058111 Environment:
Last Closed: 2015-06-30 01:19:27 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1058111    
Bug Blocks:    

Description Harald Hoyer 2014-05-16 11:24:05 UTC
+++ This bug was initially created as a clone of Bug #1058111 +++

Description of problem:
My computer has no hard drive, so it boots up via PXE on the network, then the root FS is mounted via nbd at boot (dracut network nbd in initramfs).
When I type "poweroff", the computer is shuting down all services ... until it kills nbd-client service, killing rootfs, leading to a kernel panic, then the computer stays on forever with a kernel panic trace on screen.
It wasn't happening on Fedora 18 i686 PAE.
It seems that nbd-client gets killed too early in the poweroff procedure.


Version-Release number of selected component (if applicable):
nbd-3.6-1.fc20.i686
dracut-network-034-64.git20131205.fc20.1.i686
kernel-PAE-3.12.8-300.fc20.i686


PXE parameters :
label Fedora20_i686
menu label Fedora20 i686 PAE
kernel vmlinuz-3.12.8-300.fc20.i686+PAE
initrd initramfs-3.12.8-300.fc20.i686+PAE.img
append rw netroot=nbd:192.168.56.100:2014:ext3 root=UUID=8ad5f702-4ecb-4e5f-ac8a-912641c7d8f9 ip=dhcp vga=865 quiet selinux=0

How reproducible:
Always

Steps to Reproduce:
1. install Fedora on a mounted nbd target (ext3 formatted in my case)
2. chroot in the newly-installed system and: yum install dracut-network nbd
3. put the rootfs (nbd) UUID in /etc/fstab
4. dracut --kver 3.12.8-300.fc20.i686+PAE -f /boot/initramfs-3.12.8-300.fc20.i686+PAE.img
5. prepare another machine to serve tftp and dhcp to be able to PXE boot
6. prepare the pxe menu (as described obove) on this other machine
7. boot the computer via PXE, it should find its NBD root device and mount it
8. login as root, then type poweroff

Actual results:
The computer will shutdown all services until it kills the nbd-client, thus killing its root filesystem.
It then kernel panics stating :

[ok] Reached target Unmount all filesystems
[ok] Stopped monitoring of LVM2 [...]
     Stopping LVM2 metadata daemon ...
[ok] Stopped LVM2 metadata daemon
[ok] starting restore /run/initramfs
[ok] reached target shutdown
[kernel.time] block nbd0: receive control failed (result -4)
[kernel.time] block nbd0: attempted send on closed socket
[kernel.time] end_request: I/O error, dev nbd0, sector 6680
[kernel.time] ---------[ cut here ] -------
[kernel.time] kernel bug at fs/buffer.c:3015!
[kernel.time] invalid opcode: 0000 [#1] SMP
[... kernel trace ...]
[kernel.time] ---------[ end trace ] --------
[kernel.time] kernel panic - not syncing: attempted to kill init! exitcode=0x0000000b


Expected results:
The system should shutdown cleanly (disconnect NBD at the ultimate end of the procedure) and the computer be turned off at the end.


Additional info:

--- Additional comment from Josh Boyer on 2014-01-27 09:51:44 EST ---

The kernel paniced because init died.  The init process is responsible for shutdown order, so perhaps the systemd guys have a suggestion here.

--- Additional comment from Josh Boyer on 2014-01-27 09:52:00 EST ---

The kernel paniced because init died.  The init process is responsible for shutdown order, so perhaps the systemd guys have a suggestion here.

--- Additional comment from Lennart Poettering on 2014-01-27 19:44:53 EST ---

Well, i nbd wants to survive the final killing spree then it needs to be run from the initrd and mark itself appropriately.

http://www.freedesktop.org/wiki/Software/systemd/RootStorageDaemons/

Otherwise it will be killed, and thus the root fs goes away, and thus the init process might crash, which causes the kernel to panic.

REassigning to nbd.

--- Additional comment from Christopher Meng on 2014-01-28 04:16:57 EST ---

I will backport a fix later or in Thursday maybe.

--- Additional comment from Fedora Update System on 2014-02-03 22:08:27 EST ---

nbd-3.7-2.fc20 has been submitted as an update for Fedora 20.
https://admin.fedoraproject.org/updates/nbd-3.7-2.fc20

--- Additional comment from Christopher Meng on 2014-02-03 22:11:33 EST ---

Sorry for the belated update, I'm celebrating Chinese New Year recently. 

Please see -m option and manpages of -m option after this update, and test if it still crashes then leave a karma in bodhi. Or tell me directly here if you don't have any Fedora account. 

Thanks!

--- Additional comment from Fedora Update System on 2014-02-04 22:41:15 EST ---

Package nbd-3.7-2.fc20:
* should fix your issue,
* was pushed to the Fedora 20 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing nbd-3.7-2.fc20'
as soon as you are able to.
Please go to the following url:
https://admin.fedoraproject.org/updates/FEDORA-2014-2024/nbd-3.7-2.fc20
then log in and leave karma (feedback).

--- Additional comment from Dag on 2014-02-12 00:37:04 EST ---

Hi,


I updated to nbd-3.7-2.fc20 from the Fedora 20 testing repository.

Then I updated kernel to 3.12.10-300-fc20.i686+PAE.
Then rebooted on the new kernel/iniramfs.

I then made a "dracut --force -vv" to be sure to have updated nbd-client in initramfs.

I then rebooted on the new initramfs : I still get the very same error on shutdown (poweroff/reboot).


So this update didn't fix the issue : nbd-client gets killed seemingly too early when shuting down resulting in kernel pannic.


Regards,
Daggett

--- Additional comment from Harald Hoyer on 2014-03-14 06:37:54 EDT ---

nbd-client is started in the dracut-initqueue.service from a shell script in the initramfs.

So a dracut patch is needed to add the "-m" option to the nbd-client call.

You can also mirror the behavior of mdmon.

301         if (in_initrd()) {
302                 /*
303                  * set first char of argv[0] to @. This is used by
304                  * systemd to signal that the task was launched from
305                  * initrd/initramfs and should be preserved during shutdown
306                  */
307                 argv[0][0] = '@';
308         }
309 

...

1945 int in_initrd(void)
1946 {
1947         /* This is based on similar function in systemd. */
1948         struct statfs s;
1949         return  statfs("/", &s) >= 0 &&
1950                 ((unsigned long)s.f_type == TMPFS_MAGIC ||
1951                  (unsigned long)s.f_type == RAMFS_MAGIC);
1952 }

--- Additional comment from Fedora Update System on 2014-03-20 22:29:39 EDT ---

nbd-3.8-1.fc20 has been submitted as an update for Fedora 20.
https://admin.fedoraproject.org/updates/nbd-3.8-1.fc20

--- Additional comment from Fedora Update System on 2014-04-02 05:11:46 EDT ---

nbd-3.8-1.fc20 has been pushed to the Fedora 20 stable repository.  If problems still persist, please make note of it in this bug report.

--- Additional comment from Dag on 2014-04-06 08:23:18 EDT ---

Hi,

I updated the system with the new nbd package (nbd-3.8-1.fc20).

Then I updated kernel to 3.13.8.200.fc20.i686+PAE.

Then I rebooted on new kernel/initramfs.

Then I made a "dracut --force -vv" to be sure to have updated nbd-client in initramfs.

I then rebooted on the new initramfs : I still get the very same error on shutdown (poweroff/reboot).


So this update didn't fix the issue : nbd-client gets killed seemingly too early when shuting down resulting in kernel panic.


As  Harald Hoyer said : dracut may need an update.


I made a new bug report as I believe it is NOT a nbd issue anymore :

https://bugzilla.redhat.com/show_bug.cgi?id=1084763


May be this bug report could be closed as "fixed" for nbd ?
 or reassigned to systemd ?
 or reassigned to dracut ?


regards,
Dag

Comment 2 Dag 2014-05-16 15:07:53 UTC
Hi,

Thanks for submitting this dracut modification.


I replaced my /usr/lib/dracut/modules.d/95nbd/nbdroot.sh with your modified version (from the link in your message to git.kernel.org) :

# cp /usr/lib/dracut/modules.d/95nbd/nbdroot.sh /usr/lib/dracut/modules.d/95nbd/nbdroot.sh.orig
# > /usr/lib/dracut/modules.d/95nbd/nbdroot.sh
# vi /usr/lib/dracut/modules.d/95nbd/nbdroot.sh

I then rebuilt the initramfs :
# dracut --force -vv

I then copied the new initramfs to the TFTP server, then restarted the computer.

It booted flawlessly.

Then I logged in as root and typed "poweroff" : the computer did poweroff successfully, no more kernle panic nor other failure.

I also verified that the NBD image filesystem was unmounted cleanly : it was.

So the issue is solved for me.


Thanks again,
Daggett

Comment 3 Dag 2014-05-16 15:16:15 UTC
*** Bug 1084763 has been marked as a duplicate of this bug. ***

Comment 4 Fedora End Of Life 2015-05-29 11:52:00 UTC
This message is a reminder that Fedora 20 is nearing its end of life.
Approximately 4 (four) weeks from now Fedora will stop maintaining
and issuing updates for Fedora 20. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as EOL if it remains open with a Fedora  'version'
of '20'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 20 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 5 Fedora End Of Life 2015-06-30 01:19:27 UTC
Fedora 20 changed to end-of-life (EOL) status on 2015-06-23. Fedora 20 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.