Bug 815827 - Connection to root iSCSI disk is disrupted during boot
Summary: Connection to root iSCSI disk is disrupted during boot
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Fedora
Classification: Fedora
Component: iscsi-initiator-utils
Version: 17
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Mike Christie
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard: AcceptedBlocker
Keywords:
: 818656 (view as bug list)
Depends On:
Blocks: F17Blocker, F17FinalBlocker
TreeView+ depends on / blocked
 
Reported: 2012-04-24 15:39 UTC by Tim Flink
Modified: 2012-06-26 16:08 UTC (History)
7 users (show)

(edit)
Clone Of:
(edit)
Last Closed: 2012-05-07 18:14:47 UTC


Attachments (Terms of Use)
boot log from updated system with iscsi target as root partition (111.82 KB, text/x-log)
2012-04-24 15:39 UTC, Tim Flink
no flags Details

Description Tim Flink 2012-04-24 15:39:12 UTC
Created attachment 579895 [details]
boot log from updated system with iscsi target as root partition

I have an F17 install with the following disk setup:
 /dev/sda (local sata disk)
  sda1 - 1MB BIOS boot
  sda2 - 500MB /boot
  sda3 - 4096MB swap

 /dev/sdb (iSCSI target)
  sda1 - 20GB /

This started as a DVD install of F17 beta but I used anaconda rescue mode to update the system as of the filing of this bug.

When I boot the system, everything starts off fine until the iscsi disk is suddenly lost. At this point, the boot process pauses before a never-ending stream of IO errors starts showing up on the console.

I have attached the boot log from that system with rd.debug and systemd.log_level=debug

Comment 1 Tim Flink 2012-04-24 15:40:30 UTC
Proposing as a blocker for F17 final due to violation of the following F17 final release criterion [1]:

The installer must be able to complete an installation using any network-attached storage devices (e.g. iSCSI, FCoE, Fibre Channel)

[1] http://fedoraproject.org/wiki/Fedora_17_Final_Release_Criteria

Comment 2 Mike Christie 2012-04-24 19:50:05 UTC
Is the same issue we are seeing in this bz:
https://bugzilla.redhat.com/show_bug.cgi?id=815355

Comment 3 Mike Christie 2012-04-24 19:56:40 UTC
Tim,

When you boot and you see those error messages, can you tell if the network is up on the initiator/host box? From some other box, can you ping it?

The iscsi initaitor can deal with disruptions. The default settings have the iscsi layer wait for 120 secs before we fail IO. This is what this msg:

[  176.864039]  session1: session recovery timed out after 120 secs

indicates.

From the logs, it looks like we detected the connection issue:
[   56.336068]  connection1:0: ping timeout of 5 secs expired, recv timeout 5, last rx 4294713618, last ping 4294718620, now 4294723632
We then tried to reconnect for 120 secs, but could not. We then failed IO.


I will try to replicate here and debug some more.

Comment 4 Tim Flink 2012-04-24 20:23:27 UTC
(In reply to comment #3)
> When you boot and you see those error messages, can you tell if the network is
> up on the initiator/host box? From some other box, can you ping it?

I tried pinging the box repeatedly while it was booting. It started responding to the pings when em1 came up with dracut but the pings started failing shortly before this error:

> [   56.336068]  connection1:0: ping timeout of 5 secs expired, recv timeout 5,
> last rx 4294713618, last ping 4294718620, now 4294723632

Comment 5 Tim Flink 2012-04-26 19:19:11 UTC
I'm +1 blocker on this due to the F17 final criterion listed in c#1

Comment 6 Tim Flink 2012-05-01 21:42:08 UTC
Accepted as a Fedora 17 final blocker as it violates the following final release criterion [1]:

The installer must be able to complete an installation using any network-attached storage devices (e.g. iSCSI, FCoE, Fibre Channel)

[1] https://fedoraproject.org/wiki/Fedora_17_Final_Release_Criteria

Comment 7 Jiri Popelka 2012-05-04 08:22:59 UTC
Yes, this seems to be the same problem as in bug #815355.
dhcp-4.2.4-0.4.rc1.fc17 (heading to stable atm) should fix it.

Comment 8 Tim Flink 2012-05-04 16:32:59 UTC
I did another install using TC2 + updates-testing (which pulls in dhcp-4.2.4-0.4.rc1.fc17) and I am no longer seeing the same issue.

Comment 9 Tim Flink 2012-05-04 16:41:14 UTC
(In reply to comment #8)
> I did another install using TC2 + updates-testing (which pulls in
> dhcp-4.2.4-0.4.rc1.fc17) and I am no longer seeing the same issue.

I suppose that I could have been a little more specific - the system successfully boots post-install and this looks to be fixed.

Will test again after the dhcp build has been pushed to stable but moving to ON_QA for now.

Comment 10 Tim Flink 2012-05-07 18:14:47 UTC
Verified fix with Fedora 17 Final TC3 - closing bug

Comment 11 Vivek Goyal 2012-05-08 17:51:03 UTC
*** Bug 818656 has been marked as a duplicate of this bug. ***


Note You need to log in before you can comment on or make changes to this bug.