Bug 815827 - Connection to root iSCSI disk is disrupted during boot
Connection to root iSCSI disk is disrupted during boot
Status: CLOSED CURRENTRELEASE
Product: Fedora
Classification: Fedora
Component: iscsi-initiator-utils (Show other bugs)
17
Unspecified Unspecified
unspecified Severity unspecified
: ---
: ---
Assigned To: Mike Christie
Fedora Extras Quality Assurance
AcceptedBlocker
:
: 818656 (view as bug list)
Depends On:
Blocks: F17Blocker/F17FinalBlocker
  Show dependency treegraph
 
Reported: 2012-04-24 11:39 EDT by Tim Flink
Modified: 2012-06-26 12:08 EDT (History)
7 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2012-05-07 14:14:47 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:


Attachments (Terms of Use)
boot log from updated system with iscsi target as root partition (111.82 KB, text/x-log)
2012-04-24 11:39 EDT, Tim Flink
no flags Details

  None (edit)
Description Tim Flink 2012-04-24 11:39:12 EDT
Created attachment 579895 [details]
boot log from updated system with iscsi target as root partition

I have an F17 install with the following disk setup:
 /dev/sda (local sata disk)
  sda1 - 1MB BIOS boot
  sda2 - 500MB /boot
  sda3 - 4096MB swap

 /dev/sdb (iSCSI target)
  sda1 - 20GB /

This started as a DVD install of F17 beta but I used anaconda rescue mode to update the system as of the filing of this bug.

When I boot the system, everything starts off fine until the iscsi disk is suddenly lost. At this point, the boot process pauses before a never-ending stream of IO errors starts showing up on the console.

I have attached the boot log from that system with rd.debug and systemd.log_level=debug
Comment 1 Tim Flink 2012-04-24 11:40:30 EDT
Proposing as a blocker for F17 final due to violation of the following F17 final release criterion [1]:

The installer must be able to complete an installation using any network-attached storage devices (e.g. iSCSI, FCoE, Fibre Channel)

[1] http://fedoraproject.org/wiki/Fedora_17_Final_Release_Criteria
Comment 2 Mike Christie 2012-04-24 15:50:05 EDT
Is the same issue we are seeing in this bz:
https://bugzilla.redhat.com/show_bug.cgi?id=815355
Comment 3 Mike Christie 2012-04-24 15:56:40 EDT
Tim,

When you boot and you see those error messages, can you tell if the network is up on the initiator/host box? From some other box, can you ping it?

The iscsi initaitor can deal with disruptions. The default settings have the iscsi layer wait for 120 secs before we fail IO. This is what this msg:

[  176.864039]  session1: session recovery timed out after 120 secs

indicates.

From the logs, it looks like we detected the connection issue:
[   56.336068]  connection1:0: ping timeout of 5 secs expired, recv timeout 5, last rx 4294713618, last ping 4294718620, now 4294723632
We then tried to reconnect for 120 secs, but could not. We then failed IO.


I will try to replicate here and debug some more.
Comment 4 Tim Flink 2012-04-24 16:23:27 EDT
(In reply to comment #3)
> When you boot and you see those error messages, can you tell if the network is
> up on the initiator/host box? From some other box, can you ping it?

I tried pinging the box repeatedly while it was booting. It started responding to the pings when em1 came up with dracut but the pings started failing shortly before this error:

> [   56.336068]  connection1:0: ping timeout of 5 secs expired, recv timeout 5,
> last rx 4294713618, last ping 4294718620, now 4294723632
Comment 5 Tim Flink 2012-04-26 15:19:11 EDT
I'm +1 blocker on this due to the F17 final criterion listed in c#1
Comment 6 Tim Flink 2012-05-01 17:42:08 EDT
Accepted as a Fedora 17 final blocker as it violates the following final release criterion [1]:

The installer must be able to complete an installation using any network-attached storage devices (e.g. iSCSI, FCoE, Fibre Channel)

[1] https://fedoraproject.org/wiki/Fedora_17_Final_Release_Criteria
Comment 7 Jiri Popelka 2012-05-04 04:22:59 EDT
Yes, this seems to be the same problem as in bug #815355.
dhcp-4.2.4-0.4.rc1.fc17 (heading to stable atm) should fix it.
Comment 8 Tim Flink 2012-05-04 12:32:59 EDT
I did another install using TC2 + updates-testing (which pulls in dhcp-4.2.4-0.4.rc1.fc17) and I am no longer seeing the same issue.
Comment 9 Tim Flink 2012-05-04 12:41:14 EDT
(In reply to comment #8)
> I did another install using TC2 + updates-testing (which pulls in
> dhcp-4.2.4-0.4.rc1.fc17) and I am no longer seeing the same issue.

I suppose that I could have been a little more specific - the system successfully boots post-install and this looks to be fixed.

Will test again after the dhcp build has been pushed to stable but moving to ON_QA for now.
Comment 10 Tim Flink 2012-05-07 14:14:47 EDT
Verified fix with Fedora 17 Final TC3 - closing bug
Comment 11 Vivek Goyal 2012-05-08 13:51:03 EDT
*** Bug 818656 has been marked as a duplicate of this bug. ***

Note You need to log in before you can comment on or make changes to this bug.