Bug 463624 - [LTC 6.0 FEAT] 201230:single image boot (read-only) root file system (for iSCSI customers)
[LTC 6.0 FEAT] 201230:single image boot (read-only) root file system (for iSC...
Status: CLOSED CURRENTRELEASE
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: anaconda (Show other bugs)
6.0
All All
high Severity high
: beta
: 6.0
Assigned To: Anaconda Maintenance Team
Release Test Team
: FutureFeature, TestOnly
Depends On:
Blocks: 356741 554559
  Show dependency treegraph
 
Reported: 2008-09-24 00:00 EDT by IBM Bug Proxy
Modified: 2010-11-11 10:25 EST (History)
12 users (show)

See Also:
Fixed In Version:
Doc Type: Enhancement
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2010-11-11 10:25:11 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description IBM Bug Proxy 2008-09-24 00:00:47 EDT
=Comment: #1=================================================
Emily J. Ratliff <emilyr@us.ibm.com> - 2008-09-18 00:12 EDT
1. Feature Overview:
Feature Id:	[201230]
a. Name of Feature:	single image boot (read-only) root file system (for iSCSI
customers)
b. Feature Description
This requirement is coming from a few iSCSI customers wanting to boot many
clients from one central SAN OS 'tree'; software-based iSCSI boot of many
clients from a single (read only root) image.


2. Feature Details:
Sponsor:	xSeries
Architectures:
x86
x86_64

Arch Specificity: Both
Affects Installer: Yes
Delivery Mechanism: Request Red Hat development assistance
Category:	Installation
Request Type:	Installer - Enhancement from Distributor
d. Upstream Acceptance:	No Code Required
Sponsor Priority	2
f. Severity: Medium
IBM Confidential:	no
Code Contribution:	no
g. Component Version Target:	Vendor Requirement

3. Business Case
We have customer requesting a nfs boot like feature for iscsi.

4. Primary contact at Red Hat: 
John Jarvis
jjarvis@redhat.com

5. Primary contacts at Partner:
Project Management Contact:
Monte Knutson, mknutson@us.ibm.com, 877-894-1495

Technical contact(s):
Kevin Stansell, kstansel@us.ibm.com
Chris McDermott, mcdermoc@us.ibm.com

IBM Manager:
Julio Alvarez, julioa@us.ibm.com
Comment 1 Tom Coughlan 2008-09-26 18:18:08 EDT
There is nothing here that is specific to iSCSI, as far as I know. It will be the same as any shared-access block device.

Bill, do we (will we?) offer full support for read-only shared root on a block device?
Comment 2 Bill Nottingham 2008-09-29 16:21:35 EDT
We have a mechanism for retrieving local state, so (more or less) yes.
Comment 3 Tom Coughlan 2009-10-01 15:12:03 EDT
Hey, Anaconda team, is this feasible for RHEL 6.0?
Comment 4 Hans de Goede 2009-10-02 04:09:46 EDT
(In reply to comment #3)
> Hey, Anaconda team, is this feasible for RHEL 6.0?  

That depends on what changes (if any) are needed on the installer side. Bill, I think that the initscripts now a days now how to handle a permanent read only /, there is some kernel cmdline option for this right ?

If this works properly they an just do a normal install, then add the readonly option to their kernel cmdline, and let multiple clients use the same disk readonly.

This also assumes that the iscsi initiator / target can handle this, but that is something to ask the storage team :)
Comment 5 Bill Nottingham 2009-10-02 12:29:23 EDT
'readonlyroot' in the commandline, or edit /etc/sysconfig/readonly-root.
Comment 6 Hans de Goede 2009-10-02 13:33:12 EDT
(In reply to comment #5)
> 'readonlyroot' in the commandline, or edit /etc/sysconfig/readonly-root.  

IOW no anaconda changes needed, right ?
Comment 7 Bill Nottingham 2009-10-02 13:39:46 EDT
Not directly... of course, the image may need customizations to run readonly, depending on what you're deploying with it. But that's the case for any image.
Comment 10 Mike Christie 2009-12-04 10:25:00 EST
This will not work normally, because if every client/box/initiator had the same initiator name then we are going to end up getting duplicate scsi initiator port names. The scsi initiator port for iscsi is the initiator name plus isid, and the isid is just some integer.

What will happen is box1 will login with initiatorname=name1, isid=1 and that will succeed. Then box2 will also login with initiatorname=name1, isisd=1 and that will succeed, but it will cause box1 to get logged out. To the target it will look like really basic error recovery handling where the initiator can relogin with the same scsi initiator port values and cause the target to do a implicit logout of the old session and login the new session.

box1 will then just see the target die (it will not know why and will probably just look like  cable pull or some other disruption in the network because the tcp/ip connection just goes dead) and try to relogin with initiatorname=name1, isid=1 and that will succeed and that will cause box2 to get logged out. And you see where I am going :)

I can make it so iscsi-initiator-utils does not use /etc/iscsi/initiatorname.iscsi and instead uses some dynamicly created initiator name. I guess if there was no /etc/iscsi/initiatorname.iscsi I could make it so iscsid just creates a name and uses it for login. It could be a total random name like is created now, or if each client has a unique hostname we could base the initiator name off of that.

This would be really simple on the iscsi initiator side. I could do it for RHEL 6. It would take a hour or so to code up.


Would there be any anaconda changes need to not make that initiatorname.iscsi file, or would we just have the user set this up?


Question for IBM: The problem would be that most targets have some ACLs based on initiatorname. If we create it dynamaically, is this going to work for what you are requesting?
Comment 11 Hans de Goede 2009-12-11 03:47:51 EST
Hi,

About the initiator name problem, I would expect these systems to use
some kind of network booting.

So either ibft in which case atleast the initrd will use
iscsistart -b

And thus the initiator name from ibft, I don't know how this will interact with
the later started iscsid

Or we will use some sort of pxe boot, in which case dracut allows (and needs) specifying the initiatorname on the kernel cmdline as iscsi_initiator=foo

When netbooting over iscsi without ibft and without an iscsi_initiator=foo argument dracut will dynamically generate an initiator name. When not
doing ibft netbooting the used initiatorname will get stored by
dracut as /dev/.initiatorname.iscsi

IOW when adding the iscsi boot code to dracut this problem was taken into account, and with some small changes to iscsid and / or the initscripts things should work.

Either the initscripts or iscsid itself need to be modified so that it will
look for the initiator name in the following order

1) If the machine has ibft, use the initiator name from that
   (or just use the initiator name already used by active sessions)

2) If there is a /dev/.initiatorname.iscsi file, use that

3) /etc/iscsi/initiatorname.iscsi

Regards,

Hans
Comment 12 Mike Christie 2010-01-15 11:38:45 EST
(In reply to comment #11)
> Either the initscripts or iscsid itself need to be modified so that it will
> look for the initiator name in the following order
> 
> 1) If the machine has ibft, use the initiator name from that
>    (or just use the initiator name already used by active sessions)

This part is sort of done. For sessions using ibft, iscsid will use the initiator name from ibft for those sessions. If there are sessions started by iscsistart where you passed iscsistart the values to use, then iscsid will continue to use the initiator name that was passed into iscsistart for those sessions.

If you discover new targets or if you had other sessions that were set up to get logged into after boot (so where the init script start() will look for node.startup = automatic for example), then those sessions will use what is in "/etc/iscsi/initiatorname.iscsi" or an initiator name that is passed into the tools when you discovery those targets.

It should be a simple change to make all sessions use the initiatorname passed into iscsistart or was used for ibft sessions. There is a if/else switch in iscsid for this that could be easily changed. I just was not sure if that would be what we wanted to always do. Some other user might be using different initiatornames for different storage/targets/uses/acls and may have wanted the different sessions to have different initiatornames. I will just add a config setting for that. This can be done in a day or so.


If the targets/portals are static and everything is set up at install time, then we do not have to worry about any of the above.

If the target/portals are going to change overtime (which is reasonable and probable) then I think we want a mode where when the iscsi service starts up it discovers storage and logs into whatever it finds. This would assume a nicer more FC like model where you do the storage admin from a central place or even on the target. For this, we could have sendtargets support done in a little bit (in a day or so - I was working on it for 6.0 but got busy with other stuff).

We were also working on trying to support this with isns, but that looks like it might slip to 6.1. If I can get an exception that would take about a 10 days.
Comment 13 Hans de Goede 2010-01-15 13:52:56 EST
(In reply to comment #12)
> (In reply to comment #11)
> > Either the initscripts or iscsid itself need to be modified so that it will
> > look for the initiator name in the following order
> > 
> > 1) If the machine has ibft, use the initiator name from that
> >    (or just use the initiator name already used by active sessions)
> 
> This part is sort of done. For sessions using ibft, iscsid will use the
> initiator name from ibft for those sessions. If there are sessions started by
> iscsistart where you passed iscsistart the values to use, then iscsid will
> continue to use the initiator name that was passed into iscsistart for those
> sessions.
> 
> If you discover new targets or if you had other sessions that were set up to
> get logged into after boot (so where the init script start() will look for
> node.startup = automatic for example), then those sessions will use what is in
> "/etc/iscsi/initiatorname.iscsi" or an initiator name that is passed into the
> tools when you discovery those targets.
> 
> It should be a simple change to make all sessions use the initiatorname passed
> into iscsistart or was used for ibft sessions. There is a if/else switch in
> iscsid for this that could be easily changed. I just was not sure if that would
> be what we wanted to always do. Some other user might be using different
> initiatornames for different storage/targets/uses/acls and may have wanted the
> different sessions to have different initiatornames. I will just add a config
> setting for that. This can be done in a day or so.
> 

Yes a config setting for that would be good to have.

> If the targets/portals are static and everything is set up at install time,
> then we do not have to worry about any of the above.
> 
> If the target/portals are going to change overtime (which is reasonable and
> probable) then I think we want a mode where when the iscsi service starts up it
> discovers storage and logs into whatever it finds. This would assume a nicer
> more FC like model where you do the storage admin from a central place or even
> on the target. For this, we could have sendtargets support done in a little bit
> (in a day or so - I was working on it for 6.0 but got busy with other stuff).
> 

I don't see a direct need for this, but if you want to add support for this, I won't stop you.

> We were also working on trying to support this with isns, but that looks like
> it might slip to 6.1. If I can get an exception that would take about a 10
> days.    

Idem.
Comment 14 David Cantrell 2010-01-15 21:48:13 EST
IBM, please test a current RHEL-6 tree (one dated post 15-Jan-2010) to see that this functionality is present.
Comment 15 IBM Bug Proxy 2010-01-20 13:50:42 EST
------- Comment From ennui@us.ibm.com 2010-01-20 13:40 EDT-------
Let's take a NetApp / NSeries Target as an example.

We'll have a number of systems, a shorthand example is:

iqn...sys1
iqn...sysn

With the desire to boot from a single read-only copy of the OS but have their own read/write space, presumably separate.

There would be the read-only OS LUN (how to enforce read-only on the Target is another issue) with an initiator group populated with iqn...sys1 through iqn...sysn. Call this LUN 0, the boot LUN. Then it would seem iqn...sys1 would also have a separate LUN (call this LUN 1) for its r/w space. Likewise for all systems up to iqn...sysn. Thus each system LUN is mapped to a unique initiator group and that group contains the iqn...sys? initiator name.

Is this the use case we're thinking of, or something else?
Comment 16 Mike Christie 2010-01-20 14:22:13 EST
(In reply to comment #15)
> ------- Comment From ennui@us.ibm.com 2010-01-20 13:40 EDT-------
> Let's take a NetApp / NSeries Target as an example.
> 
> We'll have a number of systems, a shorthand example is:
> 
> iqn...sys1
> iqn...sysn
> 
> With the desire to boot from a single read-only copy of the OS but have their
> own read/write space, presumably separate.
> 
> There would be the read-only OS LUN (how to enforce read-only on the Target is
> another issue) with an initiator group populated with iqn...sys1 through
> iqn...sysn. Call this LUN 0, the boot LUN. Then it would seem iqn...sys1 would
> also have a separate LUN (call this LUN 1) for its r/w space. Likewise for all
> systems up to iqn...sysn. Thus each system LUN is mapped to a unique initiator
> group and that group contains the iqn...sys? initiator name.
> 

This should work in RHEL 6 today with some warnings. If all the luns were found through the targets listed in the ibft/OF info, then it should all work pretty nicely. You should also be able to change targets or initiator names and it will work. However, if you were to change targets, then you will get some errors on startup about iscsid not being able to find a node db record for it. iscsid will work fine without a node db record, but you will just have to use the default timeouts/settings (the node db stores per portal config settings).

If the r/w LUN is not found through a target in the ibft/OF target info, then somehow you would have to get that info in your image. You could do it by going through the manual iscsi target setup in the installer or once the system is booted you could set up the target then snapshot it for your image. For this, if the initiatorname never changes then it should all work ok. If the initiatorname is going to change (you set it in ibft and want it to be used for the r/w target), I will need the initiarorname sync up stuff discussed in the beginning of comment #12. Adding devel ack for this since it is simple and doable for 6.0.


If the r/w LUN is not found through a target in the ibft/OF target info, and it is going to change over time, then either the user needs to somehow figure out a way to modify the image with the new target info, or we need the stuff discussed in the end of comment #12 where we do discovery at startup. For that we might want to create a new bugzilla and schedule it for 6.1.
Comment 17 John Jarvis 2010-02-02 11:32:57 EST
Has IBM had a chance to test out Mike's feedback in comment 16?
Comment 18 IBM Bug Proxy 2010-02-02 12:22:52 EST
------- Comment From bigmach@us.ibm.com 2010-02-02 11:59 EDT-------
(In reply to comment #19)
> Has IBM had a chance to test out Mike's feedback in comment 16?

We have been been unable to test this yet as we have no post-1/15/10 bits from which to install.  The latest RHEL6 bits we have are alpha-3 dated 12/15/09.
Comment 20 John Jarvis 2010-02-18 12:43:51 EST
IBM is signed up to test and provide feedback.
Comment 22 IBM Bug Proxy 2010-05-26 16:31:47 EDT
------- Comment From bigmach@us.ibm.com 2010-05-26 16:25 EDT-------
I have verified that this feature works in RHEL6 snap4.  Specifically, I performed an install to an iSCSI target using an HS12 blade, edited /etc/sysconfig/readonly-root and set READONLY=yes and then made two filesystems on a local disk with labels "stateless-rw" and "stateless-state".  I also edited /etc/iscsi/initiatorname.iscsi and changed the InitiatorName to a different value than the default, which had come from the iBFT.

Upon rebooting, I observed that the root filesystem was mounted read-only (according to /proc/mounts).  Attempts to create files in / failed with EROFS.  I verified that the iSCSI initiator being used was the one from the iBFT, not the one I'd changed in /etc/iscsi/initiatorname.iscsi.  I verified that the two filesystems I'd created on the local disk were mounted on /var/lib/stateless/writable and /var/lib/stateless/state.
Comment 24 Alexander Todorov 2010-07-13 12:00:51 EDT
Moving to VERIFIED as per comment #22.
Comment 25 releng-rhel@redhat.com 2010-11-11 10:25:11 EST
Red Hat Enterprise Linux 6.0 is now available and should resolve
the problem described in this bug report. This report is therefore being closed
with a resolution of CURRENTRELEASE. You may reopen this bug report if the
solution does not work for you.

Note You need to log in before you can comment on or make changes to this bug.