Bug 103530

Summary: /etc/rc.d/rc.sysinit is ineffectual/incorrect for headless machines - PROMPT setting in /etc/sysconfig/autofsck not honored correctly
Product: [Retired] Red Hat Linux Reporter: Shamim Islam <shamim>
Component: initscriptsAssignee: Bill Nottingham <notting>
Status: CLOSED CURRENTRELEASE QA Contact: Brock Organ <borgan>
Severity: high Docs Contact:
Priority: high    
Version: 9CC: rvokal
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
URL: N/A
Whiteboard:
Fixed In Version: FC4 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2005-09-30 19:23:02 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Corrected fsck behavior - streamlines fsck handling code
none
Corrects fsck behavior for headless machines, and streamlines code
none
Corrects fsck behavior for headless machines, and streamlines code none

Description Shamim Islam 2003-09-01 23:25:39 UTC
From Bugzilla Helper: 
User-Agent: Mozilla/5.0 (compatible; Konqueror/3.1; Linux 2.4.21-0.13mdk; X11; i586; 
en_US, en) 
 
Description of problem: 
rc.sysinit when checking for filesystem integriry fails miserably on headless machines 
even when the PROMPT value is set to "no". 
 
Any error unexpected return value in the root file system demands user input. 
 
Also, the return code from fsck is not interpreted correctly - fsck returns a bitfield - 
rc.sysinit instead compares the error level as if it were a regular integer. 
 
There is much duplicated script code for handling fsck issues. 
 
rc.sysinit does not handle an interrupted powerfail shut down. 
 
Version-Release number of selected component (if applicable): 
All versions in all linux distributions 
 
How reproducible: 
Always 
 
Steps to Reproduce: 
1. Power off running machine without shutdown to simulate sudden loss of power 
2. Restart machine 
3. Headless servers will hang on any difficulties even if the root filesystem was 
successfully checked (PASSED/REBOOT return codes) 
     
 
Actual Results:  Headless machine hung until user input supplied. 
 
Expected Results:  Headless machine should have restarted automatically. 
 
Additional info: 
 
This can cause an inability to restart headless machines after a power failure or severe 
problem without attaching a monitor and keyboard even if the disk was successfully 
repaired. 
 
I have created a patch that successfully recovers a headless machine EVERY TIME 
when ever the errors are trivial and does not require user input if 
/etc/sysconfig/autofsck contains PROMPT="no" 
 
Also, if the power failure causes a loss of power before proper shutdown occurrs, the 
headless machine defaults to a power failure status and does not allow logins until a 
monitor is attached. 
 
My patch also deals with this correctly. 
 
Please let me know how to submit this patch. 
 
I feel this patch is instrumental in acheiving lowered TOC for server farms as well as 
user desktop machines since it should be possible to autocorrect for most problems 
without user intervention.

Comment 1 Bill Nottingham 2003-09-02 00:52:53 UTC
Feel free to attach the patch. Note that you can set options for fsck in
/fsckoptions to have it never prompt.

Comment 2 Shamim Islam 2003-09-02 01:42:42 UTC
Actually, on trivial errors on the root filesystem the /fsckoptions (just like the 
/etc/sysconfig/autofsck options) are ignored. 

Comment 3 Shamim Islam 2003-09-02 01:46:50 UTC
Created attachment 94129 [details]
Corrected fsck behavior - streamlines fsck handling code

I did not run diff -bnuR on the old version of rc.sysinit, since I suspect that
there will be other minor variations between distros.

What you will notice is that there are packaged functions for handling the fsck
behavior and that if the prompting is declined, it correctly NEVER EVER stops
unless something incredibly serious happens.

Also, if you do a diff, you will see references to -gt for return codes from
fsck - this is where part of the problem originates. Fsck returns a bitfield.

Enjoy. I am attempting to desseminate this fix as far as it will go.

P.S. the /etc/nologin is an additional fix that corrects for when a powerfail
shutdown is terminated before completion.

You have no idea how many times I have had to hook up a monitor and keyboard to
my headless firewall even after setting the PROMPT value.

Hope this is helpful. :)

Can I be listed as a contributor???? :) :) :)

Comment 4 Shamim Islam 2003-09-02 01:50:10 UTC
Comment in AskForKey is reversed - returns 0 on no keypress, returns 1 on keypress. :) 

Comment 5 Shamim Islam 2003-09-02 01:59:42 UTC
Correction - AskForKey comment is correct. 
 
Returns 0 if key is pressed within timeout 
Returns 1 if key is not pressed within timeout 
 
Sorry - I got confused for a second - even though I commented it right the first time. 
 
When used in if statement, zero return code processes the then portion, and non-zero 
processes the else portion. :) 
 
The return code processing vs the [] evaluation had me mixed up for a moment. 

Comment 6 Shamim Islam 2003-09-08 15:03:49 UTC
Created attachment 94303 [details]
Corrects fsck behavior for headless machines, and streamlines code

Modified the Reboot() function slightly so the logic is more obvious.

Comment 7 Shamim Islam 2003-09-08 15:27:36 UTC
Created attachment 94304 [details]
Corrects fsck behavior for headless machines, and streamlines code

Extraneous CTRL-M injection detected in last upload - removed.

Comment 8 Bill Nottingham 2005-09-30 19:23:02 UTC
Closing bugs on older, no longer supported, releases. Apologies for any lack of
response.

This code was reworked before FC4. Notably, fsck is now only run once, and the
error code for 'reboot now' is properly handled.

Please open new issues for further problems with the fsck code - thanks!