Bug 35980 - in rc.sysinit, fsck can't find LABEL=/somefs only if run from initlog (see also 28768)
in rc.sysinit, fsck can't find LABEL=/somefs only if run from initlog (see al...
Status: CLOSED RAWHIDE
Product: Red Hat Linux
Classification: Retired
Component: e2fsprogs (Show other bugs)
7.1
i386 Linux
medium Severity medium
: ---
: ---
Assigned To: Florian La Roche
David Lawrence
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2001-04-15 16:14 EDT by j. alan eldridge
Modified: 2007-04-18 12:32 EDT (History)
1 user (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2001-04-15 20:57:51 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:


Attachments (Terms of Use)
strace output from fscked up run of initlog -c "fsck ....." (15.68 KB, text/plain)
2001-04-15 18:14 EDT, j. alan eldridge
no flags Details
trace files, contents of /proc/partitions (8.03 KB, application/octet-stream)
2001-04-15 19:40 EDT, j. alan eldridge
no flags Details
new spec file + patch file (3.21 KB, application/octet-stream)
2001-04-15 20:53 EDT, j. alan eldridge
no flags Details

  None (edit)
Description j. alan eldridge 2001-04-15 16:14:37 EDT
Using e2fsprogs-1.19-3 and initscripts-5.64.2-1.

Conditions:

ONLY at system startup (in rc.sysinit)
ONLY when initlog -c "fsck -T -A -R -a $fsckoptions" is used

fsck will fail to find a filesystem to check, saying "Couldn't find
matching filesystem: LABEL=/mswin". (No, that's not a winbloze fs, it's
just another ext2 filesystem).

If fsck -T -A -R -a $fsckoptions is run directly from rc.sysinit, without
involving initlog, the error does not happen.

This is 100% reproducible, 110% maddening, and 120% pissing me off because
it's just damn near impossible to diagnose further. The conditions are just
too bizarre. 

We're two or three levels of fork/exec down the chain, we're in system
startup, we've got nothing mounted yet except /, and that's the only
conditions that make it occur. Fsck.
Comment 1 j. alan eldridge 2001-04-15 17:12:28 EDT
OK, I can add another datapoint.

If I run the fsck from initlog immediately after running it directly from
rc.sysinit, the problem does not present itself.

so it looks like there's some weird 1st time effect. maybe.
Comment 2 j. alan eldridge 2001-04-15 18:14:28 EDT
Created attachment 15404 [details]
strace output from fscked up run of initlog -c "fsck ....."
Comment 3 j. alan eldridge 2001-04-15 18:21:47 EDT
I've got an strace of fsck being run by initlog from rc.sysinit (yes! really!)
that shows the fsck-up in action. File is attached, but here's the salient bits:
218   open("/proc/partitions", O_RDONLY) = 7
218   fstat64(7, {st_mode=S_IFREG|0444, st_size=0, ...}) = 0
218   mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0)
= 0x40017000
218   _llseek(7, 0, [0], SEEK_SET)      = 0
218   read(7, "major minor  #blocks  name     r"..., 1024) = 1024
218   read(7, "1    8032468 hdb7 3 0 6 40 0 0 0"..., 1024) = 147
218   read(7, "", 1024)                 = 0
218   _llseek(7, 0, [0], SEEK_SET)      = 0
218   read(7, "major minor  #blocks  name     r"..., 1024) = 1024
218   open("/dev/hda5", O_RDONLY)       = 8

[ ok, we're just getting superblocks now ... then, we go back and read a bit
more from /proc/partitions, still on fd 7. nothing fscks with fd 7 in the
meantime. ]
218   open("/dev/hdb6", O_RDONLY)       = 8
218   lseek(8, 1024, SEEK_SET)          = 1024
218   read(8, "\240\260\36\0rH=\0\342\234\0\0=\300&\0\207\252\36\0\0\0"...,
136)= 136
218   close(8)                          = 0
218   read(7, "   3    71    8032468 hdb7 3 0 6"..., 1024) = 156
218   open("/dev/71", O_RDONLY)         = -1 ENOENT (No such file or directory)

[ EXCUSE ME? DID YOU SAY "/dev/71"??? WTF IS THAT? ]

Now just have to see where that's coming from. Is something fscking with fd 7,
is /proc/partitions getting fscked up, or is the whole damn thing on drugz? 

Comment 4 j. alan eldridge 2001-04-15 19:06:43 EDT
The solution is at hand, and it has grave implications. The problem is that
/proc/partitions is changing in between the read calls.  Fsck reads the first
1K, acts on it (checking a bunch of partitions and *changing /proc/partitions in
the process*), and then stdio gets the "rest" of the file, only things aren't
"lined up" with where they used to be.

This explains the first time effect. I don't have an explanation for why running
fsck directly from rc.sysinit didn't provoke the bug, but I'm going to put that
at a lower priority. I'll have an answer for that one later on.

So what's the solution? Well, you shouldn't use stdio to read
/proc/something-that-might-change. In fact, I think in general, it may not be
safe to read anything out of /proc except as 1 (atomic) call to read. So, that's
fine, I'll just stat the fscker to see how big a buffer I need and ... can't do
that, as stat on /proc/anything returns a size of 0. 

Houston, we have a problem.

So what's the fix? Errrr.... suggest, for now, using a generous helping of
setvbuf() magic on the code in fsck that reads /proc/partitions. That'll take
care of LABEL problems. For now. In fsck. Only.

In general, this is a real problem that demands a general solution. E.g., mount
reads /proc/partitions. Doing a general audit of all code for calls that
reference /proc/* is going to suck real hard, but it is probably a good idea.
Also, a libprocfs.a probably needs to be born, with code to read the files using
atomic ops. It's kinda gross, but this bug is grosser.

/*
	read_proc_file(sz, ppch)

	reads the file "/proc/" + sz into a buffer using a single read call;
	the ptr to the buffer is placed in *ppch, and the size of the file (as
	determined by reading to end) is returned.
*/

size_t read_proc_file(char *sz, char **ppch)
{
	const size_t cbIncr = 1024;
	size_t cbBuf = cbIncr;
	int fd = -1;
	char *pch = 0;
	char *szFn = malloc(strlen(sz) + 7);
	
	strcat(strcpy(szFn, "/proc/"), sz);
	fd = open(szFn, O_RDONLY);
	free(szFn);
	if (fd < 0) {
		*ppch = 0; return -1;
	}
	pch = (char *)malloc(cbBuf );
	for (;;) {
		int n = read(fd, pch, cbBuf);
		if (n < cbBuf) break;
		lseek(fd, 0, SEEK_SET);
		pch = (char *)realloc(pch, cbBuf += cbIncr);
	}
	if (n <= 0) {
		free(pch); pch = 0;
	}
	close(fd);
	*ppch = pch; return n;
}
		
Comment 5 j. alan eldridge 2001-04-15 19:39:19 EDT
Captains Log. Final entry.

Found why running fsck without initlog as parent didn't bomb out. Basically,
running initlog to exec fsck results in different activity on the filesystems,
in particular / (since that's all that's mounted at first). It's entirely luck
that the contents of /proc/partitions was lined up differently with respect to
the 1K boundary.

In particular, the second read started a bit "earlier" in the file, early enough
that it brought in a newline after a couple of characters, making a malformed
line for the loop in get_device_by_label.c.
The big loop in uuidcache_init() short circuits on this bad line of text. The
code is:

	if (sscanf (line, " %d %d %d %[^\n ]",
		    &ma, &mi, &sz, ptname) != 4)
		continue;

After the newline, the line for hdb7 starts over again.  It's all in the
attachment after this note. A patch will be attached later tonight or tomorrow.


Comment 6 j. alan eldridge 2001-04-15 19:40:17 EDT
Created attachment 15405 [details]
trace files, contents of /proc/partitions
Comment 7 j. alan eldridge 2001-04-15 20:53:34 EDT
Created attachment 15406 [details]
new spec file + patch file
Comment 8 Bill Nottingham 2001-04-15 20:57:23 EDT
assigning to e2fsprogs
Comment 9 Florian La Roche 2001-05-15 12:05:41 EDT
fix in 1.19-5 or newer

Thanks.

Note You need to log in before you can comment on or make changes to this bug.