Bug 453218 - getdents() returns an entry with d_name=""
getdents() returns an entry with d_name=""
Status: CLOSED NOTABUG
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel (Show other bugs)
5.2
i686 Linux
low Severity low
: rc
: ---
Assigned To: Josef Bacik
Martin Jenner
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2008-06-27 16:45 EDT by Henry Hartley
Modified: 2008-08-07 10:51 EDT (History)
2 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2008-08-07 10:51:54 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
1000 lines of strace log (125.67 KB, text/plain)
2008-07-01 12:54 EDT, Henry Hartley
no flags Details

  None (edit)
Description Henry Hartley 2008-06-27 16:45:45 EDT
Description of problem: updatedb aborts on some file systems.

How reproducible: always (for me, anyway)

Steps to Reproduce:
1. mount root partition of secondary drive to /mnt/oldslash
2. don't include /mnt in PRUNEPATHS
3. run updatedb -v
  
Actual results:

directories and files are listed until eventually, updatedb aborts with the
following:

updatedb: src/updatedb.c:601: scan_cwd: Assertion `name_size > 1' failed.
Aborted

Expected results:

updatedb should not abort

Additional info:

My actual experience is on a CentOS 5.2 machine but I assume it can be
reproduced on RHES.  Here is the response from Mirek to an e-mail about this.

Hello,
thanks for your report.

Henry Hartley píše v Čt 26. 06. 2008 v 07:55 -0400: 
> > For each day since installation there was a file 
> > called /var/lib/mlocate/mlocate.db.xxxxxx where xxxxxx is a six 
> > character string of digits and upper and lowercase letters.  Each of 
> > these files was 14MB in size and it doesn't take a genius to see that 
> > this will be a problem fairly quickly (good thing, too, since I'm 
> > clearly not one).
> > 
> > [root@blackforest mlocate]# updatedb
> > updatedb: src/updatedb.c:601: scan_cwd: Assertion `name_size > 1' failed.
> > Aborted
> > 
> > Furthermore, that created one more mlocate.db.xxxxxx file.  Adding -v 
> > showed me that the problem was a mounted file system in /mnt so I added 
> > that to PRUNEPATHS in /etc/updatedb.conf and it now runs without any 
> > fproblems.
> > 
> > Clearly I should have been paying more attention and seen this coming 
> > long before it became a crisis
If the cause of this behavior was the above-described abort, cron should
have been sending e-mails to root.  Was this not the case?

> > I didn't file a bug report because I'm not 
> > really sure it's a bug but I can if you like.
There are two separate problems.

The first problem is that updatedb doesn't remove its temporary files
when it aborts.  I'll try to fix this for the next mlocate release right
away, no need to file it.

The second problem is that updatedb aborts at all.  This is very likely
a bug in the file system driver, not updatedb - nevertheless, please
file a bug and we'll hopefully figure out the case.

That's it for mlocate "upstream" releases.  If you'd like the bugs fixed
in RHEL, please file them in Red Hat bugzilla against RHEL as well.  Due
to resource constraints I can't promise they will be fixed (you are not
using RHEL, after all) - but if nobody files them, they surely won't be
fixed.

Thanks again,
	Mirek
Comment 1 Miloslav Trmač 2008-06-29 17:00:36 EDT
What exact path is reported last by (updatedb -v), and what type of file system
is mounted at that path?

Please attach the last 1000 lines of the "log" file generated by (strace -v -o
log updatedb) as well.
Comment 2 Henry Hartley 2008-07-01 12:54:56 EDT
Created attachment 310693 [details]
1000 lines of strace log
Comment 3 Henry Hartley 2008-07-01 12:57:41 EDT
The file last mentioned in the updatedb -v log was 

/mnt/oldslash/lib/modules/2.6.11-1.14_FC3/build/include/config/dvb/ves1820/module.h

When I updated the system from a Fedora Core 3 install back in February (I know,
I know) I kept the old disc and mounted it under /mnt/oldslash.  It is either
ext2 or ext3 but I don't remember for sure.  Is there a way to tell?  the file
itself doesn't look like a problem, just a single line of text.

Comment 4 Miloslav Trmač 2008-07-01 19:19:45 EDT
Thanks.

The file system type should be visible in (mount), or in (cat /proc/mounts).

It seems invalid data is returned for the directory
/mnt/oldslash/lib/modules/2.6.11-1.14_FC3/build/include/config/dvb/ves1x93 :

getdents64(15, {{d_ino=16877, d_off=2147483647, d_type=DT_UNKNOWN, d_reclen=24,
d_name=""}}, 4096) = 24
Comment 5 Henry Hartley 2008-07-02 09:41:15 EDT
Okay, it is ext3.

/dev/hdb2 on /mnt/oldslash type ext3 (rw)

When I go to that directory, ls returns nonsense:

[root@blackforest ves1x93]# ls -al
total 0
?--------- ? ? ? ?            ?

Does this in any way actually help?  I mean, I can fix it here by deleting this
directory since it's not really needed but that doesn't help you, does it?
Comment 6 Miloslav Trmač 2008-07-02 19:55:20 EDT
Thanks.  I think the above could be enough information to prepare a fix, but I'm
not an expert and the ext3 developers may need to know more.
Comment 7 Eric Sandeen 2008-07-03 13:45:09 EDT
Is the fs corrupt?  Does e2fsck find errors?  (maybe run with -n to preserve the
state for now).

-Eric
Comment 8 Henry Hartley 2008-07-03 13:59:51 EDT
[root@blackforest ~]# e2fsck -n /dev/hdb2
e2fsck 1.39 (29-May-2006)
Warning!  /dev/hdb2 is mounted.
Warning: skipping journal recovery because doing a read-only filesystem check.
/: clean, 246736/1537088 files, 2439193/3072431 blocks

Same results (without the warnings) if I unmount it first.
Comment 9 Josef Bacik 2008-07-03 14:03:15 EDT
unmount the fs and run e2fsck -f on the fs and let us know if it says anything.
Comment 10 Henry Hartley 2008-07-03 14:46:10 EDT
Should I say NO to fix, for now?

[root@blackforest ~]# e2fsck -f /dev/hdb2
e2fsck 1.39 (29-May-2006)
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Error reading block 2011563 (Attempt to read block from filesystem resulted in
short read).  Ignore error<y>? yes

Force rewrite<y>? yes

Directory inode 770755, block 0, offset 0: directory corrupted
Salvage<y>? yes

Missing '.' in directory inode 770755.
Fix<y>? yes

Setting filetype for entry '.' in ??? (770755) to 2.
Missing '..' in directory inode 770755.
Fix<y>? yes

Setting filetype for entry '..' in ??? (770755) to 2.
Error reading block 2011564 (Attempt to read block from filesystem resulted in
short read).  Ignore error<y>? yes

Force rewrite<y>? yes

Directory inode 770756, block 0, offset 0: directory corrupted
Salvage<y>? yes

Missing '.' in directory inode 770756.
Fix<y>? yes

Setting filetype for entry '.' in ??? (770756) to 2.
Missing '..' in directory inode 770756.
Fix<y>? yes

Setting filetype for entry '..' in ??? (770756) to 2.
Comment 11 Henry Hartley 2008-07-03 15:09:52 EDT
Sorry, that was only part.  Also, I meant to say no to the Fix question but
didn't.  The directory still seems to be broken, however.  I assume because it
was canceled.  Anyway, here's the rest of the e2fsck output:


Error reading block 2011565 (Attempt to read block from filesystem resulted in
short read).  Ignore error<y>? yes

Force rewrite<y>? yes

Directory inode 770757, block 0, offset 0: directory corrupted
Salvage<y>? yes

Missing '.' in directory inode 770757.
Fix<y>? yes

Setting filetype for entry '.' in ??? (770757) to 2.
Missing '..' in directory inode 770757.
Fix<y>?

/: e2fsck canceled.

/: ***** FILE SYSTEM WAS MODIFIED *****
Comment 12 Josef Bacik 2008-07-14 09:39:06 EDT
let fsck do the fixing and see if the directory is still corrupted.
Comment 13 Henry Hartley 2008-07-14 11:20:31 EDT
After running e2fsck -f /dev/hdb2 and fixing all problems (answering yes to all
questions), the directory no longer seems corrupted.  Furthermore, I removed
/mnt from PRUNEPATHS in /etc/updatedb.conf and updatedb now runs without any
problems.
Comment 14 Josef Bacik 2008-08-07 10:51:54 EDT
sounds like everything is good to go, I will close this bug.  Feel free to open it if you experience the same problem again.

Note You need to log in before you can comment on or make changes to this bug.