Bug 221756 - updatedb and cluster filesystems
Summary: updatedb and cluster filesystems
Alias: None
Product: Fedora
Classification: Fedora
Component: mlocate
Version: rawhide
Hardware: All
OS: Linux
Target Milestone: ---
Assignee: Michal Sekletar
QA Contact:
Depends On:
TreeView+ depends on / blocked
Reported: 2007-01-07 14:59 UTC by Axel Thimm
Modified: 2014-01-17 15:54 UTC (History)
0 users

Fixed In Version:
Doc Type: Enhancement
Doc Text:
Clone Of:
Last Closed:
Type: ---

Attachments (Terms of Use)

Description Axel Thimm 2007-01-07 14:59:26 UTC
When using GFS or other cluster fs one is faced with the problem that no one
really owns the filesystem. For NAS setups like NFS one can exclude the client
mounts and have only the server do the updatedb, for GFS and friends there is no
designated server that manages the mlocate database.

The current model is to have all cluster members do the updatedb at the same
time which for large clusters is killing the IO. The current "workaround" is to
add GFS to the prunefs argument, or to designate one cluster member to do the
updatedb work. The former is bad, if one is interested in using locate on the
SAN contents, the latter is bad. because it introduces a manual asymmetry in the
cluster members.

How about the following model: The updatedb database is split across filesystems
(e.g. under /.mlocatedb/) and updatedb uses lock files to indicate that someone
 is doing the updatedb work already - in that case updatedb skips this mount
point. locate then uses these databases automatically.

The benefits are:
a)  cluster filesystems can be scanned with using only one member, but all
    members can later use locate on the contents.
b1) New cluster members have immediate access to a fresh updatedb.
b2) Same for moving around disks between systems
b3) Even NFS attached nodes could start using locate on NFS contents
c)  no manual config tuning for cluster fs.
d)  Independent of cluster fs in use, works with every current and upcoming

There are security implications to consider especially for old-fashioned NFS
mounted systems, where the NFS client could spoof any userid including locate's,
but OTOH these setups are insecure on much worse level than giving away
visibility of paths, e.g. the rogue NFS client can simply become the user it
wants to query the paths for and even query the contents.

Would that model make sense? It looks easy to implement and perhaps it could
default to the current single db setup, but have easy switches to make cluster
fs behave as described.


Comment 1 Miloslav Trmač 2007-01-07 21:28:47 UTC
Sounds interesting, but I wonder whether this dictates too much local policy.

e.g.: What if / and /usr are separate GFS mounts, with /usr mounted read-only?
There is no way to store data to /usr/.mlocatedb in that case.

Then there is the technical problem of detecting stale locks on a cluster
filesystem (without a shared PID space).

Comment 2 Axel Thimm 2007-01-07 22:46:43 UTC
Read-only mounts could check for (a read-only) .mlocatedb and fall-back to /var
if it doesn't exist. Or the policy could be chosen in updatedb.conf.

Stale locks are nasty. One way to work around them would be to introduce
lock-stamping, e.g. have updatedb refresh the locks in given fixed time
intervalls and declare a lock stale if it's older than some higher value (e.g.
refresh every 15 minutes, declare stale if older than 30 minutes).

Comment 3 Miloslav Trmač 2007-03-16 04:18:40 UTC
Based on the above, I have just sent a RFC to fedora-devel-list.  Could you take
a look, please?

Comment 4 Fedora Admin XMLRPC Client 2012-10-10 14:19:39 UTC
This package has changed ownership in the Fedora Package Database.  Reassigning to the new owner of this component.

Comment 5 Fedora Admin XMLRPC Client 2014-01-17 15:54:54 UTC
This package has changed ownership in the Fedora Package Database.  Reassigning to the new owner of this component.

Note You need to log in before you can comment on or make changes to this bug.