Description of problem: /var/log/lastlog is too large See bug 156809 Version-Release number of selected component (if applicable): FC4 How reproducible: See bug 156809 Additional info: Bug 156809 reports a problem with "rsync" in that it is unable to copy "/var/log/lastlog". The problem is in the database for "lastlog". Having a 1.2TB file (sparse or not) is a problem for most copy utilities. Reading the documentation, even long distances between UIDs cause delays in the "lastlog" program (clearly, it must be incrementing for each UID). The program and database should be redesigned so that (1) the files aren't gratuitously large, and (2) the program ("lastloh") isn't so poorly designed that it appears to "hang" (see "lastlog" documentation). The only workaround seems to be to remove the "lastlogging", which is a terrible workaround.
'struct utmp' comes from glibc.
Well, lastlog uses struct lastlog. In any case, the database design is poor, old BSDish. Unfortunately, many apps access this file directly, not through some library's accessor functions, so the internal /var/log/lastlog layout is sadly a part of the interface. Similar situation is with /var/run/utmp, /var/log/wtmp, though for those there are at least accessor functions defined in <utmp.h> and <utmpx.h>, some of them standardized by POSIX. To fix this, we'd need to change all the apps that touch these 3 files to never touch the files directly and use accessor functions instead (and in case of lastlog where even no accessor functions exist write them and decide in which library they should be put (whether glibc or -llastlog or something else)). Only when this step is done we can work on moving the content of these files into different paths and changing their internal format.
The file isn't gratuitiously large; it's a sparse file, in any case. I'm not sure changing the conventions for these in a non-standard way is worth it.
The file is large in that: every tool that copies it needs to have an understanding of copying sparse files. For me, the main operational issue is: I have these directories accessible through a variety of mechanisms, e.g., NFS mount, SMB mount, RSYNC source, and the files are accessed via a variety of tools, some of which aren't UNIX tools. I have to back up, copy, synchronize, restore, etc. these files via scripts, webmin, cpio, rsync, scp, and so on -- and there are some Windows-based tools that I need to access the files via SMB share or CD-ROM. Why? Because I support two separate facilities, each as a backup for the other. Why Windows tools? For disaster recovery using readily available media, tools, and hardware. Now which makes more sense? Fixing every tool from now until kingdom come to specially handle these everyday-UNIX files? Or, is it better to fix the 3 existing programs to use a common API (i.e., an implementation-hiding technique that has two decades of good software engineering experience) and localize the knowledge of the UTMP file interface to just its APIs? Finally, what requires the lastlog to have such a large size? It seems that there is no requirement that UTMP entries be placed 1.2TB into the file, right? Why can't they just be appended to the end of the file (with a atomic write-append)? Presumably, appending to the end of the file would work just as well, right? Regardless, the main point is the trade off for most scripts/apps improperly handling everyday-UNIX operation/admin files and their all requiring changes/hacks to get them to backup up /var/log properly (note: an Oracle database is not an everyday UNIX admin file) vs. fixing a couple programs that were poorly designed already because they made their implementation visible (said differently: they didn't hide their implementation by abstracting the service interface). Not exactly true: the API is there, they just don't use it. And finally, why do x86_64 systems (and their administrators) have to worry about this problem, but x86 systems don't. Clearly, this becomes and odd portability problem when the x86_64 systems require different backup scripts than the x86 systems (again, I reiterate: for *everday* UNIX operation/admin files).
(In reply to comment #4) > Now which makes more sense? Fixing every tool from now until kingdom come to > specially handle these everyday-UNIX files? Or, is it better to fix the 3 > existing programs to use a common API (i.e., an implementation-hiding technique > that has two decades of good software engineering experience) and localize the > knowledge of the UTMP file interface to just its APIs? As stated above, the tools that access lastlog *don't* use an API; they access the file directly. > Finally, what requires the lastlog to have such a large size? It seems that > there is no requirement that UTMP entries be placed 1.2TB into the file, right? > Why can't they just be appended to the end of the file (with a atomic > write-append)? It's a sparse file indexed by login id. The nfsnobody user has a userid of (-2). When a x86-64 uses 32-bit UIDs, that (-2) is a very large number. It's done as a sparse file so that any user of the file who wants to look at the lastlog record for a particular uid can just seek to that userid's record, as opposed to parsing the whole file.
> It's a sparse file indexed by login id. The nfsnobody user has a userid of (-2). > When a x86-64 uses 32-bit UIDs, that (-2) is a very large number. ...which is the real problem, not lastlog's structure, which would probably be very difficult to change without breaking stuff. Frank said: > The file is large in that: every tool that copies it needs to have an > understanding of copying sparse files. Understanding or not, processing a 1.2TB file still takes forever. See tar --sparse : works OK (ie produces a small tar file) but takes about an hour on my 64-bit system.
Based on the date this bug was created, it appears to have been reported against rawhide during the development of a Fedora release that is no longer maintained. In order to refocus our efforts as a project we are flagging all of the open bugs for releases which are no longer maintained. If this bug remains in NEEDINFO thirty (30) days from now, we will automatically close it. If you can reproduce this bug in a maintained Fedora version (7, 8, or rawhide), please change this bug to the respective version and change the status to ASSIGNED. (If you're unable to change the bug's version or status, add a comment to the bug and someone will change it for you.) Thanks for your help, and we apologize again that we haven't handled these issues to this point. The process we're following is outlined here: http://fedoraproject.org/wiki/BugZappers/F9CleanUp We will be following the process here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping to ensure this doesn't happen again.
This bug has been in NEEDINFO for more than 30 days since feedback was first requested. As a result we are closing it. If you can reproduce this bug in the future against a maintained Fedora version please feel free to reopen it against that version. The process we're following is outlined here: http://fedoraproject.org/wiki/BugZappers/F9CleanUp