Bug 488663 - hald fails with stale or empty /var/cache/hald/fdi-cache
hald fails with stale or empty /var/cache/hald/fdi-cache
Status: CLOSED WONTFIX
Product: Fedora
Classification: Fedora
Component: hal (Show other bugs)
10
All Linux
low Severity medium
: ---
: ---
Assigned To: Richard Hughes
Fedora Extras Quality Assurance
: Triaged
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2009-03-04 22:46 EST by Chuck Ebbert
Modified: 2009-12-18 03:57 EST (History)
4 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2009-12-18 03:57:17 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
patch to fix the problem (813 bytes, patch)
2009-03-05 04:43 EST, Richard Hughes
no flags Details | Diff

  None (edit)
Description Chuck Ebbert 2009-03-04 22:46:14 EST
Description of problem:
With 0-byte /var/cache/hald/fdi-cache, hald tries to mmap 0 bytes and fails.
hald then fails to start and no input devices work.

With stale file (left from previous shutdown) hald appears to start but both connected displays are blank when X starts (just the backlights are on.) This only happens on one machine.

Version-Release number of selected component (if applicable):
hal-0.5.12-14.20081027git.fc10.x86_64

How reproducible:
Stale file causes failure every time. Workaround was to edit the hald startup script to erase fdi-cache before starting the service. This fixes both the empty-file and stale-file failure cases.

A zero-byte fdi-cache file sometimes gets left behind when restarting the system after X starts with blank screens. This causes the other failure mode on the next boot.


Steps to Reproduce:
1. start system
2. use xrandr to turn off internal display and change resolution of external
3. reboot and get blank displays
4. turning off power after X fails to start can cause the 0-byte fdi-cache
  
Actual results:

Additional info:
strace of startup with 0-byte fdi-cache... looks like it also prints the wrong error message because the ioctl following the mmap also fails:

open("/var/cache/hald/fdi-cache", O_RDONLY) = 16
fstat(16, {st_dev=makedev(8, 8), st_ino=107580, st_mode=S_IFREG|0644, st_nlink=1, st_uid=0, st_gid=0, st_blksize=4096, st_blocks=0,
mmap(NULL, 0, PROT_READ, MAP_SHARED, 16, 0) = -1 EINVAL (Invalid argument)
fstat(1, {st_dev=makedev(0, 15), st_ino=381, st_mode=S_IFCHR|0666, st_nlink=1, st_uid=0, st_gid=0, st_blksize=4096, st_blocks=0, st_
ioctl(1, SNDCTL_TMR_TIMEBASE or TCGETS, 0x7fff5ae02580) = -1 ENOTTY (Inappropriate ioctl for device)
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f1f52d00000
unlink("/var/run/haldaemon.pid")        = -1 EACCES (Permission denied)
write(1, "*** [DIE] mmap_cache.c:di_rules_init():79 : Couldn't mmap file '/var/cache/hald/fdi-cache', errno=25: Inappropriate ioctl
exit_group(1)                           = ?
Comment 1 Richard Hughes 2009-03-05 04:39:21 EST
(In reply to comment #0)
> Stale file causes failure every time. Workaround was to edit the hald startup
> script to erase fdi-cache before starting the service. This fixes both the
> empty-file and stale-file failure cases.

We don't want to do that -- it would significantly slow down the startup.

> A zero-byte fdi-cache file sometimes gets left behind when restarting the
> system after X starts with blank screens. This causes the other failure
> mode on the next boot.

I've never experienced this. The cache file should only be re-written when hal starts and finds new mtimes on the fdi directories, rather than at X startup. Can you nuke the 0 byte file, recreate it and then find out exactly when the file is written to zero bytes?

> Steps to Reproduce:
> 1. start system
> 2. use xrandr to turn off internal display and change resolution of external
> 3. reboot and get blank displays
> 4. turning off power after X fails to start can cause the 0-byte fdi-cache

I can't reproduce this -- what hardware are you using?

> Additional info:
> strace of startup with 0-byte fdi-cache... looks like it also prints the wrong
> error message because the ioctl following the mmap also fails:

I'll attach a patch that should fix things up in this case.
Comment 2 Richard Hughes 2009-03-05 04:43:49 EST
Created attachment 334115 [details]
patch to fix the problem

This is what I've applied upstream. I'll add this to the rawhide rpm if you want me to.
Comment 3 Chuck Ebbert 2009-03-05 14:39:04 EST
(In reply to comment #1)
> > Steps to Reproduce:
> > 1. start system
> > 2. use xrandr to turn off internal display and change resolution of external
> > 3. reboot and get blank displays
> > 4. turning off power after X fails to start can cause the 0-byte fdi-cache
> 
> I can't reproduce this -- what hardware are you using?

Acer Aspire 5735 notebook:
http://www.smolts.org/client/show/pub_b177e97d-6b2e-4604-b4ec-1d4c360beeaf

I can't really try to reproduce the 0-byte file ... it means booting up and killing the system with the power switch -- I really don't want to do that.
Comment 4 Josua Dietze 2009-06-20 14:38:38 EDT
This happened to me and several other users during installation of FC11.

See
http://www.fedoraforum.org/forum/showthread.php?t=223372
Comment 5 Mario Lopez 2009-07-06 03:31:49 EDT
In my case this exactly behaviour was due to unrecognised wireless card after LiveCD install. 

HAL was not starting, during boot appears as FAILED, and when start in hand, the message was related to "fdi cache".

gdb hald 
run --daemon=no --verbose=yes

RESULT
*** [DIE] mmap_cache.c:di_rules_init():79 : Couldn't mmap file '/var/cache/hald/fdi-cache', errno=22: Invalid argument

*****************

Reading this post, proceed to delete fdi_cache and HAL start, but starting Xorg, the system freeze.

Here you find more information:

http://mariotpc.blogspot.com/2009/07/fedora-11-keyboard-and-mouse-not.html
Comment 6 Bug Zapper 2009-11-18 06:16:59 EST
This message is a reminder that Fedora 10 is nearing its end of life.
Approximately 30 (thirty) days from now Fedora will stop maintaining
and issuing updates for Fedora 10.  It is Fedora's policy to close all
bug reports from releases that are no longer maintained.  At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '10'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 10's end of life.

Bug Reporter: Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 10 is end of life.  If you 
would still like to see this bug fixed and are able to reproduce it 
against a later version of Fedora please change the 'version' of this 
bug to the applicable version.  If you are unable to change the version, 
please add a comment here and someone will do it for you.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events.  Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

The process we are following is described here: 
http://fedoraproject.org/wiki/BugZappers/HouseKeeping
Comment 7 Bug Zapper 2009-12-18 03:57:17 EST
Fedora 10 changed to end-of-life (EOL) status on 2009-12-17. Fedora 10 is 
no longer maintained, which means that it will not receive any further 
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of 
Fedora please feel free to reopen this bug against that version.

Thank you for reporting this bug and we are sorry it could not be fixed.

Note You need to log in before you can comment on or make changes to this bug.