Red Hat Bugzilla – Bug 223487
The removal of dmeventd files in /var/run causes mirror monitoring problems
Last modified: 2014-03-16 23:05:05 EDT
Note: The subject may not be accurate
When LVM volumes are activated in rc.sysinit when LVM mirrors are present, a
daemon is launched that will monitor the mirrors for device failures. That
daemon creates files in /var/run (a lock file and two socket files). If those
are removed during the cleanup of /var/run, it causes a host of problems, including:
1) another dmeventd can start-up, creating conflicting daemons
2) cause LVM to misbehave when deleting mirror volumes
3) cause device failure events to not be handled
Fixing this problem doesn't necessarily mean leaving the dmeventd files in
/var/run when cleaning up that directory. When activating LVM volumes, we could
do 'vgchange -ay --monitor n ...'. This tells LVM not to do monitoring, and it
will not start the aforementioned daemon. We still need monitoring, however.
So, we would have to find an appropriate place afterwards (hopefully after all
file systems are mounted rw) to do a 'vgchange --monitor y'. This will start
monitoring on all active mirror volumes.
See bug 195476. Comments repeated here:
Comment #1 From Jeff Layton (firstname.lastname@example.org) on 2006-06-15 09:27 EST
Whether Jonathan's patch goes in or not, we'll end up with the event monitoring
daemon not getting started at boot time. So we need to activate it later
(sometime after /usr is mounted). This BZ is opened to make that happen.
Since we're doing that, I'll also change the lvm.static vgchange to prevent it
from trying to start the monitoring too and circumvent the error messages.
Comment #2 From Bill Nottingham (email@example.com) on 2006-06-15 10:16 EST
Wait, why is the command *now* starting a daemon on LVM activation?
That should be done either a) via udev rules or b) via a separate initscript. I
don't see how --monitor yes should ever be the default.
That bug is CLOSED->ERRATA. Did this issue come back?
My opinion would be:
- make "-monitor n" the default (having a device activation framework fork off a
daemon seems like the wrong interface to me)
- have a separate initscript that starts the monitor daemon (see mdmpd)
So, if you think it is the responsibility of device-mapper to start dmeventd,
then you still need to add '--monitor n' to the 'vgchange -ay ...' in rc.sysinit.
The default of activation will always be to monitor the mirror devices. We can
change when/where the daemon is started, but the default monitoring action
should not change.
(In reply to comment #2)
> So, if you think it is the responsibility of device-mapper to start dmeventd,
> then you still need to add '--monitor n' to the 'vgchange -ay ...' in rc.sysinit.
> The default of activation will always be to monitor the mirror devices. We can
> change when/where the daemon is started, but the default monitoring action
> should not change.
But it apparently *did* change in the update release - it certainly never
started monitoring before.
We did not have mirroring before, hence there were no devices that needed
monitoring. We now have LVM mirrors, so we need monitoring.
The default action for mirrors has not changed, just our support for mirrors.
Now that we support and test mirrors, we see bugs... like this one.
I still think this is definitely the wrong way to go about adding
features like this. Here's why:
You're essentially saying that, for any new dm-device type X:
1) it's perfectly reasonable to introduce a new random command
line argument for vgchange that applies only to that device type
2) it's perfectly reasonable to default that option to on,
introducing new behavior unexpectedly
This means that, in this case, any previously-existing invocation
of vgchange could now fork off a daemon in the background, and
*every* such invocation needs audited to see if forking the daemon
makes sense in that particular situation, whether the invocation is
in the initrd, in rc.sysinit, in /etc/init.d/netfs, or wherever.
Sure, we can audit and edit the stuff that we ship. But now this
new feature is going to happen in third-party scripts. Or sysadmin
And you're introducing this change in an update release to our
stable enterprise product - someone has to *add a command line
option* just to get the behavior they had before. Honestly, this
seems to me to be the sort of behavior that gets our customers up
in a row.
Maybe I'm misunderstanding how this works, but that's how I'm reading
Not to throw another monkey wrench in, but what about RHEL 5? The same code's in
agk had done some work to detect whether or not the binary being run was static.
Since the init scripts use the static binary, it should also be possible to
_not_ monitor in that case - allowing you to do nothing.
In any case, it seems we just need the right ppl to discuss and "do the right
OK. Just to make sure - is this also an issue for RHEL 5 or not?
I don't think changing the behavior vis-a-vis static/dynamic is the right
answer, for the same reasons of consistency/predictability. If it came to that,
it's probably best to shove in '-monitor n'. In any case, I don't see how to get
around having a separate init script for dmeventd ala mdadm.
Closing this - this a) isn't the right way to do this and b) I'm not sure it's even relevant for RHEL 4 at this point.