Description of problem: This is related to bz 177716. Those errors are caused by missing cmirror modules. An error stating which modules are needed should be given instead. [root@taft-04 lvm]# lvcreate -l 34875 -n mirror_20 -m 2 mirror_2 Error locking on node taft-03: Internal lvm error, check syslog Error locking on node taft-01: Internal lvm error, check syslog Error locking on node taft-02: Internal lvm error, check syslog Error locking on node taft-04: Internal lvm error, check syslog Failed to activate new LV. Feb 21 09:17:00 link-08 kernel: device-mapper: dm-mirror: Invalid number of mirrors Feb 21 09:17:00 link-08 kernel: device-mapper: error adding target to table
I doubt that this is the error you will recieve with the latest RPMS... Also, use '-m1'. We are not interested in more than 2 sided mirrors at this point.
By the "lastest rpms" you mean the latest cmirror rpms, correct? If you don't have any cmirror rpms installed yet still have the latest device-mapper and lvm2, you will see these errors. What you should see though, is a missing module or rpm warning. [root@link-01 ~]# rpm -q device-mapper device-mapper-1.02.03-1.0.RHEL4 [root@link-01 ~]# rpm -q lvm2 lvm2-2.02.02-1.0.RHEL4 [root@link-01 ~]# rpm -q lvm2-cluster lvm2-cluster-2.02.01-1.2.RHEL4 [root@link-02 ~]# rpm -q device-mapper device-mapper-1.02.03-1.0.RHEL4 [root@link-02 ~]# rpm -q lvm2 lvm2-2.02.02-1.0.RHEL4 [root@link-02 ~]# rpm -q lvm2-cluster lvm2-cluster-2.02.01-1.2.RHEL4 [root@link-08 ~]# rpm -q device-mapper device-mapper-1.02.03-1.0.RHEL4 [root@link-08 ~]# rpm -q lvm2 lvm2-2.02.02-1.0.RHEL4 [root@link-08 ~]# rpm -q lvm2-cluster lvm2-cluster-2.02.01-1.2.RHEL4 [root@link-02 ~]# lvcreate -m 1 -L 100M VG -n cmirror1 Error locking on node link-01: Internal lvm error, check syslog Error locking on node link-08: Internal lvm error, check syslog Error locking on node link-02: Internal lvm error, check syslog device-mapper: dm-mirror: Error creating mirror dirty log device-mapper: error adding target to table [root@link-02 ~]# lvs LV VG Attr LSize Origin Snap% Move Log Copy% cmirror VG mwi-d- 100.00M cmirror_mlog 0.00 cmirror1 VG mwi-d- 100.00M cmirror1_mlog 0.00 Without the cmirror rpm/module it shouldn't attempt to create any clustered mirror.
I mean the latest device-mapper and lvm2[-cluster] RPMs, which you now have; and, as you can see, the error messages are different from the original post: "Feb 21 09:17:00 link-08 kernel: device-mapper: dm-mirror: Invalid number of mirrors Feb 21 09:17:00 link-08 kernel: device-mapper: error adding target to table" vs. "device-mapper: dm-mirror: Error creating mirror dirty log device-mapper: error adding target to table" As far as not attempting to create a cluster mirror if the module is not loaded, I think that stretching it... I think it should fail, but perhaps give a better indication of the problem... like "Error creating mirror dirty log of type 'clustered_disk'". What do you think?
I still gotta go with with what I posted, it shouldn't try it if you don't have the code. Take snaps for instance, I don't have the code, it doesn't try. [root@link-02 ~]# lvcreate -s /dev/VG/cmirror -L 100M -n snappy Clustered snapshots are not yet supported. In this mirror case, I end up stuck with a "real" volume in some unkown state that depending on the user, will either get deleted, or attempted to be used. There are many instances in CS where if you don't have the right rpm/module, that code will call you an idiot. As 177716 shows, we've already got one possible customer and one tester trying this and not knowing what to do after it failed or even why it failed. To me, the following does not say, "look moron, why don't you load the proper rpm if you want clustered mirrors", to me is says, "for some reason your clustered mirror attempt failed" which leavs me thinking, why did it fail? device-mapper: dm-mirror: Error creating mirror dirty log device-mapper: error adding target to table
A init script has been introduced to fix this problem.
What if the init script doesn't get run or fails for some reason? If we are just going to have the attempt fail instead of not attempting it at all, then a better message is needed. Something about not being able to contact cmirror or the module not being loaded. To a newbie, "dm-mirror: Error creating mirror dirty log" doesn't really mean anything other than something is wrong.
More ammo in defense of mirror creation not being attempted when the module is not loaded is the whole Gulm issue (193597 and 193907). If we don't support mirroring on gulm, we shouldn't try it. Instead, it blindly attempts it and then half works/half fails. What's the user supposed to think at that point? [root@taft-04 ~]# gulm_tool getstats $(hostname) I_am = Client Master = taft-01.lab.msp.redhat.com rank = -1 quorate = true GenerationID = 1149257635110865 run time = 24 pid = 4284 verbosity = Default failover = enabled [root@taft-04 ~]# lvcreate -L 10G -m 1 -n deanmirror mirror_1 Error locking on node taft-03: Internal lvm error, check syslog Error locking on node taft-01: Internal lvm error, check syslog Error locking on node taft-02: Internal lvm error, check syslog Error locking on node taft-04: Internal lvm error, check syslog Failed to activate new LV. [root@taft-04 ~]# lvscan ACTIVE '/dev/mirror_1/deanmirror' [10.00 GB] inherit [root@taft-04 ~]# lvs LV VG Attr LSize Origin Snap% Move Log Copy% deanmirror mirror_1 mwi-d- 10.00G deanmirror_mlog 0.00
Hit this senario again, I forgot to load the module, it's still early in the morning give me a break :) and it took me awhile to realize why my creations were failing all of a sudden. Nothing in the syslog or the errors from the cmd say, "the module isn't loaded dummy".
We don't actually have a proper mechanism yet for userspace to ask the kernel what mirror logs are registered.
(for now, the best you could do is have mirrored.c's local target_present check for CLUSTERED and log_lv and extend target_present() to check /proc/modules & issue modprobe for unsupported cases like this - and there'll be more in future if modules get shared with multipath)
something would have to be added to the activation code. Otherwise, the node issuing the command will load the module, but all the other nodes won't - which means that the create still fails (on most of the nodes).
Moving this to the 4.6 release consideration due to the impact of the code changes required.
*** Bug 236345 has been marked as a duplicate of this bug. ***
Now that this is causing confusion for customers, maybe we should do something with this bug.
Adding 'cc ecs-dev-list for tracking
If comments 4, 6, and 7 (especially 7), aren't good enough check out how some of our other components deal with not having the init script started: [root@taft-03 ~]# ccs_tool update 2 Unable to connect to the CCS daemon: Connection refused Failed to update config file. [root@taft-03 ~]# cman_tool nodes cman_tool: can't open /proc/cluster/nodes, cman not running [root@taft-03 ~]# gulm_tool getstats taft-02 Failed to connect to taft-02 (::ffff:10.15.89.68 40040) Connection refused In src/gulm_tool.c:607 (1.0.10) death by: Failed to connect to server [root@taft-03 ~]# clustat Could not connect to cluster service They all either don't attempt the operation or give a meaningful error.
Thank you for submitting this issue for consideration in Red Hat Enterprise Linux. The release for which you requested us to review is now End of Life. Please See https://access.redhat.com/support/policy/updates/errata/ If you would like Red Hat to re-consider your feature request for an active release, please re-open the request via appropriate support channels and provide additional supporting details about the importance of this issue.