found sh_count 4294967295 in lvmlockd log,this causes vgs execution to lock but not unlock。 I understand that there is a thread safety problem in the code: sh_ count++?
I don't see any obvious problem related to sh_count. If you can provide more information it might help us find or reproduce the problem, e.g. the full lvmlockd log, a description of commands using the shared VG, the frequency and concurrency of lvm commands, an lvm command with full debugging like 'pvs -vvvv'.
(In reply to David Teigland from comment #2) > I don't see any obvious problem related to sh_count. If you can provide > more information it might help us find or reproduce the problem, e.g. the > full lvmlockd log, a description of commands using the shared VG, the > frequency and concurrency of lvm commands, an lvm command with full > debugging like 'pvs -vvvv'. I tried to execute vgs - vvvv, but I didn't see any error information, but lvmlockd didn't release the shared lock of vg. Then I saw sh_count 4294967295 in the log of lvmlockd. According to my observation, the value of this variable is the total number of shared locks that multiple threads of a client have added to a resource. For example, a client has five threads that have added shared locks to the same vg. The value of sh_count should be 5, If concurrency results in a final value of 4, and the five threads finally release the shared lock of the vg(which will be reduced by 5), the value will become 4294967295. The next time the vgs is executed, the bug will appear
So I want to confirm whether there is thread safety problem with sh_count++ in the code?
The log of lvmlockd executing vgs is as follows: 1vmlockd: 1666859132 recv vgs[19817] cl 834 lock vg “926af2flacac46a4b298ebe1b4122f08' mode sh flags 0 1vmlockd: 166859132 S lvm_926af2f1acac46a4b298ebeib4122f08 R VGLK action lock sh 1vmlockd:166859132 S 1vm 926af2flacac46a4b298ebe1b4122f08 R VGLK res_lock cl 834 mode sh 1wmlockd:16659132 5 1unm M2iaf f1acaAbia4k20tbelb412268 R ViLK 1ock san sh at /dev/mapper/926af2flacac46a4b298ebe1b4122f08-lvmlock:69206016 1mlockd:166659132 S 1wm 92baf2flacac46a4b298ebelb4122f108 R VGLK res _lock rv 8 read vb 1010 7847 lvmlockd: 1666859132 send vgs[19817] cl 834 lock vg rv 0 1vmlockd: 1666859132 recv vgs[ 19817] cl 834 lock vg “926af2flacac46a4b298ebe1b4122f08' mode un flags 0 1vmlockd: 1666859132 S lvm_926af2f1acac46a4b298ebe1b4122f08 R VGLK action lock un 1vmlockd: 1666859132 S 1vm 926af2flacac46a4b298ebe1b4122f08 R VGLK res unlock cl 834 1vmlockd: 1666859132 S 1vm _926af2f1lacac46a4b298ebe1b4122f68 R VGLK res_unlock sh_count 4294967295 lvmlockd: 1666859132 send vgs[19817] cl 834 lock vg rv 0 lvmlockd: 1666859132 close vgs[19817] cl 834 fd 9
I don't see a thread safety issue. r->sh_count is only used by res_lock(), res_convert(), res_unlock(). These are only called by res_process() which is only called from the lockspace_thread. There is one lockspace_thread for each lockspace/VG. The sh_count value is increased when an lvm command requests shared lock, e.g. to read the VG, and then decreases when the command unlocks it or exits. I suspect the problem is due to a missing unlock from an lvm command. One area where we have problems like this is dmeventd where lvm commands are not run normally. What is the lvm version here? If it's not recent please try a recent version since it's possible this has been fixed.
(In reply to David Teigland from comment #6) > I don't see a thread safety issue. r->sh_count is only used by res_lock(), > res_convert(), res_unlock(). These are only called by res_process() which > is only called from the lockspace_thread. There is one lockspace_thread for > each lockspace/VG. The sh_count value is increased when an lvm command > requests shared lock, e.g. to read the VG, and then decreases when the > command unlocks it or exits. I suspect the problem is due to a missing > unlock from an lvm command. One area where we have problems like this is > dmeventd where lvm commands are not run normally. What is the lvm version > here? If it's not recent please try a recent version since it's possible > this has been fixed. The lvm version is 2.02 180.Is it because the lvm command does not increase the r->sh_count by 1? I have found this phenomenon on multiple physical machines for the same vg, so it may not be a thread safety problem. The final result is that the vg is always locked with a shared lock, lvcreate or lvextend will fail, and I must restart lvmlockd and sanlock. When this problem occurs, I execute vgs - vvvv, but I see no exception information. How can I find the root cause of this problem?
2.02.180 is very old, so it's very likely this has been fixed already. Please try a recent lvm release.
(In reply to David Teigland from comment #8) > 2.02.180 is very old, so it's very likely this has been fixed already. > Please try a recent lvm release. Without upgrading the kernel, I tried to update the lvm to lvm2.03, and then installed it from the scpm. But I found that the thin-provisioning-tools and libcorosync-devel dependencies could not be resolved. So I want to ask how I can compile lvm2.03 in the current Centos7?
> Without upgrading the kernel, I tried to update the lvm to lvm2.03, and then > installed it from the scpm. But I found that the thin-provisioning-tools and > libcorosync-devel dependencies could not be resolved. So I want to ask how I > can compile lvm2.03 in the current Centos7? You will need to compile lvm2 from source, and enable/disable some configure options. I've never done this so I don't know which configure options to use, but I think it should be possible.
(In reply to David Teigland from comment #10) > > Without upgrading the kernel, I tried to update the lvm to lvm2.03, and then > > installed it from the scpm. But I found that the thin-provisioning-tools and > > libcorosync-devel dependencies could not be resolved. So I want to ask how I > > can compile lvm2.03 in the current Centos7? > > You will need to compile lvm2 from source, and enable/disable some configure > options. I've never done this so I don't know which configure options to > use, but I think it should be possible. Hi,David,I have been constantly encountering VGLK format or delta lease errors recently, such as "invalid val_blk version 1f83 flags 0 r_version 0". I would like to ask if there is any way to check for such errors because I am looking for a condition that can initialize VGLK. Thank you. for example: [root@172-26-51-127 ~]# sanlock direct read_leader -r lvm_8e97627ab5ea4b0e8cb9f42c8345d728:26:/dev/mapper/8e97627ab5ea4b0e8cb9f42c8345d728-lvmlock:0 read_leader done -223 magic 0x12212010 version 0x30004 flags 0x10 sector_size 512 num_hosts 0 max_hosts 1 owner_id 0 owner_generation 0 lver 0 space_name lvm_8e97627ab5ea4b0e8cb9f42c8345d728 resource_name timestamp 0 checksum 0x95ef093b io_timeout 10 write_id 0 write_generation 0 write_timestamp 0
> Hi,David,I have been constantly encountering VGLK format or delta lease > errors recently, such as "invalid val_blk version 1f83 flags 0 r_version 0". Hi, sorry to hear this. The previous comment is that you needed to update to a new version of lvm, did that help? These problems sound like data corruption on disk or in memory. The val_blk version should be constant (0x0101). That val_blk data is located within the lvb sector (sector number 2002 in a resource lease area.) sanlock itself won't detect corruption within the lvb sector since that data is controlled by lvmlockd (corruption in that data is reported by lvmlockd.) sanlock will detect corruption in other parts of the lease area, which you've also reported, e.g. VGLK or delta lease fields corrupted. > I would like to ask if there is any way to check for such errors because I > am looking for a condition that can initialize VGLK. Thank you. If data corruption occurs on disk, the first sign will usually be sanlock error messages in the logs. The exception is corruption within the lvb sector which will be reported by lvmlockd. I can't think of any options for handling this apart from watching logs for error messages. > for example: > [root@172-26-51-127 ~]# sanlock direct read_leader -r > lvm_8e97627ab5ea4b0e8cb9f42c8345d728:26:/dev/mapper/ > 8e97627ab5ea4b0e8cb9f42c8345d728-lvmlock:0 That command is incorrect. You need to use -s to read lockspace leases (delta leases) from offset 0. The -r attempts to read a resource lease (paxos lease). The lvmlockd resource leases on /dev/mapper/vgname-lvmlock can be found at offsets 65MB (global lock), 66MB (VG lock), 67MB... (LV locks). e.g. sanlock direct read_leader -s <lockspace_name>:<host_id>:/dev/mapper/vgname-lvmlock:0 and sanlock direct read_leader -r <lockspace_name>:<resource_name>:/dev/mapper/vgname-lvmlock:65M (66M, 67M, ...) (Because the same leader_record structure is used by both lockspace and resource leases, some of the structure fields will look sensible, even if you mix up -s/-r.)
(In reply to David Teigland from comment #13) > > 你好,David,我一直遇到 VGLK 格式或 delta Lease > > 最近出现错误,例如“无效的 val_blk version 1f83 flags 0 r_version 0”。 > > 你好,很遗憾听到这个消息。之前的评论是您需要更新到新版本的lvm,这有帮助吗? > > 这些问题听起来像是磁盘或内存中的数据损坏。val_blk 版本应该是常量 (0x0101)。该 val_blk 数据位于 lvb > 扇区内(资源租用区域中的扇区号 2002)。sanlock 本身不会检测 lvb 扇区内的损坏,因为该数据由 lvmlockd 控制(该数据中的损坏由 > lvmlockd 报告。)sanlock将检测您也报告过的租赁区域其他部分的损坏,例如 VGLK 或增量租赁字段损坏。 > > > 我想问是否有任何方法可以检查此类错误,因为我 > >正在寻找可以初始化VGLK的条件。谢谢。 > > 如果磁盘上发生数据损坏,第一个迹象通常是日志中的 sanlock 错误消息。例外情况是lvb 扇区内的损坏,这将由lvmlockd > 报告。除了查看日志中的错误消息之外,我想不出任何处理此问题的选项。 > > > 例如: > > [root@172-26-51-127 ~]# sanlock 直接 read_leader -r > > lvm_8e97627ab5ea4b0e8cb9f42c8345d728:26:/dev/mapper/ > > 8e97627ab5ea4b0e8cb9f42c8345d728-lvmlock:0 > > 该命令不正确。您需要使用 -s 从偏移量 0 读取锁空间租约(增量租约)。 -r 尝试读取资源租约(paxos > 租约)。/dev/mapper/vgname-lvmlock 上的 lvmlockd 资源租用可在偏移量 65MB(全局锁)、66MB(VG > 锁)、67MB...(LV 锁)处找到。 > > 例如 > > sanlock direct read_leader -s > <lockspace_name>:<host_id>:/dev/mapper/vgname-lvmlock:0 > > 和 > > sanlock direct read_leader -r > <lockspace_name>:<resource_name>:/dev/mapper/vgname-lvmlock:65M(66M、67M、...) > > > (因为锁空间和资源租用都使用相同的leader_record结构,所以即使你混淆了-s/-r,一些结构字段看起来也是合理的。) Yes, I used the wrong command, but is it really undetectable if the lease is broken? For example, after I set the delta lease to zero, VG cannot join the lockspace. Can lvmlockd or sanlock automatically fix such problems in the future? In addition, I have also encountered issues where executing 'sanlock direct dump/dev/mapper/$vg lvmlock: 0' returns' cannot get align/sensor size '. I am not sure where the align/sensor size is stored for a lockspace, but it seems that every delta lease has these two fields.
(In reply to shan.wu from comment #14) > Yes, I used the wrong command, but is it really undetectable if the lease is > broken? For example, after I set the delta lease to zero, VG cannot join the > lockspace. sanlock certainly detects bad leases, and logs the error messages you've seen. > Can lvmlockd or sanlock automatically fix such problems in the future? In theory, lvmlockd could check the sanlock errors and respond by rewriting the lease. It might avoid the problem you see now, but then other problems may appear. It would be treating the symptoms of the problem rather than the root cause. So, I doubt we'd want to do this. The best approach is to track down and stop the data corruption. It seems like entire blocks are being damaged or written to wrong locations. When sanlock sees bad lease data, I wonder if we could have it save that entire block of raw data to a local file for debugging. Perhaps if we could inspect that raw data block, we might see what data it contains, and maybe who wrote it, e.g. does the data contain certain strings or magic numbers? e.g. is it a block of data from sanlock, or from a file system, or from lvm, or from an application? We don't see other reports of this, although there are not many users of lvmlockd+sanlock, and other users may not have such stressful workloads. > In addition, I have also encountered issues where executing 'sanlock > direct dump/dev/mapper/$vg lvmlock: 0' returns' cannot get align/sensor size > '. I am not sure where the align/sensor size is stored for a lockspace, but > it seems that every delta lease has these two fields. I'm not sure what error this is. In the direct_dump() function I see this which sounds similar: if (!sector_size || !align_size) { printf("Cannot find sector_size and align_size, set with -A and -Z.\n"); I think old sanlock versions did not include these fields on disk, so you can use -A/-Z to specify what the values are.
I have a question to ask. I use lvmlockd and sanlock as the lock managers for lvm.I encountered an error while executing the 'sanlock client release -r', and the sanlock log showed 'no resource VGLK'. My environment has three vgs, and I have found that when executing this command on different vgs, the sanlock client_id used and fd are the same. So the lock can only be released normally when both of these match. But this command does not support specifying the client_id, what should I do?
You're right, I've wanted to address this for a long time. "sanlock client release -p <pid>" is not useful when a single process (like lvmlockd) is linked with multiple client connections. I've written a initial draft of a patch for this (adds -C <client_id> as an alternative to -p <pid>), but I don't have the time right now to test and release it.
(In reply to David Teigland from comment #17) > You're right, I've wanted to address this for a long time. "sanlock client > release -p <pid>" is not useful when a single process (like lvmlockd) is > linked with multiple client connections. I've written a initial draft of a > patch for this (adds -C <client_id> as an alternative to -p <pid>), but I > don't have the time right now to test and release it. If that's the case, how does lvmlockd release the lease, and can we use lvmlockd to release the lease in sanlock?
There are two ways to release a lease: 1. The client process (lvmlockd) releases its own lease. This uses the registered connection (fd) between lvmlockd and sanlock. The registered connection is directly associated with the client_id. This is how lvmlockd releases leases. 2. Another client process ("sanlock client ..." command) releases a lease that is held by a separate process (lvmlockd). In this method, the client process needs to somehow identify the other process (lvmlockd). Currently it uses the pid, (with -p), but the patch I attached allows it to use the client_id (with -C). This method is not usually how leases should be released, but it can be useful in some special cases.