Note: This bug is displayed in read-only format because
the product is no longer active in Red Hat Bugzilla.
RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
(In reply to comment #0)
> currently there is no reproducer, but i will try to create one (script).
If you manage to reproduce that, please try to grab the -vvvv debug log as well. Thanks.
I think i'v managed to reproduce using the following script:
#! /usr/bin/python
import subprocess
import commands
import os
def attachPid():
pids = commands.getoutput('pgrep lvm')
if not pids:
print "No LVM process found"
attachPid()
else:
pids = pids.split('\n')
for pid in pids:
pid = int(pid)
os.kill(pid, 19)
print "LVM process found (%s)- SIGSTOP sent to it" % pid
attachPid()
please run this script, along with several heavy lvm commands, and leave it for some time to run.
[root@rhev-a8c-02 tmp]# ps -ww `pgrep lvm`
PID TTY STAT TIME COMMAND
20728 ? Z< 0:00 [lvm] <defunct>
This reproducer (comment #3) make no sense to me.
You are using os.kill(pid, 19) - this send SIGSTOP/SIGCONT, and of course causes program to stop. This is not a bug.
What do you want to try do to here? What do you expect from SIGSTOP sent to lvm?
man 2 signal:
SIGCONT 19,18,25 Cont Continue if stopped
SIGSTOP 17,19,23 Stop Stop process
The signals SIGKILL and SIGSTOP cannot be caught, blocked, or ignored.
If it is problem we tried to debug some time agi, there were some active lvm semaphores. But I think this was relict of some previous testing or upgrade.
(For reproducer above - it was stopped, so obviously some semaphores were still present, but after "pkill -SIGCONT lvm" it simply continues, so this was not the former observed problem.)
Description of problem: have a machine with Iscsi storage connected to several luns, and some lvm commands are stuck in S< for the last 4-5 hours now. this state stuck our hypervisor management system called 'VDSM'. from short debugging cycle held with mbroz, it appears that one process was blocked in semaphore (and other waiting for it in flock(), and udevcomplete_all unblocked it. it also appears that we had lots of cookies\semaphore under dmsetup udevcookies. [root@nott-vds3 ~]# dmsetup udevcookies cookie semid value last_semop_time 0xd4d1360 114720771 1 Wed Jul 6 18:13:44 2011 0xd4d91b8 114753540 1 Wed Jul 6 18:13:44 2011 0xd4de72c 321388549 2 Thu Jul 14 16:48:08 2011 0xd4d79ba 114917382 1 Wed Jul 6 18:14:00 2011 0xd4d421d 272072711 1 Mon Jul 11 14:43:46 2011 0xd4d952e 280821768 1 Mon Jul 11 15:47:15 2011 0xd4d8465 288587785 1 Mon Jul 11 16:36:44 2011 0xd4d8fa5 321421322 1 Thu Jul 14 18:21:06 2011 0xd4df60f 321454091 2 Sun Jul 17 11:42:52 2011 0xd4d8b03 321486860 2 Sun Jul 17 12:10:32 2011 0xd4d7c6d 321519629 26 Sun Jul 17 17:58:12 2011 root@nott-vds3 ~]# ps -w `pgrep lvm` PID TTY STAT TIME COMMAND 1619 ? S< 0:00 /sbin/lvm lvchange --config devices { preferred_names = ["^/dev/mapper/"] ignore_suspended_devices=1 write_cache_state=0 filter = [ "a%1Daffi1309719|1Dafna-Export1308598|1EXPORT1310355|1EXP_Dom_Bckup|1Exp-Daf 1931 ? S< 0:00 /sbin/lvm vgck --config devices { preferred_names = ["^/dev/mapper/"] ignore_suspended_devices=1 write_cache_state=0 filter = [ "a%1Daffi1309719|1Dafna-Export1308598|1EXPORT1310355|1EXP_Dom_Bckup|1Exp-Daf1310 8895 ? S< 0:00 /sbin/lvm lvchange --config devices { preferred_names = ["^/dev/mapper/"] ignore_suspended_devices=1 write_cache_state=0 filter = [ "a%/dev/mapper/1Daffi1309719|/dev/mapper/1Dafna-Export1308598|/dev/mapper/1E 10774 ? S< 0:00 /sbin/lvm pvs --config devices { preferred_names = ["^/dev/mapper/"] ignore_suspended_devices=1 write_cache_state=0 filter = [ "a%1Daffi1309719|1Dafna-Export1308598|1EXPORT1310355|1EXP_Dom_Bckup|1Exp-Daf13100 11433 ? S< 0:00 /sbin/lvm pvs --config devices { preferred_names = ["^/dev/mapper/"] ignore_suspended_devices=1 write_cache_state=0 filter = [ "a%1Daffi1309719|1Dafna-Export1308598|1EXPORT1310355|1EXP_Dom_Bckup|1Exp-Daf13100 17550 ? S< 0:00 /sbin/lvm vgck --config devices { preferred_names = ["^/dev/mapper/"] ignore_suspended_devices=1 write_cache_state=0 filter = [ "a%1Daffi1309719|1Dafna-Export1308598|1EXPORT1310355|1EXP_Dom_Bckup|1Exp-Daf1310 19435 ? S< 0:00 /sbin/lvm vgck --config devices { preferred_names = ["^/dev/mapper/"] ignore_suspended_devices=1 write_cache_state=0 filter = [ "a%1Daffi1309719|1Dafna-Export1308598|1EXPORT1310355|1EXP_Dom_Bckup|1Exp-Daf1310 1619 [<ffffffff811bbd8d>] flock_lock_file_wait+0x18d/0x350 [<ffffffff811bc10b>] sys_flock+0x1bb/0x1d0 [<ffffffff8100b0b2>] system_call_fastpath+0x16/0x1b [<ffffffffffffffff>] 0xffffffffffffffff 1931 [<ffffffff811bbd8d>] flock_lock_file_wait+0x18d/0x350 [<ffffffff811bc10b>] sys_flock+0x1bb/0x1d0 [<ffffffff8100b0b2>] system_call_fastpath+0x16/0x1b [<ffffffffffffffff>] 0xffffffffffffffff 8895 [<ffffffff811f7c15>] sys_semtimedop+0x725/0x8b0 [<ffffffff811f7db0>] sys_semop+0x10/0x20 [<ffffffff8100b0b2>] system_call_fastpath+0x16/0x1b [<ffffffffffffffff>] 0xffffffffffffffff 10774 [<ffffffff811bbd8d>] flock_lock_file_wait+0x18d/0x350 [<ffffffff811bc10b>] sys_flock+0x1bb/0x1d0 [<ffffffff8100b0b2>] system_call_fastpath+0x16/0x1b [<ffffffffffffffff>] 0xffffffffffffffff 11433 [<ffffffff811bbd8d>] flock_lock_file_wait+0x18d/0x350 [<ffffffff811bc10b>] sys_flock+0x1bb/0x1d0 [<ffffffff8100b0b2>] system_call_fastpath+0x16/0x1b [<ffffffffffffffff>] 0xffffffffffffffff 17550 [<ffffffff811bbd8d>] flock_lock_file_wait+0x18d/0x350 [<ffffffff811bc10b>] sys_flock+0x1bb/0x1d0 [<ffffffff8100b0b2>] system_call_fastpath+0x16/0x1b [<ffffffffffffffff>] 0xffffffffffffffff 19435 [<ffffffff811bbd8d>] flock_lock_file_wait+0x18d/0x350 [<ffffffff811bc10b>] sys_flock+0x1bb/0x1d0 [<ffffffff8100b0b2>] system_call_fastpath+0x16/0x1b [<ffffffffffffffff>] 0xffffffffffffffff currently there is no reproducer, but i will try to create one (script).