Bug 175158
Summary: | scsi devices busy in cluster | ||
---|---|---|---|
Product: | [Retired] Red Hat Cluster Suite | Reporter: | Nate Straz <nstraz> |
Component: | lvm2-cluster | Assignee: | Christine Caulfield <ccaulfie> |
Status: | CLOSED NOTABUG | QA Contact: | Cluster QE <mspqa-list> |
Severity: | medium | Docs Contact: | |
Priority: | medium | ||
Version: | 4 | CC: | agk |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | i386 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2005-12-07 21:42:30 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Nate Straz
2005-12-06 23:52:02 UTC
Tried it with DLM on the same cluster and ran into the same problem. I have another cluster running the same tests without problems. Same storage hardware and HBAs too. I did some digging with systemtap and found why EBUSY is being returned. In rescan_partitions(), bdev->bd_part_count == 1 on the failing nodes. Here is the systemtap script I used: /* rescan.stp This is a simple script to figure out what is happening with block device rescans. */ global count_rescan probe kernel.function("blkdev_ioctl") { log("blkdev_ioctl cmd " . string($cmd)); } probe kernel.function("blkdev_reread_part") { log("entered blkdev_reread_part"); } probe kernel.function("rescan_partitions") { log("entered rescan_partitions"); log("part_count = " . string($bdev->bd_part_count)); } probe kernel.function("invalidate_partition") { log("entered invalidate_partition"); } probe begin { log("starting probe") } probe end { log("ending probe"); } Passing output: [root@morph-01 ~]# stap /tmp/rescan.stp starting probe blkdev_ioctl cmd 21286 blkdev_ioctl cmd 21286 blkdev_ioctl cmd 4703 entered blkdev_reread_part entered rescan_partitions part_count = 0 entered invalidate_partition blkdev_ioctl cmd 4712 blkdev_ioctl cmd -2147216782 blkdev_ioctl cmd 21286 ending probe Failing output: [root@morph-02 ~]# stap /tmp/rescan.stp starting probe blkdev_ioctl cmd 21286 blkdev_ioctl cmd 21286 blkdev_ioctl cmd 4703 entered blkdev_reread_part entered rescan_partitions part_count = 1 blkdev_ioctl cmd 21286 ending probe Ah ha! I found the holder! On the nodes that weren't working, there was still an entry in the DM table. After using dmsetup to remove it, I was back in business. ARG! Found that lvm.conf was wrong on one node. It didn't have cluster locking enabled. |