Bug 1463843
| Summary: | storaged flooding messages with "Failed to update LVM volume group" | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | Federico Sun <fsun> |
| Component: | storaged | Assignee: | Vratislav Podzimek <vpodzime> |
| Status: | CLOSED ERRATA | QA Contact: | xhe <xhe> |
| Severity: | medium | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 7.2 | CC: | bdonahue, vpodzime, xhe |
| Target Milestone: | rc | Keywords: | Extras |
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | storaged-2.5.2-4 | Doc Type: | If docs needed, set a value |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2017-09-05 09:56:15 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Federico Sun
2017-06-21 21:26:06 UTC
(In reply to Federico Sun from comment #0) > This issue has been detected and fixed in upstream: > > * https://github.com/storaged-project/udisks/issues/45 > * > https://github.com/storaged-project/udisks/pull/90/commits/ > 583cf3031f3c22cbb95ba507b21859772f8cdf05 > > But the patch doesn't seem to be in the latest version of RHEL 7.3. I'll try to backport and ship it next week. I have the patche backported. For the manual test, I tried to reproduce it on RHEL-7.3 with the storaged-2.5.2-2, I did start the storaged service and created 1000 lvms (100M-500M peer lvm) from one same vg, I even made filesystem on these lvms and mount them. Then check the directory of "/proc/425/fd", I still didn't hit the same issue. Please let me know what steps I lost.
*********** manual details *****************
# rpm -qa|egrep "storaged|kernel"
libstoraged-2.5.2-2.el7.x86_64
storaged-lvm2-2.5.2-2.el7.x86_64
storaged-2.5.2-2.el7.x86_64
kernel-3.10.0-514.el7.x86_64
# storagectl status
MODEL REVISION SERIAL DEVICE
--------------------------------------------------------------------------
IBM ServeRAID M5110e 3.19 0028f23c21ca1c6819f0ae9640600705 sda
NETAPP LUN 811a BVfC5+GmgIm0 dm-3 sdb sdf sdj sdn
NETAPP LUN 811a BVfC5+GmgInR dm-4 sdc sdg sdk sdo
NETAPP LUN 811a BVfC5+GmgInT dm-5 sdd sdh sdl sdp
NETAPP LUN 811a BVfC5+GmgInV dm-6 sde sdi sdm sdq
for i in {1..1000}
do
lvcreate -L 100M -n "lvm_test$i" rhel_storageqe-53
mkfs.ext3 "/dev/rhel_storageqe-53/lvm_test$i"
mkdir "/lvm_test$i"
mount "/dev/rhel_storageqe-53/lvm_test$i" "/lvm_test$i"
done
# ps -aux|grep storaged
root 425 0.1 0.1 465436 33684 ? Ssl 03:06 0:05 /usr/libexec/storaged/storaged --no-debug
root 19634 0.0 0.0 112652 960 pts/1 S+ 04:31 0:00 grep --color=auto storaged
# ls /proc/425/fd
0 1 10 11 12 2 3 4 5 6 7 8 9
# ll /proc/425/fd
total 0
lrwx------. 1 root root 64 Aug 11 03:06 0 -> /dev/null
lrwx------. 1 root root 64 Aug 11 03:06 1 -> /dev/null
lr-x------. 1 root root 64 Aug 11 03:06 10 -> anon_inode:inotify
lrwx------. 1 root root 64 Aug 11 03:06 11 -> socket:[76410]
lrwx------. 1 root root 64 Aug 11 03:06 12 -> anon_inode:[eventfd]
lrwx------. 1 root root 64 Aug 11 03:06 2 -> /dev/null
lrwx------. 1 root root 64 Aug 11 03:06 3 -> socket:[76405]
lrwx------. 1 root root 64 Aug 11 03:06 4 -> anon_inode:[eventfd]
lrwx------. 1 root root 64 Aug 11 03:06 5 -> anon_inode:[eventfd]
lrwx------. 1 root root 64 Aug 11 03:06 6 -> socket:[40385]
lrwx------. 1 root root 64 Aug 11 03:06 7 -> anon_inode:[eventfd]
lr-x------. 1 root root 64 Aug 11 03:06 8 -> /proc/425/mountinfo
lr-x------. 1 root root 64 Aug 11 03:06 9 -> /proc/swaps
I still didn't reproduce it, I did rename the lvm, mount/umount and sleep 1 second for the 1000 lvms. *************** snip ************** [1/1000] Renamed "lvm_1" to "lvm_test1" in volume group "rhel_storageqe-53" Renamed "lvm_test1" to "lvm_1" in volume group "rhel_storageqe-53" mount /dev/rhel_storageqe-53/lvm_1 /lvm_test1 umount /dev/rhel_storageqe-53/lvm_1 sleep 1 [.../1000] ... [899/1000] Renamed "lvm_899" to "lvm_test899" in volume group "rhel_storageqe-53" Renamed "lvm_test899" to "lvm_899" in volume group "rhel_storageqe-53" mount /dev/rhel_storageqe-53/lvm_899 /lvm_test899 umount /dev/rhel_storageqe-53/lvm_899 [900/1000] Renamed "lvm_900" to "lvm_test900" in volume group "rhel_storageqe-53" Renamed "lvm_test900" to "lvm_900" in volume group "rhel_storageqe-53" mount /dev/rhel_storageqe-53/lvm_900 /lvm_test900 umount /dev/rhel_storageqe-53/lvm_900 ... [1000/1000] # ps -aux|grep storaged root 4735 0.4 0.0 455116 25636 ? Ssl 05:39 0:26 /usr/libexec/storaged/storaged --no-debug root 14207 0.0 0.0 112652 960 pts/4 S+ 07:10 0:00 grep --color=auto storaged root 28236 0.0 0.0 143864 1348 ? S 03:07 0:00 journalctl -q --output=json --lines=10 --follow -- _SYSTEMD_UNIT=storaged.service + _SYSTEMD_UNIT=udisks2.service + _SYSTEMD_UNIT=dm-event.service + _SYSTEMD_UNIT=smartd.service + _SYSTEMD_UNIT=multipathd.service # ls -l /proc/4735/fd total 0 lrwx------. 1 root root 64 Aug 28 05:39 0 -> /dev/null lrwx------. 1 root root 64 Aug 28 05:39 1 -> /dev/null lr-x------. 1 root root 64 Aug 28 05:39 10 -> anon_inode:inotify lrwx------. 1 root root 64 Aug 28 05:39 11 -> socket:[10198018] lrwx------. 1 root root 64 Aug 28 05:39 12 -> anon_inode:[eventfd] lrwx------. 1 root root 64 Aug 28 05:39 2 -> /dev/null lrwx------. 1 root root 64 Aug 28 05:39 3 -> socket:[10131453] lrwx------. 1 root root 64 Aug 28 05:39 4 -> anon_inode:[eventfd] lrwx------. 1 root root 64 Aug 28 05:39 5 -> anon_inode:[eventfd] lrwx------. 1 root root 64 Aug 28 05:39 6 -> socket:[10191481] lrwx------. 1 root root 64 Aug 28 05:39 7 -> anon_inode:[eventfd] lr-x------. 1 root root 64 Aug 28 05:39 8 -> /proc/4735/mountinfo lr-x------. 1 root root 64 Aug 28 05:39 9 -> /proc/swaps I tested it on the 2.5.2-2, 2.5.2-3 and 2.5.2-4, I have not reproduced it so far. I updated the 1000 lvms which are from the same one vg, I tested the lvm renaming twice, mount and umount in one 1000 times loop and sleep 1 second for every cycle. And I even tried to run three processes for one 1000+ time loop at the same time for the case of with 1 second sleep and without 1 second sleep : 1st process - umount lvm 2nd process - rename lvm twice 3rd process - mount lvm My testing process is still running on the 1000 lvm removing and recreating at moment, and I will update my test result when the whole process finished later. According to the upstream test result on #c2, it has fixed the original issue, so I did the sanityonly (due to it's hard to reproduced to me without the customer environment). I verified it as SanityOnly. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2017:2597 |