Bug 1035044
| Summary: | pvs -a segfault during cluster mirror device failure testing | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | Corey Marthaler <cmarthal> |
| Component: | lvm2 | Assignee: | Jonathan Earl Brassow <jbrassow> |
| lvm2 sub component: | Default / Unclassified | QA Contact: | cluster-qe <cluster-qe> |
| Status: | CLOSED WORKSFORME | Docs Contact: | |
| Severity: | high | ||
| Priority: | unspecified | CC: | agk, cmarthal, heinzm, jbrassow, msnitzer, prajnoha, prockai, thornber, zkabelac |
| Version: | 7.0 | ||
| Target Milestone: | rc | ||
| Target Release: | --- | ||
| Hardware: | x86_64 | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2015-04-10 21:59:44 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | |||
| Bug Blocks: | 1075262 | ||
This issue doesn't seem reproducible. If it is, we can fix in one of two ways: 1) check for out-of-bounds parameters in '_parse_mirror_status' to prevent the segfault 2) find out the reason why this code path is being traversed at all in a way that leads to an error condition and fix that. If it isn't reproducible, the bug can be closed. I'm not closing the bug now because I think the bug needs more time in testing. However, it shouldn't be considered a blocker unless it can be reproduced. |
Description of problem: ================================================================================ Iteration 0.4 started at Tue Nov 26 16:04:26 CST 2013 ================================================================================ Scenario kill_primary_synced_core_log_4_legs: Kill primary leg of synced core log 4 leg mirror(s) ********* Mirror hash info for this scenario ********* * names: syncd_primary_core_4legs_1 * sync: 1 * striped: 0 * leg devices: /dev/sda1 /dev/sde1 /dev/sdd1 /dev/sdb1 * log devices: * no MDA devices: * failpv(s): /dev/sda1 * failnode(s): virt-123.cluster-qe.lab.eng.brq.redhat.com virt-124.cluster-qe.lab.eng.brq.redhat.com virt-125.cluster-qe.lab.eng.brq.redhat.com * lvmetad: 0 * leg fault policy: remove * log fault policy: allocate ****************************************************** Creating mirror(s) on virt-123.cluster-qe.lab.eng.brq.redhat.com... virt-123.cluster-qe.lab.eng.brq.redhat.com: lvcreate --mirrorlog core -m 3 -n syncd_primary_core_4legs_1 -L 500M helter_skelter /dev/sda1:0-1000 /dev/sde1:0-1000 /dev/sdd1:0-1000 /dev/sdb1:0-1000 Current mirror/raid device structure(s): LV Attr LSize Cpy%Sync Devices syncd_primary_core_4legs_1 mwi-a-m--- 500.00m 6.40 syncd_primary_core_4legs_1_mimage_0(0),syncd_primary_core_4legs_1_mimage_1(0),syncd_primary_core_4legs_1_mimage_2(0),syncd_primary_core_4legs_1_mimage_3(0) [syncd_primary_core_4legs_1_mimage_0] Iwi-aom--- 500.00m /dev/sda1(0) [syncd_primary_core_4legs_1_mimage_1] Iwi-aom--- 500.00m /dev/sde1(0) [syncd_primary_core_4legs_1_mimage_2] Iwi-aom--- 500.00m /dev/sdd1(0) [syncd_primary_core_4legs_1_mimage_3] Iwi-aom--- 500.00m /dev/sdb1(0) Waiting until all mirror|raid volumes become fully syncd... 0/1 mirror(s) are fully synced: ( 92.60% ) 1/1 mirror(s) are fully synced: ( 100.00% ) Creating gfs2 on top of mirror(s) on virt-123.cluster-qe.lab.eng.brq.redhat.com... mkfs.gfs2 -J 32M -j 3 -p lock_dlm -t STSRHTS22168:gfs1 /dev/helter_skelter/syncd_primary_core_4legs_1 -O Mounting mirrored gfs2 filesystems on virt-123.cluster-qe.lab.eng.brq.redhat.com... Mounting mirrored gfs2 filesystems on virt-124.cluster-qe.lab.eng.brq.redhat.com... Mounting mirrored gfs2 filesystems on virt-125.cluster-qe.lab.eng.brq.redhat.com... PV=/dev/sda1 syncd_primary_core_4legs_1_mimage_0: 6 PV=/dev/sda1 syncd_primary_core_4legs_1_mimage_0: 6 Writing verification files (checkit) to mirror(s) on... ---- virt-123.cluster-qe.lab.eng.brq.redhat.com ---- ---- virt-124.cluster-qe.lab.eng.brq.redhat.com ---- ---- virt-125.cluster-qe.lab.eng.brq.redhat.com ---- Sleeping 15 seconds to get some outsanding EXT I/O locks before the failure Verifying files (checkit) on mirror(s) on... ---- virt-123.cluster-qe.lab.eng.brq.redhat.com ---- ---- virt-124.cluster-qe.lab.eng.brq.redhat.com ---- ---- virt-125.cluster-qe.lab.eng.brq.redhat.com ---- Disabling device sda on virt-123.cluster-qe.lab.eng.brq.redhat.com Disabling device sde on virt-124.cluster-qe.lab.eng.brq.redhat.com Disabling device sdc on virt-125.cluster-qe.lab.eng.brq.redhat.com sh: line 1: 32516 Segmentation fault (core dumped) qarsh root.lab.eng.brq.redhat.com pvs -a > /dev/null 2>&1 simple pvs failed/segfaulted possible regression of BZ 571963 FI_engine: inject() method failed Nov 26 23:06:53 virt-123 qarshd[24087]: Running cmdline: echo offline > /sys/block/sda/device/state & Nov 26 23:06:54 virt-123 kernel: [82926.110209] sd 3:0:0:1: rejecting I/O to offline device Nov 26 23:06:54 virt-123 kernel: [82926.112492] sd 3:0:0:1: rejecting I/O to offline device Nov 26 23:06:54 virt-123 kernel: [82926.114695] sd 3:0:0:1: rejecting I/O to offline device Nov 26 23:06:54 virt-123 kernel: [82926.116896] sd 3:0:0:1: rejecting I/O to offline device Nov 26 23:06:54 virt-123 kernel: [82926.119071] sd 3:0:0:1: rejecting I/O to offline device Nov 26 23:06:55 virt-123 lvm[21428]: Primary mirror device 253:2 has failed (D). Nov 26 23:06:55 virt-123 lvm[21428]: Device failure in helter_skelter-syncd_primary_core_4legs_1. [...] Nov 26 23:06:56 virt-123 cmirrord[13471]: cpg_dispatch failed: CS_ERR_BAD_HANDLE Nov 26 23:06:56 virt-123 lvm[21428]: Monitoring mirror device helter_skelter-syncd_primary_core_4legs_1 for events. Nov 26 23:06:58 virt-123 cmirrord[13471]: cpg_dispatch failed: CS_ERR_BAD_HANDLE Nov 26 23:07:01 virt-123 lvm[21428]: Monitoring mirror device helter_skelter-syncd_primary_core_4legs_1 for events. Nov 26 23:07:01 virt-123 systemd: Starting qarsh Per-Connection Server... Nov 26 23:07:01 virt-123 systemd: Started qarsh Per-Connection Server. Nov 26 23:07:01 virt-123 qarshd[24141]: Talking to peer ::ffff:10.15.80.47:43496 (IPv6) Nov 26 23:07:01 virt-123 qarshd[24141]: Running cmdline: pvs -a Nov 26 23:07:02 virt-123 lvm[21428]: Mirror status: 1 of 4 images failed. Nov 26 23:07:02 virt-123 lvm[21428]: Trying to up-convert to 3 images, 0 logs. Nov 26 23:07:13 virt-123 kernel: [82944.896173] device-mapper: dm-log-userspace: [dyiCP3yC] Request timed out: [9/296085] - retrying Nov 26 23:07:15 virt-123 cmirrord[13471]: cpg_dispatch failed: CS_ERR_BAD_HANDLE Nov 26 23:07:15 virt-123 kernel: [82946.642100] pvs[24142]: segfault at 3 ip 00000000004a897c sp 00007fff31ef0ed0 error 4 in lvm[400000+102000] Nov 26 23:07:15 virt-123 abrt[24154]: Saved core dump of pid 24142 (/usr/sbin/lvm) to /var/tmp/abrt/ccpp-2013-11-26-23:07:15-24142 (1441792 bytes) Nov 26 23:07:15 virt-123 abrtd: New client connected Nov 26 23:07:18 virt-123 lvm[21428]: Monitoring mirror device helter_skelter-syncd_primary_core_4legs_1 for events. Nov 26 23:07:19 virt-123 lvm[21428]: Repair of mirrored device helter_skelter-syncd_primary_core_4legs_1 finished successfully. Nov 26 23:07:20 virt-123 lvm[21428]: No longer monitoring mirror device helter_skelter-syncd_primary_core_4legs_1 for events. Nov 26 23:07:22 virt-123 lvm[21428]: No longer monitoring mirror device helter_skelter-syncd_primary_core_4legs_1 for events. Nov 26 23:07:24 virt-123 lvm[21428]: No longer monitoring mirror device helter_skelter-syncd_primary_core_4legs_1 for events. Nov 26 23:07:54 virt-123 lvm[21428]: helter_skelter-syncd_primary_core_4legs_1 is now in-sync. Core was generated by `pvs -a'. Program terminated with signal 11, Segmentation fault. #0 0x00000000004a897c in _parse_mirror_status (mirror_status_str=<optimized out>, images_health=images_health@entry=0x7fff31ef1090, log_dev=log_dev@entry=0x7fff31ef1088, log_health=log_health@entry=0x7fff31ef1098) at activate/dev_manager.c:184 184 if (!strcmp(log_args[0], "disk")) { Missing separate debuginfos, use: debuginfo-install glibc-2.17-36.el7.x86_64 libgcc-4.8.2-3.el7.x86_64 libselinux-2.1.13-21.el7.x86_64 libsepol-2.1.9-1.el7.x86_64 ncurses-libs-5.9-11.20130511.el7.x86_64 pcre-8.32-8.el7.x86_64 readline-6.2-6.el7.x86_64 systemd-libs-207-8.el7.x86_64 (gdb) bt #0 0x00000000004a897c in _parse_mirror_status (mirror_status_str=<optimized out>, images_health=images_health@entry=0x7fff31ef1090, log_dev=log_dev@entry=0x7fff31ef1088, log_health=log_health@entry=0x7fff31ef1098) at activate/dev_manager.c:184 #1 0x00000000004aaadd in _ignore_blocked_mirror_devices (mirror_status_str=<optimized out>, length=1024000, start=0, dev=0xfe3518) at activate/dev_manager.c:250 #2 _device_is_usable (dev=dev@entry=0xfe3518, check_lv_names=check_lv_names@entry=1) at activate/dev_manager.c:401 #3 0x00000000004ab34a in device_is_usable (dev=dev@entry=0xfe3518) at activate/dev_manager.c:460 #4 0x000000000045632d in _lookup_p (f=<optimized out>, dev=0xfe3518) at filters/filter-persistent.c:291 #5 0x000000000045036a in dev_iter_get (iter=iter@entry=0xfdc100) at device/dev-cache.c:1011 #6 0x00000000004344fe in _process_all_devs (process_single_pv=0x431480 <_pvs_single>, handle=0xfcddf0, cmd=0xf78060) at toollib.c:763 #7 process_each_pv (cmd=cmd@entry=0xf78060, argc=argc@entry=0, argv=argv@entry=0x7fff31ef16e8, vg=vg@entry=0x0, flags=0, scan_label_only=<optimized out>, handle=0xfcddf0, process_single_pv=0x431480 <_pvs_single>) at toollib.c:934 #8 0x0000000000431e95 in _report (cmd=0xf78060, argc=0, argv=0x7fff31ef16e8, report_type=PVS) at reporter.c:408 #9 0x0000000000429b08 in lvm_run_command (cmd=cmd@entry=0xf78060, argc=0, argc@entry=2, argv=0x7fff31ef16e8, argv@entry=0x7fff31ef16d8) at lvmcmdline.c:1179 #10 0x000000000042a049 in lvm2_main (argc=2, argv=0x7fff31ef16d8) at lvmcmdline.c:1614 #11 0x00000036aba21af5 in __libc_start_main () from /lib64/libc.so.6 #12 0x0000000000416771 in _start () Version-Release number of selected component (if applicable): 3.10.0-54.el7.x86_64 lvm2-2.02.103-5.el7 BUILT: Tue Nov 12 13:36:17 CET 2013 lvm2-libs-2.02.103-5.el7 BUILT: Tue Nov 12 13:36:17 CET 2013 lvm2-cluster-2.02.103-5.el7 BUILT: Tue Nov 12 13:36:17 CET 2013 device-mapper-1.02.82-5.el7 BUILT: Tue Nov 12 13:36:17 CET 2013 device-mapper-libs-1.02.82-5.el7 BUILT: Tue Nov 12 13:36:17 CET 2013 device-mapper-event-1.02.82-5.el7 BUILT: Tue Nov 12 13:36:17 CET 2013 device-mapper-event-libs-1.02.82-5.el7 BUILT: Tue Nov 12 13:36:17 CET 2013 device-mapper-persistent-data-0.2.8-2.el7 BUILT: Wed Oct 30 16:20:48 CET 2013 cmirror-2.02.103-5.el7 BUILT: Tue Nov 12 13:36:17 CET 2013