Bug 1589741

Summary: multipathd[308]: segfault at 698 ip 0090d504 sp b770a1dc error 4 in libmultipath.so[904000+45000]
Product: Red Hat Enterprise Linux 6 Reporter: Lin Li <lilin>
Component: device-mapper-multipathAssignee: Ben Marzinski <bmarzins>
Status: CLOSED ERRATA QA Contact: Lin Li <lilin>
Severity: high Docs Contact: Jaroslav Klech <jklech>
Priority: high    
Version: 6.10CC: agk, bmarzins, heinzm, jbrassow, jklech, lilin, msnitzer, prajnoha, rhandlin, toneata, zkabelac
Target Milestone: rcKeywords: Regression
Target Release: ---   
Hardware: i386   
OS: Linux   
Whiteboard:
Fixed In Version: device-mapper-multipath-0.4.9-106.el6_10.1 Doc Type: Bug Fix
Doc Text:
Previously, invalid memory was in some cases accessed during thread shutdown. Consequently, the multipathd daemon sometimes terminated unexpectedly during shutdown. This update fixes multipathd's pthreads cleanup code, and multipathd no longer crashes during shutdown.
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-10-09 15:51:15 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Lin Li 2018-06-11 10:03:59 UTC
Description of problem:
multipathd[308]: segfault at 698 ip 0090d504 sp b770a1dc error 4 in libmultipath.so[904000+45000]

Version-Release number of selected component (if applicable):
RHEL-6.10-20180525.0
2.6.32-754.el6.i686

How reproducible:
once

Steps to Reproduce:
1.install RHEL-6.10-20180525.0
2.run task 
/kernel/storage/misc/env_setup 
3.run task 
/kernel/storage/misc/log_checker 

Actual results:
dracut: Scanning devices dm-3 dm-4 dm-5 sdd2  for LVM logical volumes vg_hpdl385g705/lv_root vg_hpdl385g705/lv_swap 
dracut: inactive '/dev/vg_hpdl385g705/lv_root' [50.00 GiB] inherit
dracut: inactive '/dev/vg_hpdl385g705/lv_home' [227.44 GiB] inherit
dracut: inactive '/dev/vg_hpdl385g705/lv_swap' [7.42 GiB] inherit
EXT4-fs (dm-6): mounted filesystem with ordered data mode. Opts: 
dracut: Mounted root filesystem /dev/mapper/vg_hpdl385g705-lv_root
multipathd[308]: segfault at 698 ip 0090d504 sp b770a1dc error 4 in libmultipath.so[904000+45000]
dracut: Loading SELinux policy
type=1404 audit(1527508085.075:2): enforcing=1 old_enforcing=0 auid=4294967295 ses=4294967295
SELinux: 2048 avtab hash slots, 309443 rules.
SELinux: 2048 avtab hash slots, 309443 rules.
SELinux:  9 users, 12 roles, 4218 types, 237 bools, 1 sens, 1024 cats
SELinux:  81 classes, 309443 rules
SELinux:  Completing initialization.
SELinux:  Setting up existing superblocks.
SELinux: initialized (dev dm-6, type ext4), uses xattr
SELinux: initialized (dev drm, type drm), not configured for labeling
SELinux: initialized (dev tmpfs, type tmpfs), uses transition SIDs
SELinux: initialized (dev usbfs, type usbfs), uses genfs_contexts
SELinux: initialized (dev selinuxfs, type selinuxfs), uses genfs_contexts
SELinux: initialized (dev mqueue, type mqueue), uses transition SIDs
SELinux: initialized (dev hugetlbfs, type hugetlbfs), uses transition SIDs
SELinux: initialized (dev devpts, type devpts), uses transition SIDs
SELinux: initialized (dev inotifyfs, type inotifyfs), uses genfs_contexts
SELinux: initialized (dev anon_inodefs, type anon_inodefs), uses genfs_contexts
SELinux: initialized (dev pipefs, type pipefs), uses task SIDs
SELinux: initialized (dev debugfs, type debugfs), uses genfs_contexts
SELinux: initialized (dev sockfs, type sockfs), uses task SIDs
SELinux: initialized (dev devtmpfs, type devtmpfs), uses transition SIDs
SELinux: initialized (dev tmpfs, type tmpfs), uses transition SIDs
SELinux: initialized (dev proc, type proc), uses genfs_contexts
SELinux: initialized (dev bdev, type bdev), uses genfs_contexts
SELinux: initialized (dev rootfs, type rootfs), uses genfs_contexts
SELinux: initialized (dev sysfs, type sysfs), uses genfs_contexts
type=1403 audit(1527508085.771:3): policy loaded auid=4294967295 ses=4294967295
dracut: 
dracut: Switching root
udev: starting version 147
ACPI: PCI Interrupt Link [I070] enabled at IRQ 47
  alloc irq_desc for 47 on node -1
  alloc kstat_irqs on node -1
hpwdt 0000:02:00.0: PCI INT A -> Link[I070] -> GSI 47 (level, high) -> IRQ 47
hpwdt: New timer passed in is 30 seconds.
hpwdt 0000:02:00.0: HPE Watchdog Timer Driver: NMI decoding initialized, allow kernel dump: ON (default = 1/ON)
, priority: LAST (default = 0/LAST).
hpwdt 0000:02:00.0: HPE Watchdog Timer Driver: 1.4.0-rh, timer margin: 30 seconds (nowayout=0).
hpilo 0000:02:00.2: PCI INT B -> Link[I071] -> GSI 44 (level, high) -> IRQ 44
hpilo 0000:02:00.2: setting latency timer to 64
ipmi message handler version 39.2
IPMI System Interface driver.
ipmi_si: probing via ACPI
ipmi_si 00:02: [io  0x0ca2-0x0ca3] regsize 1 spacing 1 irq 0
ipmi_si: Adding ACPI-specified kcs state machine
ipmi_si: probing via SMBIOS
ipmi_si: SMBIOS: io 0xca2 regsize 1 spacing 1 irq 0
ipmi_si: SMBIOS-specified kcs state machine: duplicate
ipmi_si: probing via SPMI
ipmi_si: SPMI: io 0xca2 regsize 2 spacing 2 irq 0
ipmi_si: SPMI-specified kcs state machine: duplicate
ipmi_si: Trying ACPI-specified kcs state machine at i/o address 0xca2, slave address 0x20, irq 0
ipmi_si 00:02: Found new BMC (man_id: 0x00000b, prod_id: 0x2000, dev_id: 0x13)
ipmi_si 00:02: IPMI kcs interface initialized
power_meter ACPI000D:00: Found ACPI power meter.
sr 2:0:0:0: Attached scsi generic sg0 type 5
sd 7:0:0:0: Attached scsi generic sg1 type 0
sd 7:0:0:2: Attached scsi generic sg2 type 0
sd 7:0:0:3: Attached scsi generic sg3 type 0
scsi 9:3:0:0: Attached scsi generic sg4 type 12
sd 9:0:0:0: Attached scsi generic sg5 type 0
sd 7:0:1:0: Attached scsi generic sg6 type 0
sd 7:0:1:2: Attached scsi generic sg7 type 0
sd 7:0:1:3: Attached scsi generic sg8 type 0
sd 7:0:2:0: Attached scsi generic sg9 type 0
sd 7:0:2:2: Attached scsi generic sg10 type 0
sd 7:0:2:3: Attached scsi generic sg11 type 0
sd 7:0:3:0: Attached scsi generic sg12 type 0
sd 7:0:3:2: Attached scsi generic sg13 type 0
sd 7:0:3:3: Attached scsi generic sg14 type 0
sd 8:0:0:0: Attached scsi generic sg15 type 0
sd 8:0:0:2: Attached scsi generic sg16 type 0
sd 8:0:0:3: Attached scsi generic sg17 type 0
sd 8:0:1:0: Attached scsi generic sg18 type 0
sd 8:0:1:2: Attached scsi generic sg19 type 0
sd 8:0:1:3: Attached scsi generic sg20 type 0
sd 8:0:2:0: Attached scsi generic sg21 type 0
sd 8:0:2:2: Attached scsi generic sg22 type 0
sd 8:0:2:3: Attached scsi generic sg23 type 0
sd 8:0:3:0: Attached scsi generic sg24 type 0
sd 8:0:3:2: Attached scsi generic sg25 type 0
sd 8:0:3:3: Attached scsi generic sg26 type 0

Expected results:
No segfault

Additional info:
beaker job: https://beaker.engineering.redhat.com/recipes/5202532#task72692206
console log: http://beaker-archive.host.prod.eng.bos.redhat.com/beaker-logs/2018/05/25084/2508450/5202532/console.log
dmesg log: http://beaker-archive.host.prod.eng.bos.redhat.com/beaker-logs/2018/05/25084/2508450/5202532/72692206/dmesg.log

Comment 20 Bob Handlin 2018-08-29 11:42:21 UTC
Setting as a blocker. Re: Regression.

Comment 22 Ben Marzinski 2018-09-11 18:10:11 UTC
removed crashing code

Comment 32 Marie Hornickova 2018-09-26 15:42:22 UTC
Hi Ben,

Thanks a lot for providing the initial info to document this bug.

Please could you check my draft description whether it is accurate content-wise?

Thank you!

Marie

Comment 33 Ben Marzinski 2018-09-26 20:39:02 UTC
(In reply to Marie Dolezelova from comment #32)
> Hi Ben,
> 
> Thanks a lot for providing the initial info to document this bug.
> 
> Please could you check my draft description whether it is accurate
> content-wise?
> 
> Thank you!
> 
> Marie

Looks good.

Comment 37 Lin Li 2018-09-28 23:21:37 UTC
Please note: the multipathd no longer crashes during shutdown not only need to  update device-mapper-multipath but also need to remake the initramfs.

Comment 39 errata-xmlrpc 2018-10-09 15:51:15 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:2901