Bug 1202449 - Kernel is crashing with BUG: unable to handle kernel NULL pointer dereference at sysfs_do_create_link_sd
Summary: Kernel is crashing with BUG: unable to handle kernel NULL pointer dereferenc...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: rawhide
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Mike Snitzer
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2015-03-16 15:55 UTC by Zdenek Kabelac
Modified: 2018-04-06 18:05 UTC (History)
20 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2018-04-06 18:05:40 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)

Description Zdenek Kabelac 2015-03-16 15:55:13 UTC
Description of problem:

Kernel crash during lvm2 test suite execution of test:
make check_lvmetad T=shell/lvconvert-repair-transient-dmeventd.sh

Opening it for lvm2 bug -  but could be shifted to kernel later
once the reason is known.

(bug is reproducible with real hw as well)

Kernel stack trace:

[  368.284367] device-mapper: ioctl: unable to remove open device LVMTEST3572pv4
[  376.548070] ------------[ cut here ]------------
[  376.549236] WARNING: CPU: 1 PID: 4485 at fs/sysfs/dir.c:31 sysfs_warn_dup+0x68/0x80()
[  376.550592] sysfs: cannot create duplicate filename '/devices/virtual/bdi/253:9'
[  376.552404] Modules linked in: loop ppdev parport_pc virtio_net parport i2c_piix4 serio_raw virtio_balloon acpi_cpufreq virtio_blk c
irrus drm_kms_helper ttm drm virtio_pci ata_generic virtio_ring virtio pata_acpi
[  376.555934] CPU: 1 PID: 4485 Comm: dmsetup Not tainted 4.0.0-0.rc3.git2.1.fc23.x86_64 #1
[  376.557190] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007
[  376.558552]  0000000000000000 00000000a4c4a0bc ffff880075043898 ffffffff8187a720
[  376.559857]  0000000000000000 ffff8800750438f0 ffff8800750438d8 ffffffff810ab44a
[  376.561348]  ffff880075d9b4a0 ffff880074b7a000 ffff880073f7c748 ffff8800364b85e8
[  376.562799] Call Trace:
[  376.563222]  [<ffffffff8187a720>] dump_stack+0x4c/0x65
[  376.564462]  [<ffffffff810ab44a>] warn_slowpath_common+0x8a/0xc0
[  376.565633]  [<ffffffff810ab4d5>] warn_slowpath_fmt+0x55/0x70
[  376.566756]  [<ffffffff81303198>] ? kernfs_path+0x48/0x60
[  376.567793]  [<ffffffff81306fe8>] sysfs_warn_dup+0x68/0x80
[  376.568841]  [<ffffffff8130708d>] sysfs_create_dir_ns+0x8d/0xa0
[  376.569947]  [<ffffffff81427f99>] kobject_add_internal+0xc9/0x460
[  376.571104]  [<ffffffff8142839f>] kobject_add+0x6f/0xd0
[  376.572107]  [<ffffffff81583abe>] device_add+0x31e/0x6d0
[  376.573124]  [<ffffffff8125090d>] ? kfree+0x1cd/0x2c0
[  376.574138]  [<ffffffff81584098>] device_create_groups_vargs+0xe8/0x100
[  376.575311]  [<ffffffff815840cc>] device_create_vargs+0x1c/0x20
[  376.576394]  [<ffffffff81209f9a>] bdi_register+0x8a/0x2d0
[  376.577441]  [<ffffffff8120a207>] bdi_register_dev+0x27/0x30
[  376.578519]  [<ffffffff814053ae>] add_disk+0x1ce/0x510
[  376.579499]  [<ffffffff816c0470>] ? table_clear+0xf0/0xf0
[  376.580528]  [<ffffffff816c0470>] ? table_clear+0xf0/0xf0
[  376.581556]  [<ffffffff816b8ca2>] dm_create+0x332/0x530
[  376.583899]  [<ffffffff816c0470>] ? table_clear+0xf0/0xf0
[  376.586277]  [<ffffffff816c04da>] dev_create+0x6a/0x2f0
[  376.588939]  [<ffffffff816c0470>] ? table_clear+0xf0/0xf0
[  376.591400]  [<ffffffff816c0470>] ? table_clear+0xf0/0xf0
[  376.593695]  [<ffffffff816bfc22>] ctl_ioctl+0x252/0x540
[  376.595926]  [<ffffffff816bff23>] dm_ctl_ioctl+0x13/0x20
[  376.598199]  [<ffffffff8128d398>] do_vfs_ioctl+0x2e8/0x530
[  376.600418]  [<ffffffff81127405>] ? rcu_read_lock_held+0x65/0x70
[  376.602765]  [<ffffffff8129a06e>] ? __fget_light+0xbe/0xe0
[  376.604947]  [<ffffffff8128d661>] SyS_ioctl+0x81/0xa0
[  376.607158]  [<ffffffff81883cc9>] system_call_fastpath+0x12/0x17
[  376.609419] ---[ end trace 08a2efe3a060fa0f ]---
[  376.611650] ------------[ cut here ]------------
[  376.613738] WARNING: CPU: 1 PID: 4485 at lib/kobject.c:240 kobject_add_internal+0x394/0x460()
[  376.616356] kobject_add_internal failed for 253:9 with -EEXIST, don't try to register things with the same name in the same director
y.
[  376.620540] Modules linked in: loop ppdev parport_pc virtio_net parport i2c_piix4 serio_raw virtio_balloon acpi_cpufreq virtio_blk c
irrus drm_kms_helper ttm drm virtio_pci ata_generic virtio_ring virtio pata_acpi
[  376.626457] CPU: 1 PID: 4485 Comm: dmsetup Tainted: G        W       4.0.0-0.rc3.git2.1.fc23.x86_64 #1
[  376.629182] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007
[  376.631818]  0000000000000000 00000000a4c4a0bc ffff8800750438f8 ffffffff8187a720
[  376.634389]  0000000000000000 ffff880075043950 ffff880075043938 ffffffff810ab44a
[  376.637029]  ffff880075043928 ffff880073054810 00000000ffffffef ffff880076e698a0
[  376.639786] Call Trace:
[  376.641389]  [<ffffffff8187a720>] dump_stack+0x4c/0x65
[  376.643744]  [<ffffffff810ab44a>] warn_slowpath_common+0x8a/0xc0
[  376.646039]  [<ffffffff810ab4d5>] warn_slowpath_fmt+0x55/0x70
[  376.648272]  [<ffffffff81428264>] kobject_add_internal+0x394/0x460
[  376.650586]  [<ffffffff8142839f>] kobject_add+0x6f/0xd0
[  376.652762]  [<ffffffff81583abe>] device_add+0x31e/0x6d0
[  376.654925]  [<ffffffff8125090d>] ? kfree+0x1cd/0x2c0
[  376.657034]  [<ffffffff81584098>] device_create_groups_vargs+0xe8/0x100
[  376.659383]  [<ffffffff815840cc>] device_create_vargs+0x1c/0x20
[  376.661720]  [<ffffffff81209f9a>] bdi_register+0x8a/0x2d0
[  376.663847]  [<ffffffff8120a207>] bdi_register_dev+0x27/0x30
[  376.666081]  [<ffffffff814053ae>] add_disk+0x1ce/0x510
[  376.668338]  [<ffffffff816c0470>] ? table_clear+0xf0/0xf0
[  376.670499]  [<ffffffff816c0470>] ? table_clear+0xf0/0xf0
[  376.672695]  [<ffffffff816b8ca2>] dm_create+0x332/0x530
[  376.674822]  [<ffffffff816c0470>] ? table_clear+0xf0/0xf0
[  376.676984]  [<ffffffff816c04da>] dev_create+0x6a/0x2f0
[  376.679184]  [<ffffffff816c0470>] ? table_clear+0xf0/0xf0
[  376.681347]  [<ffffffff816c0470>] ? table_clear+0xf0/0xf0
[  376.683509]  [<ffffffff816bfc22>] ctl_ioctl+0x252/0x540
[  376.685723]  [<ffffffff816bff23>] dm_ctl_ioctl+0x13/0x20
[  376.687920]  [<ffffffff8128d398>] do_vfs_ioctl+0x2e8/0x530
[  376.690128]  [<ffffffff81127405>] ? rcu_read_lock_held+0x65/0x70
[  376.692356]  [<ffffffff8129a06e>] ? __fget_light+0xbe/0xe0
[  376.694530]  [<ffffffff8128d661>] SyS_ioctl+0x81/0xa0
[  376.696657]  [<ffffffff81883cc9>] system_call_fastpath+0x12/0x17
[  376.698874] ---[ end trace 08a2efe3a060fa10 ]---
[  376.701564] BUG: unable to handle kernel NULL pointer dereference at 0000000000000040
[  376.702577] IP: [<ffffffff813072f0>] sysfs_do_create_link_sd.isra.2+0x40/0xe0
[  376.702577] PGD 0
[  376.702577] Oops: 0000 [#1] SMP
[  376.702577] Modules linked in: loop ppdev parport_pc virtio_net parport i2c_piix4 serio_raw virtio_balloon acpi_cpufreq virtio_blk c
irrus drm_kms_helper ttm drm virtio_pci ata_generic virtio_ring virtio pata_acpi
[  376.702577] CPU: 0 PID: 4485 Comm: dmsetup Tainted: G        W       4.0.0-0.rc3.git2.1.fc23.x86_64 #1
[  376.702577] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007
[  376.702577] task: ffff880075d9b4a0 ti: ffff880075040000 task.ti: ffff880075040000
[  376.702577] RIP: 0010:[<ffffffff813072f0>]  [<ffffffff813072f0>] sysfs_do_create_link_sd.isra.2+0x40/0xe0
[  376.702577] RSP: 0018:ffff880075043b98  EFLAGS: 00010286
[  376.702577] RAX: ffff880075d9b4a0 RBX: 0000000000000040 RCX: ef7bdef7bdef7bdf
[  376.702577] RDX: ffff88007780eaa0 RSI: ffffffff813072f0 RDI: 0000000000000246
[  376.702577] RBP: ffff880075043bc8 R08: 0000000000000000 R09: 0000000000000000
[  376.702577] R10: 0000000000000001 R11: 00000000000000ba R12: 0000000000000001
[  376.702577] R13: ffffffff81c9a60a R14: ffff88007454e348 R15: ffff8800730558a0
[  376.702577] FS:  00007f5c66fc2800(0000) GS:ffff880077800000(0000) knlGS:0000000000000000
[  376.702577] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[  376.702577] CR2: 0000000000000040 CR3: 0000000073cce000 CR4: 00000000000007f0
[  376.702577] Stack:
[  376.702577]  ffff880075043bd8 ffff880073055800 ffff8800730558b0 ffff88007305580c
[  376.702577]  ffff880075438000 ffff8800730558a0 ffff880075043bd8 ffffffff813073b5
[  376.702577]  ffff880075043c58 ffffffff81405447 ffff880073055800 0000000000000000
[  376.702577] Call Trace:
[  376.702577]  [<ffffffff813073b5>] sysfs_create_link+0x25/0x50
[  376.702577]  [<ffffffff81405447>] add_disk+0x267/0x510
[  376.702577]  [<ffffffff816c0470>] ? table_clear+0xf0/0xf0
[  376.702577]  [<ffffffff816b8ca2>] dm_create+0x332/0x530
[  376.702577]  [<ffffffff816c0470>] ? table_clear+0xf0/0xf0
[  376.702577]  [<ffffffff816c04da>] dev_create+0x6a/0x2f0
[  376.702577]  [<ffffffff816c0470>] ? table_clear+0xf0/0xf0
[  376.702577]  [<ffffffff816c0470>] ? table_clear+0xf0/0xf0
[  376.702577]  [<ffffffff816bfc22>] ctl_ioctl+0x252/0x540
[  376.702577]  [<ffffffff816bff23>] dm_ctl_ioctl+0x13/0x20
[  376.702577]  [<ffffffff8128d398>] do_vfs_ioctl+0x2e8/0x530
[  376.702577]  [<ffffffff81127405>] ? rcu_read_lock_held+0x65/0x70
[  376.702577]  [<ffffffff8129a06e>] ? __fget_light+0xbe/0xe0
[  376.702577]  [<ffffffff8128d661>] SyS_ioctl+0x81/0xa0
[  376.702577]  [<ffffffff81883cc9>] system_call_fastpath+0x12/0x17
[  376.702577] Code: 00 00 55 48 89 e5 41 57 41 56 41 55 41 54 49 89 fe 53 48 c7 c7 60 8e f0 81 48 89 f3 41 89 cc 49 89 d5 48 83 ec 08
e8 40 b9 57 00 <48> 8b 1b 48 85 db 74 48 48 89 df e8 b0 be ff ff 48 c7 c7 60 8e
[  376.702577] RIP  [<ffffffff813072f0>] sysfs_do_create_link_sd.isra.2+0x40/0xe0
[  376.702577]  RSP <ffff880075043b98>
[  376.702577] CR2: 0000000000000040
[  376.702577] ---[ end trace 08a2efe3a060fa11 ]---



Version-Release number of selected component (if applicable):
4.0.0-0.rc3.git2.1.fc23.x86_64

How reproducible:


Steps to Reproduce:
1. run lvm2 test case repeatedly until it happens...
2.
3.

Actual results:


Expected results:


Additional info:

Comment 1 Zdenek Kabelac 2015-03-17 14:33:37 UTC
First rawhide crashing kernel was:vmlinuz-3.20.0-0.rc0.git5.1.fc22.x86_64 
last usable: vmlinuz-3.20.0-0.rc0.git4.1.fc22.x86_64


This translates to vanilla commits:

git diff 8cc748aa76c9 c7d7b9867155


which has a bunch of changes even in dm core itself.

Passing to Marian for bisect.


Needed steps:

Enter test dir of lvm2 test suite and mark kernel as OK if it passes this:

for i in $(seq 1 30) ; do make check_lvmetad T=shell/lvconvert-repair-transient-dmeventd.sh ; done

Comment 2 Marian Csontos 2015-03-20 18:54:58 UTC
Tracked down to this commit:

commit c4db59d31e39ea067c32163ac961e9c80198fd37
Author: Christoph Hellwig <hch>
Date:   Tue Jan 20 14:05:00 2015 -0700

    fs: don't reassign dirty inodes to default_backing_dev_info

Comment 3 Jeff Moyer 2015-03-20 21:00:12 UTC
I also ran into this problem on one of my systems.  Here is the relevant (I hope) dm config.  It's a 2 device mirror setup with a mirrored log device.  Note that the boot device, root device and swap are all on partitions.  I run into the kernel BUG on boot.  If I prevent the dm-mirror target from loading, I can boot the system successfully.

[root@slayer ~]# pvdisplay
  --- Physical volume ---
  PV Name               /dev/sdc
  VG Name               mirror-data-vol
  PV Size               558.38 GiB / not usable 4.00 MiB
  Allocatable           yes 
  PE Size               4.00 MiB
  Total PE              142943
  Free PE               66143
  Allocated PE          76800
  PV UUID               ukwk3L-kgCP-RbwM-Ele6-r4oR-6KRy-9ZPHRA
   
  --- Physical volume ---
  PV Name               /dev/sdd
  VG Name               mirror-data-vol
  PV Size               558.38 GiB / not usable 4.00 MiB
  Allocatable           yes 
  PE Size               4.00 MiB
  Total PE              142943
  Free PE               66143
  Allocated PE          76800
  PV UUID               sPGS2e-Um4u-f5fP-AkUG-21Vi-Braz-xWjQWF
   
  --- Physical volume ---
  PV Name               /dev/sdf
  VG Name               mirror-data-vol
  PV Size               298.09 GiB / not usable 1.34 MiB
  Allocatable           yes 
  PE Size               4.00 MiB
  Total PE              76311
  Free PE               76310
  Allocated PE          1
  PV UUID               Bjf955-6eoB-93K7-z4eU-x6Tx-JezI-2aalJm
   
  --- Physical volume ---
  PV Name               /dev/sdg
  VG Name               mirror-data-vol
  PV Size               298.09 GiB / not usable 1.34 MiB
  Allocatable           yes 
  PE Size               4.00 MiB
  Total PE              76311
  Free PE               76310
  Allocated PE          1
  PV UUID               1SkTvK-malu-2je4-Rm82-b9VR-GeLs-aAFRSn
   
[root@slayer ~]# vgdisplay
  --- Volume group ---
  VG Name               mirror-data-vol
  System ID             
  Format                lvm2
  Metadata Areas        4
  Metadata Sequence No  10
  VG Access             read/write
  VG Status             resizable
  MAX LV                0
  Cur LV                1
  Open LV               0
  Max PV                0
  Cur PV                4
  Act PV                4
  VG Size               1.67 TiB
  PE Size               4.00 MiB
  Total PE              438508
  Alloc PE / Size       153602 / 600.01 GiB
  Free  PE / Size       284906 / 1.09 TiB
  VG UUID               CHA4l9-LZ2z-VvT6-aSbS-FYHX-VO1B-3NALgS
   
[root@slayer ~]# lvdisplay
  --- Logical volume ---
  LV Path                /dev/mirror-data-vol/mirror-data-lv
  LV Name                mirror-data-lv
  VG Name                mirror-data-vol
  LV UUID                ckZZLS-4Cxv-4Rrh-TBGJ-ZzqT-YnGd-hfSOAU
  LV Write Access        read/write
  LV Creation host, time slayer.lab.bos.redhat.com, 2014-11-07 15:13:14 -0500
  LV Status              available
  # open                 0
  LV Size                300.00 GiB
  Current LE             76800
  Mirrored volumes       2
  Segments               1
  Allocation             inherit
  Read ahead sectors     auto
  - currently set to     256
  Block device           252:5
   
[root@slayer ~]# dmsetup ls --tree
mirror--data--vol-mirror--data--lv (252:5)
 |-mirror--data--vol-mirror--data--lv_mimage_1 (252:4)
 |  `- (8:48)
 |-mirror--data--vol-mirror--data--lv_mimage_0 (252:3)
 |  `- (8:32)
 `-mirror--data--vol-mirror--data--lv_mlog (252:2)
    |-mirror--data--vol-mirror--data--lv_mlog_mimage_1 (252:0)
    |  `- (8:96)
    `-mirror--data--vol-mirror--data--lv_mlog_mimage_0 (252:1)
       `- (8:80)

The stack trace looks pretty much identical to the one posted above, but I'll include it here.

[   25.413746] ------------[ cut here ]------------
[   25.419777] WARNING: CPU: 1 PID: 578 at fs/sysfs/dir.c:31 sysfs_warn_dup+0x6)
[   25.428837] sysfs: cannot create duplicate filename '/devices/virtual/bdi/25'
[   25.437501] Modules linked in: dm_mirror dm_region_hash dm_log dm_mod corete2
[   25.491330] CPU: 1 PID: 578 Comm: pvscan Tainted: G          I     4.0.0-rc45
[   25.500114] Hardware name: Dell Inc. PowerEdge R710/00NH4P, BIOS 2.1.9 05/210
[   25.508915]  0000000000000000 000000009508a89a ffff8800bb1138a8 ffffffff8166d
[   25.517807]  0000000000000000[   25.518726] qla2xxx [0000:04:00.1]-8038:8: C.

[   25.528016]  ffff8800bb113900 ffff8800bb1138e8 ffffffff81079a1a
[   25.536920]  00005071bb1138e8 ffff880035f17000 ffff8800365338c0 ffff880035e70
[   25.545717] Call Trace:
[   25.549511]  [<ffffffff8166f66d>] dump_stack+0x45/0x57
[   25.555998]  [<ffffffff81079a1a>] warn_slowpath_common+0x8a/0xc0
[   25.563349]  [<ffffffff81079aa5>] warn_slowpath_fmt+0x55/0x70
[   25.570438]  [<ffffffff8126cb98>] ? kernfs_path+0x48/0x60
[   25.577193]  [<ffffffff81270508>] sysfs_warn_dup+0x68/0x80
[   25.583987]  [<ffffffff812705ae>] sysfs_create_dir_ns+0x8e/0xa0
[   25.591190]  [<ffffffff81311194>] kobject_add_internal+0xc4/0x3f0
[   25.598559]  [<ffffffff813116df>] kobject_add+0x6f/0xd0
[   25.605062]  [<ffffffff81674996>] ? mutex_lock+0x16/0x37
[   25.611692]  [<ffffffff814357e5>] device_add+0x125/0x640
[   25.618276]  [<ffffffff81435f18>] device_create_groups_vargs+0xd8/0x100
[   25.626168]  [<ffffffff81435f5c>] device_create_vargs+0x1c/0x20
[   25.633362]  [<ffffffff8119b547>] bdi_register+0x87/0x160
[   25.640033]  [<ffffffff8119b647>] bdi_register_dev+0x27/0x30
[   25.646939]  [<ffffffff812f3885>] add_disk+0x175/0x4b0
[   25.653318]  [<ffffffffa050e078>] dm_create+0x2d8/0x4e0 [dm_mod]
[   25.660549]  [<ffffffffa051546b>] dev_create+0x6b/0x2f0 [dm_mod]
[   25.667795]  [<ffffffffa0515400>] ? table_clear+0xf0/0xf0 [dm_mod]
[   25.675170]  [<ffffffffa0514c05>] ctl_ioctl+0x255/0x500 [dm_mod]
[   25.682370]  [<ffffffff811a9db7>] ? do_wp_page+0x3c7/0x7a0
[   25.689039]  [<ffffffff81285f18>] ? SYSC_semtimedop+0x298/0xeb0
[   25.696145]  [<ffffffffa0514ec3>] dm_ctl_ioctl+0x13/0x20 [dm_mod]
[   25.703412]  [<ffffffff81208268>] do_vfs_ioctl+0x2f8/0x4f0
[   25.710056]  [<ffffffff812084e1>] SyS_ioctl+0x81/0xa0
[   25.716244]  [<ffffffff81676b09>] system_call_fastpath+0x12/0x17
[   25.723402] ---[ end trace ad8730190225e8d2 ]---
[   25.729166] ------------[ cut here ]------------
[   25.734929] WARNING: CPU: 1 PID: 578 at lib/kobject.c:240 kobject_add_intern)
[   25.744531] kobject_add_internal failed for 251:0 with -EEXIST, don't try to.
[   25.758973] Modules linked in: dm_mirror dm_region_hash dm_log dm_mod corete2
[   25.812698] CPU: 1 PID: 578 Comm: pvscan Tainted: G        W I     4.0.0-rc45
[   25.821443] Hardware name: Dell Inc. PowerEdge R710/00NH4P, BIOS 2.1.9 05/210
[   25.830167]  0000000000000000 000000009508a89a ffff8800bb113908 ffffffff8166d
[   25.838941]  0000000000000000 ffff8800bb113960 ffff8800bb113948 ffffffff8107a
[   25.847681]  ffff8800bb113928 ffff880036693010 00000000ffffffef ffff8800364e0
[   25.856425] Call Trace:
[   25.860042]  [<ffffffff8166f66d>] dump_stack+0x45/0x57
[   25.866351]  [<ffffffff81079a1a>] warn_slowpath_common+0x8a/0xc0
[   25.873515]  [<ffffffff81079aa5>] warn_slowpath_fmt+0x55/0x70
[   25.880394]  [<ffffffff81311354>] kobject_add_internal+0x284/0x3f0
[   25.887696]  [<ffffffff813116df>] kobject_add+0x6f/0xd0
[   25.894032]  [<ffffffff81674996>] ? mutex_lock+0x16/0x37
[   25.900438]  [<ffffffff814357e5>] device_add+0x125/0x640
[   25.906841]  [<ffffffff81435f18>] device_create_groups_vargs+0xd8/0x100
[   25.914546]  [<ffffffff81435f5c>] device_create_vargs+0x1c/0x20
[   25.921561]  [<ffffffff8119b547>] bdi_register+0x87/0x160
[   25.928053]  [<ffffffff8119b647>] bdi_register_dev+0x27/0x30
[   25.934801]  [<ffffffff812f3885>] add_disk+0x175/0x4b0
[   25.941037]  [<ffffffffa050e078>] dm_create+0x2d8/0x4e0 [dm_mod]
[   25.948144]  [<ffffffffa051546b>] dev_create+0x6b/0x2f0 [dm_mod]
[   25.955252]  [<ffffffffa0515400>] ? table_clear+0xf0/0xf0 [dm_mod]
[   25.962544]  [<ffffffffa0514c05>] ctl_ioctl+0x255/0x500 [dm_mod]
[   25.969658]  [<ffffffff811a9db7>] ? do_wp_page+0x3c7/0x7a0
[   25.976248]  [<ffffffff81285f18>] ? SYSC_semtimedop+0x298/0xeb0
[   25.983279]  [<ffffffffa0514ec3>] dm_ctl_ioctl+0x13/0x20 [dm_mod]
[   25.990481]  [<ffffffff81208268>] do_vfs_ioctl+0x2f8/0x4f0
[   25.997079]  [<ffffffff812084e1>] SyS_ioctl+0x81/0xa0
[   26.003239]  [<ffffffff81676b09>] system_call_fastpath+0x12/0x17
[   26.010353] ---[ end trace ad8730190225e8d3 ]---
[   26.016437] BUG: unable to handle kernel NULL pointer dereference at 00000000
[   26.025480] IP: [<ffffffff81270800>] sysfs_do_create_link_sd.isra.2+0x40/0xd0
[   26.033778] PGD 0 
[   26.036944] Oops: 0000 [#1] SMP 
[   26.041328] Modules linked in: dm_mirror dm_region_hash dm_log dm_mod corete2

Comment 4 Mike Snitzer 2015-03-21 15:01:53 UTC
Marian, if you just revert commit c4db59d31e39 from the 4.0-rc4 kernel does it eliminate the crash?

I can try to reproduce using the test documented in comment#1 but figured you may already know the answer.

Comment 5 Marian Csontos 2015-03-23 13:27:49 UTC
Pure revert does not compile. At least df0ce26cb4ee8bc233d50213b97213532aff0a3c and 15d0f5ea348b9c4e6d41df294dde38a56a39c7bf must be reverted too.

Will need someone who understands it much better than me, as blindly reverting patches does not look like a good strategy to get a working kernel.

Comment 6 Mike Snitzer 2015-03-23 13:29:50 UTC
(In reply to Marian Csontos from comment #5)
> Pure revert does not compile. At least
> df0ce26cb4ee8bc233d50213b97213532aff0a3c and
> 15d0f5ea348b9c4e6d41df294dde38a56a39c7bf must be reverted too.
> 
> Will need someone who understands it much better than me, as blindly
> reverting patches does not look like a good strategy to get a working kernel.

Right, needs analysis to arrive at a proper fix.  Was just looking to confirm a revert eliminated the crash.

Comment 7 Marian Csontos 2015-03-23 13:57:20 UTC
The problem is gone after reverting c4db59d31e39 together with the two related patches (df0ce26cb4ee and 15d0f5ea348b) but I have not run the full test suite, which I will do as soon as the "reproducer" cycle is over.

I think we may need a better reproducer but without understanding what's going on I will not come up with anything better than what we have.)

Mike, will you take care of forwarding the bug appropriately?

Comment 8 Mike Snitzer 2015-03-23 14:12:54 UTC
(In reply to Marian Csontos from comment #7)
> The problem is gone after reverting c4db59d31e39 together with the two
> related patches (df0ce26cb4ee and 15d0f5ea348b) but I have not run the full
> test suite, which I will do as soon as the "reproducer" cycle is over.
> 
> I think we may need a better reproducer but without understanding what's
> going on I will not come up with anything better than what we have.)
> 
> Mike, will you take care of forwarding the bug appropriately?

I already sent mail to Christoph (and others) but no reply yet, see:
https://www.redhat.com/archives/dm-devel/2015-March/msg00207.html

But earlier in that thread, from http://marc.info/?l=linux-fsdevel&m=142100124713254&w=2 , Tejun said:
"It's also shifting writeback shutdown to destroy time from
unregistration time.  This is part of fixing the bdi lifetime issue,
right?"

SO given the point of Christoph's exercise in this 4.0 patchset was to _prepare_ to fix bdi lifetime issues it would seem that this commit jumped the gun and started changing functionality (which a follow-on patchset was to provide).

But I'll dig in and figure out what is _really_ going on here.

Comment 9 Mike Snitzer 2015-03-23 22:31:35 UTC
This fixes the problem for me:
https://git.kernel.org/cgit/linux/kernel/git/device-mapper/linux-dm.git/commit/?h=dm-for-4.0&id=63a4f065ece613b6d575b538234375b0e9c23bbc

I'll very likely send this to Linus, for 4.0-rc6 inclusion, at the end of the week.

Comment 10 Josh Boyer 2015-03-25 00:49:24 UTC
(In reply to Mike Snitzer from comment #9)
> This fixes the problem for me:
> https://git.kernel.org/cgit/linux/kernel/git/device-mapper/linux-dm.git/
> commit/?h=dm-for-4.0&id=63a4f065ece613b6d575b538234375b0e9c23bbc
> 
> I'll very likely send this to Linus, for 4.0-rc6 inclusion, at the end of
> the week.

Mike, would this potentially cause the issue seen in bug 1195899?  IIRC, the live iso does all kinds of stuff with loopdevs and the kernel in that bug contains Christoph's commit.

I had suspected a race was causing that bug but Mikulas thought it was something else.

Comment 11 Alasdair Kergon 2015-03-25 01:29:45 UTC
It's possible that it is related - try out the patch and see if it helps.

Comment 12 Marian Csontos 2015-03-25 07:47:39 UTC
Rebased Mike's patch on top of rc5 and LVM test suite is working again. Thanks.

Comment 14 Mike Snitzer 2015-03-27 00:57:05 UTC
Sent pull request to Linus for 4.0-rc6, see:
https://www.redhat.com/archives/dm-devel/2015-March/msg00219.html

Comment 15 Azat 2015-04-14 13:53:26 UTC
Hi Mike,

I still have this issue after your patch, during setting up partitions with mdadm, mdadm hung, after attaching to mdadm with strace I got next:

# pgrep mdadm | xargs strace -fp
Process 27389 attached - interrupt to quit
unlink("/dev/.tmp.md.27389:9:127")      = 0
mknod("/tmp/.tmp.md.27389:9:127", S_IFBLK|0600, makedev(9, 127)) = 0
open("/tmp/.tmp.md.27389:9:127", O_RDWR|O_EXCL|O_DIRECT) <-- *hung*

After I looked into dmesg, and found this:

[ 9627.630018] ------------[ cut here ]------------
[ 9627.630029] WARNING: CPU: 18 PID: 3330 at fs/sysfs/dir.c:31 sysfs_warn_dup+0x5a/0x70()
[ 9627.630032] sysfs: cannot create duplicate filename '/devices/virtual/bdi/9:127'
[ 9627.630033] Modules linked in: xt_tcpudp iptable_filter ip_tables x_tables nfsd nfs lockd grace sunrpc ipmi_devintf netconsole configfs loop hid_generic usbhid hid x86_pkg_temp_thermal coretemp ghash_clmulni_intel aesni_intel ioatdma ehci_pci aes_x86_64 iTCO_wdt iTCO_ve
[ 9627.630074] CPU: 18 PID: 3330 Comm: mdadm Not tainted 4.0.0bl-azat-v6+ #1
[ 9627.630076] Hardware name: Supermicro X9DRD-7LN4F(-JBOD)/X9DRD-EF/X9DRD-7LN4F, BIOS 3.0a 12/05/2013
[ 9627.630077]  0000000000000000 ffffffff814e3fcc ffffffff813e590f ffff885f9bcd3808
[ 9627.630079]  ffffffff8104575c ffff885f96acb000 ffff885fa4b3e3c0 ffff885fa2fec780
[ 9627.630081]  ffff885fa4bc4000 0000000000000000 ffffffff810457d5 ffffffff814e5d78
[ 9627.630083] Call Trace:
[ 9627.630091]  [<ffffffff813e590f>] ? dump_stack+0x40/0x50
[ 9627.630096]  [<ffffffff8104575c>] ? warn_slowpath_common+0x7c/0xb0
[ 9627.630098]  [<ffffffff810457d5>] ? warn_slowpath_fmt+0x45/0x50
[ 9627.630100]  [<ffffffff81185092>] ? kernfs_path+0x42/0x50
[ 9627.630102]  [<ffffffff811883da>] ? sysfs_warn_dup+0x5a/0x70
[ 9627.630104]  [<ffffffff8118846e>] ? sysfs_create_dir_ns+0x7e/0x90
[ 9627.630108]  [<ffffffff811d94ab>] ? kobject_add_internal+0x9b/0x2f0
[ 9627.630109]  [<ffffffff811d9af6>] ? kobject_add+0x66/0xb0
[ 9627.630114]  [<ffffffff812bb2e3>] ? device_add+0x263/0x620
[ 9627.630116]  [<ffffffff812bb8a8>] ? device_create_groups_vargs+0xe8/0x100
[ 9627.630118]  [<ffffffff812bb8d3>] ? device_create_vargs+0x13/0x20
[ 9627.630124]  [<ffffffff810ed128>] ? bdi_register+0x68/0x150
[ 9627.630129]  [<ffffffff811c535d>] ? add_disk+0x14d/0x4a0
[ 9627.630132]  [<ffffffff811c585f>] ? alloc_disk_node+0xaf/0x100
[ 9627.630137]  [<ffffffffa0252269>] ? md_alloc+0x1e9/0x350 [md_mod]
[ 9627.630141]  [<ffffffffa02523db>] ? md_probe+0xb/0x20 [md_mod]
[ 9627.630143]  [<ffffffff812c0654>] ? kobj_lookup+0x104/0x170
[ 9627.630147]  [<ffffffffa02523d0>] ? md_alloc+0x350/0x350 [md_mod]
[ 9627.630149]  [<ffffffff811c4da8>] ? get_gendisk+0x28/0xf0
[ 9627.630153]  [<ffffffff8115fb74>] ? __blkdev_get+0x114/0x3c0
[ 9627.630156]  [<ffffffff8115e590>] ? bdev_direct_access+0xa0/0xa0
[ 9627.630158]  [<ffffffff8115e5a0>] ? bdev_test+0x10/0x10
[ 9627.630160]  [<ffffffff8115fe58>] ? blkdev_get+0x38/0x310
[ 9627.630162]  [<ffffffff81160170>] ? blkdev_get_by_dev+0x40/0x40
[ 9627.630167]  [<ffffffff8112b3d3>] ? do_dentry_open.isra.16+0x153/0x320
[ 9627.630170]  [<ffffffff811380f3>] ? do_last.isra.51+0x323/0xd50
[ 9627.630172]  [<ffffffff8111f5b3>] ? kmem_cache_alloc+0x123/0x130
[ 9627.630174]  [<ffffffff8113a97f>] ? path_openat+0x7f/0x610
[ 9627.630177]  [<ffffffff810f7480>] ? tlb_flush_mmu_free+0x30/0x50
[ 9627.630180]  [<ffffffff810fe800>] ? unmap_region+0xb0/0xf0
[ 9627.630182]  [<ffffffff8113bb3b>] ? do_filp_open+0x2b/0x90
[ 9627.630187]  [<ffffffff811472ec>] ? __alloc_fd+0x7c/0x120
[ 9627.630189]  [<ffffffff8112c531>] ? do_sys_open+0x121/0x210
[ 9627.630193]  [<ffffffff813ea097>] ? system_call_fastpath+0x12/0x6a
[ 9627.630195] ---[ end trace b7a3e9c6f05c2666 ]---
[ 9627.630196] ------------[ cut here ]------------
[ 9627.630198] WARNING: CPU: 18 PID: 3330 at lib/kobject.c:240 kobject_add_internal+0x274/0x2f0()
[ 9627.630200] kobject_add_internal failed for 9:127 with -EEXIST, don't try to register things with the same name in the same directory.
[ 9627.630201] Modules linked in: xt_tcpudp iptable_filter ip_tables x_tables nfsd nfs lockd grace sunrpc ipmi_devintf netconsole configfs loop hid_generic usbhid hid x86_pkg_temp_thermal coretemp ghash_clmulni_intel aesni_intel ioatdma ehci_pci aes_x86_64 iTCO_wdt iTCO_ve
[ 9627.630223] CPU: 18 PID: 3330 Comm: mdadm Tainted: G        W       4.0.0bl-azat-v6+ #1
[ 9627.630224] Hardware name: Supermicro X9DRD-7LN4F(-JBOD)/X9DRD-EF/X9DRD-7LN4F, BIOS 3.0a 12/05/2013
[ 9627.630225]  0000000000000000 ffffffff814f2c88 ffffffff813e590f ffff885f9bcd3858
[ 9627.630227]  ffffffff8104575c ffff885fa4bc4010 00000000ffffffef ffff885fa473f420
[ 9627.630229]  ffff885fa4bc4000 0000000000000000 ffffffff810457d5 ffffffff814f2e48
[ 9627.630230] Call Trace:
[ 9627.630233]  [<ffffffff813e590f>] ? dump_stack+0x40/0x50
[ 9627.630235]  [<ffffffff8104575c>] ? warn_slowpath_common+0x7c/0xb0
[ 9627.630236]  [<ffffffff810457d5>] ? warn_slowpath_fmt+0x45/0x50
[ 9627.630238]  [<ffffffff8118846e>] ? sysfs_create_dir_ns+0x7e/0x90
[ 9627.630240]  [<ffffffff811d9684>] ? kobject_add_internal+0x274/0x2f0
[ 9627.630242]  [<ffffffff811d9af6>] ? kobject_add+0x66/0xb0
[ 9627.630244]  [<ffffffff812bb2e3>] ? device_add+0x263/0x620
[ 9627.630245]  [<ffffffff812bb8a8>] ? device_create_groups_vargs+0xe8/0x100
[ 9627.630247]  [<ffffffff812bb8d3>] ? device_create_vargs+0x13/0x20
[ 9627.630250]  [<ffffffff810ed128>] ? bdi_register+0x68/0x150
[ 9627.630252]  [<ffffffff811c535d>] ? add_disk+0x14d/0x4a0
[ 9627.630255]  [<ffffffff811c585f>] ? alloc_disk_node+0xaf/0x100
[ 9627.630258]  [<ffffffffa0252269>] ? md_alloc+0x1e9/0x350 [md_mod]
[ 9627.630261]  [<ffffffffa02523db>] ? md_probe+0xb/0x20 [md_mod]
[ 9627.630262]  [<ffffffff812c0654>] ? kobj_lookup+0x104/0x170
[ 9627.630266]  [<ffffffffa02523d0>] ? md_alloc+0x350/0x350 [md_mod]
[ 9627.630268]  [<ffffffff811c4da8>] ? get_gendisk+0x28/0xf0
[ 9627.630270]  [<ffffffff8115fb74>] ? __blkdev_get+0x114/0x3c0
[ 9627.630272]  [<ffffffff8115e590>] ? bdev_direct_access+0xa0/0xa0
[ 9627.630274]  [<ffffffff8115e5a0>] ? bdev_test+0x10/0x10
[ 9627.630276]  [<ffffffff8115fe58>] ? blkdev_get+0x38/0x310
[ 9627.630278]  [<ffffffff81160170>] ? blkdev_get_by_dev+0x40/0x40
[ 9627.630280]  [<ffffffff8112b3d3>] ? do_dentry_open.isra.16+0x153/0x320
[ 9627.630282]  [<ffffffff811380f3>] ? do_last.isra.51+0x323/0xd50
[ 9627.630283]  [<ffffffff8111f5b3>] ? kmem_cache_alloc+0x123/0x130
[ 9627.630285]  [<ffffffff8113a97f>] ? path_openat+0x7f/0x610
[ 9627.630287]  [<ffffffff810f7480>] ? tlb_flush_mmu_free+0x30/0x50
[ 9627.630289]  [<ffffffff810fe800>] ? unmap_region+0xb0/0xf0
[ 9627.630291]  [<ffffffff8113bb3b>] ? do_filp_open+0x2b/0x90
[ 9627.630293]  [<ffffffff811472ec>] ? __alloc_fd+0x7c/0x120
[ 9627.630295]  [<ffffffff8112c531>] ? do_sys_open+0x121/0x210
[ 9627.630297]  [<ffffffff813ea097>] ? system_call_fastpath+0x12/0x6a
[ 9627.630298] ---[ end trace b7a3e9c6f05c2667 ]---
[ 9627.630395] BUG: unable to handle kernel NULL pointer dereference at 0000000000000040
[ 9627.630430] IP: [<ffffffff8118869a>] sysfs_do_create_link_sd.isra.2+0x2a/0xb0
[ 9627.630524] PGD 5fa2d03067 PUD 5f9d679067 PMD 0 
[ 9627.630550] Oops: 0000 [#1] SMP 
[ 9627.630624] Modules linked in: xt_tcpudp iptable_filter ip_tables x_tables nfsd nfs lockd grace sunrpc ipmi_devintf netconsole configfs loop hid_generic usbhid hid x86_pkg_temp_thermal coretemp ghash_clmulni_intel aesni_intel ioatdma ehci_pci aes_x86_64 iTCO_wdt iTCO_ve
[ 9627.631073] CPU: 18 PID: 3330 Comm: mdadm Tainted: G        W       4.0.0bl-azat-v6+ #1
[ 9627.631090] Hardware name: Supermicro X9DRD-7LN4F(-JBOD)/X9DRD-EF/X9DRD-7LN4F, BIOS 3.0a 12/05/2013
[ 9627.631109] task: ffff885fa4659f00 ti: ffff885f9bcd0000 task.ti: ffff885f9bcd0000
[ 9627.631124] RIP: 0010:[<ffffffff8118869a>]  [<ffffffff8118869a>] sysfs_do_create_link_sd.isra.2+0x2a/0xb0
[ 9627.631162] RSP: 0018:ffff885f9bcd3a78  EFLAGS: 00010246
[ 9627.631189] RAX: 000000000000e6e6 RBX: 0000000000000040 RCX: 00000000000000e6
[ 9627.631219] RDX: ffffffff814e19d0 RSI: 0000000000000040 RDI: ffffffff81740d88
[ 9627.631249] RBP: ffffffff814e19d0 R08: 0000000000017d60 R09: ffff88607fcd7d60
[ 9627.631278] R10: ffff882fbf802400 R11: ffffea017e830e00 R12: 0000000000000001
[ 9627.631308] R13: ffff885fa5a61ca8 R14: ffff885fa4bc3c70 R15: ffff885fa4bc3c00
[ 9627.631338] FS:  00007f439e991700(0000) GS:ffff88607fcc0000(0000) knlGS:0000000000000000
[ 9627.631383] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 9627.631411] CR2: 0000000000000040 CR3: 0000005f9fe22000 CR4: 00000000001407e0
[ 9627.631440] Stack:
[ 9627.631460]  ffff885fa4bc3c00 ffff885f9bf41428 ffff885fa4bc3c0c ffff885fa4bc3c80
[ 9627.631515]  ffff885fa4bc3c70 ffffffff811c53f3 ffff885fa4bc3c00 ffffffff00000018
[ 9627.631569]  0090007f9bcd3b08 ffff885fa4bc3c00 0000000000000000 0000000000000001
[ 9627.631680] Call Trace:
[ 9627.631703]  [<ffffffff811c53f3>] ? add_disk+0x1e3/0x4a0
[ 9627.631733]  [<ffffffffa0252269>] ? md_alloc+0x1e9/0x350 [md_mod]
[ 9627.631763]  [<ffffffffa02523db>] ? md_probe+0xb/0x20 [md_mod]
[ 9627.631791]  [<ffffffff812c0654>] ? kobj_lookup+0x104/0x170
[ 9627.631820]  [<ffffffffa02523d0>] ? md_alloc+0x350/0x350 [md_mod]
[ 9627.631849]  [<ffffffff811c4da8>] ? get_gendisk+0x28/0xf0
[ 9627.631877]  [<ffffffff8115fb74>] ? __blkdev_get+0x114/0x3c0
[ 9627.631905]  [<ffffffff8115e590>] ? bdev_direct_access+0xa0/0xa0
[ 9627.631933]  [<ffffffff8115e5a0>] ? bdev_test+0x10/0x10
[ 9627.631961]  [<ffffffff8115fe58>] ? blkdev_get+0x38/0x310
[ 9627.631988]  [<ffffffff81160170>] ? blkdev_get_by_dev+0x40/0x40
[ 9627.632017]  [<ffffffff8112b3d3>] ? do_dentry_open.isra.16+0x153/0x320
[ 9627.632046]  [<ffffffff811380f3>] ? do_last.isra.51+0x323/0xd50
[ 9627.632075]  [<ffffffff8111f5b3>] ? kmem_cache_alloc+0x123/0x130
[ 9627.632103]  [<ffffffff8113a97f>] ? path_openat+0x7f/0x610
[ 9627.632131]  [<ffffffff810f7480>] ? tlb_flush_mmu_free+0x30/0x50
[ 9627.632159]  [<ffffffff810fe800>] ? unmap_region+0xb0/0xf0
[ 9627.632186]  [<ffffffff8113bb3b>] ? do_filp_open+0x2b/0x90
[ 9627.632215]  [<ffffffff811472ec>] ? __alloc_fd+0x7c/0x120
[ 9627.632242]  [<ffffffff8112c531>] ? do_sys_open+0x121/0x210
[ 9627.632270]  [<ffffffff813ea097>] ? system_call_fastpath+0x12/0x6a
[ 9627.632298] Code: 00 48 85 d2 74 73 48 85 ff 74 6e 41 56 41 55 49 89 fd 41 54 55 48 c7 c7 88 0d 74 81 53 48 89 f3 41 89 cc 48 89 d5 e8 76 15 26 00 <48> 8b 1b 48 85 db 74 08 48 89 df e8 f6 c9 ff ff 80 05 d7 86 5b 
[ 9627.632557] RIP  [<ffffffff8118869a>] sysfs_do_create_link_sd.isra.2+0x2a/0xb0
[ 9627.632604]  RSP <ffff885f9bcd3a78>
[ 9627.632627] CR2: 0000000000000040
[ 9627.633014] ---[ end trace b7a3e9c6f05c2668 ]---


$ git describe
v4.0-2620-gb79013b

Comment 16 Mike Snitzer 2015-04-14 16:30:49 UTC
(In reply to Azat from comment #15)
> Hi Mike,
> 
> I still have this issue after your patch, during setting up partitions with
> mdadm, mdadm hung, after attaching to mdadm with strace I got next:

My DM fix has _nothing_ to do with MD.  You need to report this to Neil Brown and the linux-raid mailing list.

Comment 17 Mike McCune 2016-03-28 23:10:48 UTC
This bug was accidentally moved from POST to MODIFIED via an error in automation, please see mmccune with any questions

Comment 18 Laura Abbott 2018-04-06 18:05:40 UTC
This looks like there was a patch given for the original issue. Closing, feel free to report new issues.


Note You need to log in before you can comment on or make changes to this bug.