Bug 1567215

Summary: sysfs: cannot create duplicate filename '/kvdo/vdo'
Product: Red Hat Enterprise Linux 7 Reporter: Jakub Krysl <jkrysl>
Component: kmod-kvdoAssignee: bjohnsto
Status: CLOSED DUPLICATE QA Contact: Jakub Krysl <jkrysl>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 7.5CC: awalsh, bgurney, jkrysl, tjaskiew
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-04-13 18:24:12 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Jakub Krysl 2018-04-13 15:03:39 UTC
Description of problem:
When reproducing BZ 1559692 the first time I hit this new bug. After that I managed to reproduced twice just the BZ, but there is a way to reproduce this one by killing the process that hangs. When this process is killed, the start/stop cycle that halt starts going again, each cycle producing the same 2 calltraces:

[  370.398094] kvdo24:dmsetup: Using mode sync automatically. 
[  370.422707] ------------[ cut here ]------------ 
[  370.443471] WARNING: CPU: 0 PID: 7419 at fs/sysfs/dir.c:31 sysfs_warn_dup+0x64/0x80 
[  370.477975] sysfs: cannot create duplicate filename '/kvdo/vdo' 
[  370.504583] Modules linked in: rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache dm_service_time iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi kvdo(O) uds(O) sunrpc sb_edac intel_powerclamp coretemp intel_rapl iosf_mbi kvm_intel kvm irqbypass crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd iTCO_wdt iTCO_vendor_support hpwdt hpilo pcspkr ipmi_ssif sg i2c_i801 lpc_ich ipmi_si ipmi_devintf wmi ipmi_msghandler acpi_power_meter ioatdma shpchp pcc_cpufreq dm_multipath ip_tables xfs libcrc32c sd_mod crc_t10dif crct10dif_generic mgag200 i2c_algo_bit drm_kms_helper ahci libahci syscopyarea sysfillrect sysimgblt fb_sys_fops ttm libata crct10dif_pclmul crct10dif_common crc32c_intel ixgbe serio_raw drm tg3 i2c_core mdio dca ptp pps_core dm_mirror dm_region_hash dm_log dm_mod 
[  370.840569] CPU: 0 PID: 7419 Comm: dmsetup Tainted: G           O   ------------   3.10.0-862.el7.x86_64 #1 
[  370.888532] Hardware name: HP ProLiant DL360 Gen9/ProLiant DL360 Gen9, BIOS P89 10/25/2017 
[  370.925728] Call Trace: 
[  370.936660]  [<ffffffff9a50d768>] dump_stack+0x19/0x1b 
[  370.959655]  [<ffffffff99e916d8>] __warn+0xd8/0x100 
[  370.981676]  [<ffffffff99e9175f>] warn_slowpath_fmt+0x5f/0x80 
[  371.007174]  [<ffffffff9a0a1928>] ? kernfs_path+0x48/0x60 
[  371.031455]  [<ffffffff9a0a4564>] sysfs_warn_dup+0x64/0x80 
[  371.056072]  [<ffffffff9a0a465e>] sysfs_create_dir_ns+0x8e/0xa0 
[  371.082628]  [<ffffffff9a14d65a>] kobject_add_internal+0xaa/0x330 
[  371.109860]  [<ffffffff9a14db15>] kobject_add+0x75/0xd0 
[  371.133494]  [<ffffffffc06de7b1>] ? allocateVDO+0xb1/0xf0 [kvdo] 
[  371.160431]  [<ffffffffc06f981b>] makeKernelLayer+0xeb/0xae0 [kvdo] 
[  371.188583]  [<ffffffffc06eac6f>] vdoInitialize+0x21f/0x3b0 [kvdo] 
[  371.216309]  [<ffffffffc06eb067>] vdoCtr+0x267/0x350 [kvdo] 
[  371.241464]  [<ffffffffc00d7afd>] dm_table_add_target+0x17d/0x440 [dm_mod] 
[  371.272370]  [<ffffffffc00db6a7>] table_load+0x157/0x390 [dm_mod] 
[  371.299746]  [<ffffffffc00dcb02>] ctl_ioctl+0x212/0x4e0 [dm_mod] 
[  371.326724]  [<ffffffffc00db550>] ? retrieve_status+0x1c0/0x1c0 [dm_mod] 
[  371.359275]  [<ffffffffc00dcdde>] dm_ctl_ioctl+0xe/0x20 [dm_mod] 
[  371.389348]  [<ffffffff9a02fb90>] do_vfs_ioctl+0x350/0x560 
[  371.413951]  [<ffffffff9a0d82bf>] ? file_has_perm+0x9f/0xb0 
[  371.438841]  [<ffffffff9a02fe41>] SyS_ioctl+0xa1/0xc0 
[  371.461419]  [<ffffffff9a51f7d5>] system_call_fastpath+0x1c/0x21 
[  371.488355] ---[ end trace 680573f4575bddfd ]--- 
[  371.509179] ------------[ cut here ]------------ 
[  371.529876] WARNING: CPU: 0 PID: 7419 at lib/kobject.c:239 kobject_add_internal+0x274/0x330 
[  371.567536] kobject_add_internal failed for vdo with -EEXIST, don't try to register things with the same name in the same directory. 
[  371.622129] Modules linked in: rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache dm_service_time iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi kvdo(O) uds(O) sunrpc sb_edac intel_powerclamp coretemp intel_rapl iosf_mbi kvm_intel kvm irqbypass crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd iTCO_wdt iTCO_vendor_support hpwdt hpilo pcspkr ipmi_ssif sg i2c_i801 lpc_ich ipmi_si ipmi_devintf wmi ipmi_msghandler acpi_power_meter ioatdma shpchp pcc_cpufreq dm_multipath ip_tables xfs libcrc32c sd_mod crc_t10dif crct10dif_generic mgag200 i2c_algo_bit drm_kms_helper ahci libahci syscopyarea sysfillrect sysimgblt fb_sys_fops ttm libata crct10dif_pclmul crct10dif_common crc32c_intel ixgbe serio_raw drm tg3 i2c_core mdio dca ptp pps_core dm_mirror dm_region_hash dm_log dm_mod 
[  371.959338] CPU: 0 PID: 7419 Comm: dmsetup Tainted: G        W  O   ------------   3.10.0-862.el7.x86_64 #1 
[  372.002895] Hardware name: HP ProLiant DL360 Gen9/ProLiant DL360 Gen9, BIOS P89 10/25/2017 
[  372.039970] Call Trace: 
[  372.051160]  [<ffffffff9a50d768>] dump_stack+0x19/0x1b 
[  372.074167]  [<ffffffff99e916d8>] __warn+0xd8/0x100 
[  372.095973]  [<ffffffff99e9175f>] warn_slowpath_fmt+0x5f/0x80 
[  372.121808]  [<ffffffff9a0a465e>] ? sysfs_create_dir_ns+0x8e/0xa0 
[  372.148853]  [<ffffffff9a0a456c>] ? sysfs_warn_dup+0x6c/0x80 
[  372.174199]  [<ffffffff9a14d824>] kobject_add_internal+0x274/0x330 
[  372.201900]  [<ffffffff9a14db15>] kobject_add+0x75/0xd0 
[  372.225013]  [<ffffffffc06de7b1>] ? allocateVDO+0xb1/0xf0 [kvdo] 
[  372.252075]  [<ffffffffc06f981b>] makeKernelLayer+0xeb/0xae0 [kvdo] 
[  372.280261]  [<ffffffffc06eac6f>] vdoInitialize+0x21f/0x3b0 [kvdo] 
[  372.307940]  [<ffffffffc06eb067>] vdoCtr+0x267/0x350 [kvdo] 
[  372.332983]  [<ffffffffc00d7afd>] dm_table_add_target+0x17d/0x440 [dm_mod] 
[  372.364670]  [<ffffffffc00db6a7>] table_load+0x157/0x390 [dm_mod] 
[  372.396106]  [<ffffffffc00dcb02>] ctl_ioctl+0x212/0x4e0 [dm_mod] 
[  372.423388]  [<ffffffffc00db550>] ? retrieve_status+0x1c0/0x1c0 [dm_mod] 
[  372.453484]  [<ffffffffc00dcdde>] dm_ctl_ioctl+0xe/0x20 [dm_mod] 
[  372.480498]  [<ffffffff9a02fb90>] do_vfs_ioctl+0x350/0x560 
[  372.505255]  [<ffffffff9a0d82bf>] ? file_has_perm+0x9f/0xb0 
[  372.530264]  [<ffffffff9a02fe41>] SyS_ioctl+0xa1/0xc0 
[  372.552977]  [<ffffffff9a51f7d5>] system_call_fastpath+0x1c/0x21 
[  372.580016] ---[ end trace 680573f4575bddfe ]--- 
[  372.600698] kvdo24:dmsetup: Could not create kernel physical layer. (VDO error -17, message Cannot add sysfs node) 
[  372.647248] device-mapper: table: 253:4: vdo: Cannot add sysfs node 
[  372.675862] device-mapper: ioctl: error adding target to table 
[  373.041717] kvdo25:dmsetup: starting device 'vdo' device instantiation 25 write policy auto 
[  373.085692] kvdo25:dmsetup: underlying device, REQ_FLUSH: not supported, REQ_FUA: not supported 
[  373.125884] kvdo25:dmsetup: Using mode sync automatically. 
[  373.150515] ------------[ cut here ]------------ 

Note: If I restart the server at this point using 'sudo shutdown -r now', I get the hung from BZ 1559692 again.

Version-Release number of selected component (if applicable):
kmod-kvdo-6.1.0.153-15.el7.x86_64

How reproducible:
100%

Steps to Reproduce:
1) vdo create --name vdo --device /dev/sdb --activate disabled
2) vdo activate --name vdo
3) while true; do vdo stop --name vdo --verbose || true; vdo start --name vdo --verbose || true; done;
4) (in separate terminal after few cycles of 3) ) while true; do cat /sys/kvdo/vdo/statistics/data_blocks_used || true; done;
5) (terminals get stuck)
6) (in new terminal) killall vdo

Actual results:
the start / stop cycle starts again and keeps going forever, producing mentioned calltraces

Expected results:
probably BZ 1559692 get reproduced again

Additional info:

Comment 2 Thomas Jaskiewicz 2018-04-13 18:23:46 UTC
The fix to BZ 1559692 will also fix this.  We release /sys/kvdo/vdo/statistics and wait for it to go away early in the stopping procedure, and create it last in the starting procedure.

Therefore /sys/kvdo/vdo/statistics/data_blocks_used will not exist during the time window exploited by this report.

Comment 3 Thomas Jaskiewicz 2018-04-13 18:24:12 UTC

*** This bug has been marked as a duplicate of bug 1559692 ***