1559692 – Hang or NULL pointer dereference if reading sysfs during VDO start/stop.

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1559692 - Hang or NULL pointer dereference if reading sysfs during VDO start/stop.

Summary: Hang or NULL pointer dereference if reading sysfs during VDO start/stop.

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Linux 7
Classification:	Red Hat
Component:	kmod-kvdo
Sub Component:
Version:	7.5
Hardware:	Unspecified
OS:	Unspecified
Priority:	high
Severity:	unspecified
Target Milestone:	rc
Target Release:	---
Assignee:	Thomas Jaskiewicz
QA Contact:	Jakub Krysl
Docs Contact:
URL:
Whiteboard:
Duplicates (1):	1567215 (view as bug list)
Depends On:
Blocks:	1567744
TreeView+	depends on / blocked

Reported:	2018-03-23 03:21 UTC by Sweet Tea Dorminy
Modified:	2021-09-03 12:03 UTC (History)
CC List:	5 users (show)
Fixed In Version:	6.1.1.60
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Clones:	1567744 (view as bug list)
Environment:
Last Closed:	2018-10-30 09:39:22 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHBA-2018:3094	0	None	None	None	2018-10-30 09:39:50 UTC

Description Sweet Tea Dorminy 2018-03-23 03:21:31 UTC

Description of problem:
If someone attempts to read sysfs entries while a VDO is starting up or shutting down, a hang or a NULL pointer dereference may occur. The VDO may be freed or not yet exist for the specific parts needed for the sysfs invocation.

Version-Release number of selected component (if applicable):
6.1.0.155

How reproducible:
1 in 10

Steps to Reproduce:
1. In a shell, run 'while true; cat /sys/kvdo/vdo0/statistics/data_blocks_used || true; done;'
2. Make a VDO but don't start it.
3. Start and stop the VDO in a loop.

Actual results:
Eventually, 120s hung task warnings will result for both a dmsetup command and a cat. Alternately, a NULL pointer dereference may occur.

Expected results:
No hung tasks or crashes.

Additional info:

Comment 2 Jakub Krysl 2018-04-13 14:24:40 UTC

Reproduced, acking...

1) # vdo create --name vdo0 --device /dev/sdb --activate disabled
2) # vdo activate --name vdo0
3) # while true; do vdo start --name vdo0 --verbose; vdo stop --name vdo0 --verbose; done;
4) (in separate terminal after few cycles of 3) ) # while true; do cat /sys/kvdo/vdo0/statistics/data_blocks_used || true; done;
5) terminals gets stuck
6) sudo shutdown -r now
7) this appears in console:

[  OK  ] Stopped Availability of block devices.
[     *] (2 of 2) A stop job is running for ... user root (1min 25s / 1min 30s)[  963.792399] INFO: task dmsetup:7961 blocked for more than 120 seconds.
[  963.827624] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  963.865243] Call Trace:
[  963.876890]  [<ffffffff89b12f49>] schedule+0x29/0x70
[  963.900072]  [<ffffffff896a124d>] __kernfs_remove+0x17d/0x260
[  963.926933]  [<ffffffff894bbe20>] ? wake_up_atomic_t+0x30/0x30
[  963.954539]  [<ffffffff896a21b1>] kernfs_remove+0x21/0x30
[  963.980037]  [<ffffffff896a46c0>] sysfs_remove_dir+0x50/0x80
[  964.006754]  [<ffffffff8974ccb8>] kobject_del+0x18/0x50
[  964.031334]  [<ffffffff8974cd4e>] kobject_release+0x5e/0x1b0
[  964.057751]  [<ffffffff8974cc08>] kobject_put+0x28/0x60
[  964.082184]  [<ffffffffc087d663>] freeKernelLayer+0x223/0x2f0 [kvdo]
[  964.112320]  [<ffffffffc086e8ad>] vdoDtr+0xfd/0x1b0 [kvdo]
[  964.138357]  [<ffffffff89548250>] ? dyntick_save_progress_counter+0x30/0x30
[  964.171241]  [<ffffffffc0140763>] dm_table_destroy+0x73/0x120 [dm_mod]
[  964.171252]  [<ffffffffc013c726>] __dm_destroy+0x136/0x230 [dm_mod]
[  964.171269]  [<ffffffffc013ec23>] dm_destroy+0x13/0x20 [dm_mod]
[  964.171281]  [<ffffffffc0144c5e>] dev_remove+0x11e/0x1a0 [dm_mod]
[  964.171292]  [<ffffffffc0145b02>] ctl_ioctl+0x212/0x4e0 [dm_mod]
[  964.171308]  [<ffffffffc0144b40>] ? dev_suspend+0x260/0x260 [dm_mod]
[  964.171319]  [<ffffffffc0145dde>] dm_ctl_ioctl+0xe/0x20 [dm_mod]
[  964.171325]  [<ffffffff8962fb90>] do_vfs_ioctl+0x350/0x560
[  964.171329]  [<ffffffff896d82bf>] ? file_has_perm+0x9f/0xb0
[  964.171333]  [<ffffffff8962fe41>] SyS_ioctl+0xa1/0xc0
[  964.171340]  [<ffffffff89b1f7d5>] system_call_fastpath+0x1c/0x21
[  964.171343] INFO: task cat:7963 blocked for more than 120 seconds.
[  964.171343] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  964.171510] Call Trace:
[  964.171515]  [<ffffffff89751384>] ? __radix_tree_lookup+0x84/0xf0
[  964.171521]  [<ffffffff89b12f49>] schedule+0x29/0x70
[  964.171524]  [<ffffffff89b108b9>] schedule_timeout+0x239/0x2c0
[  964.171530]  [<ffffffff895962de>] ? filemap_fault+0x17e/0x490
[  964.171535]  [<ffffffff894f8c5f>] ? __getnstimeofday64+0x3f/0xd0
[  964.171538]  [<ffffffff894f8cfe>] ? getnstimeofday64+0xe/0x30
[  964.171542]  [<ffffffff89b132fd>] wait_for_completion+0xfd/0x140
[  964.171548]  [<ffffffff894cee80>] ? wake_up_state+0x20/0x20
[  964.171570]  [<ffffffffc0873040>] ? finishVDOAction+0x20/0x20 [kvdo]
[  964.171587]  [<ffffffffc0872fb2>] performKVDOOperation+0xb2/0xe0 [kvdo]
[  964.171603]  [<ffffffffc0873040>] ? finishVDOAction+0x20/0x20 [kvdo]
[  964.171616]  [<ffffffffc0873040>] ? finishVDOAction+0x20/0x20 [kvdo]
[  964.171632]  [<ffffffffc08737a7>] getKVDOStatistics+0x57/0x80 [kvdo]
[  964.171648]  [<ffffffffc0877c06>] poolStatsDataBlocksUsedShow+0x36/0x70 [kvdo]
[  964.171663]  [<ffffffffc0873c41>] poolStatsAttrShow+0x21/0x30 [kvdo]
[  964.171667]  [<ffffffff896a3e8f>] sysfs_kf_seq_show+0xcf/0x1f0
[  964.171671]  [<ffffffff896a25d6>] kernfs_seq_show+0x26/0x30
[  964.171675]  [<ffffffff89641410>] seq_read+0x110/0x3f0
[  964.171679]  [<ffffffff896a2e35>] kernfs_fop_read+0xf5/0x160
[  964.171683]  [<ffffffff8961ab3f>] vfs_read+0x9f/0x170
[  964.171686]  [<ffffffff8961ba0f>] SyS_read+0x7f/0xf0
[  964.171692]  [<ffffffff89b1f7d5>] system_call_fastpath+0x1c/0x21
[   ***] (1 of 2) A stop job is running for ...1 of user root (2min 28s / 3min)

Comment 3 Thomas Jaskiewicz 2018-04-13 18:24:12 UTC

*** Bug 1567215 has been marked as a duplicate of this bug. ***

Comment 6 Jakub Krysl 2018-07-03 08:01:22 UTC

Tested on:
RHEL-7.6-20180626.0
kernel-3.10.0-915.el7
kmod-vdo-6.1.1.99-1.el7
vdo-6.1.1.99-2.el7

I was not able to reproduce this anymore, the stop/start cycle keeps going.
Regression testing did not find any issues.

Comment 8 errata-xmlrpc 2018-10-30 09:39:22 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:3094

Note You need to log in before you can comment on or make changes to this bug.