RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 2033457 - "BUG: scheduling while atomic" sometimes when suspending VDO while writes are active
Summary: "BUG: scheduling while atomic" sometimes when suspending VDO while writes are...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 9
Classification: Red Hat
Component: kmod-kvdo
Version: 9.0
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: rc
: ---
Assignee: sclafani
QA Contact: Filip Suba
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-12-16 21:38 UTC by sclafani
Modified: 2022-05-17 16:10 UTC (History)
3 users (show)

Fixed In Version: kmod-kvdo-8.1.1.360-12.el9
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-05-17 15:49:27 UTC
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
test case (2.69 KB, text/plain)
2021-12-16 21:38 UTC, sclafani
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker RHELPLAN-106049 0 None None None 2021-12-16 21:39:56 UTC
Red Hat Product Errata RHBA-2022:3919 0 None None None 2022-05-17 15:49:38 UTC

Description sclafani 2021-12-16 21:38:15 UTC
Created attachment 1846645 [details]
test case

Description of problem:
Suspending VDO does a synchronous flush, calling submit_bio_wait(). This is being called while a spin lock in the batch processor is held. Under some circumstances (so far only growing the logical volume when writes are happening) that flush can actually have to wait.

Version-Release number of selected component (if applicable):


How reproducible:
Intermittently, running our basic test of growing a logical volume, which creates a 5GB VDO volume and then grows the logical size by 40GB while concurrently writing 5000 blocks to VDO.

Steps to Reproduce:
See attached test case outline.

Actual results:
"BUG: scheduling while atomic" and a backtrace (see below) are logged.

Expected results:
The message and backtrace do not appear.

Additional info:

Dec 15 02:31:33 pfarm-069 kernel: kvdo1:lvresize: preparing to modify device '253:4' 
Dec 15 02:31:33 pfarm-069 kernel: kvdo1:lvresize: Preparing to resize logical to 11796736 
Dec 15 02:31:33 pfarm-069 kernel: kvdo1:lvresize: Done preparing to resize logical 
Dec 15 02:31:33 pfarm-069 kernel: kvdo1:lvresize: suspending device '253:4' 
Dec 15 02:31:33 pfarm-069 kernel: BUG: scheduling while atomic: kvdo1:cpuQ1/1725297/0x00000002 
Dec 15 02:31:33 pfarm-069 kernel: Modules linked in: kvdo(OE) uds(OE) lz4_compress ext4 mbcache jbd2 dm_thin_pool dm_persistent_data dm_bio_prison dm_bufio iscs\
i_target_mod target_core_mod nfsv3 nfs_acl nfs lockd grace fscache netfs rfkill intel_rapl_msr intel_rapl_common isst_if_common nfit libnvdimm kvm_intel kvm cir\
rus drm_kms_helper irqbypass syscopyarea sysfillrect sysimgblt fb_sys_fops i2c_piix4 virtio_balloon cec joydev pcspkr vfat fat drm fuse permatest(POE) xfs libcr\
c32c ata_generic crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel ata_piix virtio_net net_failover libata failover serio_raw virtio_blk sunrpc dm_\
mirror dm_region_hash dm_log dm_mod be2iscsi bnx2i cnic uio cxgb4i cxgb4 tls libcxgbi libcxgb qla4xxx iscsi_boot_sysfs iscsi_tcp libiscsi_tcp libiscsi scsi_tran\
sport_iscsi [last unloaded: uds] 
Dec 15 02:31:33 pfarm-069 kernel: Preemption disabled at: 
Dec 15 02:31:33 pfarm-069 kernel: [<0000000000000000>] 0x0 
Dec 15 02:31:33 pfarm-069 kernel: CPU: 1 PID: 1725297 Comm: kvdo1:cpuQ1 Kdump: loaded Tainted: P           OE    --------- ---  5.14.0-29.el9.x86_64 #1 
Dec 15 02:31:33 pfarm-069 kernel: Hardware name: Red Hat OpenStack Compute, BIOS 1.13.0-2.module+el8.2.1+7284+aa32a2c4 04/01/2014 
Dec 15 02:31:33 pfarm-069 kernel: Call Trace: 
Dec 15 02:31:33 pfarm-069 kernel: dump_stack_lvl+0x34/0x44 
Dec 15 02:31:33 pfarm-069 kernel: __schedule_bug.cold+0x7d/0x8b 
Dec 15 02:31:33 pfarm-069 kernel: __schedule+0x400/0x560 
Dec 15 02:31:33 pfarm-069 kernel: schedule+0x43/0xd0 
Dec 15 02:31:33 pfarm-069 kernel: schedule_timeout+0x88/0x150 
Dec 15 02:31:33 pfarm-069 kernel: ? __bpf_trace_tick_stop+0x10/0x10 
Dec 15 02:31:33 pfarm-069 kernel: io_schedule_timeout+0x4c/0x70 
Dec 15 02:31:33 pfarm-069 kernel: wait_for_completion_io_timeout+0x84/0x100 
Dec 15 02:31:33 pfarm-069 kernel: submit_bio_wait+0x72/0xb0 
Dec 15 02:31:33 pfarm-069 kernel: vdo_synchronous_flush+0x76/0xd0 [kvdo] 
Dec 15 02:31:33 pfarm-069 kernel: ? submit_bio_wait+0xb0/0xb0 
Dec 15 02:31:33 pfarm-069 kernel: suspend_callback+0x265/0x2e0 [kvdo] 
Dec 15 02:31:33 pfarm-069 kernel: invoke_vdo_completion_callback+0x60/0x70 [kvdo] 
Dec 15 02:31:33 pfarm-069 kernel: complete_vdo_completion+0x2a/0x60 [kvdo] 
Dec 15 02:31:33 pfarm-069 kernel: limiter_release_many+0x9c/0xb0 [kvdo] 
Dec 15 02:31:33 pfarm-069 kernel: complete_many_requests+0x90/0xd0 [kvdo] 
Dec 15 02:31:33 pfarm-069 kernel: return_data_vio_batch_to_pool+0xc3/0x170 [kvdo] 
Dec 15 02:31:33 pfarm-069 kernel: batch_processor_work+0x31/0xa0 [kvdo] 
Dec 15 02:31:33 pfarm-069 kernel: service_work_queue+0xf0/0x410 [kvdo] 
Dec 15 02:31:33 pfarm-069 kernel: ? do_wait_intr_irq+0xa0/0xa0 
Dec 15 02:31:33 pfarm-069 kernel: work_queue_runner+0x78/0x90 [kvdo] 
Dec 15 02:31:33 pfarm-069 kernel: ? service_work_queue+0x410/0x410 [kvdo] 
Dec 15 02:31:33 pfarm-069 kernel: kthread+0x135/0x160 
Dec 15 02:31:33 pfarm-069 kernel: ? set_kthread_struct+0x40/0x40 
Dec 15 02:31:33 pfarm-069 kernel: ret_from_fork+0x22/0x30 
Dec 15 02:31:33 pfarm-069 kernel: uds: kvdo1:journalQ: beginning save (vcn 4294967295) 
Dec 15 02:31:33 pfarm-069 kernel: uds: kvdo1:journalQ: finished save (vcn 4294967295) 
Dec 15 02:31:33 pfarm-069 kernel: kvdo1:lvresize: device '253:4' suspended 
Dec 15 02:31:33 pfarm-069 kernel: dm-4: detected capacity change from 10487808 to 94373888 
Dec 15 02:31:33 pfarm-069 kernel: kvdo1:lvresize: resuming device '253:4'

Comment 2 Andy Walsh 2022-02-15 14:04:01 UTC
Available in the most recent build.

Comment 5 Filip Suba 2022-02-28 11:34:23 UTC
Verified with kmod-kvdo-8.1.1.360-14.el9.

Comment 7 errata-xmlrpc 2022-05-17 15:49:27 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (new packages: kmod-kvdo), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2022:3919


Note You need to log in before you can comment on or make changes to this bug.