Bug 2033377
Summary: | block io scenario now fails to when run on writecache origin volume in rhel8.6 | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 8 | Reporter: | Corey Marthaler <cmarthal> |
Component: | lvm2 | Assignee: | David Teigland <teigland> |
lvm2 sub component: | Cache Logical Volumes | QA Contact: | cluster-qe <cluster-qe> |
Status: | CLOSED NOTABUG | Docs Contact: | |
Severity: | high | ||
Priority: | high | CC: | agk, heinzm, jbrassow, msnitzer, prajnoha, teigland, zkabelac |
Version: | 8.6 | Keywords: | Regression, Triaged |
Target Milestone: | rc | ||
Target Release: | --- | ||
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2022-02-09 15:11:28 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 2015587 |
Description
Corey Marthaler
2021-12-16 16:07:11 UTC
A full direct write to the same sized wc origin works on both 8.5 and 8.6 8.5 [root@hayes-02 ~]# dd if=/dev/zero of=/dev/snapper_thinp/origin of=/dev/writecache_sanity/block_io_origin bs=1M oflag=direct dd: error writing '/dev/writecache_sanity/block_io_origin': No space left on device 15361+0 records in 15360+0 records out 16106127360 bytes (16 GB, 15 GiB) copied, 299.501 s, 53.8 MB/s 8.6 [root@hayes-03 bin]# dd if=/dev/zero of=/dev/snapper_thinp/origin of=/dev/writecache_sanity/block_io_origin bs=1M oflag=direct dd: error writing '/dev/writecache_sanity/block_io_origin': No space left on device 15361+0 records in 15360+0 records out 16106127360 bytes (16 GB, 15 GiB) copied, 410.76 s, 39.2 MB/s I tried this with the following kernels (starting from 348.4,3,2,1 and up from 348) and the only thing that seemed to matter was the lvm build that was installed. This problem does NOT happen in 12-10 regardless of what kernel is running. lvm2-2.03.12-10.el8 This problem DOES happen in 14-1 regardless of what kernel is running. lvm2-2.03.14-1.el8 BUILT: Wed Oct 20 10:18:17 CDT 2021 [root@hayes-03 bin]# strace /usr/tests/sts-rhel8.3/bin/b_iogen -o -m random -f direct -s write,writev -t1000b -T10000b -d /dev/writecache_sanity/block_io_origin | /usr/tests/sts-rhel8.3/bin/b_doio -i 500 -v execve("/usr/tests/sts-rhel8.3/bin/b_iogen", ["/usr/tests/sts-rhel8.3/bin/b_iog"..., "-o", "-m", "random", "-f", "direct", "-s", "write,writev", "-t1000b", "-T10000b", "-d", "/dev/writecache_sanity/block_io_"...], 0x7ffc478a01a8 /* 39 vars */) = 0 brk(NULL) = 0x197d000 arch_prctl(0x3001 /* ARCH_??? */, 0x7ffcd3fa2600) = -1 EINVAL (Invalid argument) access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory) openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3 fstat(3, {st_mode=S_IFREG|0644, st_size=44263, ...}) = 0 mmap(NULL, 44263, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f3bf0ce3000 close(3) = 0 openat(AT_FDCWD, "/lib64/libxml2.so.2", O_RDONLY|O_CLOEXEC) = 3 read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\220\364\2\0\0\0\0\0"..., 832) = 832 lseek(3, 1427968, SEEK_SET) = 1427968 read(3, "\4\0\0\0\20\0\0\0\5\0\0\0GNU\0\2\0\0\300\4\0\0\0\3\0\0\0\0\0\0\0", 32) = 32 fstat(3, {st_mode=S_IFREG|0755, st_size=1503528, ...}) = 0 mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f3bf0ce1000 lseek(3, 1427968, SEEK_SET) = 1427968 read(3, "\4\0\0\0\20\0\0\0\5\0\0\0GNU\0\2\0\0\300\4\0\0\0\3\0\0\0\0\0\0\0", 32) = 32 mmap(NULL, 3568024, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7f3bf075b000 mprotect(0x7f3bf08b8000, 2093056, PROT_NONE) = 0 mmap(0x7f3bf0ab7000, 40960, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x15c000) = 0x7f3bf0ab7000 mmap(0x7f3bf0ac1000, 4504, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7f3bf0ac1000 close(3) = 0 openat(AT_FDCWD, "/lib64/libz.so.1", O_RDONLY|O_CLOEXEC) = 3 read(3, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0@'\0\0\0\0\0\0"..., 832) = 832 lseek(3, 88664, SEEK_SET) = 88664 read(3, "\4\0\0\0 \0\0\0\5\0\0\0GNU\0\1\0\0\300\4\0\0\0\30\0\0\0\0\0\0\0"..., 48) = 48 fstat(3, {st_mode=S_IFREG|0755, st_size=95416, ...}) = 0 lseek(3, 88664, SEEK_SET) = 88664 read(3, "\4\0\0\0 \0\0\0\5\0\0\0GNU\0\1\0\0\300\4\0\0\0\30\0\0\0\0\0\0\0"..., 48) = 48 mmap(NULL, 2187272, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7f3bf0544000 mprotect(0x7f3bf055a000, 2093056, PROT_NONE) = 0 mmap(0x7f3bf0759000, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x15000) = 0x7f3bf0759000 mmap(0x7f3bf075a000, 8, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7f3bf075a000 close(3) = 0 openat(AT_FDCWD, "/lib64/liblzma.so.5", O_RDONLY|O_CLOEXEC) = 3 read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\2405\0\0\0\0\0\0"..., 832) = 832 lseek(3, 150808, SEEK_SET) = 150808 read(3, "\4\0\0\0\20\0\0\0\5\0\0\0GNU\0\2\0\0\300\4\0\0\0\3\0\0\0\0\0\0\0", 32) = 32 fstat(3, {st_mode=S_IFREG|0755, st_size=192016, ...}) = 0 lseek(3, 150808, SEEK_SET) = 150808 read(3, "\4\0\0\0\20\0\0\0\5\0\0\0GNU\0\2\0\0\300\4\0\0\0\3\0\0\0\0\0\0\0", 32) = 32 mmap(NULL, 2252808, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7f3bf031d000 mprotect(0x7f3bf0342000, 2097152, PROT_NONE) = 0 mmap(0x7f3bf0542000, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x25000) = 0x7f3bf0542000 mmap(0x7f3bf0543000, 8, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7f3bf0543000 close(3) = 0 openat(AT_FDCWD, "/lib64/libm.so.6", O_RDONLY|O_CLOEXEC) = 3 read(3, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0 \305\0\0\0\0\0\0"..., 832) = 832 fstat(3, {st_mode=S_IFREG|0755, st_size=2191520, ...}) = 0 mmap(NULL, 3674432, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7f3beff9b000 mprotect(0x7f3bf011c000, 2093056, PROT_NONE) = 0 mmap(0x7f3bf031b000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x180000) = 0x7f3bf031b000 close(3) = 0 openat(AT_FDCWD, "/lib64/libdl.so.2", O_RDONLY|O_CLOEXEC) = 3 read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\300\20\0\0\0\0\0\0"..., 832) = 832 fstat(3, {st_mode=S_IFREG|0755, st_size=28840, ...}) = 0 mmap(NULL, 2109744, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7f3befd97000 mprotect(0x7f3befd9a000, 2093056, PROT_NONE) = 0 mmap(0x7f3beff99000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x2000) = 0x7f3beff99000 close(3) = 0 openat(AT_FDCWD, "/lib64/libc.so.6", O_RDONLY|O_CLOEXEC) = 3 read(3, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\260\255\3\0\0\0\0\0"..., 832) = 832 fstat(3, {st_mode=S_IFREG|0755, st_size=3167176, ...}) = 0 lseek(3, 808, SEEK_SET) = 808 read(3, "\4\0\0\0\20\0\0\0\5\0\0\0GNU\0\2\0\0\300\4\0\0\0\3\0\0\0\0\0\0\0", 32) = 32 mmap(NULL, 3950400, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7f3bef9d2000 mprotect(0x7f3befb8e000, 2093056, PROT_NONE) = 0 mmap(0x7f3befd8d000, 24576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1bb000) = 0x7f3befd8d000 mmap(0x7f3befd93000, 14144, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7f3befd93000 close(3) = 0 openat(AT_FDCWD, "/lib64/libpthread.so.0", O_RDONLY|O_CLOEXEC) = 3 read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\240n\0\0\0\0\0\0"..., 832) = 832 fstat(3, {st_mode=S_IFREG|0755, st_size=321536, ...}) = 0 mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f3bf0cdf000 mmap(NULL, 2225344, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7f3bef7b2000 mprotect(0x7f3bef7cd000, 2093056, PROT_NONE) = 0 mmap(0x7f3bef9cc000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1a000) = 0x7f3bef9cc000 mmap(0x7f3bef9ce000, 13504, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7f3bef9ce000 close(3) = 0 mmap(NULL, 12288, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f3bf0cdc000 arch_prctl(ARCH_SET_FS, 0x7f3bf0cdc740) = 0 mprotect(0x7f3befd8d000, 16384, PROT_READ) = 0 mprotect(0x7f3bef9cc000, 4096, PROT_READ) = 0 mprotect(0x7f3beff99000, 4096, PROT_READ) = 0 mprotect(0x7f3bf031b000, 4096, PROT_READ) = 0 mprotect(0x7f3bf0542000, 4096, PROT_READ) = 0 mprotect(0x7f3bf0759000, 4096, PROT_READ) = 0 mprotect(0x7f3bf0ab7000, 36864, PROT_READ) = 0 mprotect(0x60b000, 4096, PROT_READ) = 0 mprotect(0x7f3bf0cee000, 4096, PROT_READ) = 0 munmap(0x7f3bf0ce3000, 44263) = 0 set_tid_address(0x7f3bf0cdca10) = 849654 set_robust_list(0x7f3bf0cdca20, 24) = 0 rt_sigaction(SIGRTMIN, {sa_handler=0x7f3bef7b8920, sa_mask=[], sa_flags=SA_RESTORER|SA_SIGINFO, sa_restorer=0x7f3bef7c4c20}, NULL, 8) = 0 rt_sigaction(SIGRT_1, {sa_handler=0x7f3bef7b89b0, sa_mask=[], sa_flags=SA_RESTORER|SA_RESTART|SA_SIGINFO, sa_restorer=0x7f3bef7c4c20}, NULL, 8) = 0 rt_sigprocmask(SIG_UNBLOCK, [RTMIN RT_1], NULL, 8) = 0 prlimit64(0, RLIMIT_STACK, NULL, {rlim_cur=8192*1024, rlim_max=RLIM64_INFINITY}) = 0 getpid() = 849654 brk(NULL) = 0x197d000 brk(0x199e000) = 0x199e000 brk(NULL) = 0x199e000 openat(AT_FDCWD, "/dev/writecache_sanity/block_io_origin", O_RDONLY) = 3 ioctl(3, BLKGETSIZE64, [16106127360]) = 0 close(3) = 0 write(2, "b_iogen starting up with the fol"..., 467b_iogen starting up with the following: Iterations: Infinite Seed: 849654 Offset-mode: random Single Pass: off Overlap Flag: on Mintrans: 512000 Maxtrans: 5120000 Syscalls: write writev Flags: direct Test Devices: Path Size (bytes) --------------------------------------------------------------- ) = 467 write(2, "/dev/writecache_sanity/block_io_"..., 90/dev/writecache_sanity/block_io_origin 16106127360 ) = 90 futex(0x7f3bf0ac1e88, FUTEX_WAKE_PRIVATE, 2147483647) = 0 fstat(1, {st_mode=S_IFIFO|0600, st_size=0, ...}) = 0 write(1, "<xior magic=\"0xfeed10\"><write sy"..., 4096) = 4096 write(1, "O_DIRECT</oflags><offset>1647623"..., 4096) = 4096 write(1, "<xior magic=\"0xfeed10\"><write sy"..., 4096) = 4096 write(1, "RECT</oflags><offset>967073792</"..., 4096) = 4096 Can not writev() 2548736 bytes to 2144523264 on /dev/writecache_sanity/block_io_origin: Invalid argument write(1, "or magic=\"0xfeed10\"><write sysca"..., 4096) = 4096 write(1, "IRECT</oflags><offset>420069376<"..., 4096) = -1 EPIPE (Broken pipe) --- SIGPIPE {si_signo=SIGPIPE, si_code=SI_USER, si_pid=849654, si_uid=0} --- +++ killed by SIGPIPE +++ The problem occurs in the io path, and since lvm is not used in the io path it's hard to imagine how the lvm version changes the behavior. lvm of course impacts it by setting dm tables, so it's probably worthwhile to compare the dm tables in the working/non-working instances. Unfortunately the test program is doing too many things to make it clear which io system call is the problem, or even if the io error was captured. The b_doio writev fails when the logical block size of the LV is 4k. I don't know if it's specific to b_doio or if any program using writev() would fail the same way (I don't know of a basic io test off hand that uses writev directly to a block device, most io testing is done on file systems.) Previous versions of lvm defaulted to 512 LBS when no fs block size was detected on the LV, but this recently changed to default to 4k LBS when no fs block size is found, which explains why updating the lvm version would change the behavior. Previous lvm versions have the same problem with 4k LBS. I'm guessing that writev() has some io size of alignment requirements that are violated given the combination of io parameters and block size. Or, it could be a bug in the io layer, perhaps related to checking io requests against device properties. I haven't been able to strace the b_doio process given the awkward way that these test programs run: /root/sts/src/b_doio/b_iogen -o -m random -f direct -s write -t1000b -T10000b -d /dev/cd/main | /root/sts/src/b_doio/b_doio -i 500 -v I also haven't found any documentation about writev() requirements related to device properties or io size/alignment requirements. The same error occurs with an raid+integrity LV that has a 4k logical block size, where using 512 works fine. dm-writecache and dm-integrity both have target-specific options for setting the logical block size of their dm device. As for testing other LV types (e.g. linear), I don't think that lvm/dm have a way of forcing a specific logical block size of the dm device (overriding the LBS from the underlying devices). Something like dm-ebs would be needed. There's no code change to make here. |