Bug 844348
Summary: | Some (older) SSDs are slower with rotational=0 flag set | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 6 | Reporter: | Milan Broz <mbroz> |
Component: | kernel | Assignee: | Jeff Moyer <jmoyer> |
Status: | CLOSED NOTABUG | QA Contact: | Red Hat Kernel QE team <kernel-qe> |
Severity: | unspecified | Docs Contact: | |
Priority: | unspecified | ||
Version: | 6.4 | CC: | jmoyer, pvrabec, rwheeler |
Target Milestone: | rc | ||
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2012-08-06 17:00:08 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Milan Broz
2012-07-30 11:53:46 UTC
This is functioning as designed. What happens is this: Using the CFQ I/O scheduler, you run a sequential read workload, with readahead set to 128KB (it doesn't matter whether the rotational flag is set to 0 or 1). During this run, the device queue depth is driven beyond 4, at which point the kernel marks the device with the QUEUE_FLAG_CQ flag. When this flag is set, it affects whether and how long the device queue remains plugged (see queue_should_plug). Basically, if the device is non-rotational and supports command queuing, we go ahead and send requests sooner rather than later, under the assumption that newer SSDs will have no problem driving high IOPS. The deadline I/O scheduler doesn't drive a queue depth of more than 2 for this particular workload, so you never actually set the QUEUE_FLAG_CQ flag. Because of that, only read-ahead sized I/Os make it to disk, and you have better throughput. In general, your test workload is poor. Buffered I/O to the block device is not a path that we tune for performance. Also, a single threaded read is fairly simplistic, especially when taken as the lone data point. So, I'm closing this bugzilla as NOTABUG. If it is functioning as designed, why it is working better in upstream kernel? :-) Whatever, I really do not care. This bug was discovered as part of testing of more complex problem upstream. (In reply to comment #3) > If it is functioning as designed, why it is working better in upstream > kernel? :-) The on-stack plugging patches introduced plugging where there previously was none (a quick blktrace run on 3.5.0 confirms this). So, in other words, I/Os are queued up before even getting to the I/O scheduler. Thus, by the time they get there, they are "complete." So, your 256 page read-ahead I/O arrives in one chunk, instead of a bunch of Queue/Merge events. The on-stack plugging is pretty invasive, so I don't think it's feasible to backport it to RHEL 6. > Whatever, I really do not care. This bug was discovered as part of testing > of more complex problem upstream. If this was a more realistic workload, I'd put more time into it. I just don't think anyone cares about a single dd to the block device (and if you do care, you can tune the system for this workload). |