Bug 173157
| Summary: | kernel dm-log: big endian 64-bit corruption | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 4 | Reporter: | Alasdair Kergon <agk> |
| Component: | kernel | Assignee: | Alasdair Kergon <agk> |
| Status: | CLOSED ERRATA | QA Contact: | Brian Brock <bbrock> |
| Severity: | medium | Docs Contact: | |
| Priority: | medium | ||
| Version: | 4.0 | CC: | jbaron |
| Target Milestone: | --- | ||
| Target Release: | --- | ||
| Hardware: | All | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | RHSA-2006-0132 | Doc Type: | Bug Fix |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2006-03-07 20:43:47 UTC | Type: | --- |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | |||
| Bug Blocks: | 168429 | ||
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2006-0132.html |
The linux bitset operators (test_bit, set_bit etc) work on arrays of "unsigned long". dm-log uses such bitsets but treats them as arrays of uint32_t, only allocating and zeroing a multiple of 4 bytes (as 'clean_bits' is a uint32_t). The patch below fixes this problem. The problem is specific to 64-bit big endian machines such as s390x or ppc-64 and can prevent pvmove terminating. In the simplest case, if "region_count" were (say) 30, then bitset_size (below) would be 4 and bitset_uint32_count would be 1. Thus the memory for this butset, after allocation and zeroing would be 0 0 0 0 X X X X On a bigendian 64bit machine, bit 0 for this bitset is in the 8th byte! (and every bit that dm-log would use would be in the X area). 0 0 0 0 X X X X ^ here which hasn't been cleared properly. As the dm-raid1 code only syncs and counts regions which have a 0 in the 'sync_bits' bitset, and only finishes when it has counted high enough, a large number of 1's among those 'X's will cause the sync to not complete. It is worth noting that the code uses the same bitsets for in-memory and on-disk logs. As these bitsets are host-endian and host-sized, this means that they cannot safely be moved between computers with different architectures. Signed-off-by: Neil Brown <neilb> dm-log-bitset-fix.patch