| Summary: | stripe entry self-heal.. | ||
|---|---|---|---|
| Product: | [Community] GlusterFS | Reporter: | Amar Tumballi <amarts> |
| Component: | stripe | Assignee: | Amar Tumballi <amarts> |
| Status: | CLOSED CURRENTRELEASE | QA Contact: | |
| Severity: | low | Docs Contact: | |
| Priority: | low | ||
| Version: | mainline | CC: | gluster-bugs, jdarcy, rabhat, vraman |
| Target Milestone: | --- | ||
| Target Release: | --- | ||
| Hardware: | All | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | Type: | --- | |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
|
Description
Amar Tumballi
2010-01-12 18:47:42 UTC
I have a strong interest in this because I'm working on a stripe/dht hybrid right now (create/open/read/write already work and distribute data much better than stripe or dht alone). Here are some observations. The stat structure does not come only from the first subvolume. What actually happens is that stripe_stat sends the stat call to all subvolumes, and the results are collected in stripe_stack_unwind_buf_cbk. The most important reason for this is to ensure that st_blocks and st_size get the correct values - which depend on the values from all subvolumes. We can heal the directory if it's missing, but it's not clear whether that really does any good since the files themselves will still be missing. If reads are attempted to blocks in the lost file, then those reads will fail because they're beyond the (empty) replacement's EOF. That's slightly unfortunate, but the situation is far worse if the file is extended. Imagine a file striped as 3x64K, and for some reason the directory on the last subvolume (index 2) is lost. If a user opens and then writes one byte at 320K, then the replacement file on subvolume 2 will be extended with a hole up to that point. Subsequent reads from 128K to 192K-1 will fall into the hole and appear to succeed but the read data will be zero. Recent experience with a similar bug on a different filesystem reinforces the point that most users would consider this a form of data corruption and would prefer that such reads fail completely instead of returning incorrect data. PATCH: http://patches.gluster.com/patch/2655 in master (stripe entry self heal) |