Bug 783211

Summary: Cache inconsistency when reading from a partition vs the parent block_device
Product: [Fedora] Fedora Reporter: Joey Boggs <jboggs>
Component: kernelAssignee: Kernel Maintainer List <kernel-maint>
Status: CLOSED INSUFFICIENT_DATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: high Docs Contact:
Priority: high    
Version: 19CC: apevec, bmr, bsarathy, jboggs, jforbes, kernel-maint, mburns, moshiro, ndevos, oschreib, ovirt-maint, peterm, the.ridikulus.rat
Target Milestone: ---Keywords: Patch, Reopened
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 782771 Environment:
Last Closed: 2013-04-23 17:25:58 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 761540    
Attachments:
Description Flags
fs: Invalidate the cache for a parent block-device if fsync() is called for a partition none

Comment 1 Joey Boggs 2012-01-19 17:02:59 UTC
please pull in this upstream patch which is needed for ovirt-node 

http://thread.gmane.org/gmane.linux.kernel/1241227

Comment 2 Josh Boyer 2012-01-19 17:25:43 UTC
I've started a scratch build of F16 with that patch included.  Could you test this when it finishes building and let us know if it solves the issues you're seeing with ovirt-node?

http://koji.fedoraproject.org/koji/taskinfo?taskID=3715427

Comment 3 Josh Boyer 2012-01-20 02:25:42 UTC
This patch got at least one NAK upstream:

https://lkml.org/lkml/2012/1/19/455

I'll keep an eye on it.

Comment 4 Niels de Vos 2012-01-23 11:27:07 UTC
A 2nd version of the patch is available for review here:
- http://thread.gmane.org/gmane.linux.kernel/1241227/focus=1241689

Note that this does not change the behavior of the ioctl BLKFLSBUF anymore, only fsync() on device-node of a partition.

I do not know how ovirt-node initiates a sync, so it may well be that this patch does not fix the problem for ovirt-node.

A kernel for testing is building atm:
- http://koji.fedoraproject.org/koji/taskinfo?taskID=3724542

Please report of that kernel fixes the problem for ovirt-node.

Comment 5 Joey Boggs 2012-01-23 16:35:04 UTC
Niels,

Our reproducer was fairly highlevel

parted -s /dev/mapper/XXXXX "mkpart efi 0M 256M"
mkfs.vfat /dev/mapper/XXXXp1 -n EFI

parted -s /dev/mapper/XXXXX "mkpart root 256M 512M"
parted -s /dev/mapper/XXXXX "mkpart rootbackup 512M 768M"
mke2fs /dev/mapper/XXXXp2 -L Root
mke2fs /dev/mapper/XXXXp3 -L RootBackup

The vfat partition's fs was corrupted after running the remaining parted commands

The original patch did fix the issue enough to where we could work around it but there seemed to be another underlying issue with the dm device.

Comment 6 Niels de Vos 2012-01-23 17:27:02 UTC
Hi Joey,

that is very unfortunate. I expect it to work when you use normal partitions and not block-devices provided by device-mapper.

Someone else will need to check how this can be fixed in device-mapper. It was not the problem I was targeting to fix with this patch.

Comment 7 Joey Boggs 2012-01-23 19:17:57 UTC
Niels,

FWIW your patch still works for me with the dm devices on top.

Comment 8 Niels de Vos 2012-01-24 15:46:35 UTC
Ah, that's good to hear. Obviously device-mapper sets up the parent of the XXXXp1 and the like the same way as traditional partitions.

The patch currently has one Acked-by and one Reviewed-by, so far all looks good.

Comment 9 Niels de Vos 2012-01-24 16:47:35 UTC
Created attachment 557271 [details]
fs: Invalidate the cache for a parent block-device if fsync() is called for a partition

Comment 10 Mike Burns 2012-01-25 15:26:29 UTC
Is there any timeframe for a released kernel for this bz?  oVirt upstream release is blocking on this issue.

Comment 11 Josh Boyer 2012-01-25 15:43:22 UTC
(In reply to comment #10)
> Is there any timeframe for a released kernel for this bz?  oVirt upstream
> release is blocking on this issue.

oVirt upstream is blocked on an F16 kernel?  Does it need to be in the stable updates repository, or just built in koji?

I'll get the patch included later today but the next update we push is going to be when 3.2.2 is released and built.

Comment 12 Mike Burns 2012-01-25 15:53:37 UTC
(In reply to comment #11)
> 
> oVirt upstream is blocked on an F16 kernel?  Does it need to be in the stable
> updates repository, or just built in koji?

ovirt-node depends heavily on Fedora.  This bug is blocking ovirt-node from working on UEFI machines at all.

Ideally, it would be in stable.  I'll check so see if we can pull a one-off build for the release though.

> 
> I'll get the patch included later today but the next update we push is going to
> be when 3.2.2 is released and built.

Is there a timeframe for that to be done?

Thanks

Comment 13 Josh Boyer 2012-01-25 16:01:49 UTC
(In reply to comment #12)
> (In reply to comment #11)
> > 
> > oVirt upstream is blocked on an F16 kernel?  Does it need to be in the stable
> > updates repository, or just built in koji?
> 
> ovirt-node depends heavily on Fedora.  This bug is blocking ovirt-node from
> working on UEFI machines at all.
> 
> Ideally, it would be in stable.  I'll check so see if we can pull a one-off
> build for the release though.

If you need it in stable, you will be waiting a while.  Depending on Fedora is excellent, but it also means you depend on how Fedora works which means you have to wait until a submitted update gets enough karma to make it into the stable updates repository.

I would seriously suggest looking at having a procedure in place for getting one-off builds from koji or something similar.

> > I'll get the patch included later today but the next update we push is going to
> > be when 3.2.2 is released and built.
> 
> Is there a timeframe for that to be done?

It is supposed to be released upstream today in about 4 hours.  I'll get it integrated and built shortly thereafter.  Then I'll submit the update.  Assuming it releases on time, it should be queued for updates-testing sometime later this evening.

Comment 14 Perry Myers 2012-01-25 16:58:58 UTC
mburns, Josh brings up a good point.  Either we're going to need to align oVirt release schedules to be dependent on Fedora dates or we're going to need to be comfortable taking stuff from updates-testing and/or rawhide.

This probably means that we will need to have processes in place for oVirt Node builds that allow us to take selected packages from updates-testing/rawhide on an as needed basis.  Can we make that happen?

Comment 15 Mike Burns 2012-01-25 17:05:37 UTC
dynamically saying "Use the kernel from updates-testing" is not easy.  It would lead to all of updates-testing getting pulled in which we probably don't want.  We do have the ability to use a custom repo, so downloading and creating a local repo is possible (and what I would probably do for this particular build).  Automation is a lot more difficult though.  

My main concern with taking a one-off build like this is that it's not an official build that has gone through full testing, so we open ourselves to other issues.

Comment 16 Josh Boyer 2012-01-25 23:06:54 UTC
I've added Niels' patch to the F16 kernel.  It will be in the next build I do.  Hopefully that is 3.2.2, but it seems a bit slow coming so if it doesn't show up in the next hour or two I'll just do another 3.2.1 build.

Comment 17 Niels de Vos 2012-01-26 13:42:02 UTC
/side note/
The phrasing of the commit message has been changed in a 3rd revision of the patch. There is no change in functionality, hence no need to replace the patch in the Fedora kernel package.

Comment 18 Fedora Update System 2012-01-26 13:52:49 UTC
kernel-3.2.2-1.fc16 has been submitted as an update for Fedora 16.
https://admin.fedoraproject.org/updates/kernel-3.2.2-1.fc16

Comment 19 Fedora Update System 2012-01-26 22:54:39 UTC
Package kernel-3.2.2-1.fc16:
* should fix your issue,
* was pushed to the Fedora 16 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing kernel-3.2.2-1.fc16'
as soon as you are able to, then reboot.
Please go to the following url:
https://admin.fedoraproject.org/updates/FEDORA-2012-0949/kernel-3.2.2-1.fc16
then log in and leave karma (feedback).

Comment 20 Fedora Update System 2012-01-28 03:33:06 UTC
kernel-3.2.2-1.fc16 has been pushed to the Fedora 16 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 21 Josh Boyer 2012-01-29 13:59:29 UTC
Upstream still isn't liking this patch.  I'll leave it in for a while, but if upstream decides to leave this unfixed I think oVirt will need to investigate other measures.  If it's fixed some other way, we'll bring that fix back instead.

Comment 22 Niels de Vos 2012-01-31 16:13:30 UTC
Upstream would prefer to go into a better fix, addressing the design issue that caused this problem in the first place.

It makes sense to change the design, but it will take (more) time to get an acceptable patch for that.

Comment 23 Niels de Vos 2012-02-03 18:31:05 UTC
It is highly unlikely that upstream includes the proposed patch any time soon. It will take quite some more effort to develop a solution that upstream can agree upon and accept.

It is recommended to use a workaround like
    # blockdev --flushbufs /dev/vdb
or
    # echo 3 > /proc/sys/vm/drop_caches

Alternatively the tools affected by this issue, may need to open devices with O_DIRECT. This is known to work for some use-cases.

Josh, you probably want to pull this patch back from the package for the next release.

Comment 24 Josh Boyer 2012-02-03 19:11:25 UTC
(In reply to comment #23)
> Josh, you probably want to pull this patch back from the package for the next
> release.

Done.  Thank you for confirming.

I'm going to move this bug to rawhide until an upstream solution is settled on.

Comment 25 Fedora End Of Life 2013-04-03 18:09:34 UTC
This bug appears to have been reported against 'rawhide' during the Fedora 19 development cycle.
Changing version to '19'.

(As we did not run this process for some time, it could affect also pre-Fedora 19 development
cycle bugs. We are very sorry. It will help us with cleanup during Fedora 19 End Of Life. Thank you.)

More information and reason for this action is here:
https://fedoraproject.org/wiki/BugZappers/HouseKeeping/Fedora19

Comment 26 Justin M. Forbes 2013-04-05 16:59:39 UTC
Is this still a problem with 3.9 based F19 kernels?

Comment 27 Justin M. Forbes 2013-04-23 17:25:58 UTC
This bug is being closed with INSUFFICIENT_DATA as there has not been a
response in 2 weeks.  If you are still experiencing this issue,
please reopen and attach the relevant data from the latest kernel you are
running and any data that might have been requested previously.