Bug 986035

Summary: Latest updates run 07/16/2013: Glibc causes partitions to mount read only
Product: Red Hat Enterprise Linux 6 Reporter: steve_ovens <steve_ovens>
Component: glibcAssignee: Carlos O'Donell <codonell>
Status: CLOSED INSUFFICIENT_DATA QA Contact: qe-baseos-tools-bugs
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 6.4CC: fweimer, mfranc, pfrankli, spoyarek
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-01-14 22:08:02 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description steve_ovens@linux.com 2013-07-18 20:13:03 UTC
Description of problem:

After running yum update, multiple machines boot with all partitions in read-only

Version-Release number of selected component (if applicable):


How reproducible:

Was able to reproduce this on multiple machines by running the updates, fixing the problem and then re-running the update


Steps to Reproduce:
1. Run yum update
2. reboot
3. 

Actual results:

All file systems in an LVM mount read only

Expected results:

File system mounts RW.

Additional info:

I did a number of disk checks including fsck, smartctl, e2fsck, badblocks. I also tried various options in /etc/fstab. Additionally I tried toggling /etc/sysconfig/readonly to yes, and than back to no. Disabled SELinux etc.

I eventually found a post on the Arch forums from 2012 which noted a similar problem. The fix was to downgrade GlibC.

Running "yum downgrade glibc*" and rebooting fixed the problem.

I was able to reproduce this by re-running the updates.

Affects 2 machines on commercial/commodity hardware both.

RHEL 6.4 x86_64
2.6.32-358.14.1.el6.x86_64

Comment 2 Carlos O'Donell 2013-07-18 22:29:19 UTC
Which glibc version are you running and which one did you upgrade to?

Is your system virtual? If yes, what kernel is the host running?

The Arch Linux problem was due to an incompatibility with the OpenVZ kernel, and the VPS glibc shiped by the OpenVZ team and the statfs64 f_flags field passed from the kernel to glibc i.e. the kernel was too old for the glibc which assumed f_flags were valid.

You should not have that kind of incompatibility between a RHEL kernel and a RHEL glibc.

Comment 3 steve_ovens@linux.com 2013-07-19 01:25:34 UTC
Neither system is virtual. Both are on physical hardware, my appologies when I said commercial commodity hardware I was trying to say they are on phyiscal hardware.

I do not recall what version of glibc I was running but I can provide the version that was upgraded to and the version that was downgraded to (and is at currently) tomorrow.

I did read the Arch report and realized it had to do with a hosted machine, but I was grasping at straws at the point which I found that.

Comment 4 Carlos O'Donell 2013-07-19 03:23:48 UTC
(In reply to steve_ovens from comment #3)
> Neither system is virtual. Both are on physical hardware, my appologies when
> I said commercial commodity hardware I was trying to say they are on
> phyiscal hardware.
> 
> I do not recall what version of glibc I was running but I can provide the
> version that was upgraded to and the version that was downgraded to (and is
> at currently) tomorrow.
> 
> I did read the Arch report and realized it had to do with a hosted machine,
> but I was grasping at straws at the point which I found that.

The problem reported for Arch could happen with you also, but it would require that you rebuild the kernel without f_flags support, or that glibc mis-detects that support. Therefore I'd like to know exactly what versions of glibc you moved between.

Also if it is an f_flags issue the filesystem will *appear* to be mounted read only, but it will actually be writable as the kernel mounted it correctly but userspace misreports it as read only.

So in summary:
* Please provide the versions of glibc.
* Please try to write to the read only filesystem.

Comment 5 steve_ovens@linux.com 2013-07-19 11:00:22 UTC
based on yum.log I moved from glibc

2.12-1.107.el6

To

2.12.1.107.el6_4.2


Additionally I moved between these kernel versions:

2.6.32-358.11.1.el6.x86_64

to

2.6.32-358.14.1.el6.x86_64

As for the read only file system, it was definitely readonly. What prompted the investigation was during the boot system there were a bunch of errors which said things like

"Cannot write pid to /var/lock/subsys/xxx read only file system"

The partition setup is as follows:

Regular partitions:

/boot

LVM:

/
/tmp
/home
/var


Given this, I thought there was a problem with the /var LVM. I disabled /var in /etc/fstab and then attempted to copy the information from the var LVM directly onto the root partition. I had to 

mount -o remount,rw /

before I was able to. Again thinking this was a side effect of a problem with /var, I rebooted and had the same result, failed to write pids into /var

If I boot into single user mode and remount the file system as rw, then 

vgchange -ay
mount -a
init 3

The system works as expected. However regular boot produced the same problem, all file systems were readonly.


After downgrading glibc and reboot file systems mount normally again. I did not compile these kernels myself, they came from the official repo.

Comment 6 Carlos O'Donell 2013-07-19 16:08:00 UTC
If the filesystem is actually mounted RO then something is going on in the startup process.

You will need to debug your startup to determine why and when everything gets flipped into RO mode.

The reason for being in RO mode will give us a hint to what might be wrong.

What your system bootup (dmesg) say? What do your logs say about why the fs was mounted RO? Or not remounted RW?

This ticket does not yet conclusively point at a glibc issue.