Description of problem: On a two node RHEL 4 U3 GFS cluster, I expand the LVM a GFS filesystem is on, then exectute gfs_grow /mnt/point on one node. Everything works fine on that node, but when you do a "df" on the second node, there is a short hang, then this message: df: `/mnt/samGFS': Input/output error. That file system is then unusable until you umount then mount again. This happened on 2/3 tests. The third test had the expected resutls. Version-Release number of selected component (if applicable): GFS-6.1.5 (RHEL 4 U3) How reproducible: Intermittent Steps to Reproduce: One node 1: # df -h /dev/mapper/sfwMPtest-lvol0 2.8G 316M 2.5G 12% /mnt/samGFS #lvresize -L +1024M /dev/sfwMPtest/lvol0 <success> # gfs_grow /mnt/samGFS FS: Mount Point: /mnt/samGFS FS: Device: /dev/sfwMPtest/lvol0 FS: Options: rw,noatime,nodiratime FS: Size: 786432 DEV: Size: 1048576 Preparing to write new FS information... Done. #df -h /dev/mapper/sfwMPtest-lvol0 3.8G 316M 3.5G 9% /mnt/samGFS So far, so good. But, on node 2 at this point a df -h results in: # df -h Filesystem Size Used Avail Use% Mounted on /dev/sda3 32G 1.3G 29G 5% / /dev/sda1 190M 34M 147M 19% /boot none 506M 0 506M 0% /dev/shm <hang for about 30 seconds> df: `/mnt/samGFS': Input/output error The mount point is not usable on node 2 at this point. After unmounting, then mounting again, it is fine. What strikes me is that I did this procedure three times. 2 times I got this behavior, and 1 time it worked perfectly on both nodes. Actual results: <hang for about 30 seconds> df: `/mnt/samGFS': Input/output error Expected results: df should show expanded size on both nodes without having to remount. Additional info:
I could not recreate the failure in our lab. I used lvresize and gfs_grow to expand GFS filesystems in several ways, in my two-node cluster, but it was always successful and showed no signs of I/O errors from either node using df. I'd like to get more specifics regarding how the original filesystem was created. If possible, I'd like to get the exact gfs_mkfs command used to create the failing filesystem. Also, I'd like to get the volume conditions to determine if the new extent could be crossing physical volume boundaries. (Sizes of the physical volumes, etc.) Please check that both nodes in the cluster are at RHEL4 U3 and running the same kernel (i.e. with uname -a). Also, please check dmesg on both nodes to see if any kernel errors are logged after the failure occurs. Anything else you can tell me that would help recreate the problem would be helpful.
Hi, The command used to make the filesystem was: gfs_mkfs -p lock_dlm -t rh4cluster:sfwMP2 -j 2 /dev/sfwMPtest/lvol0 This was done on the lab machines rh4cluster1 and 2. The LVM has since been destroyed. I will try to re-create and re-produce later today. There were no errors whatsoever in creating or resizing the volume groups or logical volumes, however. Note that on node one the volume were on a dm-multipathed device, not sure if that would have an impact. -Sam
Created attachment 126916 [details] output of all relevent commands
Hi, I reproduced this today. Attached is the output of all vginfo. The command used to create the new filesystem was: gfs_mkfs -p lock_dlm -t rh4cluster:sfwMP2 -j 2 /dev/sfwMPtest/lvol2 I remade the cluster config from scratch for this test. Again, node 1 is multipathed. Sam
After a lot of tries, I recreated the problem. Dmesg showed: attempt to access beyond end of device dm-4: rw=0, want=4340112, limit=4194304 attempt to access beyond end of device dm-4: rw=0, want=4340120, limit=4194304 attempt to access beyond end of device dm-4: rw=0, want=4340128, limit=4194304 attempt to access beyond end of device dm-4: rw=0, want=4340136, limit=4194304 attempt to access beyond end of device dm-4: rw=0, want=4340144, limit=4194304 attempt to access beyond end of device dm-4: rw=0, want=4340152, limit=4194304 attempt to access beyond end of device dm-4: rw=0, want=4340160, limit=4194304 attempt to access beyond end of device dm-4: rw=0, want=4340168, limit=4194304 attempt to access beyond end of device dm-4: rw=0, want=4340176, limit=4194304 attempt to access beyond end of device dm-4: rw=0, want=4340184, limit=4194304 attempt to access beyond end of device dm-4: rw=0, want=4340192, limit=4194304 attempt to access beyond end of device dm-4: rw=0, want=4340200, limit=4194304 attempt to access beyond end of device dm-4: rw=0, want=4340208, limit=4194304 attempt to access beyond end of device dm-4: rw=0, want=4340216, limit=4194304 attempt to access beyond end of device dm-4: rw=0, want=4340224, limit=4194304 GFS: fsid=bob_cluster2:bobs_gfs.0: fatal: I/O error GFS: fsid=bob_cluster2:bobs_gfs.0: block = 542527 GFS: fsid=bob_cluster2:bobs_gfs.0: function = gfs_dreread GFS: fsid=bob_cluster2:bobs_gfs.0: file = /usr/src/build/714649-i686/BUILD/gfs-kernel-2.6.9-49/up/src/gfs/dio.c, line = 576 GFS: fsid=bob_cluster2:bobs_gfs.0: time = 1143646309 GFS: fsid=bob_cluster2:bobs_gfs.0: about to withdraw from the cluster GFS: fsid=bob_cluster2:bobs_gfs.0: waiting for outstanding I/O GFS: fsid=bob_cluster2:bobs_gfs.0: telling LM to withdraw lock_dlm: withdraw abandoned memory GFS: fsid=bob_cluster2:bobs_gfs.0: withdrawn For whatever reason, it seemed to recreate better after a fresh boot. It wouldn't recreate yesterday after many attempts, but without a reboot.
This exact behavior can be seen if the volume group was created before the logical volume manager (lvm2) is put into clustering mode. This can happen if vgcreate is done before the lvm configuration file (/etc/lvm/lvm.conf) is changed to set locking_type = 2. If the volume group was created before the cluster locking for volumes was set up, the clustering flag for the vg will not be set, and that will cause changes to the underlying logical volumes (in this case, lvresize) to not be propagated throughout the cluster. If the volume is linear (i.e. not striped or mirrored, etc), all nodes in the cluster can still access the data successfully. That's because of the way the cluster suite is designed to work independently: (1) GFS can act as a stand-alone filesystem in a non-clustered environment. (2) Cluster suite can manage resources without GFS, and (3) volume groups can be used in either clustered or non-clustered environments. However, this also implies that you may not notice a problem until you do an operation like lvresize. The solution is to check the clustering flag for the volume group and turn it on if necessary. To check if the clustering flag is on for a volume group, use the "vgs" command and see if the "Attr" column shows a "c". If the attr column shows something like "wz--n-" the clustering flag is off for the volume group. If the "Attr" column shows something like "wz--nc" the clustering flag is on. To set the clustering flag on, use this command: vgchange -cy <volume group name> Once I set the clustering flag on for my volume group, I was unable to recreate the failure using the same steps as before. Please verify this with the customer and update bugzilla with the results.
Interesting: [root@rh4cluster2 ~]# vgs VG #PV #LV #SN Attr VSize VFree sfwMPtest 1 3 0 wz--n- 12.34G 344.00M I'm not sure how the vg could have been created before the lvm.conf was changed, as this is a test system that is constantly used for this kind of thing. However, I just created a new vg and the cluster flag was set, so this is clearly what happened. Thanks, Sam