Bug 503941 - After a node is fenced, got messages about unlink ckpt error
After a node is fenced, got messages about unlink ckpt error
Status: CLOSED WONTFIX
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: cman (Show other bugs)
5.3
x86_64 Linux
low Severity medium
: ---
: ---
Assigned To: David Teigland
Cluster QE
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2009-06-03 09:42 EDT by Flávio do Carmo Júnior
Modified: 2010-04-16 16:27 EDT (History)
6 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2010-04-16 16:27:23 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Flávio do Carmo Júnior 2009-06-03 09:42:09 EDT
Description of problem:

When I got one node fenced by any reason, I'm seeing messages about gfs_controld and "unlink ckpt error".

See /var/log/messages below:

Jun  3 09:33:24 aramis fenced[6819]: fence "porthos-priv" success
Jun  3 09:33:24 aramis gfs_controld[6831]: unlink ckpt error 12 ctdb
Jun  3 09:33:24 aramis gfs_controld[6831]: unlink ckpt status error 9 ctdb
Jun  3 09:33:24 aramis gfs_controld[6831]: unlink ckpt error 12 arquivodigital
Jun  3 09:33:24 aramis gfs_controld[6831]: unlink ckpt status error 9 arquivodigital
Jun  3 09:33:24 aramis gfs_controld[6831]: unlink ckpt error 12 geral
Jun  3 09:33:24 aramis gfs_controld[6831]: unlink ckpt status error 9 geral
Jun  3 09:33:24 aramis gfs_controld[6831]: unlink ckpt error 12 imagens
Jun  3 09:33:24 aramis gfs_controld[6831]: unlink ckpt status error 9 imagens
Jun  3 09:33:24 aramis gfs_controld[6831]: unlink ckpt error 12 plotagem
Jun  3 09:33:24 aramis gfs_controld[6831]: unlink ckpt status error 9 plotagem
Jun  3 09:33:24 aramis kernel: GFS2: fsid=MUSKETEER:geral.1: jid=0: Trying to acquire journal lock...
Jun  3 09:33:25 aramis gfs_controld[6831]: unlink ckpt error 12 projetoslv
Jun  3 09:33:25 aramis kernel: GFS2: fsid=MUSKETEER:imagens.1: jid=0: Trying to acquire journal lock...
Jun  3 09:33:25 aramis gfs_controld[6831]: unlink ckpt status error 9 projetoslv
Jun  3 09:33:25 aramis kernel: GFS2: fsid=MUSKETEER:arquivodigital.1: jid=0: Trying to acquire journal lock...
Jun  3 09:33:25 aramis gfs_controld[6831]: unlink ckpt error 12 projetosfechados
Jun  3 09:33:25 aramis kernel: GFS2: fsid=MUSKETEER:ctdb.1: jid=0: Trying to acquire journal lock...
Jun  3 09:33:25 aramis gfs_controld[6831]: unlink ckpt status error 9 projetosfechados
Jun  3 09:33:25 aramis kernel: GFS2: fsid=MUSKETEER:geral.1: jid=0: Looking at journal...
Jun  3 09:33:25 aramis gfs_controld[6831]: unlink ckpt error 12 scripts
Jun  3 09:33:25 aramis kernel: GFS2: fsid=MUSKETEER:imagens.1: jid=0: Looking at journal...
Jun  3 09:33:25 aramis gfs_controld[6831]: unlink ckpt status error 9 scripts
Jun  3 09:33:25 aramis kernel: GFS2: fsid=MUSKETEER:plotagem.1: jid=0: Trying to acquire journal lock...
Jun  3 09:33:25 aramis gfs_controld[6831]: unlink ckpt error 12 sharedadm
Jun  3 09:33:25 aramis kernel: GFS2: fsid=MUSKETEER:arquivodigital.1: jid=0: Looking at journal...
Jun  3 09:33:25 aramis gfs_controld[6831]: unlink ckpt status error 9 sharedadm
Jun  3 09:33:25 aramis kernel: GFS2: fsid=MUSKETEER:ctdb.1: jid=0: Looking at journal...
Jun  3 09:33:25 aramis gfs_controld[6831]: unlink ckpt error 12 sharedprod
Jun  3 09:33:25 aramis kernel: GFS2: fsid=MUSKETEER:plotagem.1: jid=0: Looking at journal...
Jun  3 09:33:25 aramis gfs_controld[6831]: unlink ckpt status error 9 sharedprod
Jun  3 09:33:25 aramis kernel: GFS2: fsid=MUSKETEER:projetoslv.1: jid=0: Trying to acquire journal lock...
Jun  3 09:33:25 aramis gfs_controld[6831]: unlink ckpt error 12 util
Jun  3 09:33:25 aramis kernel: GFS2: fsid=MUSKETEER:projetoslv.1: jid=0: Looking at journal...
Jun  3 09:33:25 aramis gfs_controld[6831]: unlink ckpt status error 9 util
Jun  3 09:33:25 aramis kernel: GFS2: fsid=MUSKETEER:projetosfechados.1: jid=0: Trying to acquire journal lock...
Jun  3 09:33:25 aramis kernel: GFS2: fsid=MUSKETEER:geral.1: jid=0: Acquiring the transaction lock...
Jun  3 09:33:25 aramis kernel: GFS2: fsid=MUSKETEER:geral.1: jid=0: Replaying journal...
Jun  3 09:33:25 aramis kernel: GFS2: fsid=MUSKETEER:geral.1: jid=0: Replayed 0 of 0 blocks
Jun  3 09:33:25 aramis kernel: GFS2: fsid=MUSKETEER:geral.1: jid=0: Found 0 revoke tags
Jun  3 09:33:25 aramis kernel: GFS2: fsid=MUSKETEER:geral.1: jid=0: Journal replayed in 1s
Jun  3 09:33:25 aramis kernel: GFS2: fsid=MUSKETEER:geral.1: jid=0: Done
Jun  3 09:33:25 aramis kernel: GFS2: fsid=MUSKETEER:scripts.1: jid=0: Trying to acquire journal lock...
Jun  3 09:33:25 aramis kernel: GFS2: fsid=MUSKETEER:scripts.1: jid=0: Looking at journal...
Jun  3 09:33:25 aramis kernel: GFS2: fsid=MUSKETEER:sharedadm.1: jid=0: Trying to acquire journal lock...
Jun  3 09:33:25 aramis kernel: GFS2: fsid=MUSKETEER:imagens.1: jid=0: Done
Jun  3 09:33:25 aramis kernel: GFS2: fsid=MUSKETEER:ctdb.1: jid=0: Acquiring the transaction lock...
Jun  3 09:33:25 aramis kernel: GFS2: fsid=MUSKETEER:ctdb.1: jid=0: Replaying journal...
Jun  3 09:33:25 aramis kernel: GFS2: fsid=MUSKETEER:ctdb.1: jid=0: Replayed 1 of 1 blocks
Jun  3 09:33:25 aramis kernel: GFS2: fsid=MUSKETEER:ctdb.1: jid=0: Found 0 revoke tags
Jun  3 09:33:25 aramis kernel: GFS2: fsid=MUSKETEER:ctdb.1: jid=0: Journal replayed in 1s
Jun  3 09:33:25 aramis kernel: GFS2: fsid=MUSKETEER:ctdb.1: jid=0: Done
Jun  3 09:33:25 aramis kernel: GFS2: fsid=MUSKETEER:sharedprod.1: jid=0: Trying to acquire journal lock...
Jun  3 09:33:25 aramis kernel: GFS2: fsid=MUSKETEER:util.1: jid=0: Trying to acquire journal lock...
Jun  3 09:33:25 aramis kernel: GFS2: fsid=MUSKETEER:util.1: jid=0: Looking at journal...
Jun  3 09:33:25 aramis kernel: GFS2: fsid=MUSKETEER:plotagem.1: jid=0: Done
Jun  3 09:33:25 aramis kernel: GFS2: fsid=MUSKETEER:arquivodigital.1: jid=0: Done
Jun  3 09:33:25 aramis kernel: GFS2: fsid=MUSKETEER:scripts.1: jid=0: Done
Jun  3 09:33:25 aramis kernel: GFS2: fsid=MUSKETEER:projetoslv.1: jid=0: Done
Jun  3 09:33:25 aramis kernel: GFS2: fsid=MUSKETEER:util.1: jid=0: Done
Jun  3 09:33:32 aramis kernel: GFS2: fsid=MUSKETEER:sharedprod.1: jid=0: Looking at journal...
Jun  3 09:33:32 aramis kernel: GFS2: fsid=MUSKETEER:sharedadm.1: jid=0: Looking at journal...
Jun  3 09:33:32 aramis kernel: GFS2: fsid=MUSKETEER:projetosfechados.1: jid=0: Looking at journal...
Jun  3 09:33:33 aramis kernel: GFS2: fsid=MUSKETEER:sharedprod.1: jid=0: Acquiring the transaction lock...
Jun  3 09:33:33 aramis kernel: GFS2: fsid=MUSKETEER:sharedprod.1: jid=0: Replaying journal...
Jun  3 09:33:33 aramis kernel: GFS2: fsid=MUSKETEER:sharedprod.1: jid=0: Replayed 13 of 19 blocks
Jun  3 09:33:33 aramis kernel: GFS2: fsid=MUSKETEER:sharedprod.1: jid=0: Found 4 revoke tags
Jun  3 09:33:33 aramis kernel: GFS2: fsid=MUSKETEER:sharedprod.1: jid=0: Journal replayed in 1s
Jun  3 09:33:33 aramis kernel: GFS2: fsid=MUSKETEER:sharedprod.1: jid=0: Done
Jun  3 09:33:33 aramis kernel: GFS2: fsid=MUSKETEER:sharedadm.1: jid=0: Acquiring the transaction lock...
Jun  3 09:33:33 aramis kernel: GFS2: fsid=MUSKETEER:sharedadm.1: jid=0: Replaying journal...
Jun  3 09:33:33 aramis kernel: GFS2: fsid=MUSKETEER:sharedadm.1: jid=0: Replayed 0 of 0 blocks
Jun  3 09:33:33 aramis kernel: GFS2: fsid=MUSKETEER:sharedadm.1: jid=0: Found 0 revoke tags
Jun  3 09:33:33 aramis kernel: GFS2: fsid=MUSKETEER:sharedadm.1: jid=0: Journal replayed in 1s
Jun  3 09:33:33 aramis kernel: GFS2: fsid=MUSKETEER:sharedadm.1: jid=0: Done
Jun  3 09:33:33 aramis kernel: GFS2: fsid=MUSKETEER:projetosfechados.1: jid=0: Done


This doesn't seem to be really a problem, after messages the filesystem mounts and it work, as a user viewing, normal.

Version-Release number of selected component (if applicable):
[root@athos ~]# rpm -qa| grep -iE 'gfs2|cman|openais|clust|ipmi'
system-config-cluster-1.0.55-1.0
cluster-snmp-0.12.1-2.el5
gfs2-utils-0.1.53-1.el5_3.3
cman-2.0.98-1.el5_3.1
Cluster_Administration-en-US-5.2-1
lvm2-cluster-2.02.40-7.el5
modcluster-0.12.1-2.el5
cluster-cim-0.12.1-2.el5
openais-0.80.3-22.el5_3.4
OpenIPMI-tools-2.0.6-11.el5
OpenIPMI-libs-2.0.6-11.el5
OpenIPMI-2.0.6-11.el5
[root@athos ~]# uname -a
Linux athos.intranet.prosul 2.6.18-128.1.10.el5 #1 SMP Wed Apr 29 13:53:08 EDT 2009 x86_64 x86_64 x86_64 GNU/Linux
[root@athos ~]# cat /etc/redhat-release
Red Hat Enterprise Linux Server release 5.3 (Tikanga)

Additional info:
 I'm using fencing by IPMILAN, and gfs2 is serving for Samba+CTDB fileserver.
Comment 1 Robert Peterson 2009-06-03 10:29:22 EDT
Reassigning to Dave Teigland as per our discussion this morning.
I'm also changing the product to RHEL5, since this is clearly not
RHEL4.
Comment 2 Robert Peterson 2009-06-03 10:30:09 EDT
Adding Steve Dake to the cc list in case this is an openais issue.
Comment 3 David Teigland 2009-06-03 11:40:16 EDT
We've always seen ckpt unlink errors, and not known quite why they appear, it's generally not a problem.  In this case it seems likely to be a result of the node failure.  The critical bit of gfs_controld checkpoints are the creating and the reading.
Comment 4 RHEL Product and Program Management 2010-04-16 16:27:23 EDT
Development Management has reviewed and declined this request.  You may appeal
this decision by reopening this request.

Note You need to log in before you can comment on or make changes to this bug.