Red Hat Bugzilla – Bug 503941
After a node is fenced, got messages about unlink ckpt error
Last modified: 2010-04-16 16:27:23 EDT
Description of problem: When I got one node fenced by any reason, I'm seeing messages about gfs_controld and "unlink ckpt error". See /var/log/messages below: Jun 3 09:33:24 aramis fenced[6819]: fence "porthos-priv" success Jun 3 09:33:24 aramis gfs_controld[6831]: unlink ckpt error 12 ctdb Jun 3 09:33:24 aramis gfs_controld[6831]: unlink ckpt status error 9 ctdb Jun 3 09:33:24 aramis gfs_controld[6831]: unlink ckpt error 12 arquivodigital Jun 3 09:33:24 aramis gfs_controld[6831]: unlink ckpt status error 9 arquivodigital Jun 3 09:33:24 aramis gfs_controld[6831]: unlink ckpt error 12 geral Jun 3 09:33:24 aramis gfs_controld[6831]: unlink ckpt status error 9 geral Jun 3 09:33:24 aramis gfs_controld[6831]: unlink ckpt error 12 imagens Jun 3 09:33:24 aramis gfs_controld[6831]: unlink ckpt status error 9 imagens Jun 3 09:33:24 aramis gfs_controld[6831]: unlink ckpt error 12 plotagem Jun 3 09:33:24 aramis gfs_controld[6831]: unlink ckpt status error 9 plotagem Jun 3 09:33:24 aramis kernel: GFS2: fsid=MUSKETEER:geral.1: jid=0: Trying to acquire journal lock... Jun 3 09:33:25 aramis gfs_controld[6831]: unlink ckpt error 12 projetoslv Jun 3 09:33:25 aramis kernel: GFS2: fsid=MUSKETEER:imagens.1: jid=0: Trying to acquire journal lock... Jun 3 09:33:25 aramis gfs_controld[6831]: unlink ckpt status error 9 projetoslv Jun 3 09:33:25 aramis kernel: GFS2: fsid=MUSKETEER:arquivodigital.1: jid=0: Trying to acquire journal lock... Jun 3 09:33:25 aramis gfs_controld[6831]: unlink ckpt error 12 projetosfechados Jun 3 09:33:25 aramis kernel: GFS2: fsid=MUSKETEER:ctdb.1: jid=0: Trying to acquire journal lock... Jun 3 09:33:25 aramis gfs_controld[6831]: unlink ckpt status error 9 projetosfechados Jun 3 09:33:25 aramis kernel: GFS2: fsid=MUSKETEER:geral.1: jid=0: Looking at journal... Jun 3 09:33:25 aramis gfs_controld[6831]: unlink ckpt error 12 scripts Jun 3 09:33:25 aramis kernel: GFS2: fsid=MUSKETEER:imagens.1: jid=0: Looking at journal... Jun 3 09:33:25 aramis gfs_controld[6831]: unlink ckpt status error 9 scripts Jun 3 09:33:25 aramis kernel: GFS2: fsid=MUSKETEER:plotagem.1: jid=0: Trying to acquire journal lock... Jun 3 09:33:25 aramis gfs_controld[6831]: unlink ckpt error 12 sharedadm Jun 3 09:33:25 aramis kernel: GFS2: fsid=MUSKETEER:arquivodigital.1: jid=0: Looking at journal... Jun 3 09:33:25 aramis gfs_controld[6831]: unlink ckpt status error 9 sharedadm Jun 3 09:33:25 aramis kernel: GFS2: fsid=MUSKETEER:ctdb.1: jid=0: Looking at journal... Jun 3 09:33:25 aramis gfs_controld[6831]: unlink ckpt error 12 sharedprod Jun 3 09:33:25 aramis kernel: GFS2: fsid=MUSKETEER:plotagem.1: jid=0: Looking at journal... Jun 3 09:33:25 aramis gfs_controld[6831]: unlink ckpt status error 9 sharedprod Jun 3 09:33:25 aramis kernel: GFS2: fsid=MUSKETEER:projetoslv.1: jid=0: Trying to acquire journal lock... Jun 3 09:33:25 aramis gfs_controld[6831]: unlink ckpt error 12 util Jun 3 09:33:25 aramis kernel: GFS2: fsid=MUSKETEER:projetoslv.1: jid=0: Looking at journal... Jun 3 09:33:25 aramis gfs_controld[6831]: unlink ckpt status error 9 util Jun 3 09:33:25 aramis kernel: GFS2: fsid=MUSKETEER:projetosfechados.1: jid=0: Trying to acquire journal lock... Jun 3 09:33:25 aramis kernel: GFS2: fsid=MUSKETEER:geral.1: jid=0: Acquiring the transaction lock... Jun 3 09:33:25 aramis kernel: GFS2: fsid=MUSKETEER:geral.1: jid=0: Replaying journal... Jun 3 09:33:25 aramis kernel: GFS2: fsid=MUSKETEER:geral.1: jid=0: Replayed 0 of 0 blocks Jun 3 09:33:25 aramis kernel: GFS2: fsid=MUSKETEER:geral.1: jid=0: Found 0 revoke tags Jun 3 09:33:25 aramis kernel: GFS2: fsid=MUSKETEER:geral.1: jid=0: Journal replayed in 1s Jun 3 09:33:25 aramis kernel: GFS2: fsid=MUSKETEER:geral.1: jid=0: Done Jun 3 09:33:25 aramis kernel: GFS2: fsid=MUSKETEER:scripts.1: jid=0: Trying to acquire journal lock... Jun 3 09:33:25 aramis kernel: GFS2: fsid=MUSKETEER:scripts.1: jid=0: Looking at journal... Jun 3 09:33:25 aramis kernel: GFS2: fsid=MUSKETEER:sharedadm.1: jid=0: Trying to acquire journal lock... Jun 3 09:33:25 aramis kernel: GFS2: fsid=MUSKETEER:imagens.1: jid=0: Done Jun 3 09:33:25 aramis kernel: GFS2: fsid=MUSKETEER:ctdb.1: jid=0: Acquiring the transaction lock... Jun 3 09:33:25 aramis kernel: GFS2: fsid=MUSKETEER:ctdb.1: jid=0: Replaying journal... Jun 3 09:33:25 aramis kernel: GFS2: fsid=MUSKETEER:ctdb.1: jid=0: Replayed 1 of 1 blocks Jun 3 09:33:25 aramis kernel: GFS2: fsid=MUSKETEER:ctdb.1: jid=0: Found 0 revoke tags Jun 3 09:33:25 aramis kernel: GFS2: fsid=MUSKETEER:ctdb.1: jid=0: Journal replayed in 1s Jun 3 09:33:25 aramis kernel: GFS2: fsid=MUSKETEER:ctdb.1: jid=0: Done Jun 3 09:33:25 aramis kernel: GFS2: fsid=MUSKETEER:sharedprod.1: jid=0: Trying to acquire journal lock... Jun 3 09:33:25 aramis kernel: GFS2: fsid=MUSKETEER:util.1: jid=0: Trying to acquire journal lock... Jun 3 09:33:25 aramis kernel: GFS2: fsid=MUSKETEER:util.1: jid=0: Looking at journal... Jun 3 09:33:25 aramis kernel: GFS2: fsid=MUSKETEER:plotagem.1: jid=0: Done Jun 3 09:33:25 aramis kernel: GFS2: fsid=MUSKETEER:arquivodigital.1: jid=0: Done Jun 3 09:33:25 aramis kernel: GFS2: fsid=MUSKETEER:scripts.1: jid=0: Done Jun 3 09:33:25 aramis kernel: GFS2: fsid=MUSKETEER:projetoslv.1: jid=0: Done Jun 3 09:33:25 aramis kernel: GFS2: fsid=MUSKETEER:util.1: jid=0: Done Jun 3 09:33:32 aramis kernel: GFS2: fsid=MUSKETEER:sharedprod.1: jid=0: Looking at journal... Jun 3 09:33:32 aramis kernel: GFS2: fsid=MUSKETEER:sharedadm.1: jid=0: Looking at journal... Jun 3 09:33:32 aramis kernel: GFS2: fsid=MUSKETEER:projetosfechados.1: jid=0: Looking at journal... Jun 3 09:33:33 aramis kernel: GFS2: fsid=MUSKETEER:sharedprod.1: jid=0: Acquiring the transaction lock... Jun 3 09:33:33 aramis kernel: GFS2: fsid=MUSKETEER:sharedprod.1: jid=0: Replaying journal... Jun 3 09:33:33 aramis kernel: GFS2: fsid=MUSKETEER:sharedprod.1: jid=0: Replayed 13 of 19 blocks Jun 3 09:33:33 aramis kernel: GFS2: fsid=MUSKETEER:sharedprod.1: jid=0: Found 4 revoke tags Jun 3 09:33:33 aramis kernel: GFS2: fsid=MUSKETEER:sharedprod.1: jid=0: Journal replayed in 1s Jun 3 09:33:33 aramis kernel: GFS2: fsid=MUSKETEER:sharedprod.1: jid=0: Done Jun 3 09:33:33 aramis kernel: GFS2: fsid=MUSKETEER:sharedadm.1: jid=0: Acquiring the transaction lock... Jun 3 09:33:33 aramis kernel: GFS2: fsid=MUSKETEER:sharedadm.1: jid=0: Replaying journal... Jun 3 09:33:33 aramis kernel: GFS2: fsid=MUSKETEER:sharedadm.1: jid=0: Replayed 0 of 0 blocks Jun 3 09:33:33 aramis kernel: GFS2: fsid=MUSKETEER:sharedadm.1: jid=0: Found 0 revoke tags Jun 3 09:33:33 aramis kernel: GFS2: fsid=MUSKETEER:sharedadm.1: jid=0: Journal replayed in 1s Jun 3 09:33:33 aramis kernel: GFS2: fsid=MUSKETEER:sharedadm.1: jid=0: Done Jun 3 09:33:33 aramis kernel: GFS2: fsid=MUSKETEER:projetosfechados.1: jid=0: Done This doesn't seem to be really a problem, after messages the filesystem mounts and it work, as a user viewing, normal. Version-Release number of selected component (if applicable): [root@athos ~]# rpm -qa| grep -iE 'gfs2|cman|openais|clust|ipmi' system-config-cluster-1.0.55-1.0 cluster-snmp-0.12.1-2.el5 gfs2-utils-0.1.53-1.el5_3.3 cman-2.0.98-1.el5_3.1 Cluster_Administration-en-US-5.2-1 lvm2-cluster-2.02.40-7.el5 modcluster-0.12.1-2.el5 cluster-cim-0.12.1-2.el5 openais-0.80.3-22.el5_3.4 OpenIPMI-tools-2.0.6-11.el5 OpenIPMI-libs-2.0.6-11.el5 OpenIPMI-2.0.6-11.el5 [root@athos ~]# uname -a Linux athos.intranet.prosul 2.6.18-128.1.10.el5 #1 SMP Wed Apr 29 13:53:08 EDT 2009 x86_64 x86_64 x86_64 GNU/Linux [root@athos ~]# cat /etc/redhat-release Red Hat Enterprise Linux Server release 5.3 (Tikanga) Additional info: I'm using fencing by IPMILAN, and gfs2 is serving for Samba+CTDB fileserver.
Reassigning to Dave Teigland as per our discussion this morning. I'm also changing the product to RHEL5, since this is clearly not RHEL4.
Adding Steve Dake to the cc list in case this is an openais issue.
We've always seen ckpt unlink errors, and not known quite why they appear, it's generally not a problem. In this case it seems likely to be a result of the node failure. The critical bit of gfs_controld checkpoints are the creating and the reading.
Development Management has reviewed and declined this request. You may appeal this decision by reopening this request.