Bug 1566326
Summary: | PV not getting mounted even though only 1 out of 3 bricks in replica 3 volume is offline | ||
---|---|---|---|
Product: | [Red Hat Storage] Red Hat Gluster Storage | Reporter: | Shekhar Berry <shberry> |
Component: | rhgs-server-container | Assignee: | Saravanakumar <sarumuga> |
Status: | CLOSED WORKSFORME | QA Contact: | Prasanth <pprakash> |
Severity: | medium | Docs Contact: | |
Priority: | unspecified | ||
Version: | rhgs-3.0 | CC: | aos-bugs, aos-storage-staff, ekuric, madam, mpillai, psuriset, rcyriac, rgeorge, rhs-bugs, rsussman, rtalur, sankarshan, shberry |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2018-07-10 18:47:04 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Shekhar Berry
2018-04-12 05:06:19 UTC
Gluster Vol status output: Status of volume: vol_e5a06ef5e3ded904f341092d22589a3f Gluster process TCP Port RDMA Port Online Pid ------------------------------------------------------------------------------ Brick 10.16.29.212:/var/lib/heketi/mounts/v g_bfc15fe9069ee29bc752b293fdaec6eb/brick_69 b77b5800b28f9e6bf4539f0a6ac70a/brick 49153 0 Y 469 Brick 10.16.29.216:/var/lib/heketi/mounts/v g_eb93eb5db356e087f0db1deba22918cc/brick_95 5d9313c2a653ac009838af375edab6/brick 49153 0 Y 465 Brick 10.16.29.214:/var/lib/heketi/mounts/v g_f1704c4e203e3d0fc86bbc95ee330cc6/brick_6a 2541e6138d7a54622f1d12cf1dc1a9/brick N/A N/A N N/A Self-heal Daemon on localhost N/A N/A Y 408 Self-heal Daemon on perf148f.perf.lab.eng.b os.redhat.com N/A N/A Y 451 Self-heal Daemon on 10.16.29.214 N/A N/A Y 449 Task Status of Volume vol_e5a06ef5e3ded904f341092d22589a3f ------------------------------------------------------------------------------ There are no active volume tasks Referring the logs: I can see the following messages: connection to 10.16.29.212:24007 failed (Connection refused); disconnecting socket connection to 10.16.29.214:24007 failed (Connection refused); disconnecting socket So, 24007 port needs to be opened in firewall and doesn't seems to be. How the CNS deployment is carried out? I suspect issue in firewall setup during CNS deployment. (In reply to Saravanakumar from comment #6) > Referring the logs: > I can see the following messages: > connection to 10.16.29.212:24007 failed (Connection refused); disconnecting > socket > > connection to 10.16.29.214:24007 failed (Connection refused); disconnecting > socket > > So, 24007 port needs to be opened in firewall and doesn't seems to be. > > How the CNS deployment is carried out? > > I suspect issue in firewall setup during CNS deployment. CNS deployment was done through ansible deployment. I have verified doing I/O using pgbench, after bringing down a node (and the corresponding gluster pod). I don't observe any issue in gluster volumes and can carry out I/O. Complete logs below: ========================================== # oc version oc v3.9.0-0.53.0 kubernetes v1.9.1+a0ce1bc657 features: Basic-Auth GSSAPI Kerberos SPNEGO Server https://dhcp43-231.lab.eng.blr.redhat.com:8443 openshift v3.9.0-0.53.0 kubernetes v1.9.1+a0ce1bc657 ========================================== Gluster node and corresponding pod down:: [root@dhcp43-231 ~]# oc get nodes NAME STATUS ROLES AGE VERSION dhcp41-180.lab.eng.blr.redhat.com Ready <none> 92d v1.9.1+a0ce1bc657 dhcp42-122.lab.eng.blr.redhat.com NotReady <none> 92d v1.9.1+a0ce1bc657 dhcp43-135.lab.eng.blr.redhat.com Ready <none> 92d v1.9.1+a0ce1bc657 dhcp43-195.lab.eng.blr.redhat.com Ready <none> 92d v1.9.1+a0ce1bc657 dhcp43-231.lab.eng.blr.redhat.com Ready master 92d v1.9.1+a0ce1bc657 [root@dhcp43-231 ~]# [root@dhcp43-231 ~]# [root@dhcp43-231 ~]# [root@dhcp43-231 ~]# [root@dhcp43-231 ~]# oc get pods NAME READY STATUS RESTARTS AGE glusterblock-storage-provisioner-dc-1-j4bwp 1/1 Running 1 89d glusterfs-storage-bzbqk 1/1 Running 1 89d glusterfs-storage-cqh26 1/1 Unknown 1 89d glusterfs-storage-m9jjh 1/1 Running 1 89d heketi-storage-1-m9tdl 1/1 Running 1 89d postgresql-1-ppkvc 1/1 Running 1 2h ========================================== Gluster volume status: [root@dhcp43-231 ~]# oc rsh po/glusterfs-storage-bzbqk sh-4.2# sh-4.2# sh-4.2# gluster vol status Status of volume: heketidbstorage Gluster process TCP Port RDMA Port Online Pid ------------------------------------------------------------------------------ Brick 10.70.41.180:/var/lib/heketi/mounts/v g_20815059c81012a55f4c4fa14725695f/brick_be 8168fc1b53eee4a0d6d5c6ed26a098/brick 49152 0 Y 444 Brick 10.70.43.135:/var/lib/heketi/mounts/v g_14e52f050c99f33a77834d55e248b8b7/brick_5b e313c9d73d824819733dc3b7682bcf/brick 49152 0 Y 433 Self-heal Daemon on localhost N/A N/A Y 14217 Self-heal Daemon on 10.70.43.135 N/A N/A Y 14208 Task Status of Volume heketidbstorage ------------------------------------------------------------------------------ There are no active volume tasks Status of volume: vol_2aaf562cfd128cc6a174022f18b683b0 Gluster process TCP Port RDMA Port Online Pid ------------------------------------------------------------------------------ Brick 10.70.43.135:/var/lib/heketi/mounts/v g_c15ed928d624beddf3578195805cbdb5/brick_97 71e962d1b1858eb7bfe8741c38358f/brick 49152 0 Y 433 Brick 10.70.41.180:/var/lib/heketi/mounts/v g_3aa9dfed04dfbc76fba92a6256a0d294/brick_f1 ed81a1533f0e92e25d94006ec8a8ad/brick 49152 0 Y 444 Self-heal Daemon on localhost N/A N/A Y 14217 Self-heal Daemon on 10.70.43.135 N/A N/A Y 14208 Task Status of Volume vol_2aaf562cfd128cc6a174022f18b683b0 ------------------------------------------------------------------------------ There are no active volume tasks ========================================== Carrying out I/O on gluster volume: sh-4.2$ pgbench -i -s 10 creating tables... 100000 of 1000000 tuples (10%) done (elapsed 0.73 s, remaining 6.57 s) 200000 of 1000000 tuples (20%) done (elapsed 1.65 s, remaining 6.59 s) 300000 of 1000000 tuples (30%) done (elapsed 2.69 s, remaining 6.27 s) 400000 of 1000000 tuples (40%) done (elapsed 3.69 s, remaining 5.54 s) 500000 of 1000000 tuples (50%) done (elapsed 4.69 s, remaining 4.69 s) 600000 of 1000000 tuples (60%) done (elapsed 5.70 s, remaining 3.80 s) 700000 of 1000000 tuples (70%) done (elapsed 6.73 s, remaining 2.88 s) 800000 of 1000000 tuples (80%) done (elapsed 7.75 s, remaining 1.94 s) 900000 of 1000000 tuples (90%) done (elapsed 8.85 s, remaining 0.98 s) 1000000 of 1000000 tuples (100%) done (elapsed 10.09 s, remaining 0.00 s) vacuum... set primary keys... done. sh-4.2$ sh-4.2$ sh-4.2$ df -h Filesystem Size Used Avail Use% Mounted on /dev/mapper/docker-253:3-389005-674752ba6e44c9317ccddba442c488971f53263e3773f970848aa4731b0eab64 10G 349M 9.7G 4% / tmpfs 1.9G 0 1.9G 0% /dev tmpfs 1.9G 0 1.9G 0% /sys/fs/cgroup /dev/vda3 15G 3.1G 12G 21% /etc/hosts shm 64M 4.0K 64M 1% /dev/shm 10.70.41.180:vol_2aaf562cfd128cc6a174022f18b683b0 1016M 230M 787M 23% /var/lib/pgsql/data tmpfs 1.9G 16K 1.9G 1% /run/secrets/kubernetes.io/serviceaccount tmpfs 1.9G 0 1.9G 0% /proc/scsi sh-4.2$ sh-4.2$ sh-4.2$ pwd /var/lib/pgsql/data/userdata sh-4.2$ sh-4.2$ date Tue May 29 10:40:24 UTC 2018 sh-4.2$ ls -ltr total 103 drwx------. 2 1000080000 root 4096 May 29 08:19 pg_commit_ts drwx------. 2 1000080000 root 4096 May 29 08:19 pg_dynshmem drwx------. 2 1000080000 root 4096 May 29 08:19 pg_serial drwx------. 2 1000080000 root 4096 May 29 08:19 pg_snapshots drwx------. 2 1000080000 root 4096 May 29 08:19 pg_twophase drwx------. 4 1000080000 root 4096 May 29 08:19 pg_multixact drwx------. 2 1000080000 root 4096 May 29 08:19 pg_replslot drwx------. 2 1000080000 root 4096 May 29 08:19 pg_tblspc drwx------. 4 1000080000 root 4096 May 29 08:19 pg_logical -rw-------. 1 1000080000 root 4 May 29 08:19 PG_VERSION -rw-------. 1 1000080000 root 88 May 29 08:19 postgresql.auto.conf -rw-------. 1 1000080000 root 1636 May 29 08:19 pg_ident.conf drwx------. 3 1000080000 root 4096 May 29 08:19 pg_xlog drwx------. 2 1000080000 root 4096 May 29 08:19 pg_clog drwx------. 2 1000080000 root 4096 May 29 08:19 pg_subtrans -rw-------. 1 1000080000 root 4666 May 29 08:20 pg_hba.conf drwx------. 2 1000080000 root 4096 May 29 08:20 pg_log -rw-------. 1 1000080000 root 22424 May 29 08:20 postgresql.conf drwx------. 6 1000080000 root 4096 May 29 08:20 base -rw-------. 1 1000080000 root 89 May 29 08:20 postmaster.pid drwx------. 2 1000080000 root 4096 May 29 08:20 pg_notify -rw-------. 1 1000080000 root 46 May 29 08:20 postmaster.opts drwx------. 2 1000080000 root 4096 May 29 08:20 pg_stat drwx------. 2 1000080000 root 4096 May 29 10:39 global drwx------. 2 1000080000 root 4096 May 29 10:40 pg_stat_tmp ========================================== As mentioned in comment #6, this looks like a setup issue. |