Description of problem: heketi-cli command returns error post cns deployment. # curl http://heketi-storage-project.cloudapps.mystorage.com/hello Hello from Heketi Version-Release number of selected component (if applicable): # oc version oc v3.5.0.55 kubernetes v1.5.2+43a9be4 features: Basic-Auth GSSAPI Kerberos SPNEGO Server https://dhcp46-145.lab.eng.blr.redhat.com:8443 openshift v3.5.0.55 kubernetes v1.5.2+43a9be4 heketi-client-4.0.0-3.el7rhgs.x86_64 cns-deploy-4.0.0-6.el7rhgs.x86_64 How reproducible: Seen in two freshly installed setups Steps to Reproduce: 1. Deploy cns using the latest build 2. Export heketi 3. Try to run any heketi-cli commands Actual results: heketi-cli returns error [root@dhcp46-145 ~]# heketi-cli topology info Error: [root@dhcp46-145 ~]# heketi-cli cluster list Error: [root@dhcp46-145 ~]# heketi-cli volume list Error: ######### [negroni] Started GET /volumes [negroni] PANIC: invalid page type: 0: 0 goroutine 1726 [running]: github.com/urfave/negroni.(*Recovery).ServeHTTP.func1(0x7f05f059b730, 0xc82037c088, 0xc8203b8b70) /builddir/build/BUILD/heketi-4.0.0/src/github.com/urfave/negroni/recovery.go:41 +0x19c panic(0x15bd960, 0xc82047c710) /usr/lib/golang/src/runtime/panic.go:443 +0x4e9 github.com/boltdb/bolt.(*Cursor).search(0xc82039b2d8, 0xc82039b360, 0x6, 0x6, 0xa) /builddir/build/BUILD/heketi-4.0.0/src/github.com/boltdb/bolt/cursor.go:256 +0x215 github.com/boltdb/bolt.(*Cursor).seek(0xc82039b2d8, 0xc82039b360, 0x6, 0x6, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...) /builddir/build/BUILD/heketi-4.0.0/src/github.com/boltdb/bolt/cursor.go:159 +0xda github.com/boltdb/bolt.(*Bucket).Bucket(0xc82011ca98, 0xc82039b360, 0x6, 0x6, 0x6) /builddir/build/BUILD/heketi-4.0.0/src/github.com/boltdb/bolt/bucket.go:112 +0x180 github.com/boltdb/bolt.(*Tx).Bucket(0xc82011ca80, 0xc82039b360, 0x6, 0x6, 0x6) /builddir/build/BUILD/heketi-4.0.0/src/github.com/boltdb/bolt/tx.go:101 +0x4b github.com/heketi/heketi/apps/glusterfs.EntryKeys(0xc82011ca80, 0x1bb4b40, 0x6, 0x0, 0x0, 0x0) /builddir/build/BUILD/heketi-4.0.0/src/github.com/heketi/heketi/apps/glusterfs/dbentry.go:57 +0xe5 github.com/heketi/heketi/apps/glusterfs.VolumeList(0xc82011ca80, 0x0, 0x0, 0x0, 0x0, 0x0) /builddir/build/BUILD/heketi-4.0.0/src/github.com/heketi/heketi/apps/glusterfs/volume_entry.go:49 +0x57 github.com/heketi/heketi/apps/glusterfs.(*App).VolumeList.func1(0xc82011ca80, 0x0, 0x0) /builddir/build/BUILD/heketi-4.0.0/src/github.com/heketi/heketi/apps/glusterfs/app_volume.go:174 +0x46 github.com/boltdb/bolt.(*DB).View(0xc8203e3e00, 0xc82039b540, 0x0, 0x0) /builddir/build/BUILD/heketi-4.0.0/src/github.com/boltdb/bolt/db.go:626 +0xb9 github.com/heketi/heketi/apps/glusterfs.(*App).VolumeList(0xc820013040, 0x7f05f059b730, 0xc82037c088, 0xc82011c9a0) /builddir/build/BUILD/heketi-4.0.0/src/github.com/heketi/heketi/apps/glusterfs/app_volume.go:171 +0x7a github.com/heketi/heketi/apps/glusterfs.(*App).VolumeList-fm(0x7f05f059b730, 0xc82037c088, 0xc82011c9a0) /builddir/build/BUILD/heketi-4.0.0/src/github.com/heketi/heketi/apps/glusterfs/app.go:362 +0x3e net/http.HandlerFunc.ServeHTTP(0xc82046f8a0, 0x7f05f059b730, 0xc82037c088, 0xc82011c9a0) /usr/lib/golang/src/net/http/server.go:1618 +0x3a github.com/gorilla/mux.(*Router).ServeHTTP(0xc82028ca50, 0x7f05f059b730, 0xc82037c088, 0xc82011c9a0) /builddir/build/BUILD/heketi-4.0.0/src/github.com/gorilla/mux/mux.go:98 +0x29e github.com/urfave/negroni.Wrap.func1(0x7f05f059b730, 0xc82037c088, 0xc82011c9a0, 0xc820339180) /builddir/build/BUILD/heketi-4.0.0/src/github.com/urfave/negroni/negroni.go:41 +0x50 github.com/urfave/negroni.HandlerFunc.ServeHTTP(0xc8201841a0, 0x7f05f059b730, 0xc82037c088, 0xc82011c9a0, 0xc820339180) /builddir/build/BUILD/heketi-4.0.0/src/github.com/urfave/negroni/negroni.go:24 +0x44 github.com/urfave/negroni.middleware.ServeHTTP(0x7f05f05e0558, 0xc8201841a0, 0xc820184220, 0x7f05f059b730, 0xc82037c088, 0xc82011c9a0) /builddir/build/BUILD/heketi-4.0.0/src/github.com/urfave/negroni/negroni.go:33 +0xaa github.com/urfave/negroni.(middleware).ServeHTTP-fm(0x7f05f059b730, 0xc82037c088, 0xc82011c9a0) /builddir/build/BUILD/heketi-4.0.0/src/github.com/urfave/negroni/negroni.go:33 +0x53 github.com/urfave/negroni.(*Logger).ServeHTTP(0xc82047d0b0, 0x7f05f059b730, 0xc82037c088, 0xc82011c9a0, 0xc820338ce0) /builddir/build/BUILD/heketi-4.0.0/src/github.com/urfave/negroni/logger.go:31 +0x1f5 github.com/urfave/negroni.middleware.ServeHTTP(0x7f05f05e0530, 0xc82047d0b0, 0xc820184200, 0x7f05f059b730, 0xc82037c088, 0xc82011c9a0) /builddir/build/BUILD/heketi-4.0.0/src/github.com/urfave/negroni/negroni.go:33 +0xaa github.com/urfave/negroni.(middleware).ServeHTTP-fm(0x7f05f059b730, 0xc82037c088, 0xc82011c9a0) /builddir/build/BUILD/heketi-4.0.0/src/github.com/urfave/negroni/negroni.go:33 +0x53 github.com/urfave/negroni.(*Recovery).ServeHTTP(0xc8203b8b70, 0x7f05f059b730, 0xc82037c088, 0xc82011c9a0, 0xc820338c20) /builddir/build/BUILD/heketi-4.0.0/src/github.com/urfave/negroni/recovery.go:64 +0x75 github.com/urfave/negroni.middleware.ServeHTTP(0x7f05f05e0508, 0xc8203b8b70, 0xc8201841e0, 0x7f05f059b730, 0xc82037c088, 0xc82011c9a0) /builddir/build/BUILD/heketi-4.0.0/src/github.com/urfave/negroni/negroni.go:33 +0xaa github.com/urfave/negroni.(*Negroni).ServeHTTP(0xc8203b8ba0, 0x7f05f05e2110, 0xc8203f29c0, 0xc82011c9a0) /builddir/build/BUILD/heketi-4.0.0/src/github.com/urfave/negroni/negroni.go:73 +0x1a6 github.com/gorilla/mux.(*Router).ServeHTTP(0xc82028c9b0, 0x7f05f05e2110, 0xc8203f29c0, 0xc82011c9a0) /builddir/build/BUILD/heketi-4.0.0/src/github.com/gorilla/mux/mux.go:98 +0x29e net/http.serverHandler.ServeHTTP(0xc82048b200, 0x7f05f05e2110, 0xc8203f29c0, 0xc82011c9a0) /usr/lib/golang/src/net/http/server.go:2081 +0x19e net/http.(*conn).serve(0xc820080e80) /usr/lib/golang/src/net/http/server.go:1472 +0xf2e created by net/http.(*Server).Serve /usr/lib/golang/src/net/http/server.go:2137 +0x44e ######## Expected results: No error should be seen Additional info: I'll attach the logs and db soon
Thanks for sharing the machine. DB was clean. Some how it was not loaded properly, Will try to reproduce the issue from my end. We had a small debugging session in our scrum and could not find the root cause and how ever restarting of heketi pod helped. Will try again so that we can be sure it was something spurious and can be neglected. or handle it from the heketi volmanager script if we could find exact case of this issue being hit. trying more setups will update If hit again.
I have built a volmanager scratch build : brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/rhgs3/rhgs-volmanager-rhel7:rhgs-3.2-rhel-7-docker-candidate-20170323042518 This image will log the presence of the db and the log can be verified with : # oc rsh <heketi-pod-name> cat /var/lib/heketi/dbstat.log 1) Please confirm that log says the DB is not present 2) check if the db is present in location /var/lib/heketi in the heketi pod. 3) if present, size of the DB is above 32K(empty db size). If all the above is confirmed Then we just need a delay.
(In reply to Mohamed Ashiq from comment #8) > I have built a volmanager scratch build : > brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/rhgs3/rhgs-volmanager- > rhel7:rhgs-3.2-rhel-7-docker-candidate-20170323042518 > > This image will log the presence of the db and the log can be verified with : > > # oc rsh <heketi-pod-name> cat /var/lib/heketi/dbstat.log > > 1) Please confirm that log says the DB is not present > 2) check if the db is present in location /var/lib/heketi in the heketi pod. > 3) if present, size of the DB is above 32K(empty db size). > > If all the above is confirmed Then we just need a delay. Ashiq, I've used the above volmanager scratch build and did a manual deployment for collecting the requested details. Please find the results below: ######### [root@dhcp46-145 templates]# heketi-cli cluster list Error: [root@dhcp46-145 ]# oc get pods NAME READY STATUS RESTARTS AGE glusterfs-44nlc 1/1 Running 0 4m glusterfs-jr0f0 1/1 Running 0 4m glusterfs-xp20r 1/1 Running 0 4m heketi-1-lhqfh 1/1 Running 0 23s storage-project-router-1-r7bwz 1/1 Running 1 58m [root@dhcp46-145 ]# oc rsh heketi-1-lhqfh cat /var/lib/heketi/dbstat.log File: '/var/lib/heketi/heketi.db' Size: 53248 Blocks: 104 IO Block: 131072 regular file Device: 59h/89d Inode: 10471067870456742553 Links: 1 Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root) Access: 2017-03-24 11:57:25.027425828 +0000 Modify: 2017-03-24 11:57:25.032428014 +0000 Change: 2017-03-24 11:57:25.034428889 +0000 Birth: - [root@dhcp46-145 ]# oc rsh heketi-1-lhqfh ls -al /var/lib/heketi/heketi.db -rw-r--r--. 1 root root 53248 Mar 24 11:57 /var/lib/heketi/heketi.db [root@dhcp46-145 ]# oc rsh heketi-1-lhqfh du -sch /var/lib/heketi/heketi.db 52K /var/lib/heketi/heketi.db 52K total ######### Let me know if you want me to collect and attach any additional details from the setup for your further debugging. If required, I can also share the setup for sometime, which is in the same state, for your debugging.
Tested this with the latest image rhgs-volmanager-docker-3.2.0-4 and I still see this issue. See below: ############## [root@dhcp46-195 opt]# export HEKETI_CLI_SERVER=http://heketi-storage-project.cloudapps.mystorage.com [root@dhcp46-195 opt]# heketi-cli cluster list Clusters: [root@dhcp46-195 opt]# heketi-cli volume list [root@dhcp46-195 opt]# date Tue Mar 28 15:56:18 IST 2017 [root@dhcp46-195 opt]# curl http://heketi-storage-project.cloudapps.mystorage.com/hello Hello from Heketi[root@dhcp46-195 opt]# [root@dhcp46-195 opt]# [root@dhcp46-195 opt]# heketi-cli volume list Error: ############## Moving it back to Assigned for further investigation and fix.
Prasanth, Please give this scratch build a try. brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/rhgs3/rhgs-volmanager-rhel7:rhgs-3.2-rhel-7-docker-candidate-20170328150404 This will not let the pod to start of the db is not found and the volume is mounted.
Verified
Patches posted upstream at https://github.com/gluster/gluster-kubernetes/pull/241 https://github.com/gluster/gluster-kubernetes/pull/245
Verified as fixed in the latest build: cns-deploy-4.0.0-13.el7rhgs. Marking it as verified.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2017:1112