Bug 1686611 - Heketi Pod fails to deploy after storage node restart
Summary: Heketi Pod fails to deploy after storage node restart
Keywords:
Status: CLOSED DUPLICATE of bug 1740884
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: heketi
Version: ocs-3.11
Hardware: Unspecified
OS: Unspecified
unspecified
urgent
Target Milestone: ---
: ---
Assignee: John Mulligan
QA Contact: Prasanth
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-03-07 20:17 UTC by Bledi Agolli
Modified: 2019-09-23 18:15 UTC (History)
10 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-09-23 18:15:15 UTC
Embargoed:


Attachments (Terms of Use)
heketi.db files from 3 nodes (12.48 KB, application/zip)
2019-03-07 20:17 UTC, Bledi Agolli
no flags Details

Description Bledi Agolli 2019-03-07 20:17:37 UTC
Created attachment 1541969 [details]
heketi.db files from 3 nodes

Description of problem:
Heketi-storage pod fails to deploy after two out of 4 app-storage nodes are restarted. There was not enough time between node reboots for the Gluster volumes to become ready. After further investigation, it seems that the heketi database is corrupted.

Version-Release number of selected component (if applicable):
v3.11

How reproducible:
Error reproduced every time.

Steps to Reproduce:
1. Redeploy heketi-storage deployment. Heketi storage pod fails
2.
3.

Actual results:
Pod fails with the following error.
```Heketi 8.0.0
[heketi] INFO 2019/03/07 20:04:12 Loaded kubernetes executor
ERROR: Unable to start application```

Expected results:
Pod deploys without error

Additional info:

# Volume Info

sh-4.2# gluster volume info heketidbstorage

Volume Name: heketidbstorage
Type: Replicate
Volume ID: 8c931c3d-9032-4b42-a19d-5f0179c70743
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: 128.160.65.213:/var/lib/heketi/mounts/vg_c15e6a953b5e3d000df2af594afd571e/brick_44ac83de22ea3441b549973120fc2c6c/brick
Brick2: 128.160.65.216:/var/lib/heketi/mounts/vg_02d7d1e8c058d7ebab61d63beb56e44b/brick_fbe52cb58fdf0ed2de53f638c77a3fc6/brick
Brick3: 128.160.65.215:/var/lib/heketi/mounts/vg_ecf17f15c801c0ef8ce32086f0445551/brick_fdbb2bcdf393995f54822ac154581a8d/brick
Options Reconfigured:
performance.client-io-threads: off
nfs.disable: on
transport.address-family: inet
server.tcp-user-timeout: 42
cluster.brick-multiplex: on

# Crash Trace
sh-4.2# ./heketi db export --dbfile=heketi.db --jsonfile=heketi.json
panic: invalid page type: 12: 10

goroutine 1 [running]:
github.com/boltdb/bolt.(*Cursor).search(0xc42066b2c8, 0xc42066b360, 0x6, 0x20, 0xc)
        /builddir/build/BUILD/heketi-8.0.0/src/github.com/boltdb/bolt/cursor.go:256 +0x40c
github.com/boltdb/bolt.(*Cursor).seek(0xc42066b2c8, 0xc42066b360, 0x6, 0x20, 0x0, 0x0, 0x9, 0xbf, 0x195554f, 0x2, ...)
        /builddir/build/BUILD/heketi-8.0.0/src/github.com/boltdb/bolt/cursor.go:159 +0xb1
github.com/boltdb/bolt.(*Bucket).Bucket(0xc4204442b8, 0xc42066b360, 0x6, 0x20, 0xc42066b360)
        /builddir/build/BUILD/heketi-8.0.0/src/github.com/boltdb/bolt/bucket.go:112 +0xfc
github.com/boltdb/bolt.(*Tx).Bucket(0xc4204442a0, 0xc42066b360, 0x6, 0x20, 0x6)
        /builddir/build/BUILD/heketi-8.0.0/src/github.com/boltdb/bolt/tx.go:101 +0x4f
github.com/heketi/heketi/apps/glusterfs.EntryKeys(0xc4204442a0, 0x18efa13, 0x6, 0x110, 0xc420194000, 0xc42066b470)
        /builddir/build/BUILD/heketi-8.0.0/src/github.com/heketi/heketi/apps/glusterfs/dbentry.go:57 +0xbb
github.com/heketi/heketi/apps/glusterfs.VolumeList(0xc4204442a0, 0x18f7c82, 0xd, 0x0, 0x0, 0x0)
        /builddir/build/BUILD/heketi-8.0.0/src/github.com/heketi/heketi/apps/glusterfs/volume_entry.go:73 +0x44
github.com/heketi/heketi/apps/glusterfs.dbDumpInternal.func1(0xc4204442a0, 0x19697a0, 0xc4204442a0)
        /builddir/build/BUILD/heketi-8.0.0/src/github.com/heketi/heketi/apps/glusterfs/db_operations.go:39 +0xe3
github.com/boltdb/bolt.(*DB).View(0xc4204d81e0, 0xc420010be0, 0x0, 0x0)
        /builddir/build/BUILD/heketi-8.0.0/src/github.com/boltdb/bolt/db.go:626 +0x9a
github.com/heketi/heketi/apps/glusterfs.dbDumpInternal(0x24d3740, 0xc4204d81e0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
        /builddir/build/BUILD/heketi-8.0.0/src/github.com/heketi/heketi/apps/glusterfs/db_operations.go:34 +0x2c9
github.com/heketi/heketi/apps/glusterfs.DbDump(0x7ffc38aaa856, 0xb, 0x7ffc38aaa841, 0x9, 0x0, 0x0)
        /builddir/build/BUILD/heketi-8.0.0/src/github.com/heketi/heketi/apps/glusterfs/db_operations.go:232 +0x18c
main.glob..func3(0x24ba740, 0xc4201227e0, 0x0, 0x2)
        /builddir/build/BUILD/heketi-8.0.0/main.go:126 +0x92
github.com/spf13/cobra.(*Command).execute(0x24ba740, 0xc4201227c0, 0x2, 0x2, 0x24ba740, 0xc4201227c0)
        /builddir/build/BUILD/heketi-8.0.0/src/github.com/spf13/cobra/command.go:651 +0x23d
github.com/spf13/cobra.(*Command).ExecuteC(0x24b9ec0, 0x160000000024, 0x98, 0x98)
        /builddir/build/BUILD/heketi-8.0.0/src/github.com/spf13/cobra/command.go:726 +0x2fe
github.com/spf13/cobra.(*Command).Execute(0x24b9ec0, 0x18dff40, 0x254afc0)
        /builddir/build/BUILD/heketi-8.0.0/src/github.com/spf13/cobra/command.go:685 +0x2b
main.main()
        /builddir/build/BUILD/heketi-8.0.0/main.go:447 +0x42

heketi.db files are attached.

Comment 4 Yaniv Kaul 2019-04-14 14:29:01 UTC
Status?

Comment 6 Levy Sant'Anna 2019-06-24 19:10:46 UTC
Any status update?

Comment 7 Levy Sant'Anna 2019-06-26 11:34:51 UTC
Status?


Note You need to log in before you can comment on or make changes to this bug.