1327701 – Documentation: Controller replacement guide, Delete the failed node from MongoDB section: need to add a note about finding the IP to connect to.

Bug 1327701 - Documentation: Controller replacement guide, Delete the failed node from MongoDB section: need to add a note about finding the IP to connect to.

Summary: Documentation: Controller replacement guide, Delete the failed node from Mong...

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	Red Hat OpenStack
Classification:	Red Hat
Component:	documentation
Sub Component:
Version:	unspecified
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	unspecified
Target Milestone:	ga
Target Release:	8.0 (Liberty)
Assignee:	Dan Macpherson
QA Contact:	Alexander Chuzhoy
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2016-04-15 16:11 UTC by Alexander Chuzhoy
Modified:	2016-06-16 04:41 UTC (History)
CC List:	3 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2016-06-16 04:41:08 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Description Alexander Chuzhoy 2016-04-15 16:11:10 UTC

Documentation: Controller replacement guide, Delete the failed node from MongoDB section: need to add a note about finding the IP to connect to.


Currently it says:
 Delete the failed node from MongoDB. First, connect to MongoDB on any of remaining nodes:

[heat-admin@overcloud-controller-0 ~]$ mongo --host 192.168.0.23
MongoDB shell version: 2.6.9
connecting to: 192.168.0.23:27017/test
Welcome to the MongoDB shell.
For interactive help, type "help".
For more comprehensive documentation, see
	http://docs.mongodb.org/
Questions? Try the support group
	http://groups.google.com/group/mongodb-user
tripleo:SECONDARY>



So the first option a user would try - is to connect to the IP as seen in the output from 'nova list', which would fail.

I see that the IP is on InternalApi on my setup. But maybe it's best to looking the IP where port 27017 is open.

Comment 3 Alexander Chuzhoy 2016-05-07 04:01:57 UTC

Two things in "Delete the failed node from MongoDB" section:
1) The IPs (shown in the output and used for connection) in the guide differ:
sudo netstat -tulnp | grep 27017
tcp        0      0 192.168.201.47:27017    0.0.0.0:*               LISTEN      13415/mongod
[heat-admin@overcloud-controller-0 ~]$ mongo --host 192.168.0.47


2) I was unable to delete the node when connected to secondary:


             {
                        "_id" : 1,
                        "name" : "10.19.94.13:27017",
                        "health" : 0,
                        "state" : 8,
                        "stateStr" : "(not reachable/healthy)",
                        "uptime" : 0,
                        "optime" : Timestamp(1462580916, 5),
                        "optimeDate" : ISODate("2016-05-07T00:28:36Z"),
                        "lastHeartbeat" : ISODate("2016-05-07T03:55:16Z"),
                        "lastHeartbeatRecv" : ISODate("2016-05-07T00:33:26Z"),
                        "pingMs" : 0,
                        "syncingTo" : "10.19.94.16:27017"
                },
                {
                        "_id" : 2,
                        "name" : "10.19.94.14:27017",
                        "health" : 1,
                        "state" : 2,
                        "stateStr" : "SECONDARY",
                        "uptime" : 18603,
                        "optime" : Timestamp(1462593277, 6),
                        "optimeDate" : ISODate("2016-05-07T03:54:37Z"),
                        "self" : true
                }
        ],
        "ok" : 1
}
tripleo:SECONDARY> rs.remove('10.19.94.13:27017')
{
        "ok" : 0,
        "errmsg" : "replSetReconfig command must be sent to the current replica set primary."
}
tripleo:SECONDARY>



Note:
I was able to delete it (although with errors shown) when I connected to primary:
                {
                        "_id" : 1,
                        "name" : "10.19.94.13:27017",
                        "health" : 0,
                        "state" : 8,
                        "stateStr" : "(not reachable/healthy)",
                        "uptime" : 0,
                        "optime" : Timestamp(1462580916, 5),
                        "optimeDate" : ISODate("2016-05-07T00:28:36Z"),
                        "lastHeartbeat" : ISODate("2016-05-07T03:57:01Z"),
                        "lastHeartbeatRecv" : ISODate("2016-05-07T00:33:26Z"),
                        "pingMs" : 0,
                        "syncingTo" : "10.19.94.16:27017"
                },
                {
                        "_id" : 2,
                        "name" : "10.19.94.14:27017",
                        "health" : 1,
                        "state" : 2,
                        "stateStr" : "SECONDARY",
                        "uptime" : 18687,
                        "optime" : Timestamp(1462593277, 6),
                        "optimeDate" : ISODate("2016-05-07T03:54:37Z"),
                        "lastHeartbeat" : ISODate("2016-05-07T03:57:06Z"),
                        "lastHeartbeatRecv" : ISODate("2016-05-07T03:57:06Z"),
                        "pingMs" : 0,
                        "syncingTo" : "10.19.94.16:27017"
                }
        ],
        "ok" : 1
}
tripleo:PRIMARY> rs.remove('10.19.94.13:27017')
2016-05-07T03:57:19.541+0000 DBClientCursor::init call() failed
2016-05-07T03:57:19.543+0000 Error: error doing query: failed at src/mongo/shell/query.js:81
2016-05-07T03:57:19.545+0000 trying reconnect to 10.19.94.16:27017 (10.19.94.16) failed
2016-05-07T03:57:19.547+0000 reconnect 10.19.94.16:27017 (10.19.94.16) ok

Comment 4 Dan Macpherson 2016-05-09 02:42:37 UTC

(In reply to Alexander Chuzhoy from comment #3)
> Two things in "Delete the failed node from MongoDB" section:
> 1) The IPs (shown in the output and used for connection) in the guide differ:
> sudo netstat -tulnp | grep 27017
> tcp        0      0 192.168.201.47:27017    0.0.0.0:*               LISTEN  
> 13415/mongod
> [heat-admin@overcloud-controller-0 ~]$ mongo --host 192.168.0.47
> 
> 

Added fix for this.

> tripleo:PRIMARY> rs.remove('10.19.94.13:27017')
> 2016-05-07T03:57:19.541+0000 DBClientCursor::init call() failed
> 2016-05-07T03:57:19.543+0000 Error: error doing query: failed at
> src/mongo/shell/query.js:81
> 2016-05-07T03:57:19.545+0000 trying reconnect to 10.19.94.16:27017
> (10.19.94.16) failed
> 2016-05-07T03:57:19.547+0000 reconnect 10.19.94.16:27017 (10.19.94.16) ok

This error should be normal and should be what's expected. Might be a good idea to add a note on this item.

As for the PRIMARY vs SECONDARY issue, I'll add a note for that too. Something along the lings of:

===IMPORTANT===
You must run the command against the PRIMARY replica set. If you see the following message:

"replSetReconfig command must be sent to the current replica set primary."

Relog into MongoDB on the node designated as PRIMARY.

Comment 6 Alexander Chuzhoy 2016-05-12 13:11:54 UTC

Verified.
This section of the doc looks good.

Comment 7 Dan Macpherson 2016-06-16 04:41:08 UTC

Changes now live on the customer portal.

Note You need to log in before you can comment on or make changes to this bug.