Bug 1021011 - mongo seeds are not reconnecting to new PRIMARY of a replica set
mongo seeds are not reconnecting to new PRIMARY of a replica set
Status: CLOSED CURRENTRELEASE
Product: Pulp
Classification: Community
Component: z_other (Show other bugs)
2.2 Beta
x86_64 Linux
urgent Severity high
: ---
: 2.3.0
Assigned To: Jay Dobies
pulp-qe-list
: Triaged
Depends On: 1019909
Blocks:
  Show dependency treegraph
 
Reported: 2013-10-18 14:46 EDT by Michael Hrivnak
Modified: 2013-12-09 09:31 EST (History)
5 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 1019909
Environment:
Last Closed: 2013-12-09 09:31:03 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Michael Hrivnak 2013-10-18 14:46:22 EDT
+++ This bug was initially created as a clone of Bug #1019909 +++

Description of problem:
If pulp seeds from a mongodb replica set, and the mongo PRIMARY is re-elected, pulp fails to reconnect to the new PRIMARY.

Version-Release number of selected component (if applicable):
pulp-server-2.2.0-0.20.beta.git.0.d54a854.el6eng.cdn.1.noarch
mongo server buildinfo - 2.4.6 (EPEL)
pymongo-2.1.1-1.el6.x86_64

How reproducible:
very

Steps to Reproduce:
1. have a mongo replica set, where mongodb01.web.stage.our.domain.com is PRIMARY
2. setup pulp to seed from a host in the mongodb replica set
 /etc/pulp/server.conf [database] seeds: mongodb01.web.stage.our.domain.com:27017
3. Start the pulp sever (and pulp-manage-db)(ensure it's functioning correctly)
3. on the mongodb rs, have mongodb01 step down from PRIMARY ( rs.stepDown() )
4. make calls to the pulp server ( `pulp-admin login -u admin -p S3krit` )

Actual results:
   An internal error occurred on the Pulp server. More information can be found in the client log file ~/.pulp/admin.log.
=== START ~/.pulp/admin.log =====
2013-10-16 10:22:45,466 - ERROR - Client-side exception occurred
Traceback (most recent call last):
  File "/usr/lib/python2.6/site-packages/pulp/client/extensions/core.py", line 478, in run
    exit_code = Cli.run(self, args)
  File "/usr/lib/python2.6/site-packages/okaara/cli.py", line 974, in run
    exit_code = command_or_section.execute(self.prompt, remaining_args)
  File "/usr/lib/python2.6/site-packages/pulp/client/extensions/extensions.py", line 224, in execute
    return self.method(*arg_list, **clean_kwargs)
  File "/usr/lib/pulp/admin/extensions/pulp_server_info/pulp_cli.py", line 35, in types
    all_types = self.context.server.server_info.get_types()
  File "/usr/lib/python2.6/site-packages/pulp/bindings/server_info.py", line 33, in get_types
    return self.server.GET(path)
  File "/usr/lib/python2.6/site-packages/pulp/bindings/server.py", line 84, in GET
    return self._request('GET', path, queries)
  File "/usr/lib/python2.6/site-packages/pulp/bindings/server.py", line 142, in _request
    self._handle_exceptions(response_code, response_body)
  File "/usr/lib/python2.6/site-packages/pulp/bindings/server.py", line 183, in _handle_exceptions
    raise code_class_mappings[response_code](response_body)
PermissionsException: RequestException: GET request on /pulp/api/v2/plugins/types/ failed with 401 - Pulp exception occurred: AuthenticationFailed
2013-10-16 10:23:05,446 - ERROR - Exception occurred:
        href:      /pulp/api/v2/actions/login/
        method:    POST
        status:    500
        error:     create_index operation failed on pulp2_database.users: database connection still down after 3 tries
        traceback: None
        data:      {u'args': [u'create_index operation failed on pulp2_database.users: database connection still down after 3 tries']}
=== END ~/.pulp/admin.log =====

Expected results:
   Successfully logged in. Session certificate will expire at Oct 23 14:24:33 2013 GMT.

Additional info:
If I can get the original PRIMARY node elected back as PRIMARY, then everything on pulp begins working again.

This is enough of an issue to block us from promoting pulp v2 to production.

--- Additional comment from Michael Hrivnak on 2013-10-18 14:45:48 EDT ---

This is likely a regression. Unless it's particularly inconvenient, I think it makes sense to fix this in 2.2 and get it into our 2.2.1 release.
Comment 1 Jeff Ortel 2013-10-29 11:29:33 EDT
build: 2.3.0-0.26.beta
Comment 2 Preethi Thomas 2013-10-31 21:28:55 EDT
I followed the steps from 

https://bugzilla.redhat.com/show_bug.cgi?id=1019909#c3

So the first part of the test gave the following result in the log

PulpCollectionFailure: find_one operation failed on pulp_database.users: database connection still down after 3 tries


Now I configured server.cog


[root@hp-dl120g5-01 ~]# cat /etc/pulp/server.conf |grep seed
# seeds: comma-separated list of hostname:port of database replica seed hosts
seeds: localhost:27018,localhost:27019,localhost:27020


First I killed 27018

So 27019 became primary and the it continued work.

Then I killed 27019, so the only one running was localhost:27020, but it did not work after.  localhost:27020 stayed as SECONDARY
Comment 3 Preethi Thomas 2013-11-01 09:25:20 EDT
moving to verified.
[root@hp-sl2x170zg6-01 ~]# rpm -qa pulp-server
pulp-server-2.3.0-0.26.beta.el6.noarch
[root@hp-sl2x170zg6-01 ~]# 

The last statement in the above comment is mongo behavior.
Comment 4 Preethi Thomas 2013-12-09 09:31:03 EST
Pulp 2.3 released.

Note You need to log in before you can comment on or make changes to this bug.