Bug 755941 - luci_admin restore is not consistent
Summary: luci_admin restore is not consistent
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: conga
Version: 5.8
Hardware: Unspecified
OS: Unspecified
low
low
Target Milestone: rc
: ---
Assignee: Jan Pokorný [poki]
QA Contact: Cluster QE
URL:
Whiteboard:
Depends On: 755935
Blocks: 769266
TreeView+ depends on / blocked
 
Reported: 2011-11-22 13:28 UTC by Radek Steiger
Modified: 2012-02-21 03:17 UTC (History)
2 users (show)

Fixed In Version: conga-0.12.2-42.el5
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 757902 (view as bug list)
Environment:
Last Closed: 2012-02-21 03:17:12 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Luci backup containing no clusters. (7.09 KB, application/xml)
2011-11-22 13:30 UTC, Radek Steiger
no flags Details
Proposed change to luci_admin script. (2.81 KB, patch)
2011-11-22 16:58 UTC, Jan Pokorný [poki]
no flags Details | Diff
slightly modified patch from attachment 535118 (4.94 KB, patch)
2011-11-29 20:00 UTC, Jan Pokorný [poki]
no flags Details | Diff
patch containing mentioned cleanup (8.58 KB, patch)
2011-11-29 20:57 UTC, Jan Pokorný [poki]
no flags Details | Diff
reiterated patch (should reflect all Radek's suggestions) (9.18 KB, patch)
2011-12-09 19:50 UTC, Jan Pokorný [poki]
no flags Details | Diff


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2012:0151 0 normal SHIPPED_LIVE Moderate: conga security, bug fix, and enhancement update 2012-02-21 07:24:48 UTC

Description Radek Steiger 2011-11-22 13:28:30 UTC
Description of problem:

luci_admin restore does not fully restore the database to its original state. I.e. when a backup is created on a clean DB, then a cluster is added into luci, after restoring the db backup the created cluster is still present in a database.


Version-Release number of selected component (if applicable):

luci-0.12.2-38.el5


How reproducible:

Always


Steps to Reproduce:
1. Install a initialize luci: "luci_admin init"
2. Do a backup: "luci_admin backup"
3. Start luci and create a new cluster
4. Stop luci and do a db restore: "luci_admin restore"
5. Start luci again and check out cluster list

  
Actual results:

Cluster created in step 3 is still present.


Expected results:

Database should be restored to a state in step 2, i.e. no cluster is present.

Comment 1 Radek Steiger 2011-11-22 13:30:25 UTC
Created attachment 535023 [details]
Luci backup containing no clusters.

Comment 3 Jan Pokorný [poki] 2011-11-22 16:58:06 UTC
Created attachment 535118 [details]
Proposed change to luci_admin script.

In fact, the luci_admin script was not doing what can be marked as "restore"
operation.  It simply added items contained in the previously generated
backup XML file.

This behaviour is kept as a default for backward compatibility, but a new
switch '-r' ('--replace') is added for ``restore'' subcommand that activates
the commonly desired semantics "remove old data items first and then import
new ones".  Currently, such data items are storage and cluster ones
(the latter solves the original issue raised by Radek).

Comment 4 Jan Pokorný [poki] 2011-11-22 17:07:51 UTC
The patch (attachment 535118 [details]) in the previous comment
is meant to be applied after the patch for bug 755935
(attachment 535040 [details]).  Not only to apply cleanly, but
also because this one modifies built-in help message to reflect
both changes [*].

For this reason, I am setting the (weak) dependency between the bugs.

[*] The man page should also be modified respectively.

Comment 5 Jan Pokorný [poki] 2011-11-22 18:23:47 UTC
The proper work of "luci_admin restore -r <XML-file>" can be checked, e.g.,
with the case in bug description.  Instead of checking set of clusters being
currently in DB via luci, the output of another "luci_admin backup" launched
afterwards should provide the information as well (under <clusterList> tag).

Comment 6 Radek Steiger 2011-11-29 17:05:12 UTC
Acking QA.

Comment 7 Jan Pokorný [poki] 2011-11-29 20:00:07 UTC
Created attachment 538214 [details]
slightly modified patch from attachment 535118 [details]

Comment 8 Jan Pokorný [poki] 2011-11-29 20:27:26 UTC
This applies to both this bug 755941 and bug 755935:

They should be both fixed as of conga-0.12.2-41.el5.

However, to ensure fix for bug 755941 works properly, my first step was to
do a little cleanup in luci's code to ensure special values (temporary
flags for nodes and for the whole cluster are handled consistently within
luci).  This fix was introduced as of conga-0.12.2-40.el5.

These temporary flags could have possibly and presumably got into the backup
XML if, e.g., cluster creation process was interrupted in the middle (luci
shutdown) and the database was then backed up using "luci_admin backup"
(not verified).

Comment 9 Jan Pokorný [poki] 2011-11-29 20:57:05 UTC
Created attachment 538238 [details]
patch containing mentioned cleanup

In addition, it should also solve (theoretically only, hard to test
in isolation) a problem that "busy" (handling another task at the moment)
nodes were not excluded properly in luci internals when desired.
Thus e.g., for the purpose of cluster nodes listing, busy node could have
been chosen as a data source.

So it should be also tested that common actions with cluster work as they
should in build conga-0.12.2-41.el5 (or -40, but -41 is probably more
convenient, together with the 2 bugs) and no obviously wrong information is
displayed (such as a node is offline when it is not etc.), mainly in
combination with user-initiated tasks being performed on particular nodes.

Comment 14 Jan Pokorný [poki] 2011-12-09 19:50:36 UTC
Created attachment 544668 [details]
reiterated patch (should reflect all Radek's suggestions)

In fact, the luci_admin script was not doing what can be marked as "restore"
operation.  It simply added items contained in the previously generated
backup XML file (and in some cases, even generated tracebacks if was not
successful in doing that).

This behaviour is kept as a default for backward compatibility, but a new
switch '-r' ('--replace') is added for ``restore'' subcommand that activates
the commonly desired semantics "remove old configuration first and then
import a new one".  Currently, such data items are storage and cluster ones.

Complementary, '-u' ('--update') switch has been added, which is meant
to explicitly request (as of now default) "update" semantics.

Also be permissive if a smooth merge cannot be done due to existing
vs. new item collision -- keep existing and continue (cluster, storage
system, certificate).

Also add some comments and restrict the behaviour a bit.

Comment 15 Radek Steiger 2011-12-12 17:11:59 UTC
Marking as VERIFIED in luci-0.12.2-42.el5.


ad 1) Trying to restore an existing cluster records in database now gives a reasonable message instead of a traceback:

# luci_admin restore -u test.xml
Restoring the luci server...
An error occurred while restoring storage system "west-13.lab.bos.redhat.com": The id "west-13.lab.bos.redhat.com" is invalid - it is already in use.
An error occurred while restoring storage system "west-16.lab.bos.redhat.com": The id "west-16.lab.bos.redhat.com" is invalid - it is already in use.
An error occurred while restoring storage system "west-14.lab.bos.redhat.com": The id "west-14.lab.bos.redhat.com" is invalid - it is already in use.
An error occurred while restoring storage system "west-15.lab.bos.redhat.com": The id "west-15.lab.bos.redhat.com" is invalid - it is already in use.
An error occurred while restoring the cluster "mycluster": The id "mycluster" is invalid - it is already in use.
Certificate "/var/lib/luci/var/certs/privkey.pem" already exists, skipping...
Certificate "/var/lib/luci/var/certs/https.key.pem" already exists, skipping...
Certificate "/var/lib/luci/var/certs/cacert.pem" already exists, skipping...
Certificate "/var/lib/luci/var/certs/https.pem" already exists, skipping...
Certificate "/var/lib/luci/var/certs/cacert.config" already exists, skipping...
Certificate "/var/lib/luci/var/certs/peers/west-16.lab.bos.redhat.com_cert_pub" already exists, skipping...
Certificate "/var/lib/luci/var/certs/peers/west-14.lab.bos.redhat.com_cert_pub" already exists, skipping...
Certificate "/var/lib/luci/var/certs/peers/west-15.lab.bos.redhat.com_cert_pub" already exists, skipping...
Certificate "/var/lib/luci/var/certs/peers/west-13.lab.bos.redhat.com_cert_pub" already exists, skipping...
The luci restore was successful.
You must restart the luci server for changes to take effect.
Run "service luci restart" to do so


Replacing an existing configuration is indeed okay:

# luci_admin restore -r test.xml
Restoring the luci server...
Certificate "/var/lib/luci/var/certs/privkey.pem" already exists, skipping...
Certificate "/var/lib/luci/var/certs/https.key.pem" already exists, skipping...
Certificate "/var/lib/luci/var/certs/cacert.pem" already exists, skipping...
Certificate "/var/lib/luci/var/certs/https.pem" already exists, skipping...
Certificate "/var/lib/luci/var/certs/cacert.config" already exists, skipping...
The luci restore was successful.
You must restart the luci server for changes to take effect.
Run "service luci restart" to do so


Omitting the -u parameter still does an update restore as it is a default behavior.



ad 3) Replacing a valid configuration with an empty one (done from clean backup right after luci_admin init) will clean up the database and certificates as well:

# ls -ls /var/lib/luci/var/certs/peers/
total 0


ad 2) New man page content is sufficient:

<...>
backup [backup-file]
        Create  a  backup  of luci’s configuration to backup-file (default is
/var/lib/luci/var/luci_backup.xml.  Managed clusters, storage systems, local
users, user permissions, and SSL certificates will be stored as part of the
backup.

restore [-u|--update|-r|--replace] [backup-file]
        Restore luci’s configuration using data from backup-file previously
created using backup  (default  is  /var/lib/luci/var/luci_backup.xml).  
Switches  -u  (--update)  and  -r (--replace)  controls  the  semantics of the
command: keep (default) or remove existing configuration respectively.  The
latter has probably the expected effect, but has to be specified explicitly
(backward compatibility).
</...>

Comment 16 errata-xmlrpc 2012-02-21 03:17:12 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHSA-2012-0151.html


Note You need to log in before you can comment on or make changes to this bug.