RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1217288 - [enh] Configuration snapshots and rollbacks
Summary: [enh] Configuration snapshots and rollbacks
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: NetworkManager
Version: 7.3
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: rc
: ---
Assignee: Beniamino Galvani
QA Contact: Desktop QE
URL:
Whiteboard:
Depends On:
Blocks: 1301628 1313485
TreeView+ depends on / blocked
 
Reported: 2015-04-30 01:19 UTC by Dan Williams
Modified: 2016-11-03 19:13 UTC (History)
11 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-11-03 19:13:44 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
GNOME Bugzilla 749363 0 None None None Never
GNOME Bugzilla 757858 0 None None None 2016-01-15 13:42:31 UTC
Red Hat Product Errata RHSA-2016:2581 0 normal SHIPPED_LIVE Low: NetworkManager security, bug fix, and enhancement update 2016-11-03 12:08:07 UTC

Description Dan Williams 2015-04-30 01:19:16 UTC
Various consumers of NetworkManager's API (like Cockpit) have requested snapshot/checkpoint capability for NM configuration before, to enable rollback if the machine fails post configuration tests.

My initial thoughts on this are:

- add a org.fdo.NM.Settings.Snapshot() -> (id: s), where we take a snapshot of all connectionsand serialize those somewhere in keyfile format in a tar.gz along with some metadata about what plugin they came from (if any, no plugin means temporary connection of course).

- add a Commit(id: s) method; calling this method makes the changes since the given Snapshot() permanent, and deletes the .tar.gz of the snapshot origin

- add a Rollback(id: s) method that resets that configuration to the backed up copy from Snapshot() by deleting any new connections, overwriting existing ones, and adding back missing ones.

- if NM gets told to quit while there's an outstanding snapshot, it should probably roll everything back.  Not 100% sure about this one though, but it's the safest bet.

- NM should be determining what "successful" system boot or configuration actually is; that should be up to the thing that calls Snapshot() and Rollback().  We had discussions with the libvirt guys about this and they indicated that there are many ways to determine whether the config should be rolled back, that are higher level than just "did we get an IP".  Pings to certain machines, "can I talk to known etcd", that kind of thing...

Comment 2 Thomas Haller 2015-04-30 11:13:37 UTC
(In reply to Dan Williams from comment #0)
> Various consumers of NetworkManager's API (like Cockpit) have requested
> snapshot/checkpoint capability for NM configuration before, to enable
> rollback if the machine fails post configuration tests.
> 
> My initial thoughts on this are:
> 
> - add a org.fdo.NM.Settings.Snapshot() -> (id: s), where we take a snapshot
> of all connectionsand serialize those somewhere in keyfile format in a
> tar.gz along with some metadata about what plugin they came from (if any, no
> plugin means temporary connection of course).

We could embed additional metadata inside keyfile such as

[keyfile]
setting-plugin=ifcfg-rh
ifcfg-rh-filename=/etc/sysconfig/network-scripts/ifcfg-em1


Then either nm_keyfile_read() returns an additional hash of metadata, and nm_keyfile_writer() likewise accepts one.


> - add a Commit(id: s) method; calling this method makes the changes since
> the given Snapshot() permanent, and deletes the .tar.gz of the snapshot
> origin
> 
> - add a Rollback(id: s) method that resets that configuration to the backed
> up copy from Snapshot() by deleting any new connections, overwriting
> existing ones, and adding back missing ones.
> 
> - if NM gets told to quit while there's an outstanding snapshot, it should
> probably roll everything back.  Not 100% sure about this one though, but
> it's the safest bet.
> 
> - NM should be determining what "successful" system boot or configuration
> actually is; that should be up to the thing that calls Snapshot() and
> Rollback().  We had discussions with the libvirt guys about this and they
> indicated that there are many ways to determine whether the config should be
> rolled back, that are higher level than just "did we get an IP".  Pings to
> certain machines, "can I talk to known etcd", that kind of thing...

You want only one snapshot at a atime? We could have a list of snapshots with the following operations:
  SnapshotWrite()
  SnapshotLoad()
  SnapshotDelete()
  SnapshotList()


Maybe instead of a tar file, concatenate them to one structured text file? That is then still human readable. Then they could be in one file like:
  /etc/NetworkManager/snapshots/<UUID>.snapshot

Or even better: put separate keyfiles in a snapshot directory:

  /etc/NetworkManager/snapshots/<snapshot-UUID>/<connection-UUID>.keyfile

The advantage of that is that keyfiles are very nicely to inspect and reuse. A user could do:

  cp /etc/NetworkManager/snapshots/<snapshot-UUID>/<connection-UUID>.keyfile \
     /etc/NetworkMaanger/system-connections/my
  nmcli connection load /etc/NetworkMaanger/system-connections/my

Comment 3 Thomas Haller 2015-04-30 11:18:31 UTC
/etc/NetworkManager/snapshots/<snapshot-UUID>/ or instead of using UUIDs, just use a timestamp? "%H%m%d-%H%M%S"?

After all, we would not keep a cached version of this in memory.

  SnapshotList() would just iterate the filesystem.
  SnapshotDelete() would just delete a directory/file.
  SnapshotWrite() would create a new directory, and write the keyfiles there.
  SnapshotLoad(), would load all keyfiles from a directory in memory first,  
      before proceeding with rollback.


Anyway, the file name of the snapshot directory shouldn't matter.

Comment 4 Thomas Haller 2015-10-01 16:01:09 UTC
This feature will not make it for rhel-7.2

Comment 6 Thomas Haller 2016-01-15 13:42:32 UTC
A completely different solution would be RFE https://bugzilla.gnome.org/show_bug.cgi?id=757858

I think that would be more what cockpit requires.

Comment 7 Beniamino Galvani 2016-07-17 10:39:34 UTC
Branch on review in upstream bug https://bugzilla.gnome.org/show_bug.cgi?id=757858

Comment 8 Beniamino Galvani 2016-08-24 07:16:36 UTC
Branch merged, awaits next snapshot.

Comment 10 Vladimir Benes 2016-09-19 14:06:55 UTC
reverting changed configuration to saved state works well

Comment 12 errata-xmlrpc 2016-11-03 19:13:44 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2016-2581.html


Note You need to log in before you can comment on or make changes to this bug.