Bug 1296701 - Swift Ringbuilder errors when using IPv6 address
Swift Ringbuilder errors when using IPv6 address
Status: CLOSED ERRATA
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-tripleo-heat-templates (Show other bugs)
8.0 (Liberty)
Unspecified Unspecified
urgent Severity high
: y3
: 7.0 (Kilo)
Assigned To: marios
yeylon@redhat.com
:
: 1296702 (view as bug list)
Depends On:
Blocks: 1299227
  Show dependency treegraph
 
Reported: 2016-01-07 17:45 EST by Dan Sneddon
Modified: 2016-04-26 09:48 EDT (History)
12 users (show)

See Also:
Fixed In Version: openstack-tripleo-heat-templates-0.8.6-99.el7ost
Doc Type: Bug Fix
Doc Text:
Swift caused deployment errors for an IPv6-based Overcloud due to problems with processing Swift's IPv6 addresses. This fix corrects how the IPv6 addresses are processed. Swift now deploys successfully.
Story Points: ---
Clone Of:
: 1299227 (view as bug list)
Environment:
Last Closed: 2016-02-18 11:48:50 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
Launchpad 1534135 None None None 2016-01-14 07:59 EST
OpenStack gerrit 267523 None None None 2016-01-14 07:59 EST

  None (edit)
Description Dan Sneddon 2016-01-07 17:45:57 EST
Description of problem:
When using IPv6 versions of the isolated network TripleO Heat templates, the deployment fails due to a Swift error:

Version-Release number of selected component (if applicable):
OSP 8 beta
openstack-swift.noarch       2.5.1-dev134.el7.centos                
openstack-swift-account.noarch  2.5.1-dev134.el7.centos
openstack-swift-container.noarch 2.5.1-dev134.el7.centos                
openstack-swift-object.noarch 2.5.1-dev134.el7.centos   
openstack-swift-plugin-swift3.noarch 1.7-4.el7 
openstack-swift-proxy.noarch 2.5.1-dev134.el7.centos 

How reproducible:
100%

Steps to Reproduce:
1. Deploy using IPv6 templates (https://etherpad.openstack.org/p/tripleo-ipv6-support)
2.
3.

Actual results:
Error: Parameter name failed on Ring_object_device[fd00:fd00:fd00:4000:f816:3eff:fe60:33a0:6000/d1]: Validate method failed for class name: the scheme http does not accept registry part: fd00:fd00:fd00:4000:f816:3eff:fe60:33a0:6000 (or bad hostname?) at /var/lib/heat-config/heat-config-puppet/ab0769c5-06fe-4ee9-acc9-c29dccbec687.pp:42\nWrapped exception:\nValidate method failed for class name: the scheme http does not accept registry part: fd00:fd00:fd00:4000:f816:3eff:fe60:33a0:6000 (or bad hostname?)

Expected results:
The Swift ringbuilder should build the rings.

Additional info:
I can't figure out for the life of me where the URL for the Swift ringbuilder is stored. Maybe it isn't, and when Puppet runs it just reads the hieradata, but there I couldn't find anything stored anywhere on the controller which contains a URL with fd00:fd00:fd00:4000:f816:3eff:fe60:33a0:6000.

I do see this in /etc/puppet/hieradata/swift_devices_and_proxy.yaml:

swift::proxy::cache::memcache_servers: ['fd00:fd00:fd00:2000:f816:3eff:fec4:ed44:11211']
tripleo::ringbuilder::devices: r1z1-fd00:fd00:fd00:4000:f816:3eff:fe60:33a0:%PORT%/d1, 

If the above is part of a URL, then the IP address should have a bracket around it, separating it from the port number. If it is part of a URL, though, I can't figure out what the r1z1- prefix is about.

Please help me figure out how to make Swift ringbuilder work with IPv6 IP addresses.
Comment 2 Dan Sneddon 2016-01-08 04:22:34 EST
*** Bug 1296702 has been marked as a duplicate of this bug. ***
Comment 3 Pete Zaitcev 2016-01-12 02:08:29 EST
The r1z1-host:port/device syntax is how fully qualified devices are
described in Swift, and yes, the host should have brackets if it's
an IPv6 address.

I have tested that this works:

 swift-ring-builder test.builder add r1z1-[fd00:fd00:fd00:4000:f816:3eff:fe60:33a0]:6020/d1 100

But anyway, I do not understand why this bug is assigned to Swift.
The message "Validate method failed for class name" does not occur
anywhere in Swift. Clearly it's something in Tripple-O or Director.
I suggest to change the assignment of the bug component.
Comment 5 Sergey Gotliv 2016-01-12 07:53:51 EST
I agree with the comment#3, it also works on my environment....
Comment 6 Pete Zaitcev 2016-01-12 11:46:36 EST
This is not related to this specific problem and error message, but
we might want to clone this bug or open new bug in order to pick up
the fix for memcached over IPv6 (fresh off the oven on December 16):

 https://github.com/openstack/swift/commit/167bb5eeb82886d67c1b382417fb22b8ea85f0d3
Comment 7 marios 2016-01-14 06:12:34 EST
hey, we are setting this in the controller (for example) templates like

1271   swift_device:$                                                               
1272     description: Swift device formatted for swift-ring-builder$                
1273     value:$                                                                    
1274       str_replace:$                                                            
1275         template: 'r1z1-IP:%PORT%/d1'$                                         
1276         params:$                                                               
1277           IP: {get_attr: [NetIpMap, net_ip_map, {get_param: [ServiceNetMap, SwiftMgmtNetwork]}]}$


we should be able to add the '[]' around the IP. One issue is will that then break ipv4. SO we need to possibly have two swift_device and also swift_devicev6

poking some more.
Comment 8 marios 2016-01-14 07:58:16 EST
fixup for this proposed:

downstream:  https://code.engineering.redhat.com/gerrit/#/c/65558/
upstream: https://review.openstack.org/#/c/267523/

(both are rebased ontop of the ipv6 patch up/downstream respectively)

and also upstream bug https://bugs.launchpad.net/tripleo/+bug/1534135 for tracking
Comment 9 Dan Sneddon 2016-01-16 21:30:21 EST
After the downstream patch was merged, I am getting this error when I try to deploy:

CommandError: Could not fetch contents for file:///usr/share/openstack-tripleo-heat-templates/environments/puppet/swift-devices-and-proxy-config-v6.yaml

I think we missed something in the merge.
Comment 10 marios 2016-01-17 02:01:00 EST
(In reply to Dan Sneddon from comment #9)
> After the downstream patch was merged, I am getting this error when I try to
> deploy:
> 
> CommandError: Could not fetch contents for
> file:///usr/share/openstack-tripleo-heat-templates/environments/puppet/swift-
> devices-and-proxy-config-v6.yaml
> 
> I think we missed something in the merge.

thanks Dan, yeah is a a typo in the environment file (points to upstream here but is the same downstream): https://review.openstack.org/#/c/267523/3/environments/network-isolation-v6.yaml

the 'puppet' should be '../puppet' will fixup now
Comment 13 Marius Cornea 2016-01-19 08:12:29 EST
I did a deployment with 1 ctrl and 1 compute with the SwiftMgmtNetwork set to internal_api and it completed sucessfully:

[2016-01-19 07:55:11,409] (heat-config) [INFO] {"deploy_stdout": "\u001b[mNotice: Compiled catalog for overcloud-controller-0.localdomain in environment production in 0.50 seconds\u001b[0m\n\u001b[mNotice: /Stage[main]/Main/Package_manifest[/var/lib/tripleo/installed-packages/ringbuilder]/ensure: created\u001b[0m\n\u001b[mNotice: /Stage[main]/Tripleo::Ringbuilder/Swift::Ringbuilder::Create[container]/Exec[create_container]/returns: executed successfully\u001b[0m\n\u001b[mNotice: /Stage[main]/Tripleo::Ringbuilder/Swift::Ringbuilder::Create[object]/Exec[create_object]/returns: executed successfully\u001b[0m\n\u001b[mNotice: /Stage[main]/Tripleo::Ringbuilder/Swift::Ringbuilder::Create[account]/Exec[create_account]/returns: executed successfully\u001b[0m\n\u001b[mNotice: /Stage[main]/Tripleo::Ringbuilder/Add_devices[r1z1-[fd00:fd00:fd00:2000:f816:3eff:feba:f3fe]:%PORT%/d1]/Ring_account_device[[fd00:fd00:fd00:2000:f816:3eff:feba:f3fe]:6002/d1]/ensure: created\u001b[0m\n\u001b[mNotice: /Stage[main]/Tripleo::Ringbuilder/Add_devices[r1z1-[fd00:fd00:fd00:2000:f816:3eff:feba:f3fe]:%PORT%/d1]/Ring_container_device[[fd00:fd00:fd00:2000:f816:3eff:feba:f3fe]:6001/d1]/ensure: created\u001b[0m\n\u001b[mNotice: /Stage[main]/Tripleo::Ringbuilder/Add_devices[r1z1-[fd00:fd00:fd00:2000:f816:3eff:feba:f3fe]:%PORT%/d1]/Ring_object_device[[fd00:fd00:fd00:2000:f816:3eff:feba:f3fe]:6000/d1]/ensure: created\u001b[0m\n\u001b[mNotice: /Stage[main]/Tripleo::Ringbuilder/Swift::Ringbuilder::Rebalance[account]/Exec[rebalance_account]: Triggered 'refresh' from 1 events\u001b[0m\n\u001b[mNotice: /Stage[main]/Tripleo::Ringbuilder/Swift::Ringbuilder::Rebalance[container]/Exec[rebalance_container]: Triggered 'refresh' from 1 events\u001b[0m\n\u001b[mNotice: /Stage[main]/Tripleo::Ringbuilder/Swift::Ringbuilder::Rebalance[object]/Exec[rebalance_object]: Triggered 'refresh' from 1 events\u001b[0m\n\u001b[mNotice: Finished catalog run in 3.08 seconds\u001b[0m\n", "deploy_stderr": "Device \"br_ex\" does not exist.\nDevice \"ovs_system\" does not exist.\n"
Notice: /Stage[main]/Tripleo::Ringbuilder/Add_devices[r1z1-[fd00:fd00:fd00:2000:f816:3eff:feba:f3fe]:%PORT%/d1]/Ring_object_device[[fd00:fd00:fd00:2000:f816:3eff:feba:f3fe]:6000/d1]/ensure: created

[root@overcloud-controller-0 ~]# swift-ring-builder /etc/swift/object.ring.gz 
/etc/swift/object.ring.gz, build version 1
1024 partitions, 3.000000 replicas, 1 regions, 1 zones, 1 devices, 0.00 balance, 0.00 dispersion
The minimum number of hours before a partition can be reassigned is 1
The overload factor is 0.00% (0.000000)
Devices:    id  region  zone      ip address  port  replication ip  replication port      name weight partitions balance meta
             0       1     1 fd00:fd00:fd00:2000:f816:3eff:feba:f3fe  6000 fd00:fd00:fd00:2000:f816:3eff:feba:f3fe              6000        d1 100.00       3072    0.00
Comment 15 errata-xmlrpc 2016-02-18 11:48:50 EST
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-0264.html

Note You need to log in before you can comment on or make changes to this bug.