Bug 1392728 - osp-director-10: Upgrade 9 ->10 with IPv6 fails on CONVERGENCE_STEP with ERROR: Unsupported file format, bad range specification after pos 9 (error came from ceph node).
Summary: osp-director-10: Upgrade 9 ->10 with IPv6 fails on CONVERGENCE_STEP with ERR...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-tripleo-heat-templates
Version: 10.0 (Newton)
Hardware: x86_64
OS: Linux
high
urgent
Target Milestone: rc
: 10.0 (Newton)
Assignee: Giulio Fidente
QA Contact: Arik Chernetsky
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-11-08 05:42 UTC by Omri Hochman
Modified: 2016-12-14 16:30 UTC (History)
11 users (show)

Fixed In Version: openstack-tripleo-heat-templates-5.0.0-1.5.el7ost
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-12-14 16:30:14 UTC


Attachments (Terms of Use)
messages from the ceph (1.99 MB, application/x-tar)
2016-11-08 05:57 UTC, Omri Hochman
no flags Details
heat-engine.log from underlcoud (4.02 MB, application/x-bzip)
2016-11-08 05:58 UTC, Omri Hochman
no flags Details


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2016:2948 normal SHIPPED_LIVE Red Hat OpenStack Platform 10 enhancement update 2016-12-14 19:55:27 UTC
OpenStack gerrit 394987 None None None 2016-11-08 14:27:51 UTC
Launchpad 1640148 None None None 2016-11-08 12:16:15 UTC

Description Omri Hochman 2016-11-08 05:42:59 UTC
osp-director-10:  Upgrade 9 ->10 with IPv6 fails on CONVERGENCE_STEP with ERROR: Unsupported file format, bad range specification after pos 9. 

Environment: 
-------------
instack-undercloud-5.0.0-2.el7ost.noarch
instack-5.0.0-1.el7ost.noarch
heat-cfntools-1.3.0-2.el7ost.noarch
openstack-heat-api-cfn-7.0.0-4.el7ost.noarch
openstack-tripleo-heat-templates-5.0.0-1.2.el7ost.noarch
openstack-tripleo-heat-templates-compat-2.0.0-34.3.el7ost.noarch
python-heat-agent-0-0.5.1e6015dgit.el7ost.noarch
openstack-heat-api-7.0.0-4.el7ost.noarch
puppet-heat-9.4.1-1.el7ost.noarch
openstack-heat-templates-0-0.5.1e6015dgit.el7ost.noarch
python-heatclient-1.5.0-1.el7ost.noarch
openstack-heat-common-7.0.0-4.el7ost.noarch
openstack-heat-engine-7.0.0-4.el7ost.noarch
python-heat-tests-7.0.0-4.el7ost.noarch


Steps: 
--------
(1) Deploy OSP9 with IPv6
(2) Attempt to upgrade from OSP9 to OSP10 


Results: 
---------
Upgrade fails on CONVERGENCE_STEP with : ERROR: Unsupported file format, bad range specification after pos 9. 


WARNING (shell) "heat resource-list" is deprecated, please use "openstack stack resource list" instead


+-------------------------------------------+---------------------------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------+-----------------+----------------------+----------------------------------------------------------------------------------------------------------------------+
| resource_name                             | physical_resource_id                                                            | resource_type                                                                                                       | resource_status | updated_time         | stack_name                                                                                                           |
+-------------------------------------------+---------------------------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------+-----------------+----------------------+----------------------------------------------------------------------------------------------------------------------+
| AllNodesDeploySteps                       | ccc457ab-32da-41b4-9f62-7e0f161c9d43                                            | OS::TripleO::PostDeploySteps                                                                                        | CREATE_FAILED   | 2016-11-08T05:20:52Z | overcloud                                                                                                            |
| ControllerArtifactsDeploy                 | 99aade54-5b2c-4d38-8078-6507bbf92227                                            | OS::Heat::StructuredDeployments                                                                                     | CREATE_FAILED   | 2016-11-08T05:20:55Z | overcloud-AllNodesDeploySteps-dfasepqo6plf                                                                           |
| CephStorageArtifactsDeploy                | f8574300-2d3e-4ee4-b3d2-65a9e2d379c1                                            | OS::Heat::StructuredDeployments                                                                                     | CREATE_FAILED   | 2016-11-08T05:20:56Z | overcloud-AllNodesDeploySteps-dfasepqo6plf                                                                           |
| ComputeArtifactsDeploy                    | 3746256e-6128-41f3-b330-dead7b5ca952                                            | OS::Heat::StructuredDeployments                                                                                     | CREATE_FAILED   | 2016-11-08T05:20:56Z | overcloud-AllNodesDeploySteps-dfasepqo6plf                                                                           |
| 0                                         | 5410b77f-4975-42a8-b8d3-1bea28079806                                            | OS::Heat::StructuredDeployment                                                                                      | CREATE_FAILED   | 2016-11-08T05:21:02Z | overcloud-AllNodesDeploySteps-dfasepqo6plf-ControllerArtifactsDeploy-eitz2bimxfgr                                    |
| 0                                         | 87ca5d42-54c9-41e6-8df5-ddcf604d52cc                                            | OS::Heat::StructuredDeployment                                                                                      | CREATE_FAILED   | 2016-11-08T05:21:02Z | overcloud-AllNodesDeploySteps-dfasepqo6plf-ComputeArtifactsDeploy-666tg5dvn4fg                                       |
| 1                                         | 9807da45-1f4d-479c-9d92-945832190c77                                            | OS::Heat::StructuredDeployment                                                                                      | CREATE_FAILED   | 2016-11-08T05:21:02Z | overcloud-AllNodesDeploySteps-dfasepqo6plf-ControllerArtifactsDeploy-eitz2bimxfgr                                    |
| 2                                         | 516389a1-ba64-4aeb-a3c1-f44785e6cc61                                            | OS::Heat::StructuredDeployment                                                                                      | CREATE_FAILED   | 2016-11-08T05:21:02Z | overcloud-AllNodesDeploySteps-dfasepqo6plf-ControllerArtifactsDeploy-eitz2bimxfgr                                    |
| 0                                         | 001bbb95-cbc8-4779-92ae-bbe19b148b49                                            | OS::Heat::StructuredDeployment                                                                                      | CREATE_FAILED   | 2016-11-08T05:21:03Z | overcloud-AllNodesDeploySteps-dfasepqo6plf-CephStorageArtifactsDeploy-ygv3uarg7dxc                                   |
| ControllerAllNodesValidationDeployment    | 66745562-2623-4d14-845c-ade1b55cf3f6                                            | OS::Heat::StructuredDeployments                                                                                     | UPDATE_FAILED   | 2016-11-08T05:21:52Z | overcloud                                                                                                            |
+-------------------------------------------+---------------------------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------+-----------------+----------------------+----------------------------------------------------------------------------------------------------------------------+
[stack@undercloud-0 ~]$ [stack@undercloud-0 ~]$ heat deployment-show 001bbb95-cbc8-4779-92ae-bbe19b148b49
WARNING (shell) "heat deployment-show" is deprecated, please use "openstack software deployment show" instead
{
  "status": "FAILED", 
  "server_id": "ab8d95ab-dd30-4ad0-ba45-1a4d4df3d5f0", 
  "config_id": "e4d47391-283b-4b1c-aea4-f7e4192fd2a3", 
  "output_values": {
    "deploy_stdout": "ERROR: Unsupported file format.\n", 
    "deploy_stderr": "curl: (3) [globbing] error: bad range specification after pos 9\n", 
    "deploy_status_code": 1
  }, 
  "creation_time": "2016-11-08T05:21:05Z", 
  "updated_time": "2016-11-08T05:21:50Z", 
  "input_values": {}, 
  "action": "CREATE", 
  "status_reason": "deploy_status_code : Deployment exited with non-zero status code: 1", 
  "id": "001bbb95-cbc8-4779-92ae-bbe19b148b49"
}

Comment 1 Omri Hochman 2016-11-08 05:44:45 UTC
heat-engine.log :
------------------
2016-11-08 00:21:51.064 20367 INFO heat.engine.resource [req-dbcbcf41-4a7a-4ce2-9e18-b461102cd142 743a458154ce4748a6c24dc3633d5031 4fd10e82222c4db281ccecde5b483ade - - -] CREATE: StructuredD
eployment "0" [001bbb95-cbc8-4779-92ae-bbe19b148b49] Stack "overcloud-AllNodesDeploySteps-dfasepqo6plf-CephStorageArtifactsDeploy-ygv3uarg7dxc" [f8574300-2d3e-4ee4-b3d2-65a9e2d379c1]
2016-11-08 00:21:51.064 20367 ERROR heat.engine.resource Traceback (most recent call last):
2016-11-08 00:21:51.064 20367 ERROR heat.engine.resource   File "/usr/lib/python2.7/site-packages/heat/engine/resource.py", line 753, in _action_recorder
2016-11-08 00:21:51.064 20367 ERROR heat.engine.resource     yield
2016-11-08 00:21:51.064 20367 ERROR heat.engine.resource   File "/usr/lib/python2.7/site-packages/heat/engine/resource.py", line 855, in _do_action
2016-11-08 00:21:51.064 20367 ERROR heat.engine.resource     yield self.action_handler_task(action, args=handler_args)
2016-11-08 00:21:51.064 20367 ERROR heat.engine.resource   File "/usr/lib/python2.7/site-packages/heat/engine/scheduler.py", line 353, in wrapper
2016-11-08 00:21:51.064 20367 ERROR heat.engine.resource     step = next(subtask)
2016-11-08 00:21:51.064 20367 ERROR heat.engine.resource   File "/usr/lib/python2.7/site-packages/heat/engine/resource.py", line 806, in action_handler_task
2016-11-08 00:21:51.064 20367 ERROR heat.engine.resource     done = check(handler_data)
2016-11-08 00:21:51.064 20367 ERROR heat.engine.resource   File "/usr/lib/python2.7/site-packages/heat/engine/resources/openstack/heat/software_deployment.py", line 435, in check_create_complete
2016-11-08 00:21:51.064 20367 ERROR heat.engine.resource     return self._check_complete()
2016-11-08 00:21:51.064 20367 ERROR heat.engine.resource   File "/usr/lib/python2.7/site-packages/heat/engine/resources/openstack/heat/software_deployment.py", line 301, in _check_complete
2016-11-08 00:21:51.064 20367 ERROR heat.engine.resource     raise exception.Error(message)
2016-11-08 00:21:51.064 20367 ERROR heat.engine.resource Error: Deployment to server failed: deploy_status_code : Deployment exited with non-zero status code: 1
2016-11-08 00:21:51.064 20367 ERROR heat.engine.resource 
2016-11-08 00:21:51.076 20366 DEBUG heat.engine.scheduler [req-dbcbcf41-4a7a-4ce2-9e18-b461102cd142 743a458154ce4748a6c24dc3633d5031 4fd10e82222c4db281ccecde5b483ade - - -] Task create running step /usr/lib/python2.7/site-packages/heat/engine/scheduler.py:216
2016-11-08 00:21:51.094 20366 DEBUG oslo_messaging._drivers.amqpdriver [req-dbcbcf41-4a7a-4ce2-9e18-b461102cd142 743a458154ce4748a6c24dc3633d5031 4fd10e82222c4db281ccecde5b483ade - - -] sending reply msg_id: d41eda63a7194f108a6436626c5baa60 reply queue: reply_4004575557444aaa9a68cea1f5c8529c time elapsed: 0.0732205460008s _send_reply /usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py:73

Comment 4 Omri Hochman 2016-11-08 05:50:35 UTC
found the errors on /var/log/messages of ceph_node: 
----------------------------------------------------
Nov  8 00:21:46 localhost os-collect-config: [2016-11-08 05:21:46,742] (heat-config) [INFO] {"deploy_stdout": "ERROR: Unsupported file format.\n", "deploy_stderr": "curl: (3) [globbing] erro
r: bad range specification after pos 9\n", "deploy_status_code": 1}
Nov  8 00:21:46 localhost os-collect-config: [2016-11-08 05:21:46,742] (heat-config) [DEBUG] [2016-11-08 05:21:46,653] (heat-config) [INFO] artifact_urls=http://[fd00:fd00:fd00:3000::10]:808
0/v1/AUTH_a2d960971f4d4357807d5c9edf942019/overcloud-artifacts/puppet-modules.tar.gz?temp_url_sig=f5399cab787fc6fc090dc50aa21634dab34d6b8f&temp_url_expires=1510111549
Nov  8 00:21:46 localhost os-collect-config: [2016-11-08 05:21:46,654] (heat-config) [INFO] deploy_server_id=ab8d95ab-dd30-4ad0-ba45-1a4d4df3d5f0
Nov  8 00:21:46 localhost os-collect-config: [2016-11-08 05:21:46,654] (heat-config) [INFO] deploy_action=CREATE
Nov  8 00:21:46 localhost os-collect-config: [2016-11-08 05:21:46,654] (heat-config) [INFO] deploy_stack_id=overcloud-AllNodesDeploySteps-dfasepqo6plf-CephStorageArtifactsDeploy-ygv3uarg7dxc
/f8574300-2d3e-4ee4-b3d2-65a9e2d379c1
Nov  8 00:21:46 localhost os-collect-config: [2016-11-08 05:21:46,654] (heat-config) [INFO] deploy_resource_name=0
Nov  8 00:21:46 localhost os-collect-config: [2016-11-08 05:21:46,654] (heat-config) [INFO] deploy_signal_transport=CFN_SIGNAL
Nov  8 00:21:46 localhost os-collect-config: [2016-11-08 05:21:46,655] (heat-config) [INFO] deploy_signal_id=http://192.0.2.1:8000/v1/signal/arn%3Aopenstack%3Aheat%3A%3A4fd10e82222c4db281cce
cde5b483ade%3Astacks%2Fovercloud-AllNodesDeploySteps-dfasepqo6plf-CephStorageArtifactsDeploy-ygv3uarg7dxc%2Ff8574300-2d3e-4ee4-b3d2-65a9e2d379c1%2Fresources%2F0?Timestamp=2016-11-08T05%3A21%
3A03Z&SignatureMethod=HmacSHA256&AWSAccessKeyId=90a9b6f500df4437b5d844f762911c80&SignatureVersion=2&Signature=GKR9YRsnpLN0l4eHSILxBDq2UvxCPP%2Bn6gxU2bs0Z7E%3D
Nov  8 00:21:46 localhost os-collect-config: [2016-11-08 05:21:46,655] (heat-config) [INFO] deploy_signal_verb=POST
Nov  8 00:21:46 localhost os-collect-config: [2016-11-08 05:21:46,655] (heat-config) [DEBUG] Running /var/lib/heat-config/heat-config-script/e4d47391-283b-4b1c-aea4-f7e4192fd2a3
Nov  8 00:21:46 localhost os-collect-config: [2016-11-08 05:21:46,737] (heat-config) [INFO] ERROR: Unsupported file format.
Nov  8 00:21:46 localhost os-collect-config: [2016-11-08 05:21:46,737] (heat-config) [DEBUG] curl: (3) [globbing] error: bad range specification after pos 9
Nov  8 00:21:46 localhost os-collect-config: [2016-11-08 05:21:46,737] (heat-config) [ERROR] Error running /var/lib/heat-config/heat-config-script/e4d47391-283b-4b1c-aea4-f7e4192fd2a3. [1]
Nov  8 00:21:46 localhost os-collect-config: [2016-11-08 05:21:46,743] (heat-config) [INFO] Completed /var/lib/heat-config/hooks/script

Comment 5 Omri Hochman 2016-11-08 05:57:33 UTC
Created attachment 1218407 [details]
messages from the ceph

Comment 6 Omri Hochman 2016-11-08 05:58:08 UTC
Created attachment 1218408 [details]
heat-engine.log from underlcoud

Comment 7 Marios Andreou 2016-11-08 07:41:12 UTC
(@gfidente needinfo is just to ping you about this please check when you get a moment)

As far as I can see the first occurrence of the error you mention above from the ceph journalctl you attached ( https://bugzilla.redhat.com/attachment.cgi?id=1218407 ) is:

    Nov  8 00:21:46 localhost os-collect-config: [2016-11-08 05:21:46,742] (heat-config) [INFO] {"deploy_stdout": "ERROR: Unsupported file format.\n", "deploy_stderr": "curl: (3) [globbing] error: bad range specification after pos 9\n", "deploy_status_code": 1}


However I see some earlier ceph OSD error messages which may be related too, 

    Nov  7 22:03:20 localhost os-collect-config: #033[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/vdb]/Exec[ceph-osd-prepare-/dev/vdb]/returns: partx: /dev/vdb: error adding partition 2#033[0m
    ...
    Nov  7 22:03:20 localhost os-collect-config: #033[mNotice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/vdb]/Exec[ceph-osd-activate-/dev/vdb]/returns: 2016-11-07 22:03:18.268988 7f75197a27c0 -1 auth: error reading file: /var/lib/ceph/tmp/mnt.dMRvpE/keyring: can't open /var/lib/ceph/tmp/mnt.dMRvpE/keyring: (2) No such file or directory#033[0m

though these seem to be non-fatal (Notice not Error). I wonder if this however is still the root cause of the error on converge. I assume that all previous steps were completed OK?
Adding needinfo on gfidente (sorry faidentee... it has the word 'ceph' in the title :/) for visibility, could you please have a look at this if you get a chance today gfidente (I know you have other things too, just if you get to it)? We can talk about it this evening on scrum 

thanks, marios

Comment 8 Giulio Fidente 2016-11-08 11:30:35 UTC
This seems unrelated to Ceph itself, rather curl is failing to download the artifacts_url :

curl -o /tmp/file 'http://[fd00:fd00:fd00:3000::10]:8080/v1/AUTH_a2d960971f4d4357807d5c9edf942019/overcloud-artifacts/puppet-modules.tar.gz?temp_url_sig=f5399cab787fc6fc090dc50aa21634dab34d6b8f&temp_url_expires=1510111549'
curl: (3) [globbing] error: bad range specification after pos 9

Comment 9 Giulio Fidente 2016-11-08 11:37:27 UTC
curl is trying to glob the url, given the brackets in it; one option is to escape the brackets with \

Omri, can you try passing the url as http://\[ip6\]:port/ ?

Another option is to add --globoff to the curl command, I will submit a change for this

Comment 10 Giulio Fidente 2016-11-08 11:43:46 UTC
Also, we can use this BZ to track the IPv6 compatibility issue, but this isn't related to the upgrade process.

Comment 11 Marios Andreou 2016-11-08 12:16:16 UTC
So thanks to gfidente it seems this is an issue because you must be using DeployArtifacts (is your automation passing this in) in your environment - are you installing some specific tarballs/packages here for some reason ontop of the normal deployment?

I don't think this is Lifecycle related, I expect the same would happen if you were to deploy IPv6 and use DeployArtifacts with a fresh OSP10 setup.

Having said that, gfidente has a quick fixup at https://review.openstack.org/#/c/394914/ and I just filed the lp bug https://bugs.launchpad.net/tripleo/+bug/1640148 so it can go to stable/newton. Adding to trackers above.

slagle and jcoufal I think this is a wider Deployment DFG issue wdyt? Its just that omri hits it as part of his upgrades test. Just wondering if it is targetted correctly to DFG:DF-Lifecycle above

Comment 12 Marios Andreou 2016-11-08 12:18:33 UTC
please see comment 11 I don't think this should be targetted at Lifecycle

Comment 13 James Slagle 2016-11-08 14:00:13 UTC
agreed. will update to DFG:DF

jarda, need input on if this is a blocker. DeployArtificats is by default empty and was undocumented for OSPd 9, it is however an interface in the templates. But this never worked with ipv6, so not a regression for this particular case.

Comment 14 Jaromir Coufal 2016-11-08 14:12:43 UTC
Thanks, James, I agree - removing blocker flag.

If we can land the patch for OSP10, all good. If not, we need to improve our docs to be clear about this use case.

Comment 15 Omri Hochman 2016-11-08 15:03:51 UTC
(In reply to Jaromir Coufal from comment #14)
> Thanks, James, I agree - removing blocker flag.
> 
> If we can land the patch for OSP10, all good. If not, we need to improve our
> docs to be clear about this use case.

We would need to validate given w/a and make sure that 'upgrades with IPv6' are not blocked, and then we can remove blocker flag/ or move the bug to docs, etc..

Comment 21 Omri Hochman 2016-11-15 14:16:41 UTC
Verified with : openstack-tripleo-heat-templates-5.0.0-1.7.el7ost.noarch

Comment 23 errata-xmlrpc 2016-12-14 16:30:14 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHEA-2016-2948.html


Note You need to log in before you can comment on or make changes to this bug.