Bug 1731575 - EDP error while running Ambari/HDP 2.6
Summary: EDP error while running Ambari/HDP 2.6
Keywords:
Status: VERIFIED
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-sahara
Version: 15.0 (Stein)
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: z2
: 15.0 (Stein)
Assignee: Telles Nobrega
QA Contact: Luigi Toscano
Chuck Copello
URL:
Whiteboard:
Depends On:
Blocks: 1612113
TreeView+ depends on / blocked
 
Reported: 2019-07-19 20:10 UTC by Luigi Toscano
Modified: 2020-02-14 19:36 UTC (History)
4 users (show)

Fixed In Version: openstack-sahara-10.0.1-0.20191022230426.1fa76ce.el8ost
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
OpenStack Storyboard 2006258 None None None 2019-07-19 20:10:50 UTC
OpenStack gerrit 672818 'None' MERGED Python 3 fixes 2020-02-14 19:35:00 UTC
OpenStack gerrit 687163 'None' MERGED Python 3 fixes 2020-02-14 19:35:00 UTC

Description Luigi Toscano 2019-07-19 20:10:51 UTC
Description of problem:

While running the scenario tests (sahara-tests) for Ambari/HDP 2.6 with an up-to-date Stein on RHEL 8 (Python 3.6) I hit the following error:

Traceback (most recent call last):

  File "/usr/lib/python3.6/site-packages/sahara_tests/scenario/base.py", line 63, in wrapper
    return fct(self, *args, **kwargs)
  File "/usr/lib/python3.6/site-packages/sahara_tests/scenario/base.py", line 236, in check_run_jobs
    self._job_batching(pre_exec)

  File "/usr/lib/python3.6/site-packages/sahara_tests/scenario/base.py", line 246, in _job_batching
    self._poll_jobs_status(job_exec_ids)

  File "/usr/lib/python3.6/site-packages/sahara_tests/scenario/base.py", line 378, in _poll_jobs_status
    self.fail("\n".join(report))

  File "/usr/lib/python3.6/site-packages/unittest2/case.py", line 693, in fail
    raise self.failureException(msg)

AssertionError: Job with id=92c1f7bd-0d25-408f-9db7-a46fb901dc85, name=test-2a7add96, type=Java has status FAILED
Job with id=d6696715-a88b-4788-9469-7a2e10b564d9, name=test-20a136cf, type=Spark has status FAILED

The log file sahara-engine.log contains two instances of the following error:

[instance: f8a4e63c-d70a-4e27-89df-c8441e969a20, job_execution: be2b1349-e3e8-4c04-801c-63aef840bed8] Executing "sudo su - -c "hdfs dfs -mkdir -p /user/oozie/test-bb8c72ec/77f23ff4-0803-4e3f-ae54-0836ddcd0b4c/lib" oozie" _log_command /usr/lib/python3.6/site-packages/sahara/utils/ssh_remote.py:997
[instance: f8a4e63c-d70a-4e27-89df-c8441e969a20, job_execution: 3292d3fa-5cdd-4af4-8e28-30cbb940f4be] Executing "sudo su - -c "hdfs dfs -mkdir -p /user/oozie/test-18a47741/8cef1a3f-71d4-476a-8e56-55f19b702222" oozie" _log_command /usr/lib/python3.6/site-packages/sahara/utils/ssh_remote.py:997
[instance: f8a4e63c-d70a-4e27-89df-c8441e969a20, job_execution: be2b1349-e3e8-4c04-801c-63aef840bed8] "Executing "sudo su - -c "hdfs dfs -mkdir -p /user/oozie/test-bb8c72ec/77f23ff4-0803-4e3f-ae54-0836ddcd0b4c/lib" oozie"" took 2.9 seconds to complete _log_command /usr/lib/python3.6/site-packages/sahara/utils/ssh_remote.py:997
[instance: f8a4e63c-d70a-4e27-89df-c8441e969a20, job_execution: be2b1349-e3e8-4c04-801c-63aef840bed8] Writing file "/tmp/test-78187623-hadoop-mapreduce-examples-2.6.0.jar" _log_command /usr/lib/python3.6/site-packages/sahara/utils/ssh_remote.py:997
[instance: f8a4e63c-d70a-4e27-89df-c8441e969a20, job_execution: be2b1349-e3e8-4c04-801c-63aef840bed8] "Writing file "/tmp/test-78187623-hadoop-mapreduce-examples-2.6.0.jar"" took 0.1 seconds to complete _log_command /usr/lib/python3.6/site-packages/sahara/utils/ssh_remote.py:997
[instance: none, job_execution: be2b1349-e3e8-4c04-801c-63aef840bed8] Can't run job execution (reason: TypeError: initial_value must be str or None, not bytes

Error ID: d4e5b623-2c15-438c-8654-709a57172a75): sahara.exceptions.SubprocessException: TypeError: initial_value must be str or None, not bytes
Error ID: d4e5b623-2c15-438c-8654-709a57172a75

Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/sahara/service/edp/job_manager.py", line 134, in run_job
    _run_job(job_execution_id)
  File "/usr/lib/python3.6/site-packages/sahara/service/edp/job_manager.py", line 113, in _run_job
    jid, status, extra = eng.run_job(job_execution)
  File "/usr/lib/python3.6/site-packages/sahara/service/edp/oozie/engine.py", line 215, in run_job
    prepared_job_params = self._prepare_run_job(job_execution)
  File "/usr/lib/python3.6/site-packages/sahara/service/edp/oozie/engine.py", line 193, in _prepare_run_job
    proxy_configs)
  File "/usr/lib/python3.6/site-packages/sahara/service/edp/oozie/engine.py", line 369, in _upload_job_files_to_hdfs
    lib_dir))
  File "/usr/lib/python3.6/site-packages/sahara/service/edp/oozie/engine.py", line 384, in _upload_job_binaries
    remote=r, context=context.ctx())
  File "/usr/lib/python3.6/site-packages/sahara/service/edp/job_binaries/internal_db/implementation.py", line 38, in copy_binary_to_cluster
    r.write_file_to(dst, raw)
  File "/usr/lib/python3.6/site-packages/sahara/utils/ssh_remote.py", line 900, in write_file_to
    remote_file, data, run_as_root)
  File "/usr/lib/python3.6/site-packages/sahara/utils/ssh_remote.py", line 1019, in _run_s
    return self._run_with_log(func, timeout, description, *args, **kwargs)
  File "/usr/lib/python3.6/site-packages/sahara/utils/ssh_remote.py", line 807, in _run_with_log
    return self._run(func, *args, **kwargs)
  File "/usr/lib/python3.6/site-packages/sahara/utils/ssh_remote.py", line 1015, in _run
    return procutils.run_in_subprocess(self.proc, func, args, kwargs)
  File "/usr/lib/python3.6/site-packages/sahara/utils/procutils.py", line 56, in run_in_subprocess
    raise exceptions.SubprocessException(result['exception'])
sahara.exceptions.SubprocessException: TypeError: initial_value must be str or None, not bytes
Error ID: d4e5b623-2c15-438c-8654-709a57172a75


I suspect a Python 3 porting error.


See also the upstream bug.


Version-Release number of selected component (if applicable):
openstack-sahara-10.0.1-0.20190503161819.994ff8c.el8ost
python3-sahara-tests-0.9.0-0.20190704160411.994b21a.el8ost

Comment 6 Luigi Toscano 2019-12-04 16:30:26 UTC
Changing the component - it was an issue in the core, not in the plugin.

Comment 8 Luigi Toscano 2020-02-14 19:36:17 UTC
The error is no longer visible, and the HDP 2.6 scenario tests with Ambari 2.6 pass:

Results of testing plugin ambari 2.6
+-----------------------------+--------+-------------+----------------------------+
| Check                       | Status | Duration, s |         Start time         |
+-----------------------------+--------+-------------+----------------------------+
| Create node group templates |   OK   |      9      | 2020-02-14 18:28:08.219203 |
| Set flavor                  |   OK   |      0      | 2020-02-14 18:28:09.416057 |
| Set flavor                  |   OK   |      0      | 2020-02-14 18:28:11.585519 |
| Set flavor                  |   OK   |      0      | 2020-02-14 18:28:14.370760 |
| Create cluster template     |   OK   |      5      | 2020-02-14 18:28:17.293615 |
| Create cluster              |   OK   |      13     | 2020-02-14 18:28:22.922648 |
| Check cluster state         |   OK   |     2517    | 2020-02-14 18:28:36.689039 |
| Check cinder volumes        |   OK   |      0      | 2020-02-14 19:10:34.111096 |
| Check event logs            |   OK   |      0      | 2020-02-14 19:10:35.403726 |
| Check EDP jobs              |   OK   |     168     | 2020-02-14 19:10:35.405871 |
| Check cluster verification  |   OK   |      0      | 2020-02-14 19:13:22.804417 |
| Cluster scaling             |   OK   |     851     | 2020-02-14 19:13:23.583025 |
| Check EDP jobs              |   OK   |     143     | 2020-02-14 19:27:34.783401 |
| Check cluster verification  |   OK   |      0      | 2020-02-14 19:29:57.429343 |
+-----------------------------+--------+-------------+----------------------------+


Verified on:
openstack-sahara-common-10.0.1-0.20191022230426.1fa76ce.el8ost.noarch
openstack-sahara-engine-10.0.1-0.20191022230426.1fa76ce.el8ost.noarch
python3-sahara-10.0.1-0.20191022230426.1fa76ce.el8ost.noarch
python3-sahara-plugin-ambari-1.0.1-0.20191212120442.9f0c190.el8ost.noarch
python3-sahara-plugin-cdh-1.0.2-0.20191212130444.09ce917.el8ost.noarch
python3-sahara-plugin-mapr-1.0.2-0.20191212130444.f58087b.el8ost.noarch
python3-saharaclient-2.2.1-0.20190701100404.564380a.el8ost.noarch


Note You need to log in before you can comment on or make changes to this bug.