Bug 1640086 - [ceph-ansbile]: 3.2 Installation is failing with ValueError(\"No JSON object could be decoded\")
Summary: [ceph-ansbile]: 3.2 Installation is failing with ValueError(\"No JSON object ...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Ceph Storage
Classification: Red Hat Storage
Component: Ceph-Volume
Version: 3.2
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: rc
: 3.2
Assignee: Alfredo Deza
QA Contact: Parikshith
URL:
Whiteboard:
: 1640029 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-10-17 11:01 UTC by Parikshith
Modified: 2019-01-03 19:02 UTC (History)
16 users (show)

Fixed In Version: RHEL: ceph-12.2.8-21.el7cp Ubuntu: ceph_12.2.8-19redhat1
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-01-03 19:02:09 UTC
Embargoed:


Attachments (Terms of Use)
install-log (1.74 MB, text/plain)
2018-10-17 11:01 UTC, Parikshith
no flags Details
File contains osds.yml, all.yml and rolling update log snippet (12.71 KB, text/plain)
2018-10-24 08:35 UTC, Vasishta
no flags Details
ansible log for comment 26 run (2.43 MB, text/plain)
2018-10-25 08:50 UTC, Ramakrishnan Periyasamy
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Github ceph ceph pull 24738 0 None closed ceph-volume: do not send (lvm) stderr/stdout to the terminal, use the logfile 2020-11-22 17:17:32 UTC
Red Hat Product Errata RHBA-2019:0020 0 None None None 2019-01-03 19:02:17 UTC

Description Parikshith 2018-10-17 11:01:26 UTC
Created attachment 1494783 [details]
install-log

Description of problem:
ceph-volume osds fails to get configured on latest 3.2 build.(rhel/container)

Version-Release number of selected component (if applicable):
ansible-2.6.5-1.el7ae.noarch
ceph-ansible-3.2.0-0.1.beta5.el7cp.noarch
ceph version 12.2.8-16.el7cp
Container image-tag: ceph-3.2-rhel-7-containers-candidate-16927-20181016222443

How reproducible:
always

Steps to Reproduce:
1. Install RHCS 3.2 with ceph-volume osds


Actual results:
The full traceback is:
Traceback (most recent call last):
  File "/tmp/ansible_ggKjF8/ansible_module_ceph_volume.py", line 625, in <module>
    main()
  File "/tmp/ansible_ggKjF8/ansible_module_ceph_volume.py", line 621, in main
    run_module()
  File "/tmp/ansible_ggKjF8/ansible_module_ceph_volume.py", line 521, in run_module
    out_dict = json.loads(out)
  File "/usr/lib64/python2.7/json/__init__.py", line 338, in loads
    return _default_decoder.decode(s)
  File "/usr/lib64/python2.7/json/decoder.py", line 366, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/usr/lib64/python2.7/json/decoder.py", line 384, in raw_decode
    raise ValueError("No JSON object could be decoded")
ValueError: No JSON object could be decoded

failed: [magna005] (item={'wal_vg': 'vg1', 'data_vg': 'vg1', 'wal': 'wal-lv1', 'data': 'data-lv3'}) => {
    "changed": false, 
    "item": {
        "data": "data-lv3", 
        "data_vg": "vg1", 
        "wal": "wal-lv1", 
        "wal_vg": "vg1"
    }, 
    "module_stderr": "Traceback (most recent call last):\n  File \"/tmp/ansible_ggKjF8/ansible_module_ceph_volume.py\", line 625, in <module>\n    main()\n  File \"/tmp/ansible_ggKjF8/ansible_module_ceph_volume.py\", line 621, in main\n    run_module()\n  File \"/tmp/ansible_ggKjF8/ansible_module_ceph_volume.py\", line 521, in run_module\n    out_dict = json.loads(out)\n  File \"/usr/lib64/python2.7/json/__init__.py\", line 338, in loads\n    return _default_decoder.decode(s)\n  File \"/usr/lib64/python2.7/json/decoder.py\", line 366, in decode\n    obj, end = self.raw_decode(s, idx=_w(s, 0).end())\n  File \"/usr/lib64/python2.7/json/decoder.py\", line 384, in raw_decode\n    raise ValueError(\"No JSON object could be decoded\")\nValueError: No JSON object could be decoded\n", 
    "module_stdout": "", 
    "msg": "MODULE FAILURE", 
    "rc": 1
}


Expected results:
ceph-volume osds should get configured.

Additional info:

Osd scenario:
[osds]
magna005 lvm_volumes="[{'data':'data-lv1','data_vg':'vg1'},{'data':'data-lv2','data_vg':'vg1','db':'db-lv1','db_vg':'vg1'},{'data':'data-lv3','data_vg':'vg1','wal':'wal-lv1','wal_vg':'vg1'},{'data':'data-lv4','data_vg':'vg1','db':'db-lv2','db_vg':'vg1','wal':'wal-lv2','wal_vg':'vg1'}]" osd_scenario="lvm" osd_objectstore="bluestore"

magna006 lvm_volumes="[{'data':'data-lv1','data_vg':'vg1'},{'data':'data-lv2','data_vg':'vg1','db':'db-lv1','db_vg':'vg1'},{'data':'data-lv3','data_vg':'vg1','wal':'wal-lv1','wal_vg':'vg1'},{'data':'data-lv4','data_vg':'vg1','db':'db-lv2','db_vg':'vg1','wal':'wal-lv2','wal_vg':'vg1'}]" osd_scenario="lvm" osd_objectstore="bluestore" dmcrypt="True"

all.yml: 
fetch_directory: ~/ceph-key
ceph_origin: distro
ceph_repository: rhcs
monitor_interface: eno1  
public_network: 10.8.128.0/21
ceph_docker_registry: brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888      # 
containerized_deployment: true
ceph_docker_image_tag: ceph-3.2-rhel-7-containers-candidate-16927-20181016222443  
ceph_docker_image: "rhceph"

I have attached install log(-vvv)

Comment 3 Sébastien Han 2018-10-17 11:54:36 UTC
ceph-volume related, assigning to Andrew.

Comment 4 Andrew Schoen 2018-10-17 12:16:59 UTC
(In reply to leseb from comment #3)
> ceph-volume related, assigning to Andrew.

This is related to the container implementation of ceph-volume, which I'm unsure how to debug. The traceback I see looks like we're missing patches downstream for ceph-volume though. I believe we'll need this PR that was just merged yesterday into upstream luminous brought downstream. https://github.com/ceph/ceph/pull/24589

Ken, do we plan to wait for 12.2.9 upstream and then use that for 3.2 downstream? This would save us carrying lots of patches downstream for ceph-volume.

Comment 5 Sébastien Han 2018-10-17 12:58:27 UTC
*** Bug 1640029 has been marked as a duplicate of this bug. ***

Comment 6 Sébastien Han 2018-10-17 13:32:00 UTC
Can I get into the env?
Thanks

Comment 9 Sébastien Han 2018-10-17 16:03:38 UTC
Thanks, I reported the bug on ceph-volume here: http://tracker.ceph.com/issues/36492

Comment 10 Harish NV Rao 2018-10-22 07:26:15 UTC
(In reply to leseb from comment #9)
> Thanks, I reported the bug on ceph-volume here:
> http://tracker.ceph.com/issues/36492

@Sebastien, when can we expect the fix for this? This is currently a test blocker.

Comment 11 Sébastien Han 2018-10-22 07:30:35 UTC
It's a ceph-volume bug so I'll defer that question to Alfredo

Comment 15 Vasishta 2018-10-24 08:35:26 UTC
Created attachment 1496950 [details]
File contains osds.yml, all.yml and rolling update log snippet

Tried rolling update using ceph-ansible-3.2.0-0.1.beta8.el7cp.noarch to upgrade cluster from 12.2.8-19 to 12.2.8-20.

Playbook failed in task ceph-osd : use ceph-volume to create bluestore osds with same issue as mentioned in BZ 1640029 [1].

Can you please check whether issue still persists or we are missing something ?

Regards,
Vasishta shatsry
QE, Ceph

[1] Was closed as duplicate of this bug.

Comment 16 Ramakrishnan Periyasamy 2018-10-24 10:57:21 UTC
RHEL ceph-volume lvm installation fails with below error, observed same error for bluestore and filestore.

The full traceback is:
Traceback (most recent call last):
  File "/tmp/ansible_2Exp4u/ansible_module_ceph_volume.py", line 625, in <module>
    main()
  File "/tmp/ansible_2Exp4u/ansible_module_ceph_volume.py", line 621, in main
    run_module()
  File "/tmp/ansible_2Exp4u/ansible_module_ceph_volume.py", line 521, in run_module
    out_dict = json.loads(out)
  File "/usr/lib64/python2.7/json/__init__.py", line 338, in loads
    return _default_decoder.decode(s)
  File "/usr/lib64/python2.7/json/decoder.py", line 366, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/usr/lib64/python2.7/json/decoder.py", line 384, in raw_decode
    raise ValueError("No JSON object could be decoded")
ValueError: No JSON object could be decoded

failed: [magna060] (item={'journal': 'journal_lv2', 'data': '/dev/sdc', 'journal_vg': 'journal_vg'}) => {
    "changed": false, 
    "item": {
        "data": "/dev/sdc", 
        "journal": "journal_lv2", 
        "journal_vg": "journal_vg"
    }, 
    "module_stderr": "Traceback (most recent call last):\n  File \"/tmp/ansible_2Exp4u/ansible_module_ceph_volume.py\", line 625, in <module>\n    main()\n  File \"/tmp/ansible_2Exp4u/ansible_module_ceph_volume.py\", line 621, in main\n    run_module()\n  File \"/tmp/ansible_2Exp4u/ansible_module_ceph_volume.py\", line 521, in run_module\n    out_dict = json.loads(out)\n  File \"/usr/lib64/python2.7/json/__init__.py\", line 338, in loads\n    return _default_decoder.decode(s)\n  File \"/usr/lib64/python2.7/json/decoder.py\", line 366, in decode\n    obj, end = self.raw_decode(s, idx=_w(s, 0).end())\n  File \"/usr/lib64/python2.7/json/decoder.py\", line 384, in raw_decode\n    raise ValueError(\"No JSON object could be decoded\")\nValueError: No JSON object could be decoded\n", 
    "module_stdout": "", 
    "msg": "MODULE FAILURE", 
    "rc": 1
}

Comment 20 Ramakrishnan Periyasamy 2018-10-24 12:30:25 UTC
Hi Alfredo,

The cluster is containerized and it will not create logs in /var/log/ceph it will be in journalctl

Comment 27 Ramakrishnan Periyasamy 2018-10-25 08:50:42 UTC
Created attachment 1497367 [details]
ansible log for comment 26 run

Comment 28 Ramakrishnan Periyasamy 2018-10-25 08:51:32 UTC
Moving the bug to assigned state.

Comment 29 Alfredo Deza 2018-10-25 10:47:29 UTC
Assigning back to Sebastien, as those are container errors.

Comment 30 Sébastien Han 2018-10-25 10:49:39 UTC
Ramakrishnan Periyasamy, I just resynced the container image downstream which will include the fix for your current issue. This shouldn't have happened in the first place. Please move this as VERIFIED, since this original issue has been fixed. Thanks.

In the meantime, please wait for the new container build and let's take the discussion in: 1630977

Thanks.

Comment 32 Ramakrishnan Periyasamy 2018-10-26 11:35:13 UTC
Hi Ken, would request to move this bug to ON_QA state.

Installation of ceph-volume lvm based OSD scenarios in containers working for Filestore and Bluestore.

Verified in versions:
ansible-2.6.6-1.el7ae.noarch
ceph-ansible-3.2.0-0.1.beta9.el7cp.noarch
ceph version 12.2.8-22.el7cp (400dc4070c1d0a82f9afc8e574d780136dd28b0b) luminous (stable)

Will move the bug to verified state once it is in ON_QA

Comment 33 Ramakrishnan Periyasamy 2018-10-26 11:47:04 UTC
Thanks Alfredo :)

Moving this bug to verified state.

Comment 36 errata-xmlrpc 2019-01-03 19:02:09 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:0020


Note You need to log in before you can comment on or make changes to this bug.