Bug 1640086

Summary: [ceph-ansbile]: 3.2 Installation is failing with ValueError(\"No JSON object could be decoded\")
Product: Red Hat Ceph Storage Reporter: Parikshith <pbyregow>
Component: Ceph-VolumeAssignee: Alfredo Deza <adeza>
Status: CLOSED ERRATA QA Contact: Parikshith <pbyregow>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 3.2CC: adeza, anharris, aschoen, ceph-eng-bugs, ceph-qe-bugs, gmeno, hgurav, hnallurv, kdreyer, nthomas, pbyregow, rperiyas, sankarshan, shan, tserlin, vashastr
Target Milestone: rc   
Target Release: 3.2   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: RHEL: ceph-12.2.8-21.el7cp Ubuntu: ceph_12.2.8-19redhat1 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-01-03 19:02:09 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Attachments:
Description Flags
install-log
none
File contains osds.yml, all.yml and rolling update log snippet
none
ansible log for comment 26 run none

Description Parikshith 2018-10-17 11:01:26 UTC
Created attachment 1494783 [details]
install-log

Description of problem:
ceph-volume osds fails to get configured on latest 3.2 build.(rhel/container)

Version-Release number of selected component (if applicable):
ansible-2.6.5-1.el7ae.noarch
ceph-ansible-3.2.0-0.1.beta5.el7cp.noarch
ceph version 12.2.8-16.el7cp
Container image-tag: ceph-3.2-rhel-7-containers-candidate-16927-20181016222443

How reproducible:
always

Steps to Reproduce:
1. Install RHCS 3.2 with ceph-volume osds


Actual results:
The full traceback is:
Traceback (most recent call last):
  File "/tmp/ansible_ggKjF8/ansible_module_ceph_volume.py", line 625, in <module>
    main()
  File "/tmp/ansible_ggKjF8/ansible_module_ceph_volume.py", line 621, in main
    run_module()
  File "/tmp/ansible_ggKjF8/ansible_module_ceph_volume.py", line 521, in run_module
    out_dict = json.loads(out)
  File "/usr/lib64/python2.7/json/__init__.py", line 338, in loads
    return _default_decoder.decode(s)
  File "/usr/lib64/python2.7/json/decoder.py", line 366, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/usr/lib64/python2.7/json/decoder.py", line 384, in raw_decode
    raise ValueError("No JSON object could be decoded")
ValueError: No JSON object could be decoded

failed: [magna005] (item={'wal_vg': 'vg1', 'data_vg': 'vg1', 'wal': 'wal-lv1', 'data': 'data-lv3'}) => {
    "changed": false, 
    "item": {
        "data": "data-lv3", 
        "data_vg": "vg1", 
        "wal": "wal-lv1", 
        "wal_vg": "vg1"
    }, 
    "module_stderr": "Traceback (most recent call last):\n  File \"/tmp/ansible_ggKjF8/ansible_module_ceph_volume.py\", line 625, in <module>\n    main()\n  File \"/tmp/ansible_ggKjF8/ansible_module_ceph_volume.py\", line 621, in main\n    run_module()\n  File \"/tmp/ansible_ggKjF8/ansible_module_ceph_volume.py\", line 521, in run_module\n    out_dict = json.loads(out)\n  File \"/usr/lib64/python2.7/json/__init__.py\", line 338, in loads\n    return _default_decoder.decode(s)\n  File \"/usr/lib64/python2.7/json/decoder.py\", line 366, in decode\n    obj, end = self.raw_decode(s, idx=_w(s, 0).end())\n  File \"/usr/lib64/python2.7/json/decoder.py\", line 384, in raw_decode\n    raise ValueError(\"No JSON object could be decoded\")\nValueError: No JSON object could be decoded\n", 
    "module_stdout": "", 
    "msg": "MODULE FAILURE", 
    "rc": 1
}


Expected results:
ceph-volume osds should get configured.

Additional info:

Osd scenario:
[osds]
magna005 lvm_volumes="[{'data':'data-lv1','data_vg':'vg1'},{'data':'data-lv2','data_vg':'vg1','db':'db-lv1','db_vg':'vg1'},{'data':'data-lv3','data_vg':'vg1','wal':'wal-lv1','wal_vg':'vg1'},{'data':'data-lv4','data_vg':'vg1','db':'db-lv2','db_vg':'vg1','wal':'wal-lv2','wal_vg':'vg1'}]" osd_scenario="lvm" osd_objectstore="bluestore"

magna006 lvm_volumes="[{'data':'data-lv1','data_vg':'vg1'},{'data':'data-lv2','data_vg':'vg1','db':'db-lv1','db_vg':'vg1'},{'data':'data-lv3','data_vg':'vg1','wal':'wal-lv1','wal_vg':'vg1'},{'data':'data-lv4','data_vg':'vg1','db':'db-lv2','db_vg':'vg1','wal':'wal-lv2','wal_vg':'vg1'}]" osd_scenario="lvm" osd_objectstore="bluestore" dmcrypt="True"

all.yml: 
fetch_directory: ~/ceph-key
ceph_origin: distro
ceph_repository: rhcs
monitor_interface: eno1  
public_network: 10.8.128.0/21
ceph_docker_registry: brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888      # 
containerized_deployment: true
ceph_docker_image_tag: ceph-3.2-rhel-7-containers-candidate-16927-20181016222443  
ceph_docker_image: "rhceph"

I have attached install log(-vvv)

Comment 3 leseb 2018-10-17 11:54:36 UTC
ceph-volume related, assigning to Andrew.

Comment 4 Andrew Schoen 2018-10-17 12:16:59 UTC
(In reply to leseb from comment #3)
> ceph-volume related, assigning to Andrew.

This is related to the container implementation of ceph-volume, which I'm unsure how to debug. The traceback I see looks like we're missing patches downstream for ceph-volume though. I believe we'll need this PR that was just merged yesterday into upstream luminous brought downstream. https://github.com/ceph/ceph/pull/24589

Ken, do we plan to wait for 12.2.9 upstream and then use that for 3.2 downstream? This would save us carrying lots of patches downstream for ceph-volume.

Comment 5 leseb 2018-10-17 12:58:27 UTC
*** Bug 1640029 has been marked as a duplicate of this bug. ***

Comment 6 leseb 2018-10-17 13:32:00 UTC
Can I get into the env?
Thanks

Comment 9 leseb 2018-10-17 16:03:38 UTC
Thanks, I reported the bug on ceph-volume here: http://tracker.ceph.com/issues/36492

Comment 10 Harish NV Rao 2018-10-22 07:26:15 UTC
(In reply to leseb from comment #9)
> Thanks, I reported the bug on ceph-volume here:
> http://tracker.ceph.com/issues/36492

@Sebastien, when can we expect the fix for this? This is currently a test blocker.

Comment 11 leseb 2018-10-22 07:30:35 UTC
It's a ceph-volume bug so I'll defer that question to Alfredo

Comment 15 Vasishta 2018-10-24 08:35:26 UTC
Created attachment 1496950 [details]
File contains osds.yml, all.yml and rolling update log snippet

Tried rolling update using ceph-ansible-3.2.0-0.1.beta8.el7cp.noarch to upgrade cluster from 12.2.8-19 to 12.2.8-20.

Playbook failed in task ceph-osd : use ceph-volume to create bluestore osds with same issue as mentioned in BZ 1640029 [1].

Can you please check whether issue still persists or we are missing something ?

Regards,
Vasishta shatsry
QE, Ceph

[1] Was closed as duplicate of this bug.

Comment 16 Ramakrishnan Periyasamy 2018-10-24 10:57:21 UTC
RHEL ceph-volume lvm installation fails with below error, observed same error for bluestore and filestore.

The full traceback is:
Traceback (most recent call last):
  File "/tmp/ansible_2Exp4u/ansible_module_ceph_volume.py", line 625, in <module>
    main()
  File "/tmp/ansible_2Exp4u/ansible_module_ceph_volume.py", line 621, in main
    run_module()
  File "/tmp/ansible_2Exp4u/ansible_module_ceph_volume.py", line 521, in run_module
    out_dict = json.loads(out)
  File "/usr/lib64/python2.7/json/__init__.py", line 338, in loads
    return _default_decoder.decode(s)
  File "/usr/lib64/python2.7/json/decoder.py", line 366, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/usr/lib64/python2.7/json/decoder.py", line 384, in raw_decode
    raise ValueError("No JSON object could be decoded")
ValueError: No JSON object could be decoded

failed: [magna060] (item={'journal': 'journal_lv2', 'data': '/dev/sdc', 'journal_vg': 'journal_vg'}) => {
    "changed": false, 
    "item": {
        "data": "/dev/sdc", 
        "journal": "journal_lv2", 
        "journal_vg": "journal_vg"
    }, 
    "module_stderr": "Traceback (most recent call last):\n  File \"/tmp/ansible_2Exp4u/ansible_module_ceph_volume.py\", line 625, in <module>\n    main()\n  File \"/tmp/ansible_2Exp4u/ansible_module_ceph_volume.py\", line 621, in main\n    run_module()\n  File \"/tmp/ansible_2Exp4u/ansible_module_ceph_volume.py\", line 521, in run_module\n    out_dict = json.loads(out)\n  File \"/usr/lib64/python2.7/json/__init__.py\", line 338, in loads\n    return _default_decoder.decode(s)\n  File \"/usr/lib64/python2.7/json/decoder.py\", line 366, in decode\n    obj, end = self.raw_decode(s, idx=_w(s, 0).end())\n  File \"/usr/lib64/python2.7/json/decoder.py\", line 384, in raw_decode\n    raise ValueError(\"No JSON object could be decoded\")\nValueError: No JSON object could be decoded\n", 
    "module_stdout": "", 
    "msg": "MODULE FAILURE", 
    "rc": 1
}

Comment 20 Ramakrishnan Periyasamy 2018-10-24 12:30:25 UTC
Hi Alfredo,

The cluster is containerized and it will not create logs in /var/log/ceph it will be in journalctl

Comment 27 Ramakrishnan Periyasamy 2018-10-25 08:50:42 UTC
Created attachment 1497367 [details]
ansible log for comment 26 run

Comment 28 Ramakrishnan Periyasamy 2018-10-25 08:51:32 UTC
Moving the bug to assigned state.

Comment 29 Alfredo Deza 2018-10-25 10:47:29 UTC
Assigning back to Sebastien, as those are container errors.

Comment 30 leseb 2018-10-25 10:49:39 UTC
Ramakrishnan Periyasamy, I just resynced the container image downstream which will include the fix for your current issue. This shouldn't have happened in the first place. Please move this as VERIFIED, since this original issue has been fixed. Thanks.

In the meantime, please wait for the new container build and let's take the discussion in: 1630977

Thanks.

Comment 32 Ramakrishnan Periyasamy 2018-10-26 11:35:13 UTC
Hi Ken, would request to move this bug to ON_QA state.

Installation of ceph-volume lvm based OSD scenarios in containers working for Filestore and Bluestore.

Verified in versions:
ansible-2.6.6-1.el7ae.noarch
ceph-ansible-3.2.0-0.1.beta9.el7cp.noarch
ceph version 12.2.8-22.el7cp (400dc4070c1d0a82f9afc8e574d780136dd28b0b) luminous (stable)

Will move the bug to verified state once it is in ON_QA

Comment 33 Ramakrishnan Periyasamy 2018-10-26 11:47:04 UTC
Thanks Alfredo :)

Moving this bug to verified state.

Comment 36 errata-xmlrpc 2019-01-03 19:02:09 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:0020