Bug 2122174

Summary: engine-setup on separate DWH machine fails: Failed to execute stage 'Closing up': 'NoneType' object has no attribute 'open_sftp_client'
Product: [oVirt] ovirt-engine-dwh Reporter: Pavel Novotny <pnovotny>
Component: SetupAssignee: Martin Perina <mperina>
Status: CLOSED CURRENTRELEASE QA Contact: Pavel Novotny <pnovotny>
Severity: high Docs Contact:
Priority: high    
Version: 4.5.3CC: bugs, dfodor, didi, emarcus, lleistne, lsvaty, mperina, sradco
Target Milestone: ovirt-4.5.3Keywords: Regression
Target Release: ---Flags: mperina: ovirt-4.5+
lsvaty: blocker-
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: ovirt-engine-4.5.3.1, ovirt-engine-dwh-4.5.7 Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-12-05 12:45:31 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Metrics RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 2113980    

Description Pavel Novotny 2022-08-29 12:38:12 UTC
Created attachment 1908269 [details]
DWH setup logs

Description of problem:
Running engine-setup on a separate DWH machine engine fails with:

[ ERROR ] Failed to execute stage 'Closing up': 'NoneType' object has no attribute 'open_sftp_client'


Version-Release number of selected component (if applicable):
ovirt-engine-dwh-4.5.4-1.el8ev
ovirt-engine-dwh-setup-4.5.4-1.el8ev
ovirt-engine-4.5.2.4-0.1.el8ev

How reproducible:
always

Steps to Reproduce:
1. Install engine on machine A (ovirt-engine-4.5.2.4-0.1.el8ev)
2. Run engine-setup on machine B (ovirt-engine-dwh-setup-4.5.4-1.el8ev). Answer file is attached.
3.

Actual results:
engine-setup fails:

          --== SUMMARY ==--
         
[ INFO  ] Restarting httpd
[ INFO  ] Starting dwh service
[ INFO  ] Starting Grafana service
[ ERROR ] Failed to execute stage 'Closing up': 'NoneType' object has no attribute 'open_sftp_client'


Expected results:
engine-setup finishes successfully


Additional info:
Attached log file and answer file.

Manual SSH access from DWH machine B to engine machine A worked properly.

Stacktrace:

2022-08-29 13:47:33,859+0200 DEBUG otopi.plugins.ovirt_engine_common.base.remote_engine.remote_engine_root_ssh remote_engine_root_ssh.copy_to_engine:276 Copying data to remote engine engine-ip.redhat.com:/etc/ovirt-engine/engine.conf.d/10-setup-dwh-database.conf
2022-08-29 13:47:33,860+0200 DEBUG otopi.context context._executeMethod:145 method exception
Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/otopi/context.py", line 132, in _executeMethod
    method['method']()
  File "/usr/share/ovirt-engine/setup/bin/../plugins/ovirt-engine-setup/ovirt-engine-dwh/core/remote_engine.py", line 125, in _closeupEngineAccess
    mode=0o600,
  File "/usr/share/ovirt-engine/setup/ovirt_engine_setup/remote_engine.py", line 94, in copy_to_engine
    mode=mode,
  File "/usr/share/ovirt-engine/setup/bin/../plugins/ovirt-engine-common/base/remote_engine/remote_engine_root_ssh.py", line 279, in copy_to_engine
    sf = self._client.open_sftp()
  File "/usr/lib/python3.6/site-packages/paramiko/client.py", line 541, in open_sftp
    return self._transport.open_sftp_client()
AttributeError: 'NoneType' object has no attribute 'open_sftp_client'

Comment 1 RHEL Program Management 2022-08-30 13:22:18 UTC
This bug report has Keywords: Regression or TestBlocker.
Since no regressions or test blockers are allowed between releases, it is also being identified as a blocker for this release. Please resolve ASAP.

Comment 2 Pavel Novotny 2022-09-07 15:14:07 UTC
FailedQA in 
ovirt-engine-dwh-4.5.5-1.el8ev
ovirt-engine-dwh-setup-4.5.5-1.el8ev
ovirt-engine-4.5.2.5-0.1.el8ev


Now another error is thrown during the setup:
```
...
          The engine should be restarted for Single-Sign-On (SSO) to work. Do this as part of Setup? If not, you will have to do this later by yourself (Yes, No) [Yes]:
[ ERROR ] Failed to execute stage 'Environment customization': write() argument must be str, not bytes
...
```

Stacktrace from the setup log:
```
2022-09-07 16:33:41,054+0200 DEBUG otopi.context context._executeMethod:145 method exception
Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/otopi/context.py", line 132, in _executeMethod
    method['method']()
  File "/usr/share/ovirt-engine/setup/bin/../plugins/ovirt-engine-setup/ovirt-engine-grafana-dwh/core/config.py", line 224, in _customization_sso
    f.write(res)
TypeError: write() argument must be str, not bytes
```

Probably caused by this change:
https://github.com/oVirt/ovirt-dwh/pull/49/files#diff-8e84da21999dc564e3890014b4a064b88c98b4ffe6c397492a3e66e6d0e9a082R223

Comment 4 Martin Perina 2022-09-08 07:17:02 UTC
(In reply to Pavel Novotny from comment #2)
> FailedQA in 
> ovirt-engine-dwh-4.5.5-1.el8ev
> ovirt-engine-dwh-setup-4.5.5-1.el8ev
> ovirt-engine-4.5.2.5-0.1.el8ev
> 
> 
> Now another error is thrown during the setup:
> ```
> ...
>           The engine should be restarted for Single-Sign-On (SSO) to work.
> Do this as part of Setup? If not, you will have to do this later by yourself
> (Yes, No) [Yes]:
> [ ERROR ] Failed to execute stage 'Environment customization': write()
> argument must be str, not bytes
> ...
> ```
> 
> Stacktrace from the setup log:
> ```
> 2022-09-07 16:33:41,054+0200 DEBUG otopi.context context._executeMethod:145
> method exception
> Traceback (most recent call last):
>   File "/usr/lib/python3.6/site-packages/otopi/context.py", line 132, in
> _executeMethod
>     method['method']()
>   File
> "/usr/share/ovirt-engine/setup/bin/../plugins/ovirt-engine-setup/ovirt-
> engine-grafana-dwh/core/config.py", line 224, in _customization_sso
>     f.write(res)
> TypeError: write() argument must be str, not bytes
> ```
> 
> Probably caused by this change:
> https://github.com/oVirt/ovirt-dwh/pull/49/files#diff-
> 8e84da21999dc564e3890014b4a064b88c98b4ffe6c397492a3e66e6d0e9a082R223

That's super strange, above error was exactly the reason why I need to create below fix to pass successfully:

https://github.com/oVirt/ovirt-dwh/commit/885d65f0594dee08a48fbe48bdf697de7bcceb6b

Do you still have the env for further investigation?

Comment 5 Pavel Novotny 2022-09-08 11:19:20 UTC
(In reply to Martin Perina from comment #4)
...
...
> That's super strange, above error was exactly the reason why I need to
> create below fix to pass successfully:
> 
> https://github.com/oVirt/ovirt-dwh/commit/
> 885d65f0594dee08a48fbe48bdf697de7bcceb6b
> 
> Do you still have the env for further investigation?

Environment provided (via chat).
If you need to reproduce the issue life, I can set up clean machines before the engine-setup is executed.

Comment 6 Yedidyah Bar David 2022-09-08 11:52:59 UTC
Logged into the env and reproduced.

Still not sure about the root cause, but this workaround makes it pass this point - basically revert one of the two patches in PR 49:

# dnf install -y patch

# curl https://github.com/oVirt/ovirt-dwh/commit/885d65f0594dee08a48fbe48bdf697de7bcceb6b.diff | patch -R /usr/share/ovirt-engine/setup/plugins/ovirt-engine-setup/ovirt-engine-grafana-dwh/core/config.py

Comment 7 Yedidyah Bar David 2022-09-08 11:54:45 UTC
To undo the workaround from above comment 6, drop the '-R':

# curl https://github.com/oVirt/ovirt-dwh/commit/885d65f0594dee08a48fbe48bdf697de7bcceb6b.diff | patch /usr/share/ovirt-engine/setup/plugins/ovirt-engine-setup/ovirt-engine-grafana-dwh/core/config.py

Comment 8 Yedidyah Bar David 2022-09-08 14:22:20 UTC
After spending some more time, summarizing what I currently know:

1. I think, not sure, that [1], which is part of [2], was needed for a specific flow, which is to run setup on a separate machine and answering 'manual files' about how to access the remote engine. Martin is still looking at this. That is, it was needed to fix some bug X, not really related to current bug.

2. Martin pushed [1] to fix X, but this broke the flow of using 'ssh as root', which is what's in comment 2.

3. I verified that reverting [1] fixed the breakage of root-ssh, but reintroduces a breakage in manual-files. I pushed a revert patch [3].

4. To fix X for both flows, I pushed another patch [4]. I verified both flows with both [3] and [4] successfully.

5. I think, didn't test, that merging only [4] without [3] will break both flows. So once we decide what to do, we should probably update [3] and/or [4] to require/conflict with fixed/broken versions as applicable.

If we do nothing, BTW, another workaround is to use manual-files. That's usually less convenient (but more secure, as it does not require allowing ssh as root to the engine machine from dwh).

[1] https://github.com/oVirt/ovirt-dwh/pull/49/commits/a5a8d7d86a798c39d20fa2145f1f4bd66cfa1143
[2] https://github.com/oVirt/ovirt-dwh/pull/49
[3] https://github.com/oVirt/ovirt-dwh/pull/56
[4] https://github.com/oVirt/ovirt-engine/pull/645

Comment 10 Pavel Novotny 2022-10-30 22:59:32 UTC
Verified in
ovirt-engine-4.5.3.2-1.el8ev.noarch
ovirt-engine-dwh-4.5.7-1.el8ev.noarch

Running engine-setup on a separate DWH machine and choosing the option "1 - Access remote engine server using ssh as root"
ends up successfully.