Bug 1787914 - stream does not work in non-block mode with event loop implementation on Fibers
Summary: stream does not work in non-block mode with event loop implementation on Fibers
Keywords:
Status: NEW
Alias: None
Product: Virtualization Tools
Classification: Community
Component: ruby-libvirt
Version: unspecified
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
Assignee: Libvirt Maintainers
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-01-05 13:14 UTC by Denis
Modified: 2020-02-02 17:49 UTC (History)
2 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed:
Embargoed:


Attachments (Terms of Use)
script that reproduces an error (2.51 KB, text/plain)
2020-01-05 13:14 UTC, Denis
no flags Details
patch that fixes issue (4.28 KB, patch)
2020-01-14 13:15 UTC, Denis
senid231: review+
Details | Diff

Description Denis 2020-01-05 13:14:30 UTC
Created attachment 1649958 [details]
script that reproduces an error

Description of problem:

libvirt_stream_event_add_callback creates passthrough variable using rb_ary_new2 and pass it to virStreamEventAddCallback at ext/libvirt/stream.c:272.
After that ruby garbage collector removes this variable (I think because there is no references for passthrough in ruby code).

Version-Release number of selected component (if applicable): ruby-libvirt-0.7.1


How reproducible:

can be reproduced using libvirt fibers.
script provided in attachments.

Steps to Reproduce:
1. use ruby >= 2.4.4
2. gem install ruby-libvirt -v 0.7.1
3. gem install libvirt_async -v 0.2.1
4. ruby test_ruby_libvirt_stream.rb HV_URI DOMAIN_UUID

replace HV_URI with your hypervisor URI
replace DOMAIN_UUID with your domain UUID

Actual results:

TypeError: wrong domain event lifecycle callback argument type (expected Array)

script does not exit because it wait callback to be called

Expected results:

screenshot saved /current/path/screenshot.pnm
exit 0

Additional info:

if you call script disable GC during script like this

GC_DISABLE=1 ruby test_ruby_libvirt_stream.rb HV_URI DOMAIN_UUID

it will work correctly

Comment 1 Denis 2020-01-06 09:14:52 UTC
Sorry for bad English.

if you call the script with GC_DISABLE=1 env screenshot will be saved correctly.
It means that GC removes last reference for passthrough while it's still in use.

I have another hotfix for that. 
I've just return passthrough from stream_event_add_callback function and store it in instance variable of singleton until streaming is finished/failed/cancelled.

Comment 2 Michal Privoznik 2020-01-06 10:10:27 UTC
Adding Chris, who has the most knowledge in these bindings.

Comment 3 Denis 2020-01-13 11:53:20 UTC
same bug at ext/libvirt/connect.c:812 libvirt_connect_domain_event_register_any

passthrough is created as ruby object and later will be destroyed by GC

by temporary solution similar with previous problem - I just return passthrough to ruby code and store it until event is deregistered

But it looks very ugly from my side. 
I think proper way to handle such case would be to say GC to ignore this object and clean it manually (maybe bind it to C object and free on destruction).

Comment 4 Denis 2020-01-14 13:15:32 UTC
Created attachment 1652211 [details]
patch that fixes issue

I've found better way to handle this. I store *passthrough* in as instance variable until it no longer needed.

Take a look at the patch in attachments

Comment 5 Denis 2020-02-02 17:49:04 UTC
Update

Looks like ruby-libvirt is not maintained anymore.

I found another bug with GC (looks like GC frees stream which still has references) and memory leak (each time screenshot is taken it leaks 1-3mb).

I've decided to rewrite ruby implementation using FFI.
I've implemented main functions from host, domain and screenshot parts. And currently, there are no memory leaks or GC issues found.

If someone encounters similar problems you can try my implementation on GitHub (senid231/libvirt_ffi).
https://github.com/senid231/libvirt_ffi

PS
Also, I've found another bug related to libvirtd restarting.
When the server restarts client application hangs because the libvirt C code tries to acquire the same lock twice in the same thread. 
Currently, this issue reproduced in my implementation too.
I'm planning to fix it soon.


Note You need to log in before you can comment on or make changes to this bug.