Bug 1099917 - Optimize loading libosinfo
Summary: Optimize loading libosinfo
Keywords:
Status: CLOSED DEFERRED
Alias: None
Product: Virtualization Tools
Classification: Community
Component: libosinfo
Version: unspecified
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Matthias Clasen
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2014-05-21 13:50 UTC by Cole Robinson
Modified: 2018-09-04 18:35 UTC (History)
4 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2018-09-04 18:35:53 UTC
Embargoed:


Attachments (Terms of Use)

Description Cole Robinson 2014-05-21 13:50:24 UTC
# time python -c 'from gi.repository import Libosinfo; l = Libosinfo.Loader(); l.process_default_path()'

real	0m0.510s
user	0m0.466s
sys	0m0.038s

With warm cache. Stemmed from a discussion here:

http://www.redhat.com/archives/virt-tools-list/2014-May/msg00048.html

And Dan's comment over at bug 500320#c14 :

(In reply to Daniel Berrange from comment #14)
> Can you file a bug against libosinfo to optimize this. I'd like to think we
> can also reduce that time penalty, even if that means we have to cache the
> data in a more efficient format than XML, and only reload XML files when
> they change.

Comment 1 Zeeshan Ali 2014-05-21 16:19:57 UTC
Thanks for filing this. I had been thinking about this every now and then but never came up with any concrete ideas. One idea I had was:

1. Also allow data to be provided in JSON format.
2. Whenever libosinfo parses data from .xml, it write it into a json file, in ~/cache/libosinfo (or some other location) with the name SHA256_OF_XML_FILE_PATH.json.
3. Before loading a .xml file, check if a corresponding JSON file exists and is not newer than .xml file. If so, load from JSON file intead. Otherwise, do the same as #1.

What do you guys think?

Comment 2 Daniel Berrangé 2014-05-21 16:38:33 UTC
It depends whether parsing JSON is actually faster than parsing XML or not :-)

Actually, what would be better is to actually profile libosinfo to see exactly where the slowness is. Perhap it isn't even the XML parsing that's the problem !

Comment 3 Zeeshan Ali 2014-05-21 22:52:11 UTC
(In reply to Daniel Berrange from comment #2)
> It depends whether parsing JSON is actually faster than parsing XML or not
> :-)

Surely we need to test and measure but based on my experience with both, i'm betting parsing json is a lot faster than parsing of XML. I might be wrong about the difference being significant enough though.

> Actually, what would be better is to actually profile libosinfo to see
> exactly where the slowness is. Perhap it isn't even the XML parsing that's
> the problem !

Yeah, even though it seems unlikely the culprit is something else, we really should start with that.

Comment 4 Giuseppe Scrivano 2014-07-31 07:53:24 UTC
upstream master now should be around 30% faster than 0.2.10.

The functions osinfo_loader_process_file_reg_usb and osinfo_loader_process_file_reg_pci take a lot of time and it seems that the reason is in the cost of creating gobjects: g_object_new is quite expensive and called many times.

Comment 6 Cole Robinson 2018-09-04 18:35:53 UTC
Things can always get faster but I don't think keeping this bug open is going to motivate any more change, so closing it


Note You need to log in before you can comment on or make changes to this bug.