# time python -c 'from gi.repository import Libosinfo; l = Libosinfo.Loader(); l.process_default_path()' real 0m0.510s user 0m0.466s sys 0m0.038s With warm cache. Stemmed from a discussion here: http://www.redhat.com/archives/virt-tools-list/2014-May/msg00048.html And Dan's comment over at bug 500320#c14 : (In reply to Daniel Berrange from comment #14) > Can you file a bug against libosinfo to optimize this. I'd like to think we > can also reduce that time penalty, even if that means we have to cache the > data in a more efficient format than XML, and only reload XML files when > they change.
Thanks for filing this. I had been thinking about this every now and then but never came up with any concrete ideas. One idea I had was: 1. Also allow data to be provided in JSON format. 2. Whenever libosinfo parses data from .xml, it write it into a json file, in ~/cache/libosinfo (or some other location) with the name SHA256_OF_XML_FILE_PATH.json. 3. Before loading a .xml file, check if a corresponding JSON file exists and is not newer than .xml file. If so, load from JSON file intead. Otherwise, do the same as #1. What do you guys think?
It depends whether parsing JSON is actually faster than parsing XML or not :-) Actually, what would be better is to actually profile libosinfo to see exactly where the slowness is. Perhap it isn't even the XML parsing that's the problem !
(In reply to Daniel Berrange from comment #2) > It depends whether parsing JSON is actually faster than parsing XML or not > :-) Surely we need to test and measure but based on my experience with both, i'm betting parsing json is a lot faster than parsing of XML. I might be wrong about the difference being significant enough though. > Actually, what would be better is to actually profile libosinfo to see > exactly where the slowness is. Perhap it isn't even the XML parsing that's > the problem ! Yeah, even though it seems unlikely the culprit is something else, we really should start with that.
upstream master now should be around 30% faster than 0.2.10. The functions osinfo_loader_process_file_reg_usb and osinfo_loader_process_file_reg_pci take a lot of time and it seems that the reason is in the cost of creating gobjects: g_object_new is quite expensive and called many times.
Things can always get faster but I don't think keeping this bug open is going to motivate any more change, so closing it