Bug 981726
Summary: | Fine grained locking in LXC driver | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Daniel Berrangé <berrange> | ||||||
Component: | libvirt | Assignee: | Michal Privoznik <mprivozn> | ||||||
Status: | CLOSED CURRENTRELEASE | QA Contact: | Virtualization Bugs <virt-bugs> | ||||||
Severity: | unspecified | Docs Contact: | |||||||
Priority: | unspecified | ||||||||
Version: | 7.0 | CC: | acathrow, ajia, berrange, dallan, dyuan, gsun, lsu, mprivozn, weizhan, zpeng | ||||||
Target Milestone: | rc | ||||||||
Target Release: | --- | ||||||||
Hardware: | Unspecified | ||||||||
OS: | Unspecified | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | libvirt-1.1.1-1.el7 | Doc Type: | Bug Fix | ||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2014-06-13 10:55:36 UTC | Type: | Bug | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Bug Depends On: | |||||||||
Bug Blocks: | 960861 | ||||||||
Attachments: |
|
Description
Daniel Berrangé
2013-07-05 15:24:53 UTC
I reproduced this question with 200 running container, the 'virsh -c lxc:/// list' take more than 6mins to list running containers. [root@dell-op790-03 ~]# ps -ef|grep sandbox|grep -v grep|wc -l 200 [root@dell-op790-03 ~]# killall libvirt_lxc Notes, in fact, if I don't kill all of libvirt_lxc process, the 'virsh -c lxc:/// list' still is very slow(more than 3mins) to list running containers, it's another bug 980839. [root@dell-op790-03 ~]# time virsh -c lxc:/// list Id Name State ---------------------------------------------------- real 6m45.384s user 0m0.004s sys 0m0.044s Patch proposed upstream: https://www.redhat.com/archives/libvir-list/2013-July/msg01085.html To POST: commit dbeb04a65cb986db951af0382495b41692d1ad9d Author: Michal Privoznik <mprivozn> AuthorDate: Wed Jul 17 09:37:09 2013 +0200 Commit: Michal Privoznik <mprivozn> CommitDate: Thu Jul 18 14:16:54 2013 +0200 Introduce lxcDomObjFromDomain Similarly to qemu driver, we can use a helper function to lookup a domain instead of copying multiple lines around. commit eb150c86b4c645de7aa1f002cc11a40f0adb43bf Author: Michal Privoznik <mprivozn> AuthorDate: Wed Jul 17 09:20:26 2013 +0200 Commit: Michal Privoznik <mprivozn> CommitDate: Thu Jul 18 14:16:54 2013 +0200 Remove lxcDriverLock from almost everywhere With the majority of fields in the virLXCDriverPtr struct now immutable or self-locking, there is no need for practically any methods to be using the LXC driver lock. Only a handful of helper APIs now need it. commit 2a82171affa266e4afe472ea7b79038e338aab52 Author: Michal Privoznik <mprivozn> AuthorDate: Wed Jul 17 09:14:42 2013 +0200 Commit: Michal Privoznik <mprivozn> CommitDate: Thu Jul 18 14:16:54 2013 +0200 lxc: Make activeUsbHostdevs use locks The activeUsbHostdevs item in LXCDriver are lockable, but the lock has to be called explicitly. Call the virObject(Un)Lock() in order to achieve mutual exclusion once lxcDriverLock is removed. commit 64ec738e58917c7a94c862fca725335a4fff3e11 Author: Michal Privoznik <mprivozn> AuthorDate: Mon Jul 15 11:43:10 2013 +0200 Commit: Michal Privoznik <mprivozn> CommitDate: Thu Jul 18 14:16:54 2013 +0200 Stop accessing driver->caps directly in LXC driver The 'driver->caps' pointer can be changed on the fly. Accessing it currently requires the global driver lock. Isolate this access in a single helper, so a future patch can relax the locking constraints. commit c86950533a0457cb8d6515088c925213ea629b9f Author: Michal Privoznik <mprivozn> AuthorDate: Mon Jul 15 19:08:11 2013 +0200 Commit: Michal Privoznik <mprivozn> CommitDate: Thu Jul 18 14:16:54 2013 +0200 lxc: switch to virCloseCallbacks API commit 4deeb74d01adfe1d0d36914785ba24ad7f74060e Author: Michal Privoznik <mprivozn> AuthorDate: Tue Jul 16 19:20:24 2013 +0200 Commit: Michal Privoznik <mprivozn> CommitDate: Thu Jul 18 14:16:54 2013 +0200 Introduce annotations for virLXCDriverPtr fields Annotate the fields in virLXCDriverPtr to indicate the locking rules for their use. commit 29bed27eb4434a148da68c8bdf48b52bad3c2567 Author: Michal Privoznik <mprivozn> AuthorDate: Tue Jul 16 19:05:06 2013 +0200 Commit: Michal Privoznik <mprivozn> CommitDate: Thu Jul 18 14:16:54 2013 +0200 lxc: Use atomic ops for driver->nactive commit 7fca37554c2059f13a3de32bcc29c3c1e404fbc6 Author: Michal Privoznik <mprivozn> AuthorDate: Tue Jul 16 17:45:05 2013 +0200 Commit: Michal Privoznik <mprivozn> CommitDate: Thu Jul 18 14:16:53 2013 +0200 Introduce a virLXCDriverConfigPtr object Currently the virLXCDriverPtr struct contains an wide variety of data with varying access needs. Move all the static config data into a dedicated virLXCDriverConfigPtr object. The only locking requirement is to hold the driver lock, while obtaining an instance of virLXCDriverConfigPtr. Once a reference is held on the config object, it can be used completely lockless since it is immutable. NB, not all APIs correctly hold the driver lock while getting a reference to the config object in this patch. This is safe for now since the config is never updated on the fly. Later patches will address this fully. commit 7e94a1a4ea6581ea876c61df73173a743972a00e Author: Michal Privoznik <mprivozn> AuthorDate: Thu Jul 18 13:34:55 2013 +0200 Commit: Michal Privoznik <mprivozn> CommitDate: Thu Jul 18 14:16:53 2013 +0200 virLXCDriver: Drop unused @cgroup It is not used anywhere, so it makes no sense to have it there. commit 272769beccd7479c75e700a6cbc138d495db66fb Author: Michal Privoznik <mprivozn> AuthorDate: Mon Jul 15 16:53:13 2013 +0200 Commit: Michal Privoznik <mprivozn> CommitDate: Thu Jul 18 14:16:53 2013 +0200 qemu: Move close callbacks handling into util/virclosecallbacks.c v1.1.0-251-gdbeb04a To successfully start 89 apache containers then run steps based on Comment 2, it seems list performance is still poor. Libvirt downstream(libvirt-1.1.0-2.el7.x86_64, libvirt-sandbox-0.2.1-1.el7.x86_64): # for i in {1..100}; do time virt-sandbox-service start myapache$i & done # ps -ef|grep sandbox|grep -v grep|wc -l 89 Notes, only have 89 containers are successfully started. # killall libvirt_lxc && time virsh -c lxc:/// list Id Name State ---------------------------------------------------- real 3m2.865s user 0m0.007s sys 0m0.007s Libvirt upstream(commit 7729a16): # ./daemon/libvirtd -d # ps -ef|grep sandbox|grep -v grep|wc -l 89 # killall libvirt_lxc && time ./tools/virsh -c lxc:/// list Id Name State ---------------------------------------------------- real 2m53.398s user 0m0.004s sys 0m0.012s Alex, do you have debug logging turned on? I've run oprofile several times and found out that the most time is spent in virLogVMessage and virAsprintfInternal (when formatting the log message). Although, I'm not able to get even close to your times: libvirt.git # pgrep libvirt_lxc | wc -l 100 libvirt.git # killall libvirt_lxc && time virsh -c lxc:/// list Id Name State ---------------------------------------------------- real 0m9.184s user 0m0.010s sys 0m0.000s Can you share the domain XML with me? All my containers do is run bash. (In reply to Michal Privoznik from comment #6) > Alex, > Can you share the domain XML with me? All my containers do is run bash. I created systemd linux container not generic linux container, you may create 100 apache container with the following steps: # for i in {1..100}; do virt-sandbox-service create -C -u httpd.service -N dhcp,source=default myapache$i;done To start 100 containers # tail -2 /etc/libvirt/libvirtd.conf max_clients = 100 max_workers = 100 # for i in {1..100}; do virt-sandbox-service start myapache$i & done Notes, it should be "&" not ";" followed by "myapache$i", in addition, if your host hasn't enough memory then some containers will be killed, so the running containers may be less than 100. And then you may check LXC guest XML configuration with 'virsh -c lxc:/// dumpxml myapache$i'. If you can't reproduce my case, please restart libvirtd service then try again, as usual, the first time may be very slow for listing running container then will be very fast on the second time if you don't restart libvirtd service, it seems libvirt caches something. Michal, also please show your container XML configuration and test steps, thanks. Well, the problem is, I can't get virt-sandbox-service running on my system. Anyway, for XML I'm using see attachment. But I think the other bottleneck here is our event loop. If an lxc container dies, the event loop issues a callback to cleanup cgroups, mark corresponding domain as inactive, etc. This is all done in a singe thread. The event loop thread. So, if one hundred containers die at once, the even loop must issue one hundred callbacks prior executing 'virsh list'. However, I am hesitant to say removing this bottleneck will make significant change to time you're measuring. As I've said, cleanup callback removes all container processes from cgroups. And I think this is even the bigger bottleneck than the one in the event loop. Especially if apache you're starting spawns some processes. Having said that, I'm still curious how did you get to 3 minutes. Can you 'virsh dumpxml myapache1' and share the output for me? Created attachment 778101 [details]
lxc0.xml
Created attachment 778145 [details]
myapache1.xml
I think one thing to be aware of when testing this bug, is that its goal was *not* to solve every scalability problem in the LXC driver. It was specifically just to switch the LXC driver to use fine grained locking, instead of a global mutex. This allows greater concurrency for operations. For example if a 'virsh destory foo' takes 3 seconds to complete, this bug allows for a 'virsh dominfo bar' to happen without blocking, since it is a different VM & thus different lock. It should also have sped up the creation of multiple containers in parallel. I fully expect that more scalability problems will exist. I'm not surprised that we still have problems in shutdown of containers, for the reasons Michal describes wrt the event loop. This is why I left bug 960861 open independently rather than marking it a duplicate of this. Hi Michal , The procedure also cost long time on my testing. #rpm -q libvirt libvirt-sandbox libvirt-1.1.1-1.el7.x86_64 libvirt-sandbox-0.5.0-1.el7.x86_64 1.Create and start #for i in {1..100}; do virt-sandbox-service create -C -u httpd.service -N dhcp,source=default myapache$i;done To start 100 containers # tail -2 /etc/libvirt/libvirtd.conf max_clients = 100 max_workers = 100 # for i in {1..100}; do virsh -c lxc:/// start myapache$i & done 2.Calculate and kill #ps -ef|grep lxc|grep -v grep|wc -l 44 # killall libvirt_lxc && time virsh -c lxc:/// list Id Name State ---------------------------------------------------- real 1m23.168s user 0m0.010s sys 0m0.008s 3. Lots of logs like libvirtd[25434]:Unable to remove /sys/fs/cgroup/cpu,cpuacct/machine/myapache11.libvirt-lxc/system/libvirtd.service/system/ (16) libvirtd[25434]: Unable to remove /sys/fs/cgroup/cpu,cpuacct/machine/myapache11.libvirt-lxc/system/libvirtd.service/system//systemd-journald.service same xml as comment 10 , ajia's . Any suggestion ? (In reply to time.su from comment #12) > same xml as comment 10 , ajia's . Any suggestion ? In fact, you need to verify the bug with LXC and QEMU driver, I think you may try steps according to DB's advise on Comment 11. Well, I don't think the test you are using is right for the reasons I've described in comment #8. What about trying tests Daniel describes in comment #11? (In reply to Michal Privoznik from comment #14) > Well, I don't think the test you are using is right for the reasons I've > described in comment #8. What about trying tests Daniel describes in comment > #11? Yeah , i think i made a mistake . I will have a try basing on you mentioned and the patch . Hi Michal , I did two things when test the bug 1. Just like your pathes mentioned , i started 4 groups lxc which each group has 10 guests , and then start and destroy 10 times individual . The libvirt 1.1.1-2 saved more than 20 ses than libvirt 1.0.6 average One group template , and run them via 1 group & 2 group ...... CONNECT="virsh -c lxc:///" for j in {1..10} do for i in {11..20} do $CONNECT start lxc$i ; $CONNECT destroy lxc$i done done 2. Start and destory one guest 100 times in a loop , and at the same time query another guest's dominfo , like for i in {1..100} do { $CONNECT start lxc1 ; $CONNECT destroy lxc1 & getinfo #$CONNECT dominfo lxc2 } & getinfo done libvirt 1.1.1-2 cost real 0m16.965s user 0m3.995s sys 0m2.463s libvirt 1.0.6 cost 30 secs average Is it okay for verified ? (In reply to time.su from comment #16) > Hi Michal , > I did two things when test the bug > > 1. > Just like your pathes mentioned , i started 4 groups lxc which each > group has 10 guests , and then start and destroy 10 times individual . > The libvirt 1.1.1-2 saved more than 20 ses than libvirt 1.0.6 average > > One group template , and run them via 1 group & 2 group ...... > CONNECT="virsh -c lxc:///" > for j in {1..10} > do > for i in {11..20} > do > $CONNECT start lxc$i ; $CONNECT destroy lxc$i > done > > done You could get even better results, if you'd write a python script that shares a connection object so you don't have to connect & disconnect each time. But that just a cosmetics. > > > 2. > Start and destory one guest 100 times in a loop , and at the same time > query another guest's dominfo , like > > for i in {1..100} > do > { > > $CONNECT start lxc1 ; $CONNECT destroy lxc1 & getinfo #$CONNECT > dominfo lxc2 > > } & getinfo > done > > libvirt 1.1.1-2 cost > real 0m16.965s > user 0m3.995s > sys 0m2.463s > > libvirt 1.0.6 cost 30 secs average > > > Is it okay for verified ? Yes. But maybe we should ask Dan if it's okay to move this to VERIFIED. I understand that even more locks can be broken into smaller ones, but I think this is a good starting position for that. I mean, we can do that later as rhel-7.0 life goes on (in a separate bug). Dan, what's your oppinion? > > libvirt 1.1.1-2 cost
> > real 0m16.965s
> > user 0m3.995s
> > sys 0m2.463s
> >
> > libvirt 1.0.6 cost 30 secs average
> >
> >
> > Is it okay for verified ?
>
> Yes. But maybe we should ask Dan if it's okay to move this to VERIFIED. I
> understand that even more locks can be broken into smaller ones, but I think
> this is a good starting position for that. I mean, we can do that later as
> rhel-7.0 life goes on (in a separate bug). Dan, what's your oppinion?
Yes, this bug was never intending to solve all problems - it is just one step in an incremental effort to improve scalability. From a coding & design POV it has achieved its aims, regardless of what performance numbers you actually see in testing.
Okay , per comment 16 , 17 , 18 , i set this one VERIFIED . Thank both of your's kindly help and explaintion. This request was resolved in Red Hat Enterprise Linux 7.0. Contact your manager or support representative in case you have further questions about the request. |