From Bugzilla Helper: User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.0.6) Gecko/20060808 Fedora/1.5.0.6-3 Firefox/1.5.0.6 pango-text Description of problem: I have here a test case which is real real slow for me, it's pretty much what the cairo canvas backend of OpenOffice.org does, (the cairo canvas is disabled at the moment) > gcc --version gcc (GCC) 4.1.1 20060817 (Red Hat 4.1.1-18) > grep "Loading" /var/log/Xorg.0.log | grep drivers (II) Loading /usr/lib/xorg/modules/drivers/nv_drv.so > grep Chipset /var/log/Xorg.0.log | grep found (--) Chipset GeForce4 4200 Go found > rpm -qf /usr/lib/xorg/modules/drivers/nv_drv.so xorg-x11-drv-nv-1.2.0-3.fc6 > rpm -q cairo cairo-1.2.2-3.fc6 > gcc cairotest.c `pkg-config --cflags --libs cairo-xlib` > time ./a.out cairo_fill() time: 0.984072 sec real 0m1.017s user 0m0.020s sys 0m0.008s > time ./a.out cairo_fill() time: 0.952484 sec real 0m0.985s user 0m0.016s sys 0m0.012s > time ./a.out cairo_fill() time: 0.972430 sec real 0m1.029s user 0m0.012s sys 0m0.020s Version-Release number of selected component (if applicable): cairo-1.2.2-3.fc6 How reproducible: Always Steps to Reproduce: 1. Take attached png and save to /tmp 2. Take attached source and gcc cairotest.c `pkg-config --cflags --libs cairo-xlib` 3. run time ./a.out a few times Actual Results: pretty slow, I rather suspect the X server here, perhaps just the nv drivers, but maybe more widespread Expected Results: Additional info:
Created attachment 134718 [details] demo png to fill with
Created attachment 134719 [details] testcase
Created attachment 134720 [details] i386 a.out
On my FC5 with the radeon driver (ATI Radeon 9600XT AR (AGP)) i get: time ./a.out cairo_fill() time: 0.043786 sec real 0m0.059s user 0m0.012s sys 0m0.000s
On a FC-5 machine with vesa_drv.so with a sad little Intel integrated graphics chip I get timings of 0.02 -> 0.09 seconds, and if I set Option "NoAccel" "true" on my afflicted machine I get 0.04 -> 0.05
caolanm->ajax: Any hints on how to track this down a bit further given that it seems to be fairly unique to my hardware.
Run the testcase for longer and use oprofile or sysprof to see where we're spending our time.
...dribble... TIMER:0| samples| %| ------------------ 6375 62.4327 Xorg TIMER:0| samples| %| ------------------ 6303 98.8706 libfb.so 39 0.6118 Xorg 15 0.2353 libc-2.4.90.so 11 0.1725 nv_drv.so 5 0.0784 libxaa.so 1 0.0157 anon (tgid:2196 range:0xb7f3c000-0xb7f3d000) 1 0.0157 libextmod.so [root@soulcrusher caolan]# opreport --symbols --merge tgid /usr/lib/xorg/modules/libfb.so CPU: CPU with timer interrupt, speed 0 MHz (estimated) Profiling through timer interrupt samples % image name symbol name 5441 86.3240 libfb.so fbFetch_x8r8g8b8 396 6.2827 libfb.so fbCompositeSolidMask_nx8x8888mmx 131 2.0784 libfb.so fbCopyAreammx 87 1.3803 libfb.so fbStore_x8r8g8b8 49 0.7774 libfb.so mmxCombineOutReverseU 48 0.7615 libfb.so mmxCombineAddU 41 0.6505 libfb.so mmxCombineMaskU 33 0.5236 libfb.so fbFetch_a8 20 0.3173 libfb.so fbRasterizeEdges 14 0.2221 libfb.so fbStore_a8 10 0.1587 libfb.so fbSolid 7 0.1111 libfb.so fbCompositeSrc_8888x8888mmx 5 0.0793 libfb.so fbCompositeSrcAdd_8000x8000mmx 4 0.0635 libfb.so fbFetch 4 0.0635 libfb.so fbFetchSolid 3 0.0476 libfb.so fbComposite 3 0.0476 libfb.so fbSolidFillmmx 1 0.0159 libfb.so fbCompositeGeneral 1 0.0159 libfb.so fbCopyNtoN 1 0.0159 libfb.so fbCreatePixmap 1 0.0159 libfb.so fbStore 1 0.0159 libfb.so fbValidateGC 1 0.0159 libfb.so mmxCombineOverU 1 0.0159 libfb.so storeProcForPicture
Created attachment 135645 [details] opreport --callgraph --merge tgid
One obvious issue is the rectangle that is being filled here as non-integer coordinates: cairo_move_to(pCairo, 480.500000, 720.500000); cairo_line_to(pCairo, 0.500000, 720.500000); cairo_line_to(pCairo, 0.500000, 0.500000); cairo_line_to(pCairo, 960.500000, 0.500000); cairo_line_to(pCairo, 960.500000, 720.500000); cairo_line_to(pCairo, 480.500000, 720.500000); Which should result in an image with blurred edge pixels. If that's not an essential result in this case, then this could be made much faster by simply rounding those coordinates. For example, on my system, the unrounded values cause a fill time of 0.06 seconds but rounding the values reduces that 0.00029 seconds. -Carl
That's good enough for me, I got excellent preformance from avoiding adding 0.5 for the fill cases in the OOo cairo canvas, so would be fine from my side to close this out and I'll chase it upstream in OOo to not handicap itself.
Great, I'm glad you've got better performance now, and I'm glad this fix was so easy. Since the problem here came from doing half-integer coordinates when filling, and since using integers instead is: 1. More natural in the first place 2. Better-looking (no blurry edges on your images) 3. Much, much faster I'll go ahead and close this as NOTABUG. -Carl