Glamo Xrender Benchmark with Expedite

[ openmoko efl ]

Yesterday I’ve been testing the xrender engine on evas using the current EXA acceleration found in glamo (that is: solid fills and surface blitting). Sadly the test was taking ages to finish and even after walking up and leaving it the whole night it didnt finish but hang on the text test.

So, i wanted to test just the glue found on XRender and the implementation of it using EXA, but without painting anything, just memory moves from system memory to VRAM and the neccesary logic found on the Evas’ Xrender engine. So I “removed” (just return TRUE) the functions from the xf86-video-glamo driver and …. here are the results:








BenchmarkSoftware X11XRender without painting
Image Blend Unscaled2.76????
Image Blend Solid Unscaled12.6913.72
Image Blend Nearest Scaled1.5618.14
Image Blend Nearest Solid Scaled8.7718.00
Image Blend Smooth Scaled0.4518.22
Image Blend Smooth Solid Scaled5.9317.59
Image Blend Nearest Same Scaled5.0221.26
Image Blend Nearest Solid Same Scaled 22.0517.73
Image Blend Smooth Same Scaled1.2720.96
Image Blend Smooth Solid Same Scaled11.8417.76
Image Blend Border0.511.83
Image Blend Solid Border6.671.97
Image Blend Border Recolor0.441.23
Image Quality Scale4.29 1.97
Image Data ARGB7.223.71
Image Data ARGB Alpha4.89 1.70
Image Data YCbCr 601 Pointer List6.543.16
Image Data YCbCr 601 Pointer List Wide Stride6.045.40
Image Crossfade6.674.61
Text Basic9.282.25
Text Styles1.050.17
Text Styles Different Strings0.790.14
Text Change5.641.86
Textblock Basic5.671.50
Textblock Intl4.672.46
Rect Blend1.819.66
Rect Solid9.5718.02
Rect Blend Few69.84?????
Rect Solid Few84.2261.79
Image Blend Occlude 1 Few41.09196.75
Image Blend Occlude 2 Few24.0047.37
Image Blend Occlude 3 Few17.5070.32
Image Blend Occlude 143.2626.20
Image Blend Occlude 214.5914.03
Image Blend Occlude 34.8721.06
Image Blend Occlude 1 Many27.3112.14
Image Blend Occlude 2 Many6.814.61
Image Blend Occlude 3 Many2.21????
Image Blend Occlude 1 Very Many3.791.54
Image Blend Occlude 2 Very Many0.660.43
Image Blend Occlude 3 Very Many0.360.58
Polygon Blend3.511.69
EVAS SPEED11.8618.66

The results are very disappointing, there are several places where drawing on software is better than just doing the logic on XRender/EXA to achieve the same result but without drawing. And in the tests where XRender/EXA is better the speed up doesn’t worth as the drawing will be for sure slower. Note that the Glamo chip can only do raster operations into a destination surface of format RGB565, which means that there wont be any acceleration even if the blending is possible on hardware as Evas uses ARGB8888 premul.

Then, how to improve the speed of the rendering on Evas specifically for this chip? The path through XRender/EXA is worthless, is there any other way? Well. one possibility we could use, is to use the Evas’ software_16 engine (a destination surface of format RGB565) to reduce the bandwidth needed, but how to match that with the XRender API?

Another solution could be to leave the efforts on xf86-video-glamo acceleration and just build a specific Evas engine for glamo. Mmap the whole framebuffer memory and manage it through Eina’s memory pool manager, handle the surfaces ourselves and do a mix between software_16 and this specific engine. A lot of work, yes, but looks like the only solution (X away) that can give us some results. But there’s a problem, how to send the changes into the displayed X window? because in our engine we’ll use a VRAM backbuffer and we can’t know from a X client the phyisical memory of the area the window is being drawn. So we’ll have a roundtrip here, physical memory (our glamo surface) -> virtual memory (Xshm/X memory) -> physical memory (destination framebuffer), that for sure will remove any speedup.

Suggestions?

Written on March 3, 2009