Shop Forum More Submit  Join Login
I recently worked on a project that tried to improve the performance of DeviantArt muro.  As part of this project I did a lot of testing and benchmarking of various HTML5 operations.  I learned a lot about where one must be careful when writing web applications that use HTML5’s <canvas> element.  The following is a diary of sorts that I made while working on the project.  Of course your milage may vary depending on the setup of your application, but as you will see, DeviantArt muro realized significant performance improvements when I applied the lessons I learned.  

Conclusions

For those who don't want to read this whole long and rambling article, here are the main rules to live by as suggested by my testing:

:bulletblue: Reduce direct pixel manipulation as much as possible.  Use the line drawing API when possible, and when you must sample pixels get as few as possible.  

:bulletblue: Rendering shadows, especially ones with a high blur, will greatly reduce your performance.  You can draw quite a few shadowless lines in the time it takes to draw a single one with a soft shadow.

:bulletblue: Bundle as many lines as you can into a single call to stroke() as possible.


The Setup

Since I was interested in a scenario similar to what DeviantArt muro will often see, I ran all tests on a canvas that was 1200px wide and 500px tall.  This would be the approximate size of the drawing area if a user with an average sized monitor maximized their browser window.  All tests were run on my laptop (2009 Macbook Pro with 3.06GHz Intel Core 2 Duo Processor, 8GB RAM, and Intel X25-M Harddrive) with minimal other apps running (Terminal, vi, and standard background tasks).  The browsers that I tested on were: Firefox 3.6.13, Safari 5.0.3 (6533.19.4), and Google Chrome 8.0.552.237.  Late in the game a colleague asked me how Firefox 4 beta compared, so I re-ran some of the tests using Firefox 4.0b10.  

Some people will surely ask how the Internet Explorer 9 release candidate stacks up with the rest of the browsers.  I apologize that I did not run the tests using IE9 because I did not have a windows machine handy, and did not feel it would be accurate to run these tests on a virtual machine.

For all the tests I would time how long it took to run a small section of code a bunch of times, and then subtract the time needed to run the same code without the critical bit I was testing.  This would mean that little costs of doing things to prepare the test would not contribute to the time measured by the test itself.  The results shown here are averages of running the tests several hundred times each.  Though I did not calculate standard deviations for each result, I kept an eye on the time distributions to make sure that they did not change too much from test to test.  

These tests were meant to just give me a ballpark idea of what is important and what is not, they were not intended to be official benchmarks or the basis of an academic paper analyzing the algorithms various browsers are using, so please take the results with that grain of salt.  A browser's javascript performance depends on a wide variety of factors that were not part of these tests.  To judge a browser based on their results would be mis-guided.


Drawing Lines

The first thing I tested was basic line drawing: moveTo() a random location on the canvas, lineTo() a different random location.  I did the test once using a single fully opaque color, and again using various random colors and opacities.  This was meant to give me an idea for how much penalty one pays for making a canvas calculate more intricate blending.  I also did the same tests using quadraticCurveTo() and bezierCurveTo() so I could see how much more expensive it is to draw with smooth lines.  Of course, if you are using one of those functions in your app you will also have the overhead of having to calculate the proper control points to use.



There is not much that is surprising here.  The four browsers that were tested performed pretty similarly.  Using more mathematically complex curves comes at a cost.

Next I wanted to see if it makes a difference how often one calls stroke() when drawing with lines.  I ran tests where I compared calling stroke after each line segment was drawn and where I drew a number of line segments and then stroked them all at once.  As can be seen by the approximately linear graph, stroke() takes close to the same amount of time each time it is called.  If you can draw 50 line segments before calling stroke(), you can save in the neighborhood of 20% of the cost of drawing.




Shadows

Shadows are a really useful tool, not only for drawing actual shadows, but for anything that needs a nice soft edge.  However, that soft edge comes at a really high price.  For a while now, deviantART has realized that WebKit browsers struggle when we use a lot of shadows.  I was really curious to get a handle on just what was going on that made Gecko and WebKit browsers behave so differently.  When one times how long it takes to draw straight lines with various amounts of shadowBlur, a really interesting graph appears.  WebKit browsers can draw small shadows quickly, but as shadowBlur increases their rendering time increases slightly worse than linearly.  Firefox, on the other hand, renders shadows at near constant speed.  If the shadows do not have much blur it is slower than the WebKit browsers, but when shadowBlur gets up to 100 it is four times faster.

Interestingly, Firefox 4 beta now has performance closer to that of the WebKit browsers.  The shadows of the same blur in Firefox 4 are also a lot wider than they were previously (Firefox 3 has always had smaller shadows than WebKit).  I do not know the details, but it would seem that the canvas spec must be settling on a softer, but more computationally intensive shadow as its reference.




Buffer Copying

Earlier profiling that I did showed that the worst performance issues in DeviantArt muro came from having to move buffers around at an inopportune time.  Any complex graphics app is going to have to store and/or move image data around, and I was really curious about what the best way to do this is.

There are a number of different ways to get at the data that is on a canvas.  One can use drawImage() to copy the contents of one canvas to another canvas.  You can get the contents of a canvas as a base64 encoded PNG by using toDataUrl().  You can also get essentially an array of pixels using getImageData().  As can be seen below, the toDataUrl() method is the clear loser; apparently the cost of encoding the data is pretty high.  Which of the other two methods to use is a little less clear until Firefox 4 usage is widespread.  As can be seen, Firefox 3 has some problems getting and putting image data quickly, but WebKit browsers are much faster at that than using another canvas element as a buffer.



An advantage of using getImageData() is that it can sample a portion of the canvas.  I did the same getImageData() test, but sampled squares of increasingly larger size, and for all four browsers tested, getImageData() had close to constant speed per pixel sampled.  Before I had thought that the overhead of getting any pixels would be large, so I would sometimes sample more than I needed if I thought there was information that I would be needing at a later point.  As this graph shows though, it is better to grab only what you need, because you do not pay a noticeable penalty for sampling a second time down the road.






Applying to dA muro

While all this data is somewhat interesting, one must wonder how much the knowledge helps in performance tuning a web application.  Of course everybody’s milage will vary on this, I am sure that there are programmers out there who have a much better intuition for speed optimizations than I do.  They would have written faster code right than I from the get go.  However, I think that the code I started from was probably fairly average as to what one might expect from an experienced coder who was fairly new to HTML5 and used all of the available API’s in interest of making simple and straightforward code in preference over premature optimizations.

The main lessons I learned is that any kind of getImageData() call or canvas copy should be avoided at all costs,  delayed until a “down time” if they cannot be avoided, and if all else fails great care should be used to sample only the pixels that you absolutely need.  It is alright to call lineTo() many more times if it means you can avoid a call to getImageData().

The first place that I tried to optimize was measured by a test that simulates a user making a bunch of short strokes relatively close to one another.  An artist would typically do this if they were cross hatching, stippling, or applying a Van Gogh-esque texture to their drawing.  A lot of the changes that I made are particular to the internals of DeviantArt muro, and a description would not make sense to somebody unfamiliar with our codebase.  However, I will describe two of the optimizations.  When a user draws a line, the new line needs to be pushed into an undo buffer, and it also needs to be reflected in the zoom “navigator” panel that is in the corner of a screen.  These two tasks can not avoid some kind of buffer copying, but smarter buffer copying led to some noticeable speed improvements as can be seen below.  Note that Firefox 4 is not shown in the first three sets of results because I did not start testing it until later (and I did not feel like re-coding all my inefficiencies just to see how much it improved).



The next place I turned my attention was individual brushes that were taking longer than they needed to.  In most cases this came from unnecessary calls to clearRect() or fillRect() (these calls perform similarly to putImageData() calls).  The bulk of deviantART muro’s brushes are now quite a bit faster.  Once again, Firefox 4 is not in these results because I did not benchmark it at the beginning of the project.




Web Workers

Next I looked at the filters in DeviantArt muro.  To give context to why filters are slow, one must understand that an imageData object consists mostly of an array of pixel values with R, G, B, and A values stored separately.  Thus, the imageData array has width*height*4 elements.  Most filters need to look at the data surrounding a pixel in order to determine the new value for a pixel.  Let’s say that the filter looks in a radius of 3 pixels (so a square of 7 pixels to  a side), that means it needs to know the value of 196 array entries in order to color a single pixel.  Javascript array lookups are not particularly fast, so applying a filter to a large canvas can be painfully slow.

I figured that the good news about the filter problem was that it is something that can be easily parallelized. Most modern computers have at least 2 processor cores, so it is a shame to leave one of those idle while a single browser UI thread is churning away.  So, I prototyped a change that split the canvas into several chunks and passed the filtering off to some web worker threads.

The first problem I ran into is that web workers do not have a concept of shared memory.  Data passed to and from them must go through calls to postMessage().  An article on the Mozilla blog indicated that internally these messages are passed as JSON strings.  This is a problem for a task like filters that are operating quickly on a very large data set.  The cost of JSON encoding is not small compared to the cost of the actual computation.  Note also that in WebKit browsers you cannot assign an array reference into an imageData’s data, so you have to pay the penalty of doing a memory copy from the JSON decoded array into an imageData object.

The results of my experiment were mixed.  Firefox 3 was quite slow before the change, and sped up by a factor of 3 when it was parallelized.  Safari, on the other hand, spent a long time churning before the threads even started executing (I cannot be sure, but I suspect that this was while the JSON encoding was happening), and then for some unknown reason the multiple threads each took a lot longer than the single UI thread.  Chrome’s threads ran very quickly at first, but then it sat for a little while before returning the data to the UI thread.

I spent a little bit of time trying to debug these issues, but eventually gave up.  From my experiences I would say that web workers are a cool technology that will be really useful someday, but at the moment some browsers are not quite ready for this particular use case.

Below you can see the CPU utilization of the two cores of my machine when the filtering code is running in a single thread vs when it is running in web worker threads.




Add a Comment:
 
:icongaysparkles:
GaySparkles Featured By Owner Sep 9, 2011
That is Amazing
Reply
:iconvoneyezine:
VonEyEzine Featured By Owner Sep 6, 2011  Hobbyist Digital Artist
i appreciated your work in figuring this lagg issue with deviant muro. is there some way to keep it from lagging after drawing so many lines, because it chokes! is their a browser memory allowance that makes it choke after so many lines? what about the wacom plugin for chrome? how does that factor in?
Reply
:icongisapizzatto:
GisaPizzatto Featured By Owner Mar 11, 2011  Professional Traditional Artist
Featured: [link]
Reply
:iconmesh2325:
mesh2325 Featured By Owner Feb 26, 2011
When using get / putImageData for buffering / caching, did you run into any issues around the lack of blending within existing pixels on the canvas?

You can see an example here:

[link]

Ive been looking at using get / putImageData for caching in EaselJS:

[link]

However, using putImageData replaces existing pixels, as opposed to drawImage which blends with existing pixels.

Indeed, the blending might explain why drawImage is slower than putImageData.

mike chambers
Reply
:iconaxemclion:
axemclion Featured By Owner Feb 22, 2011
While copying pixel data to use with putImageData, you could use an approach described in [link]

Though Webkit does not let assignment of CanvasPixelArray, you still can a loop. Chrome seems to JIT the loop and it runs fast. For Firefox and Opera, you can simply assign the pixel array to the data attribute of the image.
Reply
:iconreybango:
reybango Featured By Owner Feb 21, 2011
Are these tests posted somewhere? I'd like to run them locally.
Reply
:iconparallellogic:
parallellogic Featured By Owner Feb 18, 2011
~that means it needs to know the value of 196 array entries
Every time? Seems like there could be an optimization opportunity there if a column of data could be shaved off of one calculation and a column added to it in order to calculate the next pixel, eliminating the need for 168 of the 196 loads.

~Below you can see the CPU utilization of the two cores
Safari's 'before' case seems a tad low compared to the others and to its 'after' case, was that capturing the whole operation?

I'm not quite sure I follow your analysis of the stroke() command - like what two cases you're strictly comparing. it sounds like you're comparing one method where you draw a full line one way vs drawing part of a line one way and then finishing the line in a different way and comparing the two draw times.

:) Nicely presented
Reply
:iconzcochrane:
ZCochrane Featured By Owner Feb 18, 2011  Student Photographer
Very interesting! One thing I'm wondering: Have you considered WebGL? I imagine it could bring great speed-ups for a lot of your use cases, but it is obviously a completely different programming model than the normal canvas.
Reply
:iconmudimba:
mudimba Featured By Owner Feb 18, 2011  Hobbyist
Perhaps I misunderstood, but doesn't WebGL just deal with 3D? At this point we do not have the need for any 3D. If/when we do, we will certainly look at it, though the last time I checked it was pretty new and a lot of browsers did not have full support for it yet.
Reply
:iconzcochrane:
ZCochrane Featured By Owner Feb 18, 2011  Student Photographer
It is certainly meant for 3D, but just like normal OpenGL, you can also use it in 2D only. Things like alpha blending or filters (through shaders) should be a lot faster that way, although simple things like drawing lines are less straight-forward. Of course, browser support is not exactly there yet.
Reply
:iconmedwezys:
medwezys Featured By Owner Feb 18, 2011
Impressive work and I am glad that you took a time to share the knowledge! However I would like to ask how did you conduct the benchmarks - what tools did you use for measuring times and how did you calculate the final results?
Reply
:iconmudimba:
mudimba Featured By Owner Feb 18, 2011  Hobbyist
I wrote my own test harness in javascript. It could be run on its own, or it could be plugged into dA muro so that I could test its internals and automate its execution.
Reply
:iconmedwezys:
medwezys Featured By Owner Feb 19, 2011
Thanks for answering! Would you mind sharing it on github? Of course it is totally up to you, but I think many people including me would be happy to use or even contribute to it.
Reply
:iconnakos:
NAkos Featured By Owner Feb 17, 2011  Hobbyist General Artist
Now I have a better understanding of modern Javascript / HTML innards. Before this I hardly thought about CPU, memory sharing and stuff like that.

Its really strange Chrome lose on speed in these tests.
Reply
:iconmudimba:
mudimba Featured By Owner Feb 17, 2011  Hobbyist
Chrome was slower in some tests, but like I said these only capture some of the data. If you want a test the best captures the overall speed of a browser, the hatching one is probably the one to look at. In that one Chrome was actually the fastest browser.
Reply
:iconnakos:
NAkos Featured By Owner Feb 18, 2011  Hobbyist General Artist
Ok, I see :) I think'll try to get more data about this. Seems interesting. Now I don't have to make any canvas related, but I really want to learn make it good. We are at the start of a new era of websites, good to learn now as everything new will build on top of this.
Reply
:iconmudimba:
mudimba Featured By Owner Feb 18, 2011  Hobbyist
Yep, things are starting to get a lot more interesting. The web is going to be a much different place soon.
Reply
:iconnakos:
NAkos Featured By Owner Feb 18, 2011  Hobbyist General Artist
When I started to learn HTML by myself in elementary (lotofyears ago) I didn't think I will learn 2d graphics and soon 3D graphics to make websites... or better say: web apps. :D But I am happy with this at least it stays fun to learn and do.
Reply
:iconrecklesswaltz:
recklesswaltz Featured By Owner Feb 17, 2011
Keep up the great work guys! :hug:
Reply
:iconkingofmoebius:
KingofMoebius Featured By Owner Feb 17, 2011
It really baffles me that Chrome is this much noticeably slower than, well, especially Firefox. Unless it has improved greatly, Firefox usually runs slower in most cases than Chrome on my machines. I wonder if the HTML5 has anything to do with it. . .
Reply
:iconmudimba:
mudimba Featured By Owner Feb 17, 2011  Hobbyist
Each test should be taken with a grain of salt. The canvas performance is only part of the picture. If you look at the results for hatching, you will see that chrome is really fast. That is probably the best indicator of overall browser speed. You can see that from the time you pick up your pen until the time Chrome is ready for you to draw again is just over 3 milliseconds. The brush test graph is hard to read because it is trying to convey so much data, but you will see that Chrome came out ahead in most of those tests as well.

Though I do not want to use these tests to say one browser is better than another, it is safe to say that in most cases that do not involve extreme shadows, Chrome is faster than Firefox 3.
Reply
:iconkingofmoebius:
KingofMoebius Featured By Owner Feb 18, 2011
Ahhh okay. Thanks for pointing out the specific bits for me, and for the prompt response! This is interesting stuff. ;) Sometimes I wish I had the know-how to do this kind of stuff like you guys or at least fully understand everything that is said here, but I feel like I would be too impatient to learn. Hehe.
Reply
:iconmudimba:
mudimba Featured By Owner Feb 18, 2011  Hobbyist
Oh, you should definitely try to learn it! Even if it just means studying a little bit every now and then. I think that using a computer gets a lot more fun when you have a firm understanding of how they do everything that they do.
Reply
:iconkingofmoebius:
KingofMoebius Featured By Owner Feb 17, 2011
At least, in some tests. I see now that other tests had FF running slower.
Reply
:iconelectricjonny:
electricjonny Featured By Owner Feb 17, 2011  Hobbyist Photographer
I bet a lot of places would appreciate this kind of information, since HTML5 is only going to be used more and more. Although I bet you have more important things to do than to publish and keep up with a lot of tech places like that =P
Reply
:iconpoweredbyandrex:
poweredbyandrex Featured By Owner Feb 17, 2011
Muro and these blog posts are an inspiration for me and my canvas drawing app. Keep up the fantastic work!
Reply
:iconvsconcepts:
VSConcepts Featured By Owner Feb 17, 2011  Professional Interface Designer
mmm. You left out a pie chart. mmm.. :pie:
Reply
Add a Comment:
 
×

:icondt: More from dt


More from DeviantArt



Details

Submitted on
February 17, 2011
Link
Thumb

Stats

Views
15,496 (1 today)
Favourites
1 (who?)
Comments
27