Showing posts with label Proto-Projects. Show all posts

JavaScript DOM Iteration and Date Function Optimization #

Short Version

Instead of row[k], use nextSibling to iterate between table rows in Safari and Firefox. Firefox has slow date functions; cache values whenever possible. Setting node values by using a template node subtree that you cloneNode, modify and insert is faster in Firefox and MSIE whereas setting the innerHTML property is faster in Safari.

Long Version

For my primary ongoing proto-project, I needed to do some transformations on a table's contents, to make them more human-readable. Specifically, I was iterating over all of its rows, picking a cell out of each one, extracting its contents (a date in the form of seconds from the Epoch), converting the timestamp to a nice human-readable string form, and replacing the cell's contents with that.

This all sounds very simple (and it was indeed easy to code up), but the performance I was seeing was not all that impressive. Specifically, here are the runtimes (in milliseconds) on a table with 674 rows in the three browsers that I usually test with:

Safari:1649.9 Firefox:2578.6 MSIE:618.9

Safari refers to the 1.3 beta release with the faster JavaScript implementation while Firefox is the standard version 0.9.1 and IE is of course Microsoft Internet Explorer 6.01. The first two browsers are running on 1 GHz TiBook, while IE was tested on a 1 GHz Centrino laptop.

The hardware may not match exactly, but we are (hopefully) looking for performance differences on the order of tens of percents, not the 2-4x difference that we see in the above times. I decided to investigate further, by trying to see how much time was spent just doing the iteration. This was a very simple traversal of the following manner:

for (var i=0; i < itemsTable.rows.length; i++)
    for (var j=0; j < itemsTable.rows[i].cells.length; j++)
        transformFunctions[j](itemsTable.rows[i].cells[j]);

(I am paraphrasing a bit, but performance characteristics were similar even when caching the two length values). Taking out the call to the transformation function, but leaving in a dummy assignment like var k = itemsTable.rows[i].cells[j] to make sure that all relevant table cells were accessed resulted in the following runtimes (the iteration was repeated ten times to reduce issues with timer accuracy, times presented are still per run and thus directly comparable to those above):

Safari:893.9 Firefox:667.1 MSIE:37.1

As it can be seen, Safari spends half its time just iterating over the elements, while Firefox needs a fifth of its longer runtime for the same task. Only in MSIE's case does the iteration represent a negligible portion. This prompted me to evaluate a different iteration method, one that moves from row to row and from cell to cell using a node's nextSibling property, like this:

for (var row = itemsTable.rows[1]; row != null; row = row.nextSibling)
    if (row.nodeType == 1)
        for (var cell = row.firstChild, i = 0; cell != null; cell = cell.nextSibling)
            if (cell.nodeType == 1)
                transformFunctions[i++](cell);

(The nodeType == 1 comparison is needed since whitespace between rows/cells may be included as text nodes, whose type is 3.) This code snippet (again with the call to the transform function suppressed) resulted in:

Safari:40.0 Firefox:170.0 MSIE:38.9

MSIE is barely affected, while Safari and Firefox see very significant improvements. I'm guessing that the IE JavaScript/DOM implementation is optimized for the row[k] access method as well (perhaps by pre-computing the array when the DOM tree was generated) whereas the other two browsers simply step along k times through the linked list of nodes, every time there's a index request. A bit more digging would reveal whether Firefox and Safari have an N2 behavior for various table sizes, confirming that this was the case. At any rate, this significantly sped up the overall processing time, and now the bottleneck was the actual transformation that the function was doing. To see where it was spending its time, we ran it so that it just did the timestamp to string computation, as compared to doing the DOM operations as well (getting the cell's contents and later replacing them):

Iteration and Computation

Safari:208.1 Firefox:941.6 MSIE:95.1

Iteration, Computation and DOM Operations

Safari:550.3 Firefox:2081.7 MSIE:620.7

It seems that Safari and IE spend more time with DOM operations, while in Firefox's case the split is more even. As a quick change to help with the DOM issue, I changed the way I obtained the cell's contents. Rather than getting its innerHTML property, which presumably requires the browser to reconstruct the text from the node hierarchy, we rely on the fact that we know the cell's contents are plain text, and thus we can get its value directly with cell.firstChild.nodeValue. Running with this code gets us:

Safari:522.7 Firefox:2041.0 MSIE:587.7

A small (and reproducible) improvement, but not significant perceptually. I next decided to focus on the date operations themselves. The string conversion is done in two parts, one for the date and one for the time. In my application's case, it is likely that the same date will appear several times, therefore it makes sense to cache the strings generated for that, and reuse them in later iterations instead of recomputing them. With this particular dataset, the hit rate of this cache was 86.7%. Running the entire thing, we get:

Safari:474.3 Firefox:1839.7 MSIE:571.0

Firefox is helped the most, confirming the previous observation that its date functions are slow. I then realized that once I created a date object, I kept making calls to its accessor functions (getMonth(), getFullYear(), etc.) The small tweak of making these calls only once and remembering their values for later comparisons resulted in:

Safari:463.3 Firefox:1548.3 MSIE:571.0

Firefox is significantly helped yet again, and now for it too, the DOM operations dominate the runtime. As a final attempt in tweaking them, I tried a different approach when changing the cell's contents. Normally, I don't just get the resulting date string and use it; rather the date and time portions are put into different <span>s, aligned on opposite sides of the cell. Rather than generating these spans by setting the innerHTML property of the cell and letting the browser parse out the DOM tree, I attempted to create the sub-tree directly. I first created a static "template" subtree at initialization time. Then, when it came time to set the cell's contents, I made a (deep) copy of this template subtree by using cloneNode and replaced the values of its two text nodes with the strings. Finally, I replaced the original text child of the cell with this new sub-tree. Timing this resulted in:

Safari:805.0 Firefox:1483.0 MSIE:433.7

For the first time we see an optimization that hurts one browser (Safari) while helping the two others (Firefox and MSIE), to significant degrees in both cases. In my case, I decided to simply revert back to the innerHTML method, but it may be worth it to actually support both methods and switch based on the user's browser.

Finally, here are the percentage speedups that all of the tweaks brought (using the fastest time for each browser):

Safari:256% Firefox:74% MSIE:43%

I must say, I'm quite impressed with IE's JavaScript implementation, especially considering how many (perceived) issues there are with its other components, like its CSS engine.

All Mac Developers Think Alike? #

Are these people the competition or possible beta testers?

Early Bird... #

What seems to happen when you take too long on your pet project is that someone else gets to it ahead of you. Though perhaps, in this particular case, there's room for improvement.

Mail Scripting: Part Deux #

Now that I have determined what the best (i.e. fastest) way to get Perl talking to AppleScript it is, it's time to actually get some data out of it. My first attempt was very simplistic: have a handler export all the data that I needed as a string (in this specific example, we're getting the list of mailboxes, but a similar approach can work for getting any sort of data out of the script):

on «event MailGMbx»
    set output to ""
    tell application "Mail"
            repeat with i from 1 to mailboxCount
                set output to name of item i of allMailboxes & "\n"
            end repeat
    end tell
    return output
end «event MailGMbx»

This handler can be invoked with Mac::OSA::Simple's call method, and the result can be extracted by using Perl's split command. Some benchmarking revealed that this had reasonable performance for this particular case (there never are too many mailboxes), but that it degraded severely when dealing with more data. For example, getting a list of all the subjects, senders, dates, etc. of a mailbox with ~1000 messages took around 10 seconds. I initially thought that this was due to poor string handling in AppleScript (much in the same way that the String class in Java isn't meant for repeated appends, leaving that the string building tasks instead to StringBuffer). However, even not doing any appends in the loop (simply setting output to the current value) had similar behavior, performance-wise.

Since Mac::OSA::Simple's call function can handle lists as result values, I tried returning the name of every mailbox directly instead. Much to my surprise(or perhaps not - doing the iterations at a lower level was bound to be faster), that worked much better. In the mailbox case, the string building way took an average of 0.115 seconds per handler invocation, whereas returning a list directly took 0.0472 seconds. In even more demanding situations, such as the above-mentioned mailbox headers case, it was an order of magnitude faster.

Deviating a bit from the above, I also attempted a different method altogether. In my Googling yesterday, I came across Mac::Glue, an alternative way of bridging the Perl-AppleScript (or more correctly Perl-AppleEvents) gap. The module very cleverly loads the scripting dictionaries of applications, and creates a bit of glue code that allows interaction with AppleEvent objects directly from Perl, as described in this article. Thus there is no need to have an external script that is invoked from Perl; everything can be done from The One True Language. For example, the above mailbox-extracting case would be done as follows:

my @mailboxes = ();

my @mailboxesAE = $mailApp->prop("mailboxes")->get;

my $i = 0;
for my $mailboxAE (@mailboxesAE)
{
    my $mailboxRef = {id => $i++,
                      name => $mailboxAE->prop("name")->get};
    push(@mailboxes, $mailboxRef);  
}

return @mailboxes;

However, it turns out that, despite the greater ease of use (and flexibility) that Mac::Glue allows, it's not going to be feasible to use it in my project, for two reasons. The first is that performance seems to be much worse; the above code runs at 1.97 seconds per call, an order of magnitude slower than even the string based method above. More importantly, it also has memory leak issues, with the above example losing ~2 megabytes per call (there were 52 mailboxes in the list). Since I will be using this from within a daemon process that has a long lifespan, this would not be acceptable.

The developer's journal has the answer as to why this is the case. Mac OS X changed the way data could be extracted out of AppleEvent descriptors (AEDescs). Rather than simply getting a handle to the descriptor's data portion, one must now allocate some memory and request that the data be copied there (with the original being out of reach). Mac::Glue, having started life as a Mac OS 9-era MacPerl module, does not fully take this into account yet. That is, it does the copying necessary, but it never disposes of the data once the Perl object goes out of scope or is otherwise garbage collected. Until this is fixed, this module is of limited use to me.

Perl, AppleScript and Mail.app: Prototyping and Benchmarking #

I am investigating a new product for Mscape, since working on the same thing for six years can get somewhat boring. This would involve exposing Mail.app's mailboxes and messages with an alternative GUI, thus I needed to figure out the best way to get data out of it.

The most direct way would be as a plug-in (a.k.a bundles). However, this seems to be an undocumented (and unsupported) approach that relies on private Mail.app methods. There are several plug-ins that exist in spite of this, and presumably their developers have a vested interest in keeping the reverse-engineered headers up to date, but there is no absolute guarantee that this approach will always work. Furthermore, there appear to be some complications with this approach, licensing-wise. Most of the plug-ins are open source, specifically under the GPL. I'm guessing that this prevents me from grabbing their headers and incorporating them into my (closed-source) product. httpMail appears to be under a BSD license and may thus be more permissive about such things, but it's still a messy thing that I'd rather not deal with.

An alternative approach would be to rely on Mail.app's AppleScript support, which has the advantage of being officially supported by Apple, with the trade-offs of lower performance and less functionality. Since, at least in the prototyping stage, I'll be working with Perl (I don't relish the idea of writing something entirely in AppleScript), I went off on the tangent of investigating the best way to tie these two scripting languages together. The simplest way would be to invoke the osacript program, much in the way that I did for my blogroll generator. However, this has some pretty significant overheads, specifically having to spawn another process and compile the script. Some searching turned up a post benchmarking the various Perl-AppleScript glue mechanisms. The fastest appeared to be the DoAppleScript method from the MacPerl package. However, the post mentioned that Mac::OSA::Simple supported pre-compiled scripts as well, something that was not tested. Using this functionality is very simple; it is simply a matter of using the compile_applescript function, and then later calling the execute() method on the resulting object. Incorporating this in the benchmark gives us the following test results (averaged over three runs):

runscript:       42.83 executions/sec
applescpt:       306.66 executions/sec
osascript:       312.33 executions/sec
doscript:        410.00 executions/sec
applescpt_comp: 1108.67 executions/sec

As it can be seen, the compiled version is by far the faster one (the fast osascript times are deceiving, as the post mentions, in fact its speed is closer to that of runscript). However, the AppleScript snippet that is used for benchmarking is very simple (asking the Finder for the name of the startup disk). Running a more realistic script (that iterates through Mail.app's mailboxes, picks one, and returns the subjects of the 131 messages contained within) gives the following results:

applescpt:      6.35 executions/sec
doscript:       6.63 executions/sec
applescpt_comp: 6.72 executions/sec

The spread is much tighter, and so any method should be satisfactory. However, the pre-compiled, Mac::OSA::Simple, approach is still preferable, since in addition to its slight performance advantage, it also provides some handy methods, like the ability to invoke specific handlers within a script. The only approach that may provide better performance and more flexibility would be to use AppleEvents directly, via Mac::AppleEvents::Simple, but this would be much more tedious, since I would have to build them (the events) by hand.