Archive for March 2011
JavaScript Quotations
Lately I’ve been working on the ECMA3 conformance in IronJS, but last night I did a small side-tour into something completely different: JavaScript Quotations. The ideal is similar to the one found in F# code quotations or Lisp macros, not as evolved as any of them though – but still pretty nice. I wanna say right now that this is my own extension to the JavaScript language and it’s not something you can do in any browser or other implementation (that I know of).
What it does is it introduces a new symbol, @ – stolen from F#, which gives you access to the syntax tree of a function during runtime and allows you to modify it as you see fit and then compile it to a regular JavaScript function. While it could be abused to no end it allows for some pretty interesting possibilities. The example I’m going to show creates a function which reads a property out of an object, and optionally compiles a console.log statement into the function body.
function makeLoggedPropertyReader(includeLog, propertyName) { // Note the @ symbol infront of the function keyword var quoted = @function (x) { if(x) { console.log(x); } return x._; }; // This is how the quoted structure looks like, // it's basically a syntax tree that you can // traverse, modify as you see fit and then compile /* quoted = { type: 19, // function body: [ { type: 18, // if statement test: { type: 5, // identifier value: "x" }, trueBranch: [ { type: 9, // method call target: { type: 5, // identifier value: "console" }, member: { type: 5, // identifier value: "log" }, arguments: [ { type: 5, // identifier value: "x" } ] } ], elseBranch: { type: 0 // void node } }, { type: 25, // return value: { type: 43, // property accessor object: { type: 5, // identifier value: "x" }, name: { type: 5, // identifier value: "_" // the value we're going to replace } } } ] } */ // The first statement in the qouted body // is the ifStatement, which we will conditionally // remove depending on the boolean value of includeLog if(!includeLog) { quoted.body[0] = Quotations.voidStatement(); } // Pull the second statement out of the function body var returnStmt = quoted.body[1]; // Pull the value node out of the return statement var propertyAccessor = returnStmt.value; // Set the value of the "name" node of the property accessor // to the string value of the propertyName that is passed in propertyAccessor.name.value = propertyName.toString(); // We've modified our qouted expression // and we can now compile it so it // becomes a return quoted.compile(); } // And here we'll use it: var logged = makeLoggedPropertyReader(true, "myProp"); var notLogged = makeLoggedPropertyReader(false, "myProp"); var myObj = {myProp: "hello world"} var xValue = logged(myObj); // will return and print the value of myProp to console.log var xValue = notLogged(myObj); // will only return the value of myProp
Analyzer: Single-Pass vs. Multi-Pass
I recently wrote about the new lexer and parser in IronJS, giving a 8x performance boost to parsing. Over the past two days I’ve been looking at the AST analyzer IronJS has been using, and how to improve it. The analyzer steps through the AST produced by the parser and figures out things like closures, static types, dynamic scopes, etc. Due to the non-mutable nature of discriminated unions in F# it has been forced to re-build the syntax tree to resolve everything it needs to, since it sometimes required changes to, it has also been a doing several passes over the syntax tree.
I’m glad to announce that with some clever use of reference cells I’ve been able to both eliminate the need to re-build the AST and also, due to having access to the internals of the new TDOP-based parser, manged to make it require only a single pass over the syntax tree, the performance difference is pretty staggering.
As you can see with both the new parser and analyzer it’s a whopping ~13x faster then the old ANTLR based parser and multi-pass analyzer. It’s also ~4x faster then the new parser with the old multi-pass analyzer.
New lexer and parser in IronJS
I’ve been thinking about replacing the lexer and parser IronJS has been using for a year now, which is an ANTLR generated LL(*) parser. The main drive behind this is that I’ve been wanting to shed the two DLL dependencies the parser caused, first the runtime for ANTLR (Antlr3.Runtime.dll) and then the parser itself (Xebic.ES3.dll) – since it was C# it wasn’t possible to integrate the code into the IronJS F# code base and it had to be linked as a separate assembly.
I’m glad to announce that I’ve finally gotten around to do this and that the new F# based lexer and parser were pushed to the master branch on github earlier today. I also decided to remove the dependency on Microsoft.Dynamic.dll which I only did about ten calls into. This means IronJS now only requires FSKit other then itself, the plan is to merge the FSKit functionality that IronJS requires into the IronJS project itself so it will only be one DLL.
Another great benefit of rewriting the parser in F# is a pretty nice speed boost, if I can direct your attention to the chart below you will see that the new lexer and parser is about eight times faster on the jQuery 1.5.1 (uncompressed) source code. This of course means that IronJS i getting even faster then it was.
Also, keep your eyes open for the first 0.2 beta that will arrive shortly.
Update:
I got a question on IRC on how the profiling was done, so here’s a description of it.
- System.Threading.Thread.CurrentThread.Priority was set to System.Threading.ThreadPriority.Highest
- Timing was done with the System.Diagnostics.Stopwatch class
- The source code was loaded into memory before lexing and parsing so no disk penalty would occur
- The machine, which is a i7 Quad Core with 8Gb of ram, was restarted between each test and as many processes as possible were killed when Windows was done booting
- The projects were compiled in release mode with full optimizations and no debug info
- Each test was ran ten times before timing started to make sure all assemblies were loaded in memory and there would be no JIT overhead
- After the ten warm-up runs the test was ran 100 times, the ten fastest were picked and averaged
If there are any flaws in the above process please do point them out and I will re-do the test and post new results.