Thursday, November 15, 2007

JUEL vs. MVEL

Some people recently asked me what the advantages of MVEL over, say, JUEL were. After all, JUEL implements a JSR standard, and it is supposed to be very fast. So I thought I would have to put JUEL through it's paces.

I designed MVEL to be super-easy to integrate, negate the need to provide factory caching, and integrating with sophisticated interfaces and APIs.

Even thought we provide a very simple facade to MVEL, we provide fully extensible external resolvers and extendability of the coercion support, etc. But we also provide a very trivial and bullet-proof integration API to hide the complexity for those who don't need it.

Certainly, when we use the "simple approach" to MVEL, we compromise some performance by requiring that the MVEL convenience methods provide lightweight wrappers to inject variables from Map's, etc. When using this convenient method, MVEL must be slower than taking advantage of the performance boosting power of factory caching in EL implementations, right? Well, let's see.

Following the integration instructions, I faithfully tried to perform a "fair" test between MVEL and JUEL.

Firstly, I started by defining a simple POJO class: Foo.java

public class Foo {
private String name = "Foo";
private Bar bar = new Bar();

public String getName() {
return name;
}

public void setName(String name) {
this.name = name;
}

public Bar getBar() {
return bar;
}

public void setBar(Bar bar) {
this.bar = bar;
}
}


Then I defined two simple tests to run side-by side. Here are the individual tests, but you can view the full file here.

public void runJUEL() {
ExpressionFactory factory = new de.odysseus.el.ExpressionFactoryImpl();
de.odysseus.el.util.SimpleContext context = new de.odysseus.el.util.SimpleContext();
context.setVariable("foo", factory.createValueExpression(foo, Foo.class));
ValueExpression v = factory.createValueExpression(context, "${foo.name}", String.class);

for (int i = 0; i <>
          if (!"Foo".equals(v.getValue(context))) throw new RuntimeException("invalid value returned");        
        }     
    }     
    public void runMVEL() {         
       Serializable s = MVEL.compileExpression("foo.name");          
       // inject variables into MVEL via a Map (for convenience)         
       Map map = new HashMap(1);         
       map.put("foo", foo);          
       for (int i = 0; i <>
          if (!"Foo".equals(MVEL.executeExpression(s, map)))                 
              throw new RuntimeException("invalid value returned");         
          }    
       }
    }

I elected to run each of these tests 3 times, plus 1 unmeasured HotSpot warmup run, for 100,000 iterations. So, what were the results?

Test #MVELJUEL
16.0ms34.0ms
25.6ms33.6ms
36.0ms34.6ms
Avg.5.8ms34.2ms

With no cached factory, no cached context, and using a simple Map to insert variables into MVEL's VariableFactory, we observe that MVEL yields performance nearly 6 times faster than JUEL.

Let's make it even more interesting.  

What if we removed all caching before the iteration? What if we cold-start JUEL and MVEL for each individual iteration?  What if we have to setup the JUEL factories and contexts for each execution, and what if we just use MVEL's eval() method without any pre-compile on every execution?  Both JUEL and MVEL's results will surely worsen, but by how much?  Here's the results:

Test #MVELJUEL
194.3ms2426.0ms
294.0ms2421.6ms
396.0ms3400.6ms
Avg.94.7ms2749.4ms

When we take away the advantage of resource-reuse from both MVEL and JUEL, MVEL slows down by a factor of 16, and JUEL slows down by a factor 72.  Boo-ya!

Tuesday, August 28, 2007

MVEL's Compiled vs. Interpreted Mode.

One of the things which makes MVEL so neat is the fact it functions in both purely interpreted (ie. execute while parsing) and as pre-compiled scripts.

Contrary to what you might think, calling eval in MVEL does not simply bootstrap the compiler and execute. Rather no AST is generated at all. The compiler parses tokens, reduces them at the first ability opportunity, and pushes the values down to an execution stack. This is very much unlike running a compiled script, which often involves little or no use of the execution stack.

Take the following example:

(4 + 5) - 2 + a

When you compile this statement, MVEL generates a very efficient evaluation tree, or even bytecode to accelerate the execution. But when you run in interpreted mode, MVEL doesn't bother with all of that overhead. Instead, MVEL takes the shortest possible path to solve the problem.  It does this by emitting instructions and values onto an execution stack as it parses. Here is how it works:

1. "(" is encountered. The parser scans forward to determine where this nest ends.
2. The nest
"4 + 5" is returned.
3. MVEL pushes
4 onto the stack.
4. MVEL pushes
5 onto the stack.
5. MVEL pushes the
'+' opcode onto the stack.
6. MVEL determines that the stack can be reduced.
7. MVEL pops the opcode off the stack, and the 2 parameters (4 and 5), and passes it to the operation handling code.
8. MVEL pushes the result (
9) onto the stack.
9. MVEL pushes
2 onto the stack.
10. MVEL pushes the
'-' opcode onto the stack.
11. MVEL pops the opcode off the stack, and the 2 parameters (9 and 2), and passes it to the operation handling code.
12. MVEL pushes the result (
7) onto the stack.
13. MVEL resolves the value of
a, and pushes it onto the stack. (which for the sake of argument is: 10)
14. MVEL pushes the
'+' opcode onto the stack.
15. MVEL pops the opcode off the stack, and the 2 parameters (7 and 10), and passes it to the operation handling code.
16. MVEL pushes the result (
17) onto the stack.
17. MVEL has nothing left to parse, MVEL pops the value off the stack (
17) and returns.

Which should gives you the basic idea of how MVEL is able to parse and execute on the fly versus use an AST for eval() :)

Wednesday, July 18, 2007

MVEL by the Numbers. The Real Story.

Many people will have read the excessively long flame war that was set off over at The Server Side when I posted some numbers juxtaposing MVEL's performance vis-a-vis that of OGNL 2.7.

Jesse Khunert pointed out correctly, that I was not properly testing OGNL's new bytecode enhancer due to my ignorance of the API.

Indeed, OGNL 2.7 would appear to be faster than MVEL in terms of pure bytecode generation. But this is not the entire story. If we take a look at MVEL's reflection-based performance vs. OGNL's reflection-based performance, it's no contest.

Let's take a look at some test source code (using latest OGNL and latest MVEL 1.2 beta):

--snip--snip---
// Expression we'll test.
String expression = "foo.bar.name";

// Number of iterations
int iterations = 100000;

Base base = new Base();

// Compile expression in MVEL
Serializable mvelCompiled = MVEL.compileExpression(expression);

// Disable MVEL's JIT by making the default optimizer the Reflective optimizer.
OptimizerFactory.setDefaultOptimizer(OptimizerFactory.SAFE_REFLECTIVE);

// Compile OGNL AST
Object ognlCompiled = Ognl.parseExpression(expression);


// We loop twice, once to warm up HotSpot.
for (int repeat = 0; repeat < 2; repeat++) {

long tm = System.currentTimeMillis();
for (int i = 0; i < iterations; i++) {
MVEL.executeExpression(mvelCompiled, base);
}

// Let's not report the results the first time around, HotSpot needs to warm up
if (repeat != 0) System.out.println("MVEL : " + ((System.currentTimeMillis() - tm)) + "ms");


tm = System.currentTimeMillis();
for (int i = 0; i < iterations; i++) {
Ognl.getValue(ognlCompiled, base);
}

// See above.
if (repeat != 0) System.out.println("OGNL : " + ((System.currentTimeMillis() - tm)) + "ms");
}


Full source here

In the above test we put MVEL and OGNL on equal footing. We kill MVEL's internal JIT and we let OGNL and MVEL fight it out using pure reflection. So what do the results look like?



MVEL : 56ms
OGNL : 615ms


Pretty big difference. MVEL is 10 times faster in reflection mode. And you might say: so what? I'm just going to use the JIT from now until forever.

That sounds like a great idea until you befallen the great caveat of on-the-fly code generation in Java: classes don't get garbage collected until their ClassLoader is garbage collected.

Early on in development, we realized the problem created with thousands of bytecode optimizers being generated on-the-fly in large systems when we started running into JVM crashes due to an overflow of classes in the permanent generation. While work-arounds exist, such as the use of wacky JVM options (which often have wacky caveats like breaking singletons) and one-classloader-per-class schemes (a horrible idea) we found that it was impossible to provide a consistent, out-of-the-box safe integration experience for web frameworks and other systems which might find themselves using MVEL as a binding language.

Instead of wait for the Java world to catch up to the world of code generation, we decided to keep our eye on the ball of our reflection-based performance, and as such MVEL allows for parallel and hybrid compilation of both reflective accessors and bytecode generated accessors.

But why is MVEL's bytecode still around 1.2 to 1.5 times slower than OGNL generated bytecode?

MVEL, as a dynamically typed language (with optional static typing) still requires callbacks to the MVEL runtime in order to perform expression egress type narrowing (I'll explain that later) as well as providing consistent type coercion. In fact, unlike OGNL's bytecode compiler, which performs a static type analysis for method calls and accessors, MVEL provides inline dynamic de-optimization points to allow the same compiled expression to be used with two unrelated types. For example:


class Foo {
private String name;

public String getName() { return name; }
}

class Bar {
private CharSequence name;

public CharSequence getName() { return name; }
}


Say we initialize both classes Foo and Bar, and then compile the expression name. Then we apply that compiled expression against each object. Observe the ClassCastException in OGNL 2.7 while MVEL re-optimizes and hums along :)