Archive | May 2013

When to Compile your Regex

In .NET, it is a common misconception that compiled Regex always operates faster than uncompiled Regex after the initial cost which is incurred at the time of instantiation only. This is not always true.

Recently I wrote a URL rewriter, similar to Helicon but a little more powerful and flexible. It allows you to define rules in a config file, and then rewrite a captured input string to an output string, replacing parts of the URL via Regex capture groups along the way. An example of the configuration looks like this:

<add original="^/(.*)/customer(.*)$" rewritten="/$1/cust$2" redirect="false" ignorecase="true" />

What the above says is that if my URL is anything, then “/customer” then optionally anything else, rewrite it to the “/cust” version of the URL with the original inputs applied to the outputs.

Now, considering myself to be an efficient developer, I have always pre-compiled my Regex by adding the RegexOptions.Compiled option to the constructor. My rewriter code that parsed my XML file and created Regex’s looked like this:

Regex rewriteRuleRegex = null;

// Determine if we ignore case
if (rewriteConfiguration.IgnoreCase)
{
    rewriteRuleRegex = new Regex(rewriteConfiguration.Original, RegexOptions.Compiled | RegexOptions.IgnoreCase);
}
else
{
    rewriteRuleRegex = new Regex(rewriteConfiguration.Original, RegexOptions.Compiled);
}

I had 144 rules in my file, and decided to performance test the URL writer against approximately 100 unique requests. The result was that it sucked, REALLY, REALLY badly, with 87% of the request lifecycle being spent on Regex.Replace!

Precompiled Sucks

Precompiled Sucks

That’s INSANE! 87% is COMPLETELY unacceptable! After all I’d heard about the .NET Regex engine being efficient, this was freaking terrible! I spent a day pulling my hair out and reading a ton of articles, when I came across one from Jeff Atwood about when to Precompile and when not to. So, I took the Precompile flag off of my code:

Regex rewriteRuleRegex = null;

// Determine if we ignore case
if (rewriteConfiguration.IgnoreCase)
{
    rewriteRuleRegex = new Regex(rewriteConfiguration.Original, RegexOptions.IgnoreCase);
}
else
{
    rewriteRuleRegex = new Regex(rewriteConfiguration.Original);
}

And re-ran my tests:

Precompiled Sucks

Precompiled Sucks

MUCH better.

So what I learned is that if you CONTROL the input (AKA if it’s a small set of input that you know in advance), precompiling is good. If you don’t control the input (such as user generated URLs that are sent to your application), DO NOT PRECOMPILE. EVER.