Tag Archive | performance

Custom ASP.NET MVC Action Result Cache Attribute

If you’re working on an application built using ASP.NET MVC, you’re hopefully aware of the OutputCacheAttribute attribute which can be used to statically cache your dynamic web pages. By adding this attribute to a controller or action method, the output of the method(s) will be stored in memory. For example, if your action method renders a view, then the view page will be cached in memory. This cached view page is then available to the application for all subsequent requests (or until the item expires out of the cache), which can retrieve it from the memory rather than redoing the work to re-create the result again. This is the essence of caching: trading memory for performance.

The OutputCacheAttribute is a really powerful way to improve performance in your MVC application, but isn’t always the most practical. Because it caches the entire page as raw HTML, it circumvents a large part of the MVC pipeline and thus also skips the code that runs to generate the page. This means that if your view has dynamic content that comes from session or ViewData, such as displaying the currently logged in user’s name in the top bar, or the current time of day, or the resulting view of an invalid form post which tells your user to correct their input errors, you’ll quickly discover the error of your ways when you try to cache that page. When David accesses the logged in page for the first time and caches it, everybody else who logs in will be called David on the page. And if David fills out your empty form and presses submit, only to cache the resulting input validation error page, then everybody will see David’s completed form when they have errors too – maybe even including sensitive data like his username, password, or even his credit card information. I think that most of us have seen this kind of (often humorous) caching error before. It’s scary stuff, nonetheless.

A great way to balance the benefits of output caching with the dynamic content and features that the modern ASP.NET MVC web application offers is to create a custom caching attribute. This attribute can cache the ActionResult instead of the raw HTML of the page, and in doing so will allow you to cache all of the work that is done to generate the ActionResult (be it ViewResult or otherwise). By executing within the MVC pipeline, this custom caching attribute will not interrupt or short-circuit the MVC pipeline. This allows for things like SessionState or ViewData to vary per cached request! It’s not quite as efficient as the true OutputCacheAttribute, but my custom ActionResultCacheAttribute is an excellent tradeoff between performance and dynamic data:

/// <summary>
/// Caches the result of an action method.
/// NOTE: you'll need refs to System.Web.Mvc and System.Runtime.Caching
/// </summary>
[AttributeUsage(AttributeTargets.Method | AttributeTargets.Class, AllowMultiple = false, Inherited = false)]
public class ActionResultCacheAttribute : ActionFilterAttribute
{
    private static readonly Dictionary<string, string[]> _varyByParamsSplitCache = new Dictionary<string, string[]>();
    private static readonly ReaderWriterLockSlim _lock = new ReaderWriterLockSlim();
    private static readonly MemoryCache _cache = new MemoryCache("ActionResultCacheAttribute");

    /// <summary>
    /// The comma separated parameters to vary the caching by.
    /// </summary>
    public string VaryByParam { get; set; }

    /// <summary>
    /// The sliding expiration, in seconds.
    /// </summary>
    public int SlidingExpiration { get; set; }

    /// <summary>
    /// The duration to cache before expiration, in seconds.
    /// </summary>
    public int Duration { get; set; }

    /// <summary>
    /// Occurs when an action is executing.
    /// </summary>
    /// <param name="filterContext">The filter context.</param>
    public override void OnActionExecuting(ActionExecutingContext filterContext)
    {
        // Create the cache key
        var cacheKey = CreateCacheKey(filterContext.RouteData.Values, filterContext.ActionParameters);

        // Try and get the action method result from cache
        var result = _cache.Get(cacheKey) as ActionResult;
        if (result != null)
        {
            // Set the result
            filterContext.Result = result;
            return;
        }

        // Store to HttpContext Items
        filterContext.HttpContext.Items["__actionresultcacheattribute_cachekey"] = cacheKey;
    }

    /// <summary>
    /// Occurs when an action has executed.
    /// </summary>
    /// <param name="filterContext">The filter context.</param>
    public override void OnActionExecuted(ActionExecutedContext filterContext)
    {
        // Don't cache errors
        if (filterContext.Exception != null)
        {
            return;
        }

        // Get the cache key from HttpContext Items
        var cacheKey = filterContext.HttpContext.Items["__actionresultcacheattribute_cachekey"] as string;
        if (string.IsNullOrWhiteSpace(cacheKey))
        {
            return;
        }

        // Cache the result of the action method
        if (SlidingExpiration != 0)
        {
            _cache.Add(cacheKey, filterContext.Result, TimeSpan.FromSeconds(SlidingExpiration));
            return;
        }

        if (Duration != 0)
        {
            _cache.Add(cacheKey, filterContext.Result, DateTime.UtcNow.AddSeconds(Duration));
            return;
        }

        // Default to 1 hour
        _cache.Add(cacheKey, filterContext.Result, DateTime.UtcNow.AddSeconds(60 * 60));
    }

    /// <summary>
    /// Creates the cache key.
    /// </summary>
    /// <param name="routeValues">The route values.</param>
    /// <returns>The cache key.</returns>
    private string CreateCacheKey(RouteValueDictionary routeValues, IDictionary<string, object> actionParameters)
    {
        // Create the cache key prefix as the controller and action method
        var sb = new StringBuilder(routeValues["controller"].ToString());
        sb.Append("_").Append(routeValues["action"].ToString());

        if (string.IsNullOrWhiteSpace(VaryByParam))
        {
            return sb.ToString();
        }

        // Append the cache key from the vary by parameters
        object varyByParamObject = null;
        string[] varyByParamsSplit = null;
        bool gotValue = false;

        _lock.EnterReadLock();
        try
        {
            gotValue = _varyByParamsSplitCache.TryGetValue(VaryByParam, out varyByParamsSplit);
        }
        finally
        {
            _lock.ExitReadLock();
        }

        if (!gotValue)
        {
            _lock.EnterWriteLock();
            try
            {
                varyByParamsSplit = VaryByParam.Split(new[] { ',', ' ' }, StringSplitOptions.RemoveEmptyEntries);
                _varyByParamsSplitCache[VaryByParam] = varyByParamsSplit;
            }
            finally
            {
                _lock.ExitWriteLock();
            }
        }

        foreach (var varyByParam in varyByParamsSplit)
        {
            // Skip invalid parameters
            if (!actionParameters.TryGetValue(varyByParam, out varyByParamObject))
            {
                continue;
            }

            // Sometimes a parameter will be null
            if (varyByParamObject == null)
            {
                continue;
            }

            sb.Append("_").Append(varyByParamObject.ToString());
        }

        return sb.ToString();
    }
}

You can use this method on a controller to affect all action methods:

[ActionResultCache(Duration = 60 * 60 * 24)]
public class HomeController : Controller
{
    public async Task<ActionResult> TermsOfService()
    {
        return View();
    }
}

Or just apply it to individual action methods:

[ActionResultCache(Duration = 60 * 60 * 24)]
public async Task<ActionResult> TermsOfService()
{
    return View();
}

You can also use it with the VaryByParam property to vary the cached result by the parameter(s) of the action method:

[ActionResultCache(Duration = 60 * 60 * 24, VaryByParam = "username")]
public async Task<ActionResult> ViewUser(string username)
{
    var model = new UserModel
    {
        Username = username,
        ...
    };

    return View(model);
}

The main benefit of this custom caching attribute is that your session state and all global action filter attributes, etc. still run in the MVC pipeline as they would normally. The only code cached and skipped over is the method body of the action method.

Please use and enjoy! Feedback welcomed in the comments.

To Node.js Or Not To Node.js

Node.js – it has rapidly become the “new hotness” in the tech start-up realm. With each passing day, the fan base of Node lovers grows larger, spreading their rhetoric like a religion. How do you spot a Node.js user? Don’t worry, they’ll let you know. 😉

One day you’re at a regular user group meeting, sipping soda and talking with some colleagues, when the subject turns to Node. “Have you guys tried Node.js?” asks one of the people in your group. “It’s all the rage. All of the cool kids in Silicon Valley are using it!” “What does it do?” you ask, only to be bombarded with a sales pitch worthy of the best of used car lots. “Oh, it’s amazing!” they reply, sipping their diet coke and shuffling their hipster fedora and backpack with MacBook Pro in it (or something like that), “It’s server side JavaScript. It runs on a single thread and it can do 100,000 web requests a second!” They glance at the group for the oohs and ahhs, but most people just stare back with amazement in their eyes. Then, your hipster Node-loving friend drops the words that start wars: “It’s way better than .NET” – and just like that, your group is hooked. They go home, download the Node.js tools, write “Hello World”, and suddenly they’re on their way to the next user group meeting to talk about how great Node is.

Okay, so I might be exaggerating the appearance and demeanour of your average Node lover a little (read: a lot, almost entirely in fact). However, I have had this exact scenario happen repeatedly over the last six months, with ever-increasing intensity and frequency. Node users love Node. They want you to love Node. They’re excited about it.

Having given it some thought, why wouldn’t Node.js developers be excited? Want to fire up a “Hello World” web server in Node? It’s trivial:

// Load the http module to create an http server.
var http = require('http');

// Configure our HTTP server to respond with Hello World to all requests.
var server = http.createServer(function (request, response) {
  response.writeHead(200, {"Content-Type": "text/plain"});
  response.end("Hello World\n");
});

// Listen on port 8000, IP defaults to 127.0.0.1
server.listen(8000);

Want to do the same thing in .NET? Be prepared to learn about IIS, the Machine.config, the Web.config, the Process Model, how Global.asax works, either ASP.NET MVC or WebForms (huge paradigms in themselves), and how Visual Studio works. Don’t forget to learn how to create a solution, at least one web project, and how to deploy the application to IIS. Oh, and one little detail: go ahead and learn C# while you’re at it. All of that’s pretty much just as easy and intuitive as Node.js, right?

.NET is incredibly complicated. Node.js is incredibly simple. On the merits of that fact alone it’s no wonder that these .NET developers and fresh-out-of-college kids who have already dabbled in JavaScript are transferring these skills to the server side and standing up web servers in literally 5 minutes and 5 lines of code. How can you deny the sexiness of that? The bragging rights it gives you? The ease and comfort of a language you’re already familiar with?

This, in my opinion, is why Node.js is becoming huge. It has simplified and streamlined the development process and made programming very accessible to almost everyone (or at least anyone who has ever played with JavaScript).

However, to those who sell Node.js as single-threaded, and to those who sell Node.js as having significantly better performance than .NET, I say this: you are wrong.

With simplicity comes misunderstanding and the concept of “Leaky Abstractions.” As my good friend and colleague Thomas B. said to me during dinner last week: Node.js is opinionated. It has an opinion on how you should do things, and it forces you to do them a certain way.

Node.js is not single-threaded, though many Node developers in my experience believe it to be. Node’s creator believes that a single-threaded listener delegating I/O-bound work to a thread pool is the key to a highly available application. As a result, Node.js forces you into this paradigm of event-based asynchronous execution of I/O operations via a thread pool.

Node.js has a single thread listening for connections. All of the code which you as the Node developer write is executed on this single thread. This single thread is all that is exposed to you. As soon as a connection is received, Node’s listening thread executes your coded event on the same listener thread. This event either does quick, non-CPU intensive work (like returning static content to a client), or long-running I/O bound operations (like reading data from a database). In the case of the former, the listener thread does in fact block for the duration of the request, but the request happens so quickly that the delay is trivial. In the case of the latter, Node uses V8 and libuv (which it is built upon) to delegate the I/O work to a thread from an underlying pool of native C++ threads. The single listening thread kicks off the work to an I/O worker thread with a callback that says “tell me when you’re done” and immediately returns to listening for the next connection. It is thus plain to see that Node.js is indeed multi-threaded, though this functionality is not directly exposed to the Node developer.

An important note regarding Node.js is that any CPU-intensive code which you write will block the entire system and make your application scale poorly or become entirely unresponsive. As a result, you would not want to use Node.js when you need to write an application that will do CPU-intensive work such as performing calculations or creating reports.

This is how a single thread can handle multiple requests at once; receiving a request and either serving static/simple content or delegating it to an I/O thread from a thread pool are both very cheap and quick operations. When the thread pool thread that is doing the long-running I/O work signals to the single listener thread that the work is done, the listener thread picks up the response and sends it back to the user; this is another very cheap operation. The core idea is that the single listener thread never blocks: it only does fast, cheap processing or delegation of requests to other threads and the serving of responses to clients. The diagram below (taken from Stack Overflow) explains this visually:

Node.js Processing Model

This is a very good, scalable, highly-available way to write code; Node.js nailed their approach and developers benefit from it. However, as of .NET 4.5, you can easily create this exact paradigm/pattern in your .NET applications. The difference is that .NET does not force you to do so.

With the introduction of a very tidy wrapper around asynchronous programming in .NET 4.5 (async/await keywords), Microsoft made asynchronous, event-based programming quite a bit easier and more intuitive. And with recent conformance by Microsoft to the jointly-created OWIN specification, the web pipeline of .NET has become much simpler.

In fact, you can now write the “Hello World” asynchronous web server in .NET in about as few lines as Node.js! In this example, I host a web server in a console application which is terminated when a key is pressed:

/// <summary>
/// A simple program to show off an OWIN self-hosted web app.
/// </summary>
public class Program
{
    /// <summary>
    /// The entry point for the console application.
    /// </summary>
    /// <param name="args">The arguments to the execution of the console application. Ignored.</param>
    static void Main(string[] args)
    {
        // Start OWIN host
        using (WebApp.Start<Startup>(url: "http://localhost:8000"))
        {
            // Runs until a key is pressed
            Console.ReadKey();
        }
    }

    /// <summary>
    /// This code configures the OWIN web app. The Startup class is specified as a type parameter in the WebApp.Start method.
    /// </summary>
    private class Startup
    {
        /// <summary>
        /// Configures the web app.
        /// </summary>
        /// <param name="app">The app builder.</param>
        public void Configuration( IAppBuilder app )
        {
            // We ignore any rules here and just return the same response for any request
            app.Run( context =>
            {
                context.Response.ContentType = "text/plain";
                return context.Response.WriteAsync( "Hello World\n" );
            } );
        }
    }
}

One of the big positives of Node.js is that you opt-in to complexity. You start very simply and add on functionality as you need it. I’m a big fan of this approach and I feel that this is where Node really shines. Nothing bothers me more than starting an “Empty MVC 4 Web Application” from template in Visual Studio only to have it install about 15 Nuget packages, one of which is Entity Framework. Great, I fired up a blank application and already my ORM has been decided for me. Who said I even needed one in the first place?!

The above OWIN-based approach to hosting a web app in .NET allows you to benefit from Node’s simplistic approach. I have started out simply, and can now add Dapper if I need an ORM, Newtonsoft.Json if I need to serialize to and from JSON, Unity if I care about dependency injection, etc. It’s a nice, clean slate upon which I can build any framework that I desire.

This approach in .NET is very comparable to Node.js, with a few differences:

  • Node.js uses 1 listener thread, while .NET uses N listener threads. If your Node.js application does CPU-intensive work at all, it will block the entire system and potentially cause your application to become unresponsive. .NET, on the other hand, is designed to do CPU intensive work. Tying up a thread to do some CPU work is not of concern because there are other threads available in the listener thread pool to take other requests while this is happening. However, both Node.js and .NET are limited by the server resources; in either case, if you max out the CPU or RAM, your app will perform horribly, regardless of thread delegation. This is known as resource contention.
  • Node.js delegates I/O bound work to an I/O thread worker pool, and .NET implemented asynchronously (async methods and the async/await keywords) does the same.
  • Node.js uses an event-based paradigm, and .NET does also when implemented asynchronously.
  • Node.js offers high performance for I/O bound, low CPU operations, and .NET offers comparable performance when you skip the IIS pipeline. IIS tacks on a significant amount of performance overhead due to things like session state management, forms authentication, the process model, request lifecycle events, etc. These are not bad things to have and use, but if you don’t need IIS, session state, forms auth, request lifecycle events, or the process model, then don’t use them!
  • Node.js must parse/serialize to and from JSON, and .NET must serialize to and from JSON to interact with .NET objects. Parsing is going to be much cheaper in Node.js than serializing is in .NET, but .NET also enables you to serialize to XML, Atom RSS, and anything else that you desire. With Node, this is a bit trickier, and the serialization overhead comes back into play to even the field.

When someone compares Node.js to .NET, I find that they often actually compare Node.js to IIS hosted frameworks such as ASP.NET MVC, ASP.NET WebForms, and ASP.NET Web API (in IIS hosted mode). These are all tools built on top of ASP.NET to simplify enterprise web development and to do CPU-intensive calculations. In these scenarios, Node.js will have an advantage, because it is designed specifically to NOT do CPU-intensive calculations. You are effectively comparing CPU-intensive Apples to low-CPU-usage Oranges. It is not a fair comparison.

When someone compares Node.js to a self-hosted .NET web app which does I/O-bound long-running operations via asynchronous thread pool delegation, they find that there is not much of a difference in performance between the two runtimes. In fact, comparing Node.js to self-hosted Web API (NOT using IIS) doing low-CPU work, the numbers are very close:

Web API vs Node.js

This image was taken from a benchmark done in 2012 with the Web API Release Candidate (not Web API 2, and NOT OWIN hosted). Given that Web API 2 exists, and can be self-hosted via OWIN, I’d love to see a more recent benchmark comparing the two. In fact, I will try and do this for my next blog post.

So, to Node.js or not to Node.js? I submit these final thoughts:

I guess the point of all of this has been that neither Node.js or .NET is necessarily better/the best/the new hotness. They both serve a purpose, and while Node.js is much easier to use and more accessible to developers, .NET is very versatile and powerful as well. They are built to do different things: Node.js specializes in performing and scaling well for low-CPU, highly I/O-bound operations. .NET can perform well in this scenario as well, but can also perform well with high-CPU operations. In fact, I would argue that .NET excels at CPU-intensive operations, especially when compared to Node.

There are many .NET developers in the current tech scene that are capable and competent. This means that it’s not too hard to find, hire, and retain good .NET talent. They can pick up self-hosted OWIN web apps in a matter of days, and begin to write very scalable, high-performance web apps and services based on it. They can even easily host Web API in an OWIN web app via console, a Windows service, or Azure. There’s a community that has existed for over a decade that evolves the .NET framework with amazing tools and add-ons like Dapper, Unity, and Newtonsoft.Json. This community is mature and there are many prominent voices within it that offer advice and help.

Relative to .NET, there aren’t as many Node.js developers in the current tech scene that are capable and competent. This is because fortune favours the bold, and only a few have been early adopters of Node.js. In my experience, few Node developers will truly understand what is going on in the Node.js processing model and how to exploit it for maximum performance, though the opinionated paradigm of Node.js will force developers to write good asynchronous code. It can be hard to find, hire, and retain good Node.js talent. This will become less of a concern as Node’s following grows. The Node.js community is budding and has created some amazing tools and add-ons for Node.js as well such as ORMs and DI frameworks. This community is not yet mature and I am not aware of many prominent voices within it that offer advice and help. As a result, it could be difficult to find support and tools for Node.js if you encounter a problem.

In conclusion, both Node.js and .NET are great. Which one to pick for a particular solution/application, however, depends on many factors; it is not black and white but a full colour spectrum. It would be very foolish and naive for a .NET developer to use .NET to solve every single problem just because “that’s what we use.” It would be similarly foolish for a Node.js developer to propose Node.js as a solution for every project or problem that he or she encounters. One must choose the right tool for a given job, and be open to different paradigms in order to truly excel.

In general, use Node.js when you have highly I/O-bound operations that don’t use much CPU. Use .NET when you need to calculate things and use a lot of CPU.

Don’t use Node.js solely on the reasoning that it’s much faster and performs way better than .NET: it depends on how you use .NET. And don’t use .NET if all you’re doing is heavily I/O-bound operations with low CPU usage: that’s where Node.js excels.

When to Compile your Regex

In .NET, it is a common misconception that compiled Regex always operates faster than uncompiled Regex after the initial cost which is incurred at the time of instantiation only. This is not always true.

Recently I wrote a URL rewriter, similar to Helicon but a little more powerful and flexible. It allows you to define rules in a config file, and then rewrite a captured input string to an output string, replacing parts of the URL via Regex capture groups along the way. An example of the configuration looks like this:

<add original="^/(.*)/customer(.*)$" rewritten="/$1/cust$2" redirect="false" ignorecase="true" />

What the above says is that if my URL is anything, then “/customer” then optionally anything else, rewrite it to the “/cust” version of the URL with the original inputs applied to the outputs.

Now, considering myself to be an efficient developer, I have always pre-compiled my Regex by adding the RegexOptions.Compiled option to the constructor. My rewriter code that parsed my XML file and created Regex’s looked like this:

Regex rewriteRuleRegex = null;

// Determine if we ignore case
if (rewriteConfiguration.IgnoreCase)
{
    rewriteRuleRegex = new Regex(rewriteConfiguration.Original, RegexOptions.Compiled | RegexOptions.IgnoreCase);
}
else
{
    rewriteRuleRegex = new Regex(rewriteConfiguration.Original, RegexOptions.Compiled);
}

I had 144 rules in my file, and decided to performance test the URL writer against approximately 100 unique requests. The result was that it sucked, REALLY, REALLY badly, with 87% of the request lifecycle being spent on Regex.Replace!

Precompiled Sucks

Precompiled Sucks

That’s INSANE! 87% is COMPLETELY unacceptable! After all I’d heard about the .NET Regex engine being efficient, this was freaking terrible! I spent a day pulling my hair out and reading a ton of articles, when I came across one from Jeff Atwood about when to Precompile and when not to. So, I took the Precompile flag off of my code:

Regex rewriteRuleRegex = null;

// Determine if we ignore case
if (rewriteConfiguration.IgnoreCase)
{
    rewriteRuleRegex = new Regex(rewriteConfiguration.Original, RegexOptions.IgnoreCase);
}
else
{
    rewriteRuleRegex = new Regex(rewriteConfiguration.Original);
}

And re-ran my tests:

Precompiled Sucks

Precompiled Sucks

MUCH better.

So what I learned is that if you CONTROL the input (AKA if it’s a small set of input that you know in advance), precompiling is good. If you don’t control the input (such as user generated URLs that are sent to your application), DO NOT PRECOMPILE. EVER.

One More Thing About List Binary Search

I wanted to point people to this link at DotNetPearls:

http://www.dotnetperls.com/binarysearch

They do an excellent, quick demonstration of List<T>.BinarySearch and show a graph that really drives home how much faster it is for large lists than a regular traversal!

Make Mostly Read, Seldom-Written Lists Much More Efficient

One of the many things that I do at work is run a full-blown Search Engine which I also developed from scratch. This Search Engine feeds all product related information to our websites. A search index consists of a pre-computed collection of products, their properties, a list of words that are correctly spelled, and some pre-computed faceted/guided navigation. A search index, until this week, took up approximately 10.7 gigs of memory. This was becoming too large as we added new products every single day.

As of writing this, it now takes only 4.8 gigs of memory and is only slightly (1-3%) less performant than before. How did I do it? Believe it or not, a very simple data structure and algorithm change.

In the Search Engine, a product’s properties are a key-value pairing of strings… Things like “isInStock” “1” or “color” “red” etc. We store the properties in a collection, per product. The collection was originally:

Dictionary<string, HashSet<string>> _entityProperties;

The key of the Dictionary was the property name and the HashSet of strings were the values for that property name (property names are not a “unique” key – a product could have multiple colors for example). I initially chose this data structure because we have a heavy need for DIRECT lookups to property names and values. Methods like HasProperties(string propertyName) and HasProperty(string propertyName, string propertyValue) are essential to the search engine’s function, and need to be performant. Thus, I figured that a Dictionary and HashSet would be best, since both offer O(1) lookup times and the index is read from 10000x more often than it is written to. O(1 + 1) is pretty good when it comes to complexity.

It turns out that there was a simpler, better data structure for my goals which also satisfies the performance requirements of the aforementioned methods.

As you may or may not know (and I learned the hard way), a HashSet<T> is actually not very efficient when you have only a few items in it. A List<T> is actually more performant for small collections (4 or fewer objects with simple GetHashCode() methods, such as strings, in my testing). This is true even though your average lookup/read case goes from O(1) to (1/2n) since you must traverse the List to find your desired object. The reason that List is faster is that there is no hash key computation, and the List<T> is basically an elastic array and thus takes less memory and has less overhead than a HashSet<T> with the same number of objects in it. Since my product properties typically only consist of 2 or 3 values for a given property name, I changed my data structure to this:

Dictionary<string, List<string>> _entityProperties;

This shaved approximately 10% off of the memory footprint and brought my memory usage down to 9.6 gigs. The performance was basically identical in all performance tests. This was better than my HashSet, but still not great. I wanted to do better. I was sure that somehow I could do better.

I spent the good part of this week trying – and failing – to design a more efficient data structure than the above. I tried a string Trie with nodes that pointed to another Trie, I tried SortedList<TKey, TValue> instead of the above, and everything else that I could think of. Yet no matter what I did, the memory stayed the same and the performance was the same or worse. It sucked. I was still sure that somehow I could do better.

Finally, Wednesday morning, I had a thought in the shower (where I do my best thinking): two dimensional Arrays suck. They are well documented to, in general, have worse memory usage metrics than a one dimensional array (a quick Google will fill you in). A Dictionary of Lists is certainly a two dimensional jagged Array of sorts, and it wasn’t doing what I wanted in terms of memory. So, I took another approach and changed my data structure wildly – I flattened it out and made it one dimensional:

List<KeyValuePair<string, string>> _entityProperties;

Seems insane, right? I go from a Dictionary with an O(1) key lookup to a linear List of all keys and values stored together. And yet, it did the trick for my memory: it went from 9.6 gigs to 4.8 gigs. Half of the amount of memory used. I was stoked.

I saved this memory by both interning strings and taking advantage of the KeyValuePair being a struct. Structs are a lot more efficient than reference types when the object is small, and a KeyValuePair is indeed quite small.

A new problem needed solving, however. Each product has around 60-100 properties associated with it, and I needed them to be accessed efficiently and quickly with near-direct lookups. Traversing the [now giant] List was not acceptable in terms of performance.

As it stood, I went from an O(1 + 1) data structure (key and value lookup costs for Dictionary and HashSet) to an O(1 + 1/2n) data structure (Dictionary and List) and finally to an O(n) data structure (List). And to top it all off, the n in the 1/2n was 3 or 4 where the n in the flat List of KeyValuePair was between 60 and 100. Truly I was getting worse performance with each improvement – at least theoretically. Still, the allure of the memory savings was too great to ignore and I wanted to use this data structure.

It then hit me: why not use BinarySearch on the List<T> to look up items quickly and keep the List sorted while I add AND be able to check for duplicates before adding? It was certainly worth a shot, since binary search is an O(log n) algorithm which is an improvement over the List’s O(n) traversal. So, I modified my Add(string propertyName, string propertyValue) method to keep the List sorted as things were added to it. This is surprisingly easy to do.

*note that from here on out I’m simplifying my code greatly to basic concepts from what actually exists in order to avoid giving away trade secrets or violating my NDA and being fired*

public void Add(string propertyName, string propertyValue)
{   
    // TODO: null checks, etc.

    var keyValuePair = new KeyValuePair<string, string>(propertyName, propertyValue);

    // Add property name and value
    // First find the identical item if it exists
    var result = _entityProperties.BinarySearch(keyValuePair, _entityPropertiesComparer);
    // If result is >= 0, already exists, done
    if (result >= 0)
    {
        return;
    }

    // If it does not exist, a one's complement of the returned int tells us WHERE to insert to maintain the sort order
    _entityProperties.Insert(~result, keyValuePair);
}

The secret here is two-fold:

One: I created a custom KeyValuePair<string, string> comparer class that implements IComparer<KeyValuePair<string, string>> and basically does a case-insensitive string compare of first the key strings, then the value strings. This IComparer is required by the List’s BinarySearch method to determine the ordering of objects in the List.

Two: the BinarySearch method returns a very useful value: a simple integer. If the int is < 0, it means that the item was not found in the List. If it is >= 0, it means that the item was found at the index of the value. If it returns a negative integer, it means that it was not found, and also that the proper index to insert the item at in order to keep the List sorted is the one’s complement of the value. A super useful return type, indeed. This allows you to add your elements to a List while preserving an order, at the cost of your add being an O(log n) operation instead of a List’s usual O(1) add operation. However, if you don’t add things as much as you read the List (we only adjust the index once a day for example, but read it thousands of times per hour), this can be worthwhile. Additionally, you could add everything in O(1) time and then do a final List Sort in order to sort the List for a single O(log n) cost if the order of elements does not matter until you’re done adding everything. In my case, the order mattered as I added to the List because I did not ever want to add duplicates (same property name and value). The HashSet handles this for me – Lists do not.

So, now my add costs O(log n) instead of O(n), but the payoff is now my lookups cost O(log n) instead of O(n) as well. I adjusted my earlier mentioned HasProperty and HasProperties methods accordingly:

public List<string> GetSpecificPropertyValues(string propertyName)
{
    // TODO: null checks, etc.

    List<string> result = new List<string>();

    // Binary search the property name - null is the smallest value of string for comparison
    var keyValuePair = new KeyValuePair<string, string>(propertyName, null);
    // One's complement the start index
    var startIndex = ~_entityProperties.BinarySearch(keyValuePair, _entityPropertiesComparer);

    for (int i = startIndex; i < _entityProperties.Count; i++)
    {
        // Leave the loop when the property name no longer matches
        if (!string.Equals(propertyName, _entityProperties[i].Key, StringComparison.OrdinalIgnoreCase))
        {
            // Leave the loop
            break;
        }
                    
        result.Add(_entityProperties[i].Value);
    }

    return result;
}

public bool HasProperty(string propertyName, string propertyValue)
{
    // TODO: null checks, etc.

    // Binary search the property name
    var keyValuePair = new KeyValuePair<string, string>(propertyName, propertyValue);
    var startIndex = _entityProperties.BinarySearch(keyValuePair, _entityPropertiesComparer);
    return startIndex >= 0;
}

public bool HasProperties(string propertyName)
{
    // TODO: null checks, etc.

    // Binary search the property name
    var keyValuePair = new KeyValuePair<string, string>(propertyName, null);
    // One's complement the start index
    var startIndex = ~_entityProperties.BinarySearch(keyValuePair, _entityPropertiesComparer);
    if (startIndex >= _entityProperties.Count)
    {
        return false;
    }

    // Check that the next element matches the property name
    return string.Equals(propertyName, _entityProperties[startIndex].Key, StringComparison.OrdinalIgnoreCase);
}

Suddenly, I have the same “direct lookup” methods available as I did with my Dictionary and HashSet/List structure, but in a flat List with O(log n) complexity.

This yielded 50% less memory usage and only a 1-3% increase in performance times. A very acceptable trade for the Search Engine.

If you have a List<T> with a lot of objects in it, and performance is key to your application, consider using BinarySearch and/or Sort to access it in a much more efficient way. As long as you can create an IComparer<T> where T is your List objects type, you’ll have a more efficient List.

A Better MIME Mapping Stealer!

In the interest of self-improvement and sharing knowledge, I felt that I should share an update to my last post. I discovered a slightly better way to create the GetMimeMapping delegate/method via reflection that involves less casting and overhead, and is more Object Oriented in a sense. It allows the signature of the reflected method to be Func<string, string> instead of MethodInfo. Code below, note the use of Delegate.CreateDelegate(Type, MethodInfo):

/// <summary>
/// Exposes the Mime Mapping method that Microsoft hid from us.
/// </summary>
public static class MimeMappingStealer
{
    // The get mime mapping method
    private static readonly Func<string, string> _getMimeMappingMethod = null;

    /// <summary>
    /// Static constructor sets up reflection.
    /// </summary>
    static MimeMappingStealer()
    {
        // Load hidden mime mapping class and method from System.Web
        var assembly = Assembly.GetAssembly(typeof(HttpApplication));
        Type mimeMappingType = assembly.GetType("System.Web.MimeMapping");
        _getMimeMappingMethod = (Func<string, string>)Delegate.CreateDelegate(typeof(Func<string, string>), mimeMappingType.GetMethod("GetMimeMapping", 
            BindingFlags.Instance | BindingFlags.Static | BindingFlags.Public |
            BindingFlags.NonPublic | BindingFlags.FlattenHierarchy));
    }

    /// <summary>
    /// Exposes the hidden Mime mapping method.
    /// </summary>
    /// <param name="fileName">The file name.</param>
    /// <returns>The mime mapping.</returns>
    public static string GetMimeMapping(string fileName)
    {
        return _getMimeMappingMethod(fileName);
    }
}

Determine MIME Type from File Name

I recently had a need, in an ASP.NET MVC3 application, to read raw HTML, CSS, JS, and image files from disk and return them to the user… A sort of “pass-through” if you will. Normally I’d have simply routed to a custom HTTP handler per file type or just allowed MVC3 to map existing files to supply its own .NET HTTP handlers and do all of this work for me, but in this case I needed the mapped “directory” to switch behind the scenes based on Session settings… So I ultimately had to feed these files through a Controller and Action Method to gain access to the Session.

One problem that came up was being able to determine the MIME type of the content that I’m reading from disk. This is done for you by the HTTP handlers provided in the .NET framework, but when you’re serving files through MVC Controllers, the default HTTP handlers are not used and thus you’re left to figure out the MIME types for yourself.

So, I began to investigate, using ILSpy, how the native/default ASP.NET HTTP handlers determine the MIME types. I came upon a class in the System.Web namespace called System.Web.MimeMapping – this class keeps a private, sealed dictionary of type MimeMappingDictionaryClassic (which extends a private abstract class called MimeMappingDictionaryBase) which holds all knows extensions and their associated MIME types… A sample of the decompiled code which populates it is below:

protected override void PopulateMappings()
{
    base.AddMapping(".323", "text/h323");
    base.AddMapping(".aaf", "application/octet-stream");
    base.AddMapping(".aca", "application/octet-stream");
    base.AddMapping(".accdb", "application/msaccess");
    base.AddMapping(".accde", "application/msaccess");
    base.AddMapping(".accdt", "application/msaccess");
    base.AddMapping(".acx", "application/internet-property-stream");
    base.AddMapping(".afm", "application/octet-stream");
    base.AddMapping(".ai", "application/postscript");
    base.AddMapping(".aif", "audio/x-aiff");
    base.AddMapping(".aifc", "audio/aiff");
    base.AddMapping(".aiff", "audio/aiff");

And so on… In total, there are 342 lines of known mappings!

Ultimately, my goal was to get a hold of this functionality in the easiest, most flexible way possible.

In .NET 4.5, MimeMapping exposes a public static method called GetMimeMapping which takes in a file name (or extension) and returns the appropriate MIME type from the aforementioned dictionary. Unfortunately my project is on .NET 4.0 and in that version of the framework this method is internal, not public (why, Microsoft, why?!) and thus was not available to me. So, I felt that I was left with 3 options:

1. Upgrade to .NET 4.5 (not possible at this time due to corporate politics and so on)

2. Copy and paste the entire list of mappings into a dictionary of my own and reference it (yuck!)

3. REFLECTION TO THE RESCUE!

So, with a short bit of code, you too can steal the functionality of the GetMimeMapping method, even if it isn’t public!

First, set up the reflection and cache the MethodInfo in an assembly that references the System.Web namespace. Below is a custom static class I built which wraps the reflective method:

/// <summary>
/// Exposes the Mime Mapping method that Microsoft hid from us.
/// </summary>
public static class MimeMappingStealer
{
    // The get mime mapping method info
    private static readonly MethodInfo _getMimeMappingMethod = null;

    /// <summary>
    /// Static constructor sets up reflection.
    /// </summary>
    static MimeMappingStealer()
    {
        // Load hidden mime mapping class and method from System.Web
        var assembly = Assembly.GetAssembly(typeof(HttpApplication));
        Type mimeMappingType = assembly.GetType("System.Web.MimeMapping");
        _getMimeMappingMethod = mimeMappingType.GetMethod("GetMimeMapping", 
            BindingFlags.Instance | BindingFlags.Static | BindingFlags.Public |
            BindingFlags.NonPublic | BindingFlags.FlattenHierarchy);
    }

    /// <summary>
    /// Exposes the hidden Mime mapping method.
    /// </summary>
    /// <param name="fileName">The file name.</param>
    /// <returns>The mime mapping.</returns>
    public static string GetMimeMapping(string fileName)
    {
        return (string)_getMimeMappingMethod.Invoke(null /*static method*/, new[] { fileName });
    }
}

Now, a quick test via a console application to ensure that it works:

static void Main(string[] args)
{
    var fileName1 = "whatever.js";
    var fileName2 = "somefile.css";
    var fileName3 = "myfile.html";

    Console.WriteLine("Output for " + fileName1 + " = " + MimeMappingStealer.GetMimeMapping(fileName1));
    Console.WriteLine("Output for " + fileName2 + " = " + MimeMappingStealer.GetMimeMapping(fileName2));
    Console.WriteLine("Output for " + fileName3 + " = " + MimeMappingStealer.GetMimeMapping(fileName3));

    Console.ReadKey();
}

And running the console application results in success!

GetMimeMapping Works

GetMimeMapping Works

Published by Red Gate

As of today I’ve been published in an e-Book offered for free by Red Gate! It is called 50 Ways to Avoid, Find and Fix ASP.NET Performance Issues and contains many useful performance tips which have been contributed by various members of the .NET community. Many tips are ASP.NET MVC specific which is also a plus.

My tip is #3 and has to do with debugging Microsoft symbols.

Get a free copy here – it has already taught me a few things I had never thought to consider!

But it Didn’t Happen in DEV or QA!

Most of us have been there: you’ve written a fantastic application that performs perfectly in your Development and/or QA environments, but in Production something goes wrong. Your application spins out of control, utilizing 100% of your CPU. Maybe it simply stops responding as if it were deadlocked. Or maybe it simply crashes randomly. What now?

Logic tells you that you have a problem in the code somewhere that is only encountered in a Production-like environment… and if you could JUST get into the Production box, install Visual Studio (or at least the Remote Debugger), and debug the application, you’d be able to solve the problem. However, you can’t (because it’s Production!), and you can’t replicate the problem in any other environment. Maybe it’s because of stale Development or QA environment data compared to live Production data. Maybe it’s something else. You have no idea where to look to find and fix the problem in your application. For lack of eloquence: you’re screwed.

Fortunately, there are both tools designed for this very scenario and ways to “reproduce” the problem to determine the cause. I’m going to show you how to debug Production problems in applications where you cannot attach to the process for live debugging, and there are either no logs or the logs tell you nothing useful.

Let’s create a simple application that is designed to take up 100% of our CPU:

class Program
{
    static void Main(string[] args)
    {
        // Parallel to really max out the CPU
        Parallel.For(0, 100, (i) =>
        {
            while (true)
            {
                // Loop forever and ever
            }
        });
    }
}

This code basically spawns 100 concurrent threads that loop for all of eternity. The reason that we do this using the TPL / Parallel library is that a single threaded application would only max out 1/N of our CPU, where N is the number of cores in the processor. Verifying our simple application, we can see that it does its job and maxes out our CPU:

100% CPU Used

100% CPU Used

Now, imagine that this application is a lot more complex and that this simple method is just one part of the entire solution. Perhaps you’ve built a really huge website or service and this method exists in just one little part of it. Pretend also that this method is not hit in your Development or QA environments during testing, and so your application appears to operate normally to you.

In fact, pretend that you’ve never seen this source code at all. All that you know is that you have an application in Production that spins out of control, and you don’t know where the problem is – or even where to begin looking.

So, what do you do?

The first thing we need to do is ensure that we have the tools we’ll need to debug and fix the issue. Things you’re going to need:

  • A tool that can create Mini-Dumps. I highly recommend Process Explorer which is available via TechNet.
  • WinDBG, via Debugging Tools for Windows, an unintuitive yet key tool from Microsoft for debugging Mini-Dump files. When you install this, you’ll have to do it through the Windows Software Development Kit installer. You can unselect everything except Debugging Tools for Windows, since you’ll need nothing else for our purposes.

Install WinDBG on the PC which you’ll use to analyze the Mini-Dump (typically your Development machine). Install Process Explorer on the Production box that hosts the application which is misbehaving.

Now that you’ve got those installed, we’ll proceed and figure out the problem.

So our scenario is this: we have an application running in Production which is spinning out of control, we have the source code on a different PC, and we can’t attach a debugger to the Production environment. We can’t reproduce this behaviour in Development or QA environments at all. So, time to get down to the details of the problem.

Step 1 is easy. You need to take a Mini-Dump of the misbehaving application on the Production machine. There’s a bit of a catch 22 here in that the rogue application is using 100% of your CPU, so this Mini-Dump could take forever. To solve that problem on a multi-core machine, simply use Task Manager to set the affinity of the misbehaving application to 1 or 2 cores to lower the total CPU used by the application, thus freeing up CPU for our Mini-Dump:

Set the Affinity to 1 or 2 CPUs

Set the Affinity to 1 or 2 CPUs

Unselect “All Processors” and pick 1 or 2:

Pick 1 or 2 CPUs

Pick 1 or 2 CPUs

There, now it’s a lot easier to do things on the affected PC since it isn’t spending 100% of its CPU spinning on your application:

Processor Now Freed

Processor Now Freed

Note that the affinity setting does not persist, meaning the next time you launch the application, it will go back to using all CPUs per usual.

So, now we’re on to Step 2. Take a Mini-Dump (haha). To do this, launch Process Explorer, find your rogue application, right click, Create Dump –> Minidump:

Create a Mini-Dump

Create a Mini-Dump

Now save the .dmp file somewhere that your Development PC can get to it in order to debug the issue.

Step 3. Great, so now we have our .dmp Mini-Dump file… so let’s get cracking on debugging it. Get the .dmp file to your Development PC and fire up WinDBG. NOTE: run the 64 bit version for 64 bit applications and the 32 bit version for 32 bit applications. If you experience weird behaviour, try switching the WinDBG version that you launch. You’ll be greeted with a pretty bland grey window. Go to File –> Open Crash Dump (or CTRL + D for the shortcut fans out there). Select your .dmp file and you’ll end up with a screen similar to this:

Mini-Dump File Loaded

Mini-Dump File Loaded

So now the “hard” part begins. We need to manually diagnose the issue. The first step in WinDBG is usually to load up the CLR runtime so that we can examine our stack. To do this, run either:

.loadby SOS clr (.NET 4) OR
.loadby SOS mscorwks (not .NET 4)

My app is .NET 4, so I ran .loadby sos clr to load the CLR. This is what it should look like if it succeeds:

Load the CLR

Load the CLR

Next we need to load the symbols for the application that we’re debugging. To do this, run the following commands:

.symfix
.sympath+ <absolute path to your application’s compiled code and symbols>

Here’s what I did for my Symbol Path for reference:

Run .symfix and .sympath

Run .symfix and .sympath

Next, run .symopt+0x40 to enable symbols to be used which are not perfectly matched. This setting is extremely useful in cases where the code is compiled locally and doesn’t perfectly match the production code that the Mini-Dump was taken from. If this setting is omitted, you will have a very bad time determining the issue, so make sure that you run it. This is what it looks like:

Run symopt+0x40

Run symopt+0x40

Now that you’ve specified the path at which to look for symbols for your application, run the reload command to reload your symbols:

.reload /v /f

You might get some warnings and errors here but that’s usually fine as long as they don’t relate to your application’s DLLs. Verify that it found your application’s symbols by reading the output of the .reload command:

Ensure your application's symbols loaded

Ensure your application’s symbols loaded

Alright, almost there! Now we have our symbols and CLR loaded, so let’s find out what our application is spending all of its time doing. Run the command:

!runaway

And you’ll find out which threads have been using the most CPU time. Mine looks like this:

Many Threads are Long Running

Many Threads are Long Running

So here we can see that we have 6 threads which are taking up more time than all of the others… We need to print their stack traces and see just what is going on. Execute this (super intuitive) command to see the stack trace of the first thread:

~~[9:18a8]e!clrstack

Super intuitive right? Basically it says “gimme the CLR stack trace for the thread whose ID is in the square brackets”. Here’s what mine spits out:

Aha! The Culprit!

Aha! The Culprit!

And now we know the culprit! So, we proceed to our source code and sure enough, line 16 is our problem:

The Problem Line

The Problem Line

And there you have it. How to debug production issues from your development environment. 🙂

Note that you can also determine what caused a crash using WinDBG… Have your system administrator enable Mini-Dumps for crashes, and then perform the same commands as we did in this post EXCEPT that at the part where we ran !runaway, instead run !analyze -v and !clrstack – these will get you on your way!

Static vs Instance string.Equals Benchmark

A friend of mine commented on my last post asking about how much faster the static string.Equals method is than the instance string.Equals method. To satiate both of our curiosities, I have created this benchmarking application:

static void Main(string[] args)
{
    var stopwatch = new Stopwatch();

    string a = "hello";
    string b = "hi";

    stopwatch.Start();
    for (int i = 0; i < 10000000; i++)
    {
        a.Equals(b);
    }
    stopwatch.Stop();

    Console.WriteLine("Instance string.Equals over 10,000,000 iterations: " + stopwatch.ElapsedMilliseconds + " ms");

    stopwatch.Reset();

    stopwatch.Start();
    for (int i = 0; i < 10000000; i++)
    {
        string.Equals(a, b);
    }
    stopwatch.Stop();

    Console.WriteLine("Static string.Equals over 10,000,000 iterations: " + stopwatch.ElapsedMilliseconds + " ms");

    Console.ReadKey();
}

The results of 5 runs, where “I” is the instance method and “S” is the static method, and the times are in milliseconds:

I: 113
S: 100

I: 144
S: 96

I: 126
S: 89

I: 126
S: 94

I: 128
S: 97

And there you have it. Static string.Equals is reliably slightly faster… But unless you’re doing millions of comparisons, it probably doesn’t really matter much. It does, however, prevent the NullReferenceException mentioned in the last post when the string instance is null.