Archive for November, 2009

An Ultimate Solution to Double Checked Lock problem

Friday, November 27th, 2009

When you see somebody laying a claim like this, your first reaction should be - this is a hoax, there is no ultimate solution to double checked locking problem. And you will be correct.

Yet I do have an ultimate solution, although it only solves this problem for load/create on demand scenarios. It also involves some cheating. Let me explain myself.

Double Checked Locking - some background

Double checked locking is usually presented as a way to reduce contention around some global object used in a multi-threaded environment. It is pretty common in such systems to have global objects which are updated rarely but accessed very often. A solution to double checked locking problem would provide a way to first check whether the lock is necessary and only apply the lock if it is. When (and if) the lock is applied the usual pattern is used which also involves checking - hence the name 'double checked locking'

There are numerous, sometimes heated, discussions on the topic (as a starting point you can check here) but the bottom line is that all suggested generic solutions either have implicit locks or are not thread safe.

So, what about my claim?

Here is the scenario: Let us say you have a list of resources to be used by multiple threads. If resource creation is expensive you would want to ensure that every resource is created on demand the first time it is requested. So when a thread needs a resource it has to check first whether this resource was requested before and if it is, take the existing one rather than create another copy. The dictionary of the resources is that very object which would benefit from the double checked lock.

The thing is that in this particular case it is possible to safely implement the first check of the double checked lock schema outside of any locks. All you need to do is in addition to keeping a global dictionary make copies of the dictionary - one per processing thread and make every thread perform the check against its own dictionary. Any copy of the dictionary will be only accessed by the thread which owns it, so there is no locks necessary while checking this dictionary.

If the requested resource is located you can take it and use it. If it is not you have to check whether it was created by a different thread or you need to create it. To do that you have to  acquire a lock on the global dictionary, check if the resource is already there, create and add it to the global dictionary if it is not and, finally create a new copy of the global dictionary to replace the copy in your worker thread, the usual stuff - all of this while holding the lock.

This second part is no different from the straightforward implementation of the global resource dictionary. The difference is that if it were the only one, every request for the resource would have to go through a global lock which in the systems with a big number of simultaneous requests can create a severe bottleneck. Adding extra layer with per thread dictionaries remedies this problem - every thread which already has a resource in its local dictionary will access it without any locking.

Let me be explicit about some assumptions here.

  • First of all it is implied that all threads share a single copy of the resource. It means that either it has to be immutable or you will have to build some synchronization over access to the resource itself - which takes you back to the square one.
  • Secondly the dictionary itself has to be immutable as well so that the same copy can be shared by multiple threads.

If it sounds very F# than that's because it is - this is exactly how NDjango rendering engine is implemented and it is written in F#

There is one more thing I have to admit to. Speaking theoretically my approach does not solve anything - it just pushes the same problem to the process responsible for thread allocation. Think about it - you have a list of available threads and when a new request comes in, a thread has to be allocated to it and removed from the list. It is the same problem. Luckily for us people creating these parts of our software/hardware (like ASP.NET core) are really smart and they solved this problem for us in the most efficient manner, so for all practical purposes it does solve the problem.

Immutable settings make settings updates easy

Wednesday, November 25th, 2009

Sounds strange - huh? Here is the deal: Imagine you have a service running. It is happily processing incoming requests returning whatever it is supposed to return. Now imagine that in doing so it has to take into account certain global settings. The usual way to implement the settings is to put every value into a dictionary keyed by the setting name. The dictionary is owned by the service and every time some part of the service needs a setting value it pulls it from the dictionary.

In my case the service is the NDjango rendering engine. And to give you an example of a setting: an error string, a string to be inserted into the rendering result every time a django variable fails to resolve.

Now, what if you need to change a setting while the engine is running? The first thing which comes to mind is just go and stick a new value into the dictionary. Well... there are a few complications which may or may not be a real issue depending on what is the nature of the setting and how it is implemented. What if the change hits right in the middle of a request processing? Depending on the setting the impact can be anywhere from inconsistent response all the way through to the service crash. There is a number of ways to prevent such problems. You can clone the settings for every request. You can suspend processing, wait till all active request complete processing and only then apply changes. Or you can write your code so that every setting is accessed only once. All these remedies are difficult to implement and easy to break. Some of them can also have negative performance impact.

So how immutability can help here? Or better yet how can you change settings if they are immutable?

Here is something I picked up from my F# coding: instead of trying to change the values owned by existing service, create a new instance of service with updated values. Then leave the old instance deal with the existing requests and make sure that the new ones are directed to the new instance. This is a design pattern which certainly can be implemented in any language. Implementing it in F# provides additional safety of ensuring on the language level that the values are read only.

This design pattern can be used for any number of things - not just settings. In NDjango this pattern gave me a way to implement system - wide template caching with no global locks for templates already in the cache. Global locks are only involved when templates are loaded and parsed.