Implementing LRU cache
In my last post I mentioned that checking whatever a user is an administrator or not using Active Directory query can be slow. That means that we can just make use of that, we have to cache that.
When caching is involved, we have to consider a few things. When do we expire the data? How much memory are we going to use? How do we handle concurrency?
The first thing that pops to mind is the usage of MemoryCache, now part of the .NET framework and easily accessible. Sadly, this is a heavy weight object, it creates its own threads to manage its state, which probably means we don’t want to use it for a fairly simple feature like this.
Instead, I implemented the following:
public class CachingAdminFinder { private class CachedResult { public int Usage; public DateTime Timestamp; public bool Value; } private const int CacheMaxSize = 25; private static readonly TimeSpan MaxDuration = TimeSpan.FromMinutes(15); private readonly ConcurrentDictionary<SecurityIdentifier, CachedResult> cache = new ConcurrentDictionary<SecurityIdentifier, CachedResult>(); public bool IsAdministrator(WindowsIdentity windowsIdentity) { if (windowsIdentity == null) throw new ArgumentNullException("windowsIdentity"); if (windowsIdentity.User == null) throw new ArgumentException("Could not find user on the windowsIdentity", "windowsIdentity"); CachedResult value; if (cache.TryGetValue(windowsIdentity.User, out value) && (DateTime.UtcNow - value.Timestamp) <= MaxDuration) { Interlocked.Increment(ref value.Usage); return value.Value; } bool isAdministratorNoCache; try { isAdministratorNoCache = IsAdministratorNoCache(windowsIdentity.Name); } catch (Exception e) { log.WarnException("Could not determine whatever user is admin or not, assuming not", e); return false; } var cachedResult = new CachedResult { Usage = value == null ? 1 : value.Usage + 1, Value = isAdministratorNoCache, Timestamp = DateTime.UtcNow }; cache.AddOrUpdate(windowsIdentity.User, cachedResult, (_, __) => cachedResult); if (cache.Count > CacheMaxSize) { foreach (var source in cache .OrderByDescending(x => x.Value.Usage) .ThenBy(x => x.Value.Timestamp) .Skip(CacheMaxSize)) { if (source.Key == windowsIdentity.User) continue; // we don't want to remove the one we just added CachedResult ignored; cache.TryRemove(source.Key, out ignored); } } return isAdministratorNoCache; } private static bool IsAdministratorNoCache(string username) { // see previous post } }
Amusingly enough, properly handling the cache takes (much) more code than it takes to actually get the value.
We use ConcurrentDictionary as the backing store for our cache, and we enhance the value with usage & timestamp information. Those come in handy when the cache grows too big and need to be trimmed.
Note that we also make sure to check the source every 15 minutes or so, because there is nothing as annoying as “you have to restart the server for it to pick the change”. We also handle the case were we can’t get this information for some reason.
In practice, I doubt that we will ever hit the cache max size limit, but I wouldn’t have been able to live with myself without adding the check .
Comments
Doesn't the foreach throw an exception if you modify the cache inside the loop?
This cache suffers from cache stampeding. At 1000k requests per seconds and 2s to produce a cache value this will overwhelm the poor AD server. It will need to handle 2k requests in 2sec suddenly when the cache item expires.
I hate caches which use the CAS pattern (get, produce, try-insert). This only goes well if there is little load on the individual cache item.
Nice implementation, but how is it possible that there was no general-purpose LRU cache in Raven? And watch out for the API: Some time ago I have been tasked with implementing similar, windows-based security and had quite serious problems with accessing user information through UserPrincipal API (there were two mutually trusted domains where users could belong to groups in both of them - in such case the MS API sometimes kept throwing undocumented errors at me). And this made me wonder if we could give all that work to Windows file system - by checking user permissions to designated files representing various application rights.
A simple fix is not to cache a T but a Lazy[T] at the highest mode of synchronization.
ConcurrentDictionary does not work with a "version" like the generic collections and doesn't throw on modified collections while enumerating, it seems to handle most cases like expected :)
Leaves you with an empty dict.
@tobi, If he doesn't expect to even reach MaxCacheSize (25) then I see no problem with this implementation.
But do you have the name of the pattern that could handle the load you are describing?
You forgot to add a log entry when it does clear the cache, so that it is possible to see when cache trashing occurs.
Steve, That is a ConcurretnDictionary there, it won't throw under this scenario.
Tobi, Excellent commentary, I've modified the actual cache to use Lazy to avoid this issue.
Rafal, We actually have several. The difference is that each have different semantics. From Admin, we take into account duration, for caching execution plans, we take into account memory and time. We use specialized versions, because we don't have the need for a uber generic one.
Davh, Can you explain a bit more about what you mean here?
The most annoying part about LRU caches is that they are prune to cache trashing. If you have more users (in this case) than the max cache size, a LRU is worse than not having a cache. So it would therefore be very useful to get a log entry every time the max size is exceed. Then it is very visible for someone complaining about slow logins if they are in the situation where the cache is not working. And btw, you forgot to prune expired entries before the Least recently used.
Dennis, I see, good point. I fixed both issues, but I don't think either would have been a problem. This is a LRU for admins, the number of expected admins is low, and the number of requests requiring them is even lower :-)
You never know when someone chose to make all their users admin and let them access the db :) e.g. some test server in a software development company where they all have admin access to it.
Davh, The only requests that require admin privileges are things like "create new db, back a db, delete a db". Those tend to be fairly rare :-)
One small nitpick: this doesn't look like a LRU (Least Recently Used) cache, but more like a (LFU) Least Frequently Used cache.
If multiple threads call in on a cache miss at the same time you'll end up with a cache entry that has a usage of 1.
[)amien
Comment preview