Rules, Delegation, and Yielders, oh my!

time to read 9 min | 1625 words

Like I said, I've just wrote a rule that used pretty much all the new features that C# 2 has to offer. I'm aware that I tend to like new things for their own sake, but I like to think that I'm using them wisely. Let's take an example, since this is the best way to do it, in my opinion. I present here a case similar to the one I ended up with, and I would appreciate comments about the readability / maintainability of the code.

I have a graph of objects, and I need to run various rules against it. Traversing the graph is trivial, but I usually need to do it while taking all sorts of things into account. In this case, I basically need to compare the two, and generate notifications (using yield return) based on what is going on there. Let's see an example, and then I could talk and be pretty sure that you understand what I'm talking about.

First, let's take a relatively simple scenario that bring this and several other concepts that I've talked about earlier. The scenario is simple, we have a context under which we operate, and we need to apply similar rules to different contexts. Most of the rules operate on time, for example: A store must sell at least 50 items a day.

It's a simple rule, isn't it? But the stores doesn't hold the data in a daily format, they hold a list of sales, and each sale has its associated date. The sales are already sorted by date, so it's should be very easy to do it, right? Well, it is, if you don't have a day without any sales. Does it trigger the rule? That depends, if it's a day that the store was closed, it wouldn't, but if it was opened and didn't sell a thing, there is probably something there that needs more attention. What happens if you want to run the rule with a granularity of a week? A month? A quarter? What happen if the rule has a granularity of a month but the data we has is only for a week. This get messy very fast.

At the end I decided that I would create a class that would encapsulate all those decisions, and that class would use the context to get the various options it needed (for instance, when the store is closed). Here is what I ended up with. Pay attention to GetItemsOnSameDay() and GetItemsOnSameWeek(), that is where the crux is happening.

public class StoreWalker
{
 Context context;
 Convertor<ISalesContext, int> action = DefaultAction;
 
 public StoreWalker(Context context)
 {
  this.context = context;
 }
 
 private int DefaultAction(ISalesContext salesContext)
 {
  return 1;
 }
 
 public Convertor<ISalesContext, int> Action
 {
  set 
  { 
   if (value==null)
    throw new ArgumentNullException("value");
   action = value;
  }
 }
 
 public int Walk(Store Store)
 {
  int count = 0;
  foreach(ISalesContext salesContext in GetItemsOnSameDay(Store))
  {
   count += action(salesContext);
  }
  return count;
 }
 
 public IEnumerable<ISalesContext> GetItemsOnSameDay(Store Store)
 {
  IList<Sale> items = new List<Sale>();
  DateTime currentDate = Context.Start;
  foreach(Sale item in Context.GetItems(Store))
  {
   while(currentDate != item.Date)
   {
    yield return new SalesContext(items, currentDate);
    items = newList<Sale>();
    currentDate = currentDate.AddDays(1);
   }
   items.Add(item);
  }
  while(currentDate != Context.End)
  {
   yield return new Store(items, currentDate);
   items = newList<Sale>();
   currentDate = currentDate.AddDays(1);
  }
 }
 
 public IEnumerable<ISalesContext> GetItemsOnSameWeek(Store Store)
 {
  IList<Sale> items = new List<Sale>();
  DateTime start = Context.Start;
  foreach(ISalesContext salesContext in GetItemsOnSameDay(Store))
  {
   items.AddRange(childStore.Items);
   if(salesContext.Date.DayOfWeek == DayOfWeek.Sunday)
   {
    yield return new SalesContext(items, start,  salesContext.Date);
    start =  salesContext;
    items = new List<Sale>();
   }
  }
 }
}

What GetItemsOnSameDay() does is to return provide a way to iterate over a store, and make sure that every day is accountable to. The GetItemsOnSameWeek() builts on GetItemsOnSameDay(). I can very easily add methods that would make the calculations for a month or a quarter, and they would be simple, since they wouldn't need to check for missing weeks or months.

The nice part about it is that the whole thing is lazily evaluated, so if you're running it on large amounts of data, there is no need to build the graph and then run on it, you run on the graph as you go along, and then the objects are GC'ed. However, I'm not sure what the implications of this are performance wise. There are some problems with recursive iterators that you should be concerned about if you're using it for large data sets. I don't think that this apply here, as it's not about blindly forwarding methods calls (as in the case of the recusrive iterators), but each iterator in this case does work on each own, and it make it conceptually much easier to understand the system.

The more esoteric stuff comes later, when you need to run the rules over a store. If I was using .Net 1.1 or Java, I would need to use the Template Method, and for each rule that I would need, I would have to create a sub-class of StoreWalker and put the things that I need to check there. I find it limiting because most of the time I only need a couple of lines to express the difference, and I only use it in one place. I choose to go with Generic Template Delegate.

How do we find out how much items was sold from the store?

public int GetNumberOfItemsSold()
{
 StoreWalker sw  = new StoreWalker(Context);
 sw.Action = delegate(ISalesContext salesContext) { return salesContext.Sales.Count; };
 return sw.Walk(Store);
}

How do we find the busiest day? This one uses local variables, so it can "remember" the current maximum is.

public DateTime GetBusiestDay()
{
  StoreWalker sw = new StoreWalker(Context);
  int max  = -1;
  ISalesContext maxSaleContext;
  sw.Action = delegate(ISalesContext salesContext) 
  {
    if(g.Children.Count>max)
    {
      max = salesContext.Sales.Count;
      maxSaleContext = salesContext;
    }
    return 1; 
 };
 sw.Walk(Store);
 return maxSaleContext == null ? DateTime.MinValue : maxSaleContext.Date;

Total number of sales?

public int GetNumberOfSales()
{
   return new StoreWalker(Context).Walk(Store);
}

I chatted with Oren Ellenbogen about this, and he claims that the maintainability of the code is compromised. The idea is that it would cost you five minutes to do it the old way (using Template Method), and then it's more maintainable. I argued that while this does require a shift in thinking, it pays off very shortly by using these type of techniques. I certainly don't advocate putting a large method inside an anonymous delegate, but even if it's a complex calculation that you want to do, you can still get the benefits of merely writing the method and then wiring to the walker's action.

Thoughts?