The Slim Reader/Writer Lock in Orcas

time to read 3 min | 425 words

Joe Duffy has posted about the new ReaderWriterLockSlim class in oracs, everything is cool, and I am very happy to see a replacement for the ReaderWriterLock. The Dynamic Proxy library had (twice!) bugs related to the way ReaderWriterLock was used (vs. the way it ought to work), which cause it to fail under high load.

So, I was very happy, until I get to the end of the post, and I saw this:

Lastly, I mentioned there are some caveats around where this lock’s use is appropriate. Well, there’s one, really: it’s not hardened to be reliable. This means a few things.

...

Next, the lock is not robust to asynchronous exceptions such as thread aborts and out of memory conditions. If one of these occurs while in the middle of one of the lock’s methods, the lock state can be corrupt, causing subsequent deadlocks, unhandled exceptions, and (sadly) due to the use of spin locks internally, a pegged 100% CPU.

So, I can't use this for anything that need to be reliable. Where do I usually use a lock, for multi threading scenarios, which often happens on servers, which has to be reliable. Hell, the way it sounds, using it in a web enviornment and editing the web.config is playing russian roulette*. I tried to play with in on the January CTP, but it is not there yet.

What about cross AppDomain stuff? Does it work across the AppDomains? If so, it really does need to handle AppDomain unloads, while keeping the process (and lock) safe to further use. What happens if I am using a plugins and I need to monitor rouge code and maybe kill it if it takes too long. That is a greate DoS attack against my code (actually, throwing exception from new thread or simply doing stack overflow will both do that as well).

* Okay, not fair, the CLR is supposed to handle AppDomain unload cleanly, but I am not sure whatever this holds here as well.

Tweet Share Share 8 comments

Tags:

Comments

08 Feb 2007
08:01 AM

Oran

I started getting disillusioned with the CLR's ability to make guarantees a while ago when we (foolishly) wrote our own server process from scratch. First was the lack of FIFO locks anywhere in the .NET Framework. I spoke with Joe Duffy and Jan Gray about this at PDC05, and they both weren't interested in enabling this. I have come to believe that Microsoft's philosophy is if you want reliability and guarantees with your managed code, you need to leave that up to your unmanaged host, such as IIS, SQL Server, or WAS.

Another disillusionment point was when Windows Server 2003 SP1 was released which caused System.Threading.Timer in the 1.1 framework to eventually stop firing. You had to call PSS to get the fix. It was ONLY broken on Windows Server 2003. Yet another sign (perhaps unintentional) that Microsoft doesn't take .NET on servers seriously. If you have an unmanaged host like IIS recycling your code on a regular basis, why go to the extra effort of making .NET reliable?

See this post for the gory details of the timer issue which had a ripple effect on things like SqlClient.ConnectionPool and System.Web.HttpRuntime: http://groups.google.com/group/microsoft.public.dotnet.framework.clr/browse_thread/thread/772a5528aba714fa/a320226391613871

08 Feb 2007
08:12 AM

Ayende Rahien

This is very sad.

I have already had to write more than a few locks because they didn't exists (wait for consumers, for instance).

Considering that a .NET service is something that they really going to start enabling with WCF (which will not sit on top of IIS in all cases), this is worrying.

A simple example of why I need reliability is a service that needs to have a huge cache in order to work. The first 15 minutes of loading the service are dedicated to populating that cache, which means that any shutdown usually leads to about 20 minutes outage, with any other application on the machine performing very badly.

The bug you describe is very scary, by the way.

08 Feb 2007
08:13 AM

Oran

People like Chris Brumme.worked their tails off to make .NET reliable, so perhaps I'm being a bit harsh. Reading Chris's huge blog posts, it became clear that there were quite a few outstanding hard problems that just had to wait until later versions of the framework to be solved. Hopefully the work to enable greater parallelism in managed code will motivate better reliability.

08 Feb 2007
08:28 AM

Oran

The non-IIS unmanaged WCF hosting option is WAS - Windows Activation Service. It's basically IIS for all the other protocols, so they've got us covered if self-hosting with non-HTTP transports isn't reliable enough.

09 Feb 2007
06:08 AM

Joe Duffy

Ayende, AppDomain unloads are not a problem since RWLSlim can't be shared across them. Individual thread aborts are. Because ASP.NET doesn't use the CLR's V2 hosting interfaces, however, nothing the RWLSlim could do would help the situation; effectively any lock written in managed code would suffer from these problems (without extreme measures).

Regarding your question about isolation and plug-in tear-down, the CLR is working on a new AddIn model, described a bit in this MSDN article: http://msdn.microsoft.com/msdnmag/issues/07/02/CLRInsideOut/. Honestly, until then my personal advice is process isolation: it's the safest and least risky.

Oran, you are correct, the CLR places a lot (but not all) of the reliability responsibility in the hands of the host. A host is, after all, the only component that should be introducing individual async thread aborts without also unloading the AppDomain, so once the host decides to do this it also accepts the additional responsibility. Any use of aborts w/out using the hosting APIs to guarantee they are done safely is asking for problems.

The CLR locks are in fact weakly FIFO, but not strictly. I'm sorry if I seemed to shrug this off at the PDC, but there is actually a good reason for this design. It has to do with some pretty serious liveness problems that can result otherwise (and have in fact resulted in the past when locks in Windows were originally strictly FIFO). I wrote about this more here: http://www.bluebytesoftware.com/blog/PermaLink,guid,e40c2675-43a3-410f-8f85-616ef7b031aa.aspx. FWIW.

--joe

09 Feb 2007
06:32 AM

Ayende Rahien

@Joe,

Thanks for the response.

I am concerned about leaving reliability to the host, since i write quite a few Windows Services which has to be reliable, and those are running without a host.

My question about plug-in teardown is actually also relevant to ASP.Net timeout behavior, which will also abort a thread. Obviously separate processes are not an option here.

"Any use of aborts w/out using the hosting APIs to guarantee they are done safely is asking for problems."

Is there a way from the managed process to tell the host to kill a thread and do it safely? What happens when I am not running inside a host (windows service again)?

I am not sure that WAS is a good idea for those services at any rate, they most watch resources and act upon changes, not exposing services, etc.

09 Feb 2007
13:09 PM

Udi Dahan - The Software Simplis

When writing my own host (like a windows service) I usually wrap the thread class with one of my own - one which exposes a Stop method. In that way, I let the thread finish its current unit of work before it stops at some safe point. This keeps my locks in a consistent state when threads start and stop.

10 Feb 2007
18:28 PM

Joe Duffy

Hi Ayende,

There should be no thread aborts happening in an unhosted Windows Service, so this ought not to be an issue.

(Any rogue, trusted code can call Thread.Abort so long as it has a reference to the Thread object, but this is a highly discouraged practice.)

The right way to shut down a thread is exactly what Udi suggests: cooperative shutdown, by polling a shared flag at safe points, set by the shutdown initiator causing the thread to voluntarily shutdown. At some point, I hope the Framework gives better support for things like IO Cancellation to make this approach more responsive in blocking scenarios. You can also consider using thread interrupts to wake up blocked threads, though there are some pros/cons to this.

ASP.NET isn’t safe in the way it aborts threads, ever, regardless of whether the lock calls Begin/EndCriticalRegion or not. That’s because it doesn’t use the new V2.0 hosting interfaces and will attempt to cancel individual threads rather than the entire AppDomain in many cases.

Reliable locks can be built in managed code to tolerate this kind of thing, but only if you introduce lengthy delay-abort regions, possibly even while blocking, which is a horrible practice (since it prevents, say, ASP.NET from reclaiming a thread).

Reliability is mostly about statistics, and statistically speaking, most of this won't be a problem most of the time. It's likely there will never be a foolproof way to do any of this, which means failsafes need to be built into the system, to deal with data corruption, unresponsive threads, and other statistically infrequent undesirable events.

--joe

Comment preview

Comments have been closed on this topic.

Markdown turns plain text formatting into fancy HTML formatting.

Phrase Emphasis

*italic*   **bold**
_italic_   __bold__

Links

Inline:

An [example](http://url.com/ "Title")

Reference-style labels (titles are optional):

An [example][id]. Then, anywhere
else in the doc, define the link:
  [id]: http://example.com/  "Title"

Images

Inline (titles are optional):

![alt text](/path/img.jpg "Title")

Reference-style:

![alt text][id]
[id]: /url/to/img.jpg "Title"

Headers

Setext-style:

Header 1
========
Header 2
--------

atx-style (closing #'s are optional):

# Header 1 #
## Header 2 ##
###### Header 6

Lists

Ordered, without paragraphs:

1.  Foo
2.  Bar

Unordered, with paragraphs:

*   A list item.
    With multiple paragraphs.
*   Bar

You can nest them:

*   Abacus
    * answer
*   Bubbles
    1.  bunk
    2.  bupkis
        * BELITTLER
    3. burper
*   Cunning

Blockquotes

> Email-style angle brackets
> are used for blockquotes.
> > And, they can be nested.
> #### Headers in blockquotes
> 
> * You can quote a list.
> * Etc.

Horizontal Rules

Three or more dashes or asterisks:

---
* * *
- - - -

Manual Line Breaks

End a line with two or more spaces:

Roses are red,   
Violets are blue.

Fenced Code Blocks

Code blocks delimited by 3 or more backticks or tildas:

```
This is a preformatted
code block
```

Header IDs

Set the id of headings with {#<id>} at end of heading line:

## My Heading {#myheading}

Tables

Fruit    |Color
---------|----------
Apples   |Red
Pears	 |Green
Bananas  |Yellow

Definition Lists

Term 1
: Definition 1
Term 2
: Definition 2

Footnotes

Body text with a footnote [^1]
[^1]: Footnote text here

Abbreviations

MDD <- will have title
*[MDD]: MarkdownDeep

Oren Eini

Oren Eini

CEO of RavenDB