As Bryan has observed the past, software has a quality unique to engineering disciplines in that you can build it, but you can’t see it. DTrace changes that by opening windows into parts of the system that were previously unobservable and it does so in a way that minimally changes what you’re attempting to observe — this software “uncertainty principle” has limited the utility of previous observability tools. One of the darkest areas of debugging in user-land has been around lock contention.
In multi-threaded programs synchronization primitives — mutexes, R/W locks, semaphores, etc. — are required to coordinate each thread’s efforts and make sure shared data is accessed safely. If many threads are kept waiting while another thread owns a sychronization primitive, the program is said to suffer from lock contention. In the kernel, we’ve had lockstat(1m) for many years, but in user-land, the techniqes for observing lock behavior and sorting out the cause or even the presence have been very ad hoc
the plockstat provider
I just finished work on the plockstat provider for DTrace as well as a new plockstat(1m) command for observing user-land synchronization objects. If you’re unfamiliar with DTrace, you might want to take a quick look at the Solaris Dynamic Tracing Guide (look through it for some examples); that will help ground some of this explanation.
The plockstat provider has these probes:
mutex-acquire | fires when a mutex is acquired |
mutex-release | fires when a mutex is released |
mutex-block | fires when a thread blocks waiting for a mutex |
mutex-spin | fires when a thread spins waiting for a mutex |
rw-acquire | fires when an R/W lock is acquired |
rw-release | fires when an R/W lock is released |
rw-block | fires when a thread blocks waiting for an R/W lock |
It’s possible with other tools to observe these points, but — as anyone who’s tried it can attest — other tools can alter the effects you’re trying to observe. Traditional debuggers can effectively serialize your parallel program removing any trace of the lock contention you’d see during a normal run. DTrace and the plockstat provider avoid eliminate this problem.
With the plockstat provider you can answer questions that were previously very difficult to solve, such as “where is my program blocked on mutexes”:
bash-2.05b# dtrace -n plockstat1173:::mutex-block'{ @[ustack()] = count() }' dtrace: description 'plockstat1173:::mutex-block' matched 2 probes ^C libc.so.1`mutex_lock_queue+0xa9 libc.so.1`slow_lock+0x3d libc.so.1`mutex_lock_impl+0xec libc.so.1`mutex_lock+0x38 libnspr4.so`PR_Lock+0x1a libnspr4.so`PR_EnterMonitor+0x35 libxpcom.so`__1cGnsPipePGetWriteSegment6MrpcrI_I_+0x3e libxpcom.so`__1cSnsPipeOutputStreamNWriteSegments6MpFpnPnsIOutputStream_pvpcIIpI_I3I5_I_+0x4f c4654d3c libxpcom.so`__1cUnsThreadPoolRunnableDRun6M_I_+0xb0 libxpcom.so`__1cInsThreadEMain6Fpv_v_+0x32 c4ec1d6a libc.so.1`_thr_setup+0x50 libc.so.1`_lwp_start 1
(any guesses as to what program this might be?)
Not just a new view for DTrace, but a new view for user-land.
the plockstat(1m) command
DTrace is an incredibly powerful tool, but some tasks are so common that we want to make it as easy as possible to use DTrace’s facilities without knowing anything about DTrace. The plockstat(1m) command wraps up a bunch of knowledge about lock contention in a neat and easy to use package:
# plockstat -s 10 -A -p `pgrep locker` ^C Mutex block ------------------------------------------------------------------------------- Count nsec Lock Caller 13 22040260 locker`lock1 locker`go_lock+0x47 nsec ---- Time Distribution --- count Stack 65536 |@@@@@@@@@@@@@@ | 8 libc.so.1`mutex_lock+0x38 131072 | | 0 locker`go_lock+0x47 262144 |@@@@@ | 3 libc.so.1`_thr_setup+0x50 524288 | | 0 libc.so.1`_lwp_start 1048576 | | 0 2097152 | | 0 4194304 | | 0 8388608 | | 0 16777216 |@ | 1 33554432 | | 0 67108864 | | 0 134217728 | | 0 268435456 |@ | 1 ...
This has been a bit of a teaser. I only integrated plockstat into Solaris 10 yesterday and it will be a few weeks before you can access plockstat as part of the Solaris Express program, but keep an eye on the DTrace Solaris Express Schedule.