A note about performance
On a multi-core Mac, running 10.8, I ran a test to compare the speed differences of incrementing or decrementing a value with no-locks, atomics and finally, a mutex lock.
The no-lock is the baseline, and here is the difference in speed with the later 2 items:
Atomics: 3X slower on the same thread
Mutex: 7X slower on the same thread