Operating Systems
Topic: Concurrency – Locks
Endadul Hoque
Acknowledgement
• Youjip Won (Hanyang University)
• OSTEP book – by Remzi and Andrea Arpaci-Dusseau
(University of Wisconsin)
Locks: The basic idea
• Ensure that any critical section executes as if it were a
single atomic instruction.
– An example: the canonical update of a shared variable
balance = balance + 1;
– Add some code around the critical section
1 lock_t mutex; // some globally-allocated lock ‘mutex’
2 ...
3 lock(&mutex);
4 balance = balance + 1;
5 unlock(&mutex);
Locks: The basic idea
• Lock variable holds the state of the lock.
– available (or unlocked or free)
• No thread holds the lock.
– acquired (or locked or held)
• Exactly one thread holds the lock and presumably is in a
critical section.
The semantics of the lock()
• lock()
– Try to acquire the lock.
– If no other thread holds the lock, the thread will
acquire the lock.
– Enter the critical section.
• This thread is said to be the owner of the lock.
– Other threads are prevented from entering the critical
section while the first thread that holds the lock is in
there.
Pthread Locks - mutex
• The name that the POSIX library uses for a lock.
– Used to provide mutual exclusion between threads.
1 pthread_mutex_t lock = PTHREAD_MUTEX_INITIALIZER;
2
3 Pthread_mutex_lock(&lock); // wrapper for pthread_mutex_lock()
4 balance = balance + 1;
5 Pthread_mutex_unlock(&lock);
– Instead of using one big lock (a coarse-grained approach), we
may be using different locks to protect different variables (a more
fine-grained approach) → Increase concurrency.
Building A Lock
• Efficient locks must provide mutual
exclusion at low cost.
• Building a lock needs some help from the
hardware and the OS.
Evaluating locks – Basic criteria
• Correctness (aka Mutual exclusion)
– Does the lock work, preventing multiple threads from entering a
critical section?
• Fairness
– Does each thread contending for the lock get a fair shot at
acquiring it once it is free? (Addressing starvation)
• Performance
– The time overheads added by using the lock
• Cases: (a) No contention, (b) contention on single CPU, (c) contention on
multiple CPUs
Naïve solution: Controlling Interrupts
• Disable Interrupts for critical sections
– One of the earliest solutions used to provide mutual exclusion
– Invented for single-processor systems.
1 void lock() {
2 DisableInterrupts();
3 }
4 void unlock() {
5 EnableInterrupts();
6 }
Naïve solution: Controlling Interrupts
• Disable Interrupts for critical sections
– One of the earliest solutions used to provide mutual exclusion
– Invented for single-processor systems.
1 void lock() {
2 DisableInterrupts();
3 }
4 void unlock() {
5 EnableInterrupts();
• Problems:
6 }
– Require too much trust in applications
• Greedy (or malicious) program could monopolize the processor.
– Do not work on multiprocessors
– Disabling interrupts for a longer time can lose important interrupts
– Code that masks or unmasks interrupts be executed slowly by modern
CPUs
Why is hardware support needed?
• First attempt: Using a flag denoting whether the lock is
held or not.
– The code below has problems.
1 typedef struct __lock_t { int flag; } lock_t;
2
3 void init(lock_t *mutex) {
4 // 0 → lock is available, 1 → held
5 mutex->flag = 0;
6 }
7
8 void lock(lock_t *mutex) {
9 while (mutex->flag == 1) // TEST the flag
10 ; // spin-wait (do nothing)
11 mutex->flag = 1; // now SET it !
12 }
13
14 void unlock(lock_t *mutex) {
15 mutex->flag = 0;
16 }
Why is hardware support needed?
– Problem 1: No Mutual Exclusion
Thread1 Thread2
Assume
call lock()
flag=0 while (flag == 1)
interrupt: switch to Thread 2
call lock()
while (flag == 1)
flag = 1;
interrupt: switch to Thread 1
flag = 1; // set flag to 1 (too!)
– Problem 2: Spin-waiting wastes time waiting for another thread.
• So, we need an atomic instruction supported by Hardware!
– E.g., test-and-set instruction, also known as atomic exchange
Test And Set (Atomic Exchange)
• An H/W instruction to support the creation of simple
locks
• This instruction operates as follows (C pseudocode):
1 int TestAndSet(int *ptr, int new) {
2 int old = *ptr; // fetch old value at ptr
3 *ptr = new; // store ‘new’ into ptr
4 return old; // return the old value
5 }
– return old value pointed to by the ptr (testing)
– Simultaneously update said value to new (setting)
– This sequence of operations is performed atomically.
A Simple Spin Lock using test-and-set
1 typedef struct __lock_t {
2 int flag;
3 } lock_t;
4
5 void init(lock_t *lock) {
6 // 0 indicates that lock is available,
7 // 1 that it is held
8 lock->flag = 0;
9 }
10
11 void lock(lock_t *lock) {
12 while (TestAndSet(&lock->flag, 1) == 1)
13 ; // spin-wait
14 }
15
16 void unlock(lock_t *lock) {
17 lock->flag = 0;
18 }
Note: To work correctly on a single processor,
it requires a preemptive scheduler.
Evaluating Spin Locks (using Test-and-
Set)
• Correctness: yes
– The spin lock only allows a single thread to entry the critical section.
• Fairness: no
– Spin locks don’t provide any fairness guarantees.
– Indeed, a thread spinning may spin forever.
• Performance:
– In the single CPU, performance overheads can be quite substantial.
– If the number of threads roughly equals the number of CPUs, spin
locks work reasonably well.
Some other H/W instructions for Spin
Locks
• Compare-and-Swap
• Load-Linked and Store-Conditional
• Fetch-And-Add
If interested to learn more on this, read sections 28.9, 28.10, 28.11
from the book chapter
So Much Spinning
• Hardware-based spin locks are simple and they
work.
• In some cases, these solutions can be quite
inefficient.
– Any time a thread gets caught spinning, it wastes an
entire time slice doing nothing but checking a value.
How To Avoid Spinning?
We’ll need OS Support too!
A Simple Approach: Just Yield
• When you are going to spin, give up the CPU to another thread.
– OS system call moves the caller from the running state to the ready state.
– The cost of a context switch can be substantial and the starvation
problem still exists.
1 void init() {
2 flag = 0;
3 }
4
5 void lock() {
6 while (TestAndSet(&flag, 1) == 1)
7 yield(); // give up the CPU
8 }
9
10 void unlock() {
11 flag = 0;
12 }
Lock with Test-and-set and Yield
Using Queues: Sleeping Instead of
Spinning
• Queue to keep track of which threads are waiting to
acquire the lock.
• park()
– Put a calling thread to sleep
• unpark(threadID)
– Wake up a particular thread as designated by threadID.
Using Queues: Sleeping Instead of
Spinning
1 typedef struct __lock_t { int flag; int guard; queue_t *q; } lock_t;
2
3 void lock_init(lock_t *m) {
4 m->flag = 0;
5 m->guard = 0;
6 queue_init(m->q);
7 }
8
9 void lock(lock_t *m) {
10 while (TestAndSet(&m->guard, 1) == 1)
11 ; // acquire guard lock by spinning (another lock, used internally)
12 if (m->flag == 0) {
13 m->flag = 1; // acquire lock m
14 m->guard = 0; // release guard lock
15 } else {
16 queue_add(m->q, gettid()); // lock m is not available,so queue myself
17 m->guard = 0; // release guard lock
18 park(); // let me sleep
19 }
20 }
21 …
Lock With Queues, Test-and-set, Yield, And Wakeup
Using Queues: Sleeping Instead of
Spinning
22 void unlock(lock_t *m) {
23 while (TestAndSet(&m->guard, 1) == 1)
124 typedef struct __lock_t
; // acquire { int
guard flag;
lock int guard; queue_t *q; } lock_t;
by spinning
225 if (queue_empty(m->q))
326 void lock_init(lock_t *m)let
m->flag = 0; // { go of lock; no one wants it
427 m->flag
else = 0;
528 m->guard = 0;
unpark(queue_remove(m->q)); // wake up a thread, hold lock for it
629 queue_init(m->q);
m->guard = 0;
730 }}
8
9 void lock(lock_t *m) {
10 while (TestAndSet(&m->guard, 1) == 1)
11 ; // acquire guard lock by spinning (another lock, used internally)
12 if (m->flag == 0) {
13 m->flag = 1; // acquire lock m
14 m->guard = 0; // release guard lock
15 } else {
16 queue_add(m->q, gettid()); // lock m is not available,so queue myself
17 m->guard = 0; // release guard lock
18 park(); // let me sleep
19 }
20 }
21 …
Lock With Queues, Test-and-set, Yield, And Wakeup
Wakeup/waiting race
• Scenario:
1. Thread A holds the lock
2. Thread B attempts to acquire lock, hence ends up adding itself
to the queue; but before it could call park(), context switch
happens
3. Thread A releases the lock and calls unpark(), but no thread to
wake up
4. Thread B regains the CPU and calls park() now; thus Thread B
goes to sleep (and potentially sleeps forever).
Wakeup/waiting race
• Solaris solves this problem by adding a third system call:
setpark().
– By calling this routine, a thread can indicate it is about to park.
– If it happens to be interrupted and another thread calls unpark
before park is actually called, the subsequent park returns
immediately instead of sleeping.
1 queue_add(m->q, gettid());
2 setpark(); // new code
3 m->guard = 0;
4 park();
Code modification inside of lock()
Futex
• Linux provides a futex (is similar to Solaris’s park
and unpark), part of nptl library
– futex_wait(address, expected)
• Put the calling thread to sleep
• If the value at address is not equal to expected, the call
returns immediately.
– futex_wake(address)
• Wake one thread that is waiting on the queue.
Reading Material
• Chapter 28 of OSTEP book – by Remzi and Andrea
Arpaci-Dusseau (University of Wisconsin)
http://pages.cs.wisc.edu/~remzi/OSTEP/threads-
locks.pdf
Questions?