OPERATING SYSTEMS (CS F372)
Threads
Dr. Barsha Mitra
BITS Pilani CSIS Dept., BITS Pilani, Hyderabad Campus
Hyderabad Campus
Motivation
Most modern applications are multithreaded
Threads run within application
Multiple tasks with the application can be implemented by separate threads
Update display
Fetch data
Spell checking
Answer a network request
Process creation is heavy-weight while thread creation is light-weight
Can simplify code, increase efficiency
BITS Pilani, Hyderabad Campus
Motivation
Multithreaded Server Architecture
BITS Pilani, Hyderabad Campus
What is Thread?
Basic unit of CPU utilization
Comprises a thread ID, program counter, registers and stack
Shares with other threads belonging to the same program
code section
data section
OS resources like open files
BITS Pilani, Hyderabad Campus
Motivation
BITS Pilani, Hyderabad Campus
Benefits
Responsiveness – may allow continued execution if part of process is blocked,
especially important for user interfaces in interactive environments
Resource Sharing – threads share resources of process, easier than shared
memory or message passing
Economy – cheaper than process creation, thread switching has lower overhead
than context switching
Scalability – process can take advantage of multiprocessor architectures
BITS Pilani, Hyderabad Campus
Multicore Programming
Multicore or multiprocessor systems putting pressure on programmers,
programming challenges include:
Identifying tasks
Balance
Data splitting
Data dependency
Testing and debugging
Parallelism implies a system can perform more than one task simultaneously
Concurrency supports more than one task making progress
Single processor / core, scheduler providing concurrency
BITS Pilani, Hyderabad Campus
Multicore Programming
Types of parallelism
Data parallelism – distributes subsets of the same data across multiple cores,
same operation on each subset
Task parallelism – distributing tasks/threads across cores, each thread
performing unique operation, threads may be operating on same or different
data
BITS Pilani, Hyderabad Campus
Concurrency vs Parallelism
Concurrent execution on single-core system:
Parallelism on a multi-core system:
BITS Pilani, Hyderabad Campus
User Threads and Kernel Threads
User threads - management done by user-level threads library without kernel
support
Three primary thread libraries:
POSIX Pthreads
Windows threads
Java threads
Kernel threads - Supported and managed by the Kernel
Examples – virtually all general purpose operating systems support kernel
threads, including:
Windows
Solaris
Linux
Tru64 UNIX
Mac OS X
BITS Pilani, Hyderabad Campus
Multithreading Models
Many-to-One
One-to-One
Many-to-Many
BITS Pilani, Hyderabad Campus
Many-to-One Model
Many user-level threads mapped to
single kernel thread
Thread management done by thread
library in user space
One thread blocking causes all to block
Multiple threads may not run in parallel
on muticore system because only one
can access kernel at a time
Few systems currently use this model
Examples:
Solaris Green Threads
BITS Pilani, Hyderabad Campus
One-to-One Model
Each user-level thread maps to kernel
thread
Creating a user-level thread creates a
kernel thread
More concurrency than many-to-one
Number of threads per process
sometimes restricted due to overhead
Examples:
Windows
Linux
Solaris 9 and later
BITS Pilani, Hyderabad Campus
Many-to-Many Model
Allows many user level threads to be
mapped to many kernel threads
Allows the operating system to create a
sufficient number of kernel threads
Solaris prior to version 9
Windows with the ThreadFiber package
BITS Pilani, Hyderabad Campus
Two-Level Model
Similar to M:M, except that it allows a
user thread to be bound to kernel
thread
Examples
IRIX
HP-UX
Tru64 UNIX
Solaris 8 and earlier
BITS Pilani, Hyderabad Campus
Thread Libraries
Thread library provides programmer with API for creating and managing threads
Two primary ways of implementing
Library entirely in user space
Kernel-level library supported by the OS
BITS Pilani, Hyderabad Campus
Pthreads
May be provided either as user-level or kernel-level
A POSIX standard (IEEE 1003.1c) API for thread creation and synchronization
Specification, not implementation
API specifies behavior of the thread library, implementation is up to development
of the library
Common in UNIX operating systems (Solaris, Linux, Mac OS X)
BITS Pilani, Hyderabad Campus
Pthreads Example
#include <pthread.h>
#include <stdio.h> /* The thread will begin control in this function */
int sum; /* this data is shared by the thread(s) */ void *runner(void *param)
void *runner(void *param); /* threads call this function */ {
int i, upper = atoi(param);
int main(int argc, char *argv[]) sum = 0;
{ for (i = 1; i <= upper; i++)
pthread_t tid; /* the thread identifier */ sum += i;
pthread_attr_t attr; /* set of thread attributes */ pthread_exit(0);
if (argc != 2) { }
fprintf(stderr,"usage: a.out <integer value>\n");
return -1; } synchronous threading
if (atoi(argv[1]) < 0) {
fprintf(stderr,"%d must be >= 0\n",atoi(argv[1]));
return -1; }
pthread_attr_init(&attr); /* get the default attributes */
pthread_create(&tid, &attr, runner, argv[1]); /* create the thread */
pthread_join(tid, NULL); /* wait for the thread to exit */
printf("sum = %d\n", sum);
} BITS Pilani, Hyderabad Campus
Pthreads Example
#define NUM THREADS 10
/* an array of threads to be joined upon */
pthread_t workers[NUM THREADS];
for (int i = 0; i < NUM THREADS; i++)
pthread join(workers[i], NULL);
JOINING 10 THREADS
BITS Pilani, Hyderabad Campus
Threading Issues
fork() and exec() system calls
Signal handling
Thread cancellation of target thread
Thread-local storage
Scheduler Activations
BITS Pilani, Hyderabad Campus
fork() and exec()
If one thread in a program calls fork(), does the new process duplicate all threads, or is
the new process single-threaded?
Some UNIX systems have chosen to have two versions of fork(), one that duplicates all
threads and another that duplicates only the thread that invoked the fork() system call
The exec() system call works in the same way
if a thread invokes the exec() system call, the program specified in the parameter
to exec() will replace the entire process—including all threads
If exec() is called immediately after forking, then duplicating all threads is unnecessary,
as the program specified in the parameters to exec() will replace the process
duplicating only the calling thread is appropriate
If the separate process does not call exec() after forking, the separate process should
duplicate all threads
BITS Pilani, Hyderabad Campus
Signal Handling
Signals are used in UNIX systems to notify a process that a particular event has
occurred
The signal is delivered to a process
When delivered, signal handler is used to process signals
Synchronous and asynchronous signals
Synchronous signals
illegal memory access, div. by 0
delivered to the same process that performed the operation generating the signal
Asynchronous signals
generated by an event external to a running process
the running process receives the signal asynchronously
Ctrl + C, timer expiration
BITS Pilani, Hyderabad Campus
Signal Handling
Signal is handled by one of two signal handlers:
default
user-defined
Every signal has default handler that kernel runs when handling signal
User-defined signal handler can override default signal handler
Some signals can be ignored, others are handled by terminating the process
For single-threaded, signal is delivered to process
BITS Pilani, Hyderabad Campus
Signal Handling
Where should a signal be delivered for multi-threaded process?
Deliver the signal to the thread to which the signal applies
Deliver the signal to every thread in the process
Deliver the signal to certain threads in the process
Assign a specific thread to receive all signals for the process
Method for delivering a signal depends on the type of signal generated
synchronous signals need to be delivered to the thread causing the signal and not
to other threads in the process
some asynchronous signals—such as <Ctrl + C> should be sent to all threads
Signals can be delivered to a
specific process – specify process id and type of signal
specific thread – specify thread id and type of signal
BITS Pilani, Hyderabad Campus
Signal Handling
Most multithreaded versions of UNIX allow a thread to specify which signals it will
accept and which it will block
In some cases, an asynchronous signal may be delivered only to those threads that
are not blocking it
Signals need to be handled only once, a signal is delivered only to the first thread
found that is not blocking it
BITS Pilani, Hyderabad Campus
Thread Cancellation
Terminating a thread before it has finished
Thread to be canceled is target thread
Two general approaches:
Asynchronous cancellation terminates the target thread immediately
Deferred cancellation allows the target thread to periodically check if
it should be cancelled
What about freeing resources??
Pthread code to create and cancel a thread:
pthread_t tid;
/* create the thread */
pthread_create(&tid, &attr, worker, NULL);
...
/* cancel the thread */
pthread_cancel(tid);
BITS Pilani, Hyderabad Campus
Thread Cancellation
Invoking thread cancellation requests cancellation, but actual cancellation depends on
how the target thread is set up to handle the request
default type
If thread has cancellation disabled, cancellation remains pending until thread enables it
Default type is deferred
Cancellation only occurs when thread reaches cancellation point
Establish cancellation point by calling pthread_testcancel()
If cancellation request is pending, cleanup handler is invoked to release any
acquired resources
BITS Pilani, Hyderabad Campus
Thread-Local Storage
Thread-local storage (TLS) allows each thread to have its own copy
of data
Different from local variables
Local variables visible only during single function invocation
TLS visible across function invocations
BITS Pilani, Hyderabad Campus
Scheduler Activations
Both M:M and Two-level models require communication to
maintain appropriate number of kernel threads allocated to
the application
Use an intermediate data structure between user and kernel
threads – lightweight process (LWP)
Appears to be a virtual processor on which process can
schedule user thread to run
Each LWP attached to kernel thread which is
scheduled on a physical processor
How many LWPs to create?
BITS Pilani, Hyderabad Campus
Scheduler Activations
Scheduler activations - scheme for communication b/w user-thread library and
kernel
Kernel provides an application with a set of virtual processors (LWPs)
Application can schedule user threads onto an available virtual processor
Kernel must inform an application about certain events via an upcall
Upcalls are handled by the thread library with an upcall handler
Upcall handlers must run on a virtual processor
Upcall trigerring occurs when an application thread is about to block
BITS Pilani, Hyderabad Campus
Kernel makes upcall to the application informing that a thread is about to block and identifying the thread
Kernel allocates a new virtual processor to the application
Application runs an upcall handler on this new virtual processor, which saves the state of the blocking thread
and relinquishes the virtual processor on which the blocking thread is running
Upcall handler then schedules another thread that is eligible to run on an available virtual processor
When the event that the blocking thread was waiting for occurs, the kernel makes another upcall to the
thread library informing it that the previously blocked thread is now eligible to run
Upcall handler for this event requires a virtual processor, and kernel may allocate a new virtual processor or
preempt one of the user threads and run the upcall handler on its virtual processor
After marking the unblocked thread as eligible to run, the application schedules an eligible thread to run
on an available virtual processor BITS Pilani, Hyderabad Campus
Scheduler Activations
At time T1, the kernel
allocates the application two
processors. On each processor,
the kernel schedules a user-
level thread taken from the
ready list and starts execution.
Source: Scheduler Activations: Effective Kernel Support for the User-Level Management of Parallelism, THOMAS E. ANDERSON, BRIAN N.
BERSHAD, EDWARD D. LAZOWSKA, and HENRY M. LEVY, 1992
BITS Pilani, Hyderabad Campus
Scheduler Activations
At time T2, one of the user-
level threads (thread 1) blocks
in the kernel. To notify the user
level of this event, the kernel
takes the processor that had
been running thread 1 and
performs an upcall. The user-
level thread scheduler can
then use the processor to take
another thread off the ready
list and start running it.
Source: Scheduler Activations: Effective Kernel Support for the User-Level Management of Parallelism, THOMAS E. ANDERSON, BRIAN N.
BERSHAD, EDWARD D. LAZOWSKA, and HENRY M. LEVY, 1992
BITS Pilani, Hyderabad Campus
Scheduler Activations
At time T3, the I/O completes. Again, the kernel
must notify the user-level thread system of the
event, but this notification requires a processor.
The kernel preempts one of the virtual
processors running and uses it to do the upcall.
(If there are no processors available when the
I/O completes, the upcall must wait until the
kernel allocates one). The upcall puts the thread
that had been blocked on the ready list and puts
the thread that was preempted on the ready
list.
Source: Scheduler Activations: Effective Kernel Support for the User-Level Management of Parallelism, THOMAS E. ANDERSON, BRIAN N.
BERSHAD, EDWARD D. LAZOWSKA, and HENRY M. LEVY, 1992
BITS Pilani, Hyderabad Campus
Scheduler Activations
Finally, at time T4,
the upcall takes a
thread off the
ready list and starts
running it.
Source: Scheduler Activations: Effective Kernel
Support for the User-Level Management of
Parallelism, THOMAS E. ANDERSON, BRIAN N.
BERSHAD, EDWARD D. LAZOWSKA, and HENRY
M. LEVY, 1992
BITS Pilani, Hyderabad Campus
BITS Pilani, Hyderabad Campus