Lab 12 - Concurrency
Lab goals:
- Learn about the provided tiny web server
- Modify the tiny web server to use thread concurrency
- Learn about other forms of concurrency, including process concurrency using pre-forking and thread concurrency using Pthreads and its synchronization primitives of mutexes, condition variables, and barriers
The tiny Web Server
In your repository for this lab, you will find the source code for a simple web server, known as the tiny web server. In the first exercise in this lab, you will modify this provided tiny server source code to make use of thread concurrency, one of the new concurrency topics covered by this lab.
The provided web server source files are located in the
tiny
subdirectory of your repository. To build
tiny, use the Unix commands:
cd tiny
make
To run the tiny server, use the Unix command on CLEAR:
./tiny port &
where port is the port number at which tiny can then be reached. This must be unique integer between 1024 and 65,535, inclusive. And, note that if you want to be able to connect to tiny from outside of the CLEAR machines, you need to use a port number between 18000 and 18200.
To connect to tiny, navigate your web browser to the URL
http://hostname:port
where hostname is the full name of the CLEAR server that you are logged on to, and port is the same port number that you used in starting tiny. Note that, between hostname and port there is a colon character with no spaces. If you don't know the specific name of the CLEAR server you are logged on to, it can be found by running the command
uname -n
You must use the full name of the CLEAR server you are
logged on to, including the
.clear.rice.edu
part.
IMPORTANT: As with the echo server in the previous lab, when you are done testing your tiny web server, be sure to stop the execution of the server. This can be done either with the kill command or with a combination of the fg command and typing control-C. Be sure not to leave your tiny server running in the background.
Process Concurrency Using Pre-Forking
In the last lab, we looked at the use of process
concurrency using the
fork-after-accept
technique. Process
concurrency can also be created instead using the
pre-forking
technique, which is more
efficient.
In particular, the overhead of forking a new process is relatively high. This forking of a new process after accepting a connection will increase the latency of the response to the request and the time until the next request can be accepted. A more efficient design is to fork before the accept. Specifically, in this technique, you create a pool of processes that are all attempting to accept and process new requests. Here is an example of this technique using the echo server from the previous lab:
#include "csapp.h"
#include "echo.h" // Declares echo().
void
process(int listenfd)
{
struct sockaddr_in clientaddr;
socklen_t clientlen = sizeof(struct sockaddr_in);
int connfd;
while (true) {
clientlen = sizeof(clientaddr);
connfd = Accept(listenfd, (SA *)&clientaddr, &clientlen);
echo(connfd);
Close(connfd);
}
}
int
main(int argc, char **argv)
{
int i, listenfd, numprocs, status;
if (argc != 3) {
fprintf(stderr, "usage: %s <port> <numprocs>\n", argv[0]);
exit(1);
}
listenfd = Open_listenfd(argv[1]);
numprocs = atoi(argv[2]);
for (i = 0; i < numprocs; i++) {
if (Fork() == 0)
process(listenfd);
}
while (true) {
wait(&status);
if (WIFEXITED(status) || WIFSIGNALED(status)) {
/*
* Start a new process to replace the terminated one.
*/
if (Fork() == 0)
process(listenfd);
}
}
}
This works because accept()
is an operating
system kernel call, and internally, the Linux kernel forces
the operation of accept()
to be
atomic. Here atomic
means that, even if
there are many different processes at the same time waiting
to return from an accept()
call on the same
listening file descriptor, when a new connection is then
made from some client, the kernel will pick only a
single one of those processes to succeed and
actually accept the connection.
The parent process in this model serves as a
nanny
for all of its children. Note that this
application uses the capital letter versions of all system
functions, and recall that these functions exit if an error
is encountered. If any child terminates because of a
function encountering an error, the nanny will fork another
child, so there will always be a constant number of
children. This makes the server more robust to failure. The
nanny process should not be used for anything other than
waiting on children and restarting them, to hep prevent the
nanny itself from failing.
Thread Concurrency
Process concurrency is not well suited for some applications. For example, if the application needs to share a large amount of information between each process, the overhead of the inter-process communication can become substantial. Sharing information between threads within a single process is easier in this case because the threads share the same entire address space of the process. In these types of applications, it would be better to use threads.
Each thread within a process executes independently, but
all of the threads within the process share all of the
memory and all file descriptors of the process. Support for
threads on Unix is provided by the POSIX threads standard,
or Pthreads
for short.
The Pthreads function for creating a new thread is
pthread_create
. It has the following
prototype:
int pthread_create(pthread_t *thread, const pthread_attr_t *attr, void *(*start_routine)(void *), void *arg);
The four arguments to pthread_create
are as
follows:
- thread is a pointer to the location at which the thread ID of the newly created thread should be stored by pthread_create upon successful completion.
- attr should be NULL for creating a thread with default attributes.
- start_routine is the address of the
function at which the new thread will begin execution.
This function should be defined to take a single
void *
argument and to return avoid *
result. - arg is the address of the single argument to be passed to start_routine when the new thread begins execution.
As described above, the newly created thread begins
execution with a call to the procedure at
start_routine
, passing it the single argument
arg
. If you want to pass more than one
argument to that function, you should define a
struct
to contain the desired arguments; you
should create and initialize that type of
struct
and then call
pthread_create
to have it pass the address of
it as the single argument to the
start_routine
.
There is no asynchronous method for a thread to be
notified of the completion of another thread like there is
for processes with SIGCHLD
. The only method
for checking if another thread has finished is by polling
with the pthread_join
function. See
man pthread_join
for information on
how to use this function.
If you want to have a thread that does not return a
value, the thread must be marked as detached
with
the pthread_detach
function. See
man pthread_detach
for information
on how to use this function.
Here is an example of using Pthreads for an echo server:
#include "csapp.h"
#include "echo.h" // Declares echo().
void *start_routine(void *vargp);
int
main(int argc, char **argv)
{
struct sockaddr_in clientaddr;
socklen_t clientlen = sizeof(struct sockaddr_in);
pthread_t tid;
int *connfdp, listenfd;
if (argc != 2) {
fprintf(stderr, "usage: %s <port>\n", argv[0]);
exit(0);
}
listenfd = Open_listenfd(argv[1]);
while (true) {
connfdp = Malloc(sizeof(int));
*connfdp = Accept(listenfd, (SA *)&clientaddr, &clientlen);
Pthread_create(&tid, NULL, start_routine, connfdp);
}
}
void *
start_routine(void *vargp)
{
int connfd = *(int *)vargp;
Pthread_detach(pthread_self());
Free(vargp);
echo(connfd);
Close(connfd);
return (NULL);
}
Note that even though only a single integer is being
passed to the newly created thread, space for this integer
is allocated using malloc
rather than the
integer being located on the stack. This design is needed
because there is a subtle bug that can occur if the integer
is on the stack of the calling thread. If the calling
thread modifies the argument (either deliberately or
accidentally) before the new thread is first scheduled and
begins execution, then the argument may change values
behind the new thread's back; depending on the timing, the
new thread may see either the original or the modified
value for this argument when it begins execution. This
problem can occur very easily by accident if the argument
is on the calling thread's stack, for example simply by the
calling thread returning from the procedure in which it
created the new thread. Due to this problem, the
argument to a new thread should always be in private
memory, such as that allocated with
malloc
.
Mutexes
A mutex is a type of variable that can be used
to provide mutual exclusion for safely executing
critical sections. A piece of code is said to be a
critical section
(or part of a critical section, for
related pieces of code) if more than one thread can execute
this code and in it access one or more variables that are
shared between these threads. For correctness, the idea is
to have a thread that is executing in the critical section
exclude other threads from entering in the
critical section at the same time. Specifically, mutual
exclusion means that no more than one thread at a time can
be executing in the critical section, and thus no more than
one thread at a time can be accessing these shared
variables.
When using mutexes in this way, a thread may enter the critical section only after acquiring the mutex lock protecting that critical section. When the thread exits the critical section, it must release the mutex. When the mutex has been acquired and is being held by some thread, any other thread attempting to execute the critical section must wait until the mutex is released and can be acquired by this thread. Mutexes have some similarities to binary semaphores, but importantly, unlike semaphores, only the thread that acquires the mutex can release it.
The typical use of a mutex to protect a critical section looks like this:
LOCK MUTEX critical section begins ... ... ... critical section ends UNLOCK MUTEX
The Pthreads library supports the following functions for using mutexes:
- int pthread_mutex_init(pthread_mutex_t
*restrict mutex, const pthread_mutexattr_t *restrict
attr);
Initializes the mutex pointed to bymutex
using the attributes specified byattr
. The pointermutex
must point to an existing mutex variable to be initialized. Ifattr
is NULL, the default mutex attributes are used. Refer to the man pages for more information on the attributes. Upon successful initialization of the mutex, the state of the mutex becomes initialized and unlocked. - int pthread_mutex_destroy(pthread_mutex_t
*mutex);
Destroys the mutex pointed to bymutex
. Attempting to destroy a locked mutex results in undefined behavior. - int pthread_mutex_lock(pthread_mutex_t
*mutex);
Locks the mutex pointed to bymutex
. If the mutex is already locked, the calling thread is blocked until the mutex becomes available. When this function returns, the mutex is locked and the calling thread has acquired the mutex. - int pthread_mutex_unlock(pthread_mutex_t
*mutex);
Unlocks (i.e., releases) the mutex pointed to bymutex
.
Pthreads Mutex Example
This example program illustrates the use of a Pthreads mutex. The shared counter variable cnt is initialized to 0, and the program then creates two threads that each iterate NITERS number of times, incrementing the count in this variable that is shared between the two threads. The final value of cnt should thus be 2 × NITERS, but without proper synchronization, this might not be the result. The complete source code for this program is provided in the file echo-key+sor/goodcnt-mutex.c in your repository for this lab.
#define NITERS 10000000 /* Number of iterations. */
unsigned int cnt = 0; /* Shared count variable. */
pthread_mutex_t mutex; /* Shared pthread mutex. */
int
main(void)
{
pthread_t tid1, tid2;
/* Initialize pthread variables. */
Pthread_mutex_init(&mutex, NULL); /* INITIALIZE MUTEX */
/* Start threads. */
Pthread_create(&tid1, NULL, count, NULL);
Pthread_create(&tid2, NULL, count, NULL);
/* Wait for threads to finish. */
Pthread_join(tid1, NULL);
Pthread_join(tid2, NULL);
if (cnt != (unsigned)NITERS * 2)
printf("BOOM! cnt = %d\n", cnt);
else
printf("OK cnt = %d\n", cnt);
/* Clean up. */
Pthread_mutex_destroy(&mutex); /* DESTROY MUTEX */
return 0;
}
/* Each thread executes this function. */
void *
count(void *arg)
{
int i;
for (i = 0; i < NITERS; i++) {
/* Acquire mutex lock. */
Pthread_mutex_lock(&mutex); /* ACQUIRE LOCK */
cnt++;
/* Release mutex lock. */
Pthread_mutex_unlock(&mutex); /* RELEASE LOCK */
}
return NULL;
}
Condition Variables
A condition variable allows threads to synchronize with each other based typically on the value of some shared data. Imagine the scenario below in which a thread has to perform some task but cannot do so until the value of a shared variable shared_val is not equal to 0 and can perform that task only when the value of that variable is still not equal to 0. Checking the value of this variable and performing the task must be done inside a critical section, as otherwise the value of the variable could change while this is being done. This means that the thread must acquire a mutex before checking the value of the variable and, if not equal to 0, must release the mutex only after performing the task.
Consider the following inefficient example with two threads:
Thread 1 Thread 2 while (true) { LOCK MUTEX if (shared_val != 1) UNLOCK MUTEX else break } perform task ... ... UNLOCK MUTEX LOCK MUTEX ... ... shared_val = 1 UNLOCK MUTEX
Condition variables allow for efficient handling of such scenarios. A condition variable does not hold any value but instead represents a condition or an event. A thread can wait on a condition variable until the condition is satisfied or the event occurs. It is the responsibility of other threads to signal the condition variable when this happens.
So in the above example, waiting for the value of the shared variable to be equal to 1 can be represented by a condition variable. One or more threads wait on this condition variable. When another thread sets the value to be equal to 1 (or, alternatively, detects that this is so), it also signals the condition variable.
The example below is equivalent to the previous example, but it uses a condition variable efficiently to provide the synchronization for waiting on the event (note, however, the caveat in Exercise #2 below):
Thread 1 Thread 2 LOCK MUTEX if (shared_val != 1) COND WAIT perform task ... ... UNLOCK MUTEX LOCK MUTEX ... ... shared_val = 1 COND SIGNAL UNLOCK MUTEX
Note that condition variables are always used in conjunction with mutexes. A condition variable is never used by itself.
The Pthreads library supports the following functions for using condition variables:
- int pthread_cond_init(pthread_cond_t *restrict
cond, const pthread_condattr_t *restrict
attr);
Initializes the condition variable pointed to bycond
using attributes specified byattr
. The pointercond
must point to an existing condition variable to be initialized. Ifattr
is NULL, the default condition variable attributes are used. Refer to the man pages for more information on the attributes. - int pthread_cond_destroy(pthread_cond_t
*cond);
Destroys the condition variable pointed to bycond
. Attempting to destroy a condition variable upon which any threads are currently blocked results in undefined behavior. - int pthread_cond_wait(pthread_cond_t *restrict
cond, pthread_mutex_t *restrict mutex);
Blocks the calling thread on the condition variable pointed to bycond
. The calling thread must have first acquired the mutex pointed to bymutex
. Note that before blocking, the mutex lock is automatically released, allowing other threads to then acquire the mutex and possibly to also wait on this condition variable. When this function returns, the mutex is (again) held by the calling thread. - int pthread_cond_signal(pthread_cond_t
*cond);
Unblocks at least one of the threads that are currently blocked on the condition variable pointed to bycond
. This function has no effect if no threads are currently blocked on this condition variable. The unblocked thread automatically re-acquires the associated mutex lock before returning from its call topthread_cond_wait
. - int pthread_cond_broadcast(pthread_cond_t
*cond);
Unblocks all threads currently blocked on the condition variable pointed to bycond
. This function has no effect if no threads are blocked on this condition variable. If more than one thread is unblocked, all of them automatically contend for the associated mutex lock and must acquire it before it can return from its call topthread_cond_wait
.
Barriers
A barrier can be used to synchronize all running threads at a particular common point within the execution of the algorithm. This is most useful when several threads execute in parallel some step of the algorithm in which they modify some variables shared with other threads. A barrier at the end of this portion of the code blocks each thread at that point until all of them reach the corresponding barrier. Then all of the threads are allowed to continue and execute beyond that point. The example below shows how barriers can be used (assume here that all threads execute the same code in parallel, as shown below):
All threads 1 ... n all threads execute code section 1 in parallel
....
BARRIER /* each thread is blocked here until all threads reach this barrier, and then all threads continue again */
...
all threads execute code section 2 in parallel
...
BARRIER /* each thread is blocked here until all threads reach this barrier, and then all threads continue again */
...
The Pthreads library supports the following functions for using barriers:
- int pthread_barrier_init(pthread_barrier_t
*restrict barrier, const pthread_barrierattr_t
*restrict attr, unsigned count);
Initializes the barrier pointed to bybarrier
using attributes specified byattr
. The pointerbarrier
must point to an existing barrier variable to be initialized. Ifattr
is NULL, the default barrier attributes are used. Refer to the man pages for more information on the attributes. Thecount
argument specifies the number of threads that must wait at thisbarrier
before any of them successfully return from the call and continue. - int pthread_barrier_destroy(pthread_barrier_t
*barrier);
Destroys the barrier pointed to bybarrier
. Attempting to destroy a barrier on which other threads are currently blocked results in undefined behavior. - int pthread_barrier_wait(pthread_barrier_t
*barrier);
Blocks the calling thread until the required number of threads have called this function specifying the same barrier pointed to bybarrier
. When the required number of threads have reached the barrier, this function unblocks all of these threads and returns in all threads. It returns the constant valuePTHREAD_BARRIER_SERIAL_THREAD
for a single (arbitrary) thread, and instead returns the value 0 for all other threads.
GitHub Repository for This Lab
To obtain your private repo for this lab, please point your browser to this link for the starter code for the lab. Follow the same steps as for previous labs and assignments to create your repository on GitHub and then to clone it onto CLEAR. The directory for your repository for this lab will be
lab-12-concurrency-name
where name is your GitHub userid.
Submission
Be sure to git push the appropriate C source files for this lab before 11:55 PM tonight, to get credit for this lab.