Notes---Operating Systems: Three Easy Piece---Concurrency（UW-Madison CS 537）

这个 notes 主要用来记录有关 “Operating Systems: Three Easy Piece” 这本书第二大块的内容。就像书名写的，作者把 Operating system 的内容分为三大块，现在我们已经进行到了第二块：Concurrency。这一块的思想其实某种意义上是最重要的，因为 concurrency 的思想在非常多的地方都有用到，但是这一部分却是内容最少的，我还记得老师当时说的时候还挺惋惜的。

Chapter 26: Concurrency: An Introduction

关于 concurrency，第一个要介绍的概念就是 thread。之前我们说到的 program 其实都是 single-threaded program，但是现在我们就要转向 multi-threaded program。如果用一个非常浅显但是不准确的语言去描述 thread，其实就是同一个程序“同时”干好几个事情。从这个实现的功能上来讲，似乎跟多进程差不多，但有一些很底层上的区别，同时带来了很多根本上的不一样的 feature。

我们在切换 thread 的时候，也有 context switch，但是这个是 context1switch between threads。我们之前管理很多的 process 的时候，我们有 process control block (PCB)，现在到了 multi-treaded 这里，我们有 thread control blocks (TCBs)。

关于 thread，最根本的不同我觉得已经在上面这个图里体现了，thread 是一个 process 的一部分，我们之前讲过，我们的 OS 会给每一个 process 虚拟化出来一个这个 process 自己的 VM，那么既然 threads 是在一个 process 里面的，那么他也就会用这个 process 的 VM。

那么我们是如何实现我们好像可以同时干不同的事情的呢，实现方法就是每个 thread 会有自己的 stack，但是他们会共享除了 stack 之外的其他部分，比如说这个 code section，heap section。

Thus, any stack-allocated variables, parameters, return values, and other things that we put on the stack will be placed in what is sometimes called thread-local storage, i.e., the stack of the relevant thread.

但是我们为什么需要 threads 呢？

第一点书中说的是 parallelization。

The task of transforming your standard single-threaded program into a program that does this sort of work on multiple CPUs is called parallelization, and using a thread per CPU to do this work is a natural and typical way to make programs run faster on modern hardware.

第二点是防止一直在等 I/O。

Threading enables overlap of I/O with other activities within a single program, much like multiprogramming did for processes across programs

我看到这里的时候，有一个疑问，就是这些 multiprocess 不都能干，为什么我们还需要 threads，书里也有解释：

Of course, in either of the cases mentioned above, you could use multiple processes instead of threads. However, threads share an address space and thus make it easy to share data, and hence are a natural choice when constructing these types of programs. Processes are a more sound choice for logically separate tasks where little sharing of data structures in memory is needed.

在我们实际写 multi-threaded program 的时候就会发现，一个 thread 其实有点像一个 function。它和 function 的区别就在于，我们 call 一个 function 之后，我们会直接执行这个 function 里的内容，然后我们在返回到 main 函数里，但是 thread 不会，我们创建完一个 thread，这个 thread 和我们原来的 main thread 就是独立的了，谁先运行不知道。

我们前面有提到 multi-threaded 的一个优点，他们在一个 process 里，在底层设计上就有共享的 data，但是这就带来了另一个问题，当多个 thread 同时 access 一个共享的数据，会发生什么。

书里用一个 counter 的例子在说明了这个问题。概括而言就是有两个 thread，同时向一个共享的 counter 里加 10000 次，最后我们的 counter 应该得到 20000，但实际上这不一定。

prompt> ./main
main: begin (counter = 0)
A: begin
B: begin
A: done
B: done
main: done with both (counter = 19221041)

想要分析这个问题我们需要借助更底层的代码，也就是 assembly code。像这个一个非常简单的向内存某个地方加一的操作需要三行 assembly code：

mov 0x8049a1c, %eax
add $0x1, %eax
mov %eax, 0x8049a1c

上面说的那个问题之所以会发生，就是因为 thread 1 在已经加完了（第二行代码），但是还没有移回去的时候（第三行代码），我们可能会发生 context switch，我们直接就执行 thread 2 了，这样我们就有可能少加一次。

这个现象叫做 race condition，或者更确切的说是 data race：the results depend on the timing of the code’s execution.

A race condition (or data race) arises if multiple threads of execution enter the critical section at roughly the same time; both attempt to update the shared data structure, leading to a surprising (and perhaps undesirable) outcome.

A critical section is a piece of code that accesses a shared resource, usually a variable or data structure.

An indeterminate program consists of one or more race conditions; the output of the program varies from run to run, depending on which threads ran when. The outcome is thus not deterministic, something we usually expect from computer systems.

To avoid these problems, threads should use some kind of mutual exclusion primitives; doing so guarantees that only a single thread ever enters a critical section, thus avoiding races, and resulting in deterministic program outputs.

解决上面那个问题的一个方法就是把这些在 critical section 的代码变成 atomic。

The idea behind making a series of actions atomic is simply expressed with the phrase “all or nothing”; it should either appear as if all of the actions you wish to group together occurred, or that none of them occurred, with no in-between state visible. Sometimes, the grouping of many actions into a single atomic action is called a transaction, an idea developed in great detail in the world of databases and transaction processing.

但是想要实现这个 atomic，我们需要 synchronization primitives from the hardware。

在这个 concurrency 这个 piece 里，我们就会学 how to build support for synchronization primitives to support atomicity。除此之外，我们还会学 mechanisms to support sleeping/waking interaction that is common in multi-threaded programs.

Chapter 27: Interlude: Thread API

这一章节主要讲的就是具体 C 怎么用 Thread 了，比如说传什么参数进去这种。

首先就是怎么创建一个 thread：

#include <pthread.h>
int
pthread_create(pthread_t *thread,
							 const pthread_attr_t *attr,
							 void *(*start_routine)(void*),
							 void *arg);

然后讲了 thread completion，用 pthread_join() 可以等对应的 thread 结束之后，在运行后面的代码。

int pthread_join(pthread_t thread, void **value_ptr);

接下来是另一个比较重要的事情，locks，locks 可以给我们提供 mutual exclusion。

int pthread_mutex_lock(pthread_mutex_t *mutex);
int pthread_mutex_unlock(pthread_mutex_t *mutex);

如果你有一块代码是 critical section，你就可以把这段代码放到 lock 里面保护，比如如下这个例子：

pthread_mutex_t lock;
pthread_mutex_lock(&lock);
x = x + 1; // or whatever your critical section is
pthread_mutex_unlock(&lock);

上面这段代码是有错误的，只是为了给你一些 intuition。

Conditional variable 也是一个非常重要的组成部分。如果 threads 之间也有先后次序，那我们就可以用 conditional variable 来使得某一个 thread 等待另外一个 thread 完成一些必要步骤在运行：

int pthread_cond_wait(pthread_cond_t *cond, pthread_mutex_t *mutex);
int pthread_cond_signal(pthread_cond_t *cond);