Course Content#
Inter-process communication is complex [independent memory space], and the switching cost is high [temporal locality], so threads were invented.
Threads#
A branch of a process [pthread], essentially a lightweight process.
- Communication is convenient because multiple threads within a process share memory.
- Switching cost is low because memory is shared, and there is no need to swap caches when switching threads.
pthread_create#
Create a new thread.
- man pthread_create
- Prototype
- thread: thread id [Note: not a numeric type, cannot be directly compared with ==, see pthread_equal]
- attr: attributes
- arg is the parameter for start_routine.
- Description
- Start a new thread and execute the start_route function.
- The start_route function can only accept one arg parameter [multiple parameters can be encapsulated in a structure].
- There are 4 ways for a thread to terminate [as a tool, the method of death is very important].
- ① Suicide: call pthread_exit by itself.
- Threads in the same process can use pthread_join to receive its death status [similar to wait].
- ② Normal death: return from the start_routine function.
- Equivalent to the pthread_exit method.
- ③ Homicide: pthread_cancel.
- ④ Mutual destruction: a thread in the process calls exit, or the main thread returns from the main function.
- [PS] If a thread causes a memory crash, it is very likely to also produce a mutual destruction effect, meaning all threads in the process die.
- ① Suicide: call pthread_exit by itself.
- attr can be NULL, corresponding to default attributes.
- After a successful call, the thread id will be saved in the thread variable, which can be used later [similar to a file descriptor].
- Return value
- 0 for success; otherwise, failure.
——3 Methods to Terminate Threads——#
pthread_exit#
Thread suicide.
- man pthread_exit
- ❗ Thread suicide, passing retval to the joining thread [threads are joinable by default].
- Execute the functions registered with pthread_cleanup_push, then release thread-specific data.
- Shared resources in the process will not be released [because there are sibling threads].
- Functions registered with atexit will not be called [these belong to the process].
- After the last thread ends, the process ends with exit(0), releasing shared resources and executing functions registered with atexit.
- 【Note】 The relationship between threads and processes.
pthread_cancel#
Send a cancellation request to a thread [homicide].
- man pthread_cancel
- The possibility and timing of thread cancellation depend on two attributes: state, type.
- state
- Cancelable [default].
- Uncancelable: cancellation commands received will be queued.
- type
- Deferred [default]: waits for the thread's next call.
- Asynchronous: immediate, but the system cannot guarantee it.
exit [Process Related]#
Terminate a normal process.
- man exit
- The value passed to the parent process is: status & 0377.
- Note: 0377 is octal, corresponding to eight 1s in binary, which means only the lower 8 bits of status are retained.
- Functions registered with atexit and on_exit will be called in the reverse order of registration.
- Can be nested: registered functions can also register more functions, which will be placed at the front of the call list.
- If a registered function does not return, for example, if it calls _exit or uses a signal to terminate, the remaining functions will not be called, and exit-related processing will be prohibited.
- Functions registered multiple times will be called multiple times.
- ⭐ After exit, all standard IO streams will be flushed and closed.
——Monitoring Thread Status Related——#
pthread_join#
Wait for a thread to terminate.
- man pthread_join
- Can be compared to the wait function in processes.
- retval receives the thread exit status.
- If the thread is a suicide, it copies the retval value from pthread_exit.
- If the thread is a homicide, it is assigned the value PTHREAD_CANCELED.
- [Consider] Here, retval is a double pointer, that is, a pointer to a pointer. Why?
- Surface reason: the retval in pthread_exit is a pointer, so by convention, a double pointer is used here [similarly, if an int is received, int * would be used].
- Further reason: to allow modification of the passed pointer.
- A blog also mentioned this: Discussing why the second parameter of pthread_join() is a double pointer——CSDN.
pthread_detach#
Detach a thread.
- man pthread_detach
- Once a thread is detached, the system will automatically reclaim its resources upon termination, without requiring other threads to block and wait for it to terminate.
- Generally used with pthread_self to detach itself.
- pthread_self: obtain the id of the calling thread [itself].
- Reference Detailed Explanation of pthread_join() and pthread_detach()——CSDN.
——Additional——#
pthread_yield#
Yield the processor [processor].
- man pthread_yield
- [Similar to the effect of sleep].
- This method is only used in certain systems; a more standard usage is: sched_yield.
- For cooperative systems, calling this function actively yields the CPU.
- For preemptive systems, the kernel will schedule, and this function has little significance; sleep can be used directly.
- Cooperative vs. preemptive, see 4 Advanced Process Management——Scheduler Classification.
pthread_equal#
Compare the ids of two threads [cannot be directly compared with ==].
- man pthread_equal
- If equal, returns a non-zero value.
Thread Pool#
Let a bunch of threads stay in the pool, ready to work at any time.
Basic components👇
① Task Queue: stores tasks that need to be processed.
- [Circular queue is better].
- Basic operations: init, push, pop.
② Multiple threads: always ready, reducing the time for creation and destruction.
③ Thread function: do_work().
- while(1): wait for tasks to be added.
-
- Task queue pop: pop a task for the thread to execute.
-
- do_work(): execute the task in the CPU.
-
❗ Note: Both push and pop need to be locked to prevent data races [thirsty threads].
- See code demonstration for details.
Kernel Threads, User Threads#
Who creates threads? Thread model.
- The main difference between the two lies in scheduling: kernel threads are scheduled by the kernel; user threads are scheduled by user processes.
- Advantages of kernel threads:
- ① Each kernel thread has its own time slice. Therefore, a process with multiple threads will have more processor time; while a user process will not gain more processor time just because it has several user threads.
- ② If a kernel thread is blocked, the remaining threads in the process can continue to run. If a user thread is blocked, the entire process will be blocked.
- PS: If a kernel thread sends a sleep signal to its own process, that thread can still continue to run.
- Advantages of user threads:
- ① Low switching cost. Does not involve transitioning from user mode to kernel mode.
- ② Scheduling algorithms are completely controlled by the process. User processes can use their own scheduling algorithms, so they have better autonomy; while the scheduling of kernel threads is a black box for users.
- Therefore, a hybrid thread design can combine the advantages of both kernel and user threads.
Code Demonstration#
Simple Use of Multithreading#
- Focus: usage of the pthread_create function.
- If there is no usleep after creating the thread or if the usleep time is too short, the following may occur:
- The main thread returns, causing all child threads to mutually destroy.
- At this time, some outputs may appear twice, such as ①, ② [③ is normal output].
-
- Speculation: the issue with the output buffer, where the buffer content is output again when the thread suddenly ends [the buffer has not been updated].
- [PS] Using fflush does not solve the problem, possibly because the thread ends suddenly.
- Note: All threads can operate on the value at the same address.
Thread Pool#
thread_pool.h
-
- Definition of the task queue and basic operations.
thread_pool.c
-
-
- Note: Locking and unlocking, sending signals; checking if the queue is full/empty; checking if the pointer has reached the end of the queue.
1.test.c
- The buffer for storing data is a two-dimensional array, which can avoid the main thread changing the data through the address while the thread reads the data.
- The clever use of usleep: to avoid excessive CPU consumption from the while(1) loop.
- pthread_detach is generally used with pthread_self.
- fgets: read from a file into the buffer.
- man fgets
-
- Read by line.
- Output effect.
- Basic output: push [task queue output]; pop [task queue output] + do_work [thread output].
Additional Knowledge Points#
- When compiling files that include thread-related functions, be sure to use -lpthread.
Tips#
- A created process also represents a main thread.
- Multithreading is still executed sequentially in a single CPU, so it is not recommended to use multithreading on a single-core CPU.
- Reference How do single-core and multi-core CPUs work for multithreaded programs——cnblogs.