Requirement Description#
- Set a concurrency level
INS
, representing the number of processes to be opened. - Use these
INS
processes to calculate the sum of numbers fromstart
toend
. start
andend
are obtained by parsing command line arguments usinggetopt
.
./a.out -s 12 -e 24
- Output an integer result:
sum
[Note]
- Mainly involves file and process-related operations.
- Using files for data sharing requires consideration of data races.
- Attempt to use file locks to simulate mutex locks between threads.
- Achieve synchronized access to critical data (data modified by multiple processes or threads) through file locks.
- Need to learn about flock: man 2 flock
Final Result#
- Calculating the sum from 1 to 1000 with 100 processes, the effect is as follows:
-
-
- Successfully allowed processes to compete for calculating the sum on the same data.
-
Implementation Process#
Flowchart#
-
- Grasp the tasks of the parent and child processes.
- Key: The locking operation for multiple processes accessing the same file makes reading and writing data an "atomic operation" [the smallest indivisible unit].
- Can be understood as an atomic operation, but essentially just ensures the integrity of the data read and write process.
- A process may be interrupted due to the time slice being exhausted, but because of the lock, other processes cannot access this data at that time.
Getting Command Line Arguments#
Capture the -s and -e options, which must be accompanied by parameters.
#include "head.h"
int main(int argc, char **argv) {
int opt, start = 0, end = 0;
while ((opt = getopt(argc, argv, "s:e:")) != -1) {
switch (opt) {
case 's':
start = atoi(optarg); // atoi: string -> integer
break;
case 'e':
end = atoi(optarg);
break;
default:
fprintf(stderr, "Usage : %s -s start_num -e end_num\n", argv[0]);
exit(1);
}
}
printf("start = %d\nend = %d\n", start, end);
return 0;
}
- The header file "head.h" is at the end.
- atoi: string 👉 integer, optarg is a character array.
- The effect is as follows:
-
- 🆗
-
Creating INS Processes#
Use fork to create INS processes, and be careful to use wait to prevent zombie processes.
#define INS 100
pid_t pid;
int x = 0; // x: process number
for (int i = 1; i <= INS; i++) {
if ((pid = fork()) < 0) {
perror("fork");
exit(1); // Just for convenience, not recommended in work.
}
if (pid == 0) {
x = i; // Assign number to child process.
break; // Key, otherwise it will keep nesting.
}
}
if (pid != 0) {
// Prevent zombie processes [wait for all child processes to finish].
for (int i = 1; i <= INS; i++) {
wait(NULL);
}
// Parent process
printf("I'm parent!\n");
} else {
printf("I'm %dth child!\n", x);
}
- This code segment is placed in the main function after obtaining command line arguments.
- INS is defined as a macro.
- If the child process creation fails, it directly exits(1) for convenience, which is not recommended in work.
- The effect is as follows:
-
- Successfully created 100 child processes.
-
File-based Data Read and Write Interface#
Use files as carriers for shared data between processes.
- How to store data in files? ASCII code [character], int [low 16 bits + high 16 bits]...
- Here, a structure is used to store data for clarity.
-
- Store addends and sums.
-
char data_file[] = "./.data";
char lock_file[] = "./.lock"; // [Optional] Set a dedicated lock.
struct Msg {
int now; // Addend
int sum; // Sum
};
struct Msg data; // Structure data.
// Write structure data.
size_t set_data(struct Msg *msg) {
FILE *f = fopen(data_file, "w"); // Write
if (f == NULL) {
perror("fopen");
return -1; // Exiting in a small function is too rude.
}
size_t nwrite = fwrite(msg, 1, sizeof(struct Msg), f); // Write 1 byte at a time.
fclose(f);
return nwrite; // Returns the number of bytes successfully written; if an error occurs, it also returns to the upper layer.
}
// Read structure data.
size_t get_data(struct Msg *msg) {
FILE *f = fopen(data_file, "r");
if (f == NULL) {
perror("fopen");
return -1;
}
size_t nread = fread(msg, 1, sizeof(struct Msg), f); // Read structure data into msg.
fclose(f);
return nread;
}
- Create a global variable data for data manipulation in processes.
- Use standard file operations; low-level file operations are also feasible.
- Return values can be used by callers to check whether read and write were successful.
Adding Locks⭐#
Allow processes to compete to maintain shared data and protect the data file from simultaneous operations.
【Two Approaches】 Use one file; use two files.
- Approach One: Directly lock the data file.
char data_file[] = "./.data";
// Perform addition [atomic operation: read + write]; end: addition stop condition; id: child number [can monitor from a god's perspective].
void do_add(int end, int id) {
// Child keeps adding inside.
while (1) {
/*
* Approach One: One file, directly lock the data file.
*/
// Open data_file for locking.
FILE *f = fopen(data_file, "r");
// Add mutex lock.
flock(f->_fileno, LOCK_EX);
// Read data from the file [the get_data function will open the data_file file again, corresponding to a new fd, the lock is not shared].
if (get_data(&data) < 0) continue;
// Addend +1, and check if the addend exceeds the range.
if (++data.now > end) {
fclose(f);
break;
}
// Perform addition.
data.sum += data.now;
printf("The <%d>th Child : now = %d, sum = %d\n", id, data.now, data.sum);
// Write data to file.
if (set_data(&data) < 0) continue;
// Unlock [closing later will also automatically release the lock].
flock(fileno(f), LOCK_UN);
fclose(f);
}
}
- Function parameters: end serves as a reference for the addition stop condition, id can be used to observe which child is performing each addition.
- Locking 👉 Unlocking in the middle is an atomic operation [the smallest indivisible unit].
- Encapsulates reading data, performing calculations, and writing data operations; during the process, data will not be preempted.
- Obtain the file descriptor fd from the file pointer FILE* f.
- ① f->_fileno
- ② fileno(f)
- [PS]
- Repeatedly opening a file will yield different file descriptors, and the locks are independent of each other.
- Closing a file will automatically release the lock.
- After each call to the read and write interface, make good use of the return value to determine whether the operation was successful.
- Approach Two: Set a dedicated file for locking.
char data_file[] = "./.data";
char lock_file[] = "./.lock"; // Set a dedicated lock.
void do_add(int end, int id) {
while (1) {
/*
* Approach Two: Two files, use a separate file as a lock [easier to understand].
*/
// Open or create a lock file; if the file is locked, it will wait for the user to unlock it.
FILE *lock = fopen(lock_file, "w"); // "w": if the file does not exist, it will create one.
if (lock == NULL) {
perror("fopen");
exit(1);
}
// Lock.
flock(lock->_fileno, LOCK_EX);
// Read data from the file.
if (get_data(&data) < 0) {
fclose(lock); // Close the lock file, release the lock.
continue;
}
// Addend +1, and check if the stop condition is met.
if (++data.now > end) {
fclose(lock);
break;
}
// Perform addition.
data.sum += data.now;
printf("The <%d>th Child : now = %d, sum = %d\n", id, data.now, data.sum);
// Write data to file.
if (set_data(&data) < 0) continue;
// Unlock.
flock(lock->_fileno, LOCK_UN);
fclose(lock);
}
}
- lock_file is solely for locking purposes.
- The effect is as follows: 【Single-core, 5 processes, calculating 1~100】
-
-
- The single-core effect is more orderly than multi-core.
- A single core can only run one process at a time.
- You can use usleep() to suspend processes in advance, preventing one process from calculating for too long, making the order more chaotic.
- If the output is passed to more, it will rearrange the output by process.
-
- 【Note】
- In the main function, write the initial values of data to the file first; otherwise, the file will be empty [see complete code].
- In the main function, call the do_add() function in the child process logic, and in the parent process logic, wait for all child processes to finish before retrieving and outputting the final result from the data file.
- ❗ If no locks are added, the results are still correct.
- The addend and sum are packaged together, and the addition will not be erroneous.
- However, each process will calculate the result completely, possibly due to buffering? No.
- After all write operations, adding fflush, although there are some cases where calculations continue, each process will still arrive at the correct final result.
- It is equivalent to a process finishing the calculation, writing data to the file, but another process reading data that is not the latest yet will calculate the sum again.
- Explanation:
- Multiple processes opening the same file, each process has its own file table entry (file object), containing its own file offset.
- Therefore, multiple processes reading the same file can work correctly, but writing to the same file may produce unexpected results; refer to using pread, pwrite.
- Also refer to Simultaneous File Operations by Multiple Processes in Linux — cnblogs.
Complete Code#
sum.c#
#include "head.h"
#define INS 100
char data_file[] = "./.data";
char lock_file[] = "./.lock"; // [Optional] Set a dedicated lock.
// Data to be passed.
struct Msg {
int now; // Addend
int sum; // Sum
};
struct Msg data; // Structure data.
// Write structure data.
size_t set_data(struct Msg *msg) {
FILE *f = fopen(data_file, "w"); // Write
if (f == NULL) {
perror("fopen");
return -1; // Exiting in a small function is too rude.
}
size_t nwrite = fwrite(msg, 1, sizeof(struct Msg), f); // Write 1 byte at a time.
fclose(f);
return nwrite; // Returns the number of bytes successfully written; if an error occurs, it also returns to the upper layer.
}
// Read structure data.
size_t get_data(struct Msg *msg) {
FILE *f = fopen(data_file, "r");
if (f == NULL) {
perror("fopen");
return -1;
}
size_t nread = fread(msg, 1, sizeof(struct Msg), f); // Read structure data into msg.
return nread;
}
// Perform addition [atomic operation: read + write]; end: addition stop condition; id: child number [can monitor from a god's perspective].
void do_add(int end, int id) {
// Child keeps adding inside.
while (1) {
/*
* Approach Two: Two files, use a separate file as a lock [easier to understand].
*/
// Open or create a lock file; if the file is locked, it will wait for the user to unlock it.
FILE *lock = fopen(lock_file, "w"); // "w": if the file does not exist, it will create one.
if (lock == NULL) {
perror("fopen");
exit(1);
}
// Lock.
flock(lock->_fileno, LOCK_EX);
// Read data from the file.
if (get_data(&data) < 0) {
fclose(lock); // Close the lock file, release the lock.
continue;
}
// Addend +1, and check if the stop condition is met.
if (++data.now > end) {
fclose(lock);
break;
}
// Perform addition.
data.sum += data.now;
printf("The <%d>th Child : now = %d, sum = %d\n", id, data.now, data.sum);
// Write data to file.
if (set_data(&data) < 0) continue;
// Unlock.
flock(lock->_fileno, LOCK_UN);
fclose(lock);
/*
* Approach One: One file, directly lock the data file.
*/
/*
// Open data_file for locking.
FILE *f = fopen(data_file, "r");
// Add mutex lock.
flock(f->_fileno, LOCK_EX);
// Read data from the file [the get_data function will open the data_file file again, corresponding to a new fd, the lock is not shared].
if (get_data(&data) < 0) continue;
// Addend +1, and check if the addend exceeds the range.
if (++data.now > end) {
fclose(f);
break;
}
// Perform addition.
data.sum += data.now;
printf("The <%d>th Child : now = %d, sum = %d\n", id, data.now, data.sum);
// Write data to file.
if (set_data(&data) < 0) continue;
// Unlock [closing later will also automatically release the lock].
flock(fileno(f), LOCK_UN);
fclose(f);
*/
}
}
int main(int argc, char **argv) {
int opt, start = 0, end = 0;
while ((opt = getopt(argc, argv, "s:e:")) != -1) {
switch (opt) {
case 's':
start = atoi(optarg); // atoi: string -> integer
break;
case 'e':
end = atoi(optarg);
break;
default:
fprintf(stderr, "Usage : %s -s start_num -e end_num\n", argv[0]);
exit(1);
}
}
printf("start = %d\nend = %d\n", start, end);
// Write initial data to the file first.
if (set_data(&data) < 0) return -1; // data is a global variable, members are default to 0.
pid_t pid;
int x = 0; // x: process number.
for (int i = 1; i <= INS; i++) {
if ((pid = fork()) < 0) {
perror("fork");
exit(1); // Just for convenience, not recommended in work.
}
if (pid == 0) {
x = i; // Assign number to child process.
break; // Key, otherwise it will keep nesting.
}
}
if (pid != 0) {
// Prevent zombie processes [wait for all child processes to finish].
for (int i = 1; i <= INS; i++) {
wait(NULL);
}
if (get_data(&data) < 0) return -1; // Get the final result.
printf("sum = %d\n", data.sum);
} else {
do_add(end, x); // The only task of the child process.
}
return 0;
}
head.h#
#ifndef _HEAD_H
#define _HEAD_H
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <sys/ioctl.h>
#include <sys/time.h>
#include <sys/wait.h>
#include <sys/file.h>
#endif
- There may be extra header files, which are not the focus.
References#
- Main knowledge points refer to "Network and System Programming".
- 0 Course Introduction and Command Line Parsing Functions — getopt
- 1 File, Directory Operations and Implementation of ls Ideas — fopen, fread, fwrite
- 3 Multi-process — fork, wait, flock⭐