Bo2SS

Bo2SS

File-based Inter-Process Communication — Using 100 Processes to Compete for the Sum

Requirement Description#

  1. Set a concurrency level INS, representing the number of processes to be opened.
  2. Use these INS processes to calculate the sum of numbers from start to end.
  3. start and end are obtained by parsing command line arguments using getopt.
./a.out -s 12 -e 24
  1. Output an integer result: sum

[Note]

  • Mainly involves file and process-related operations.
  • Using files for data sharing requires consideration of data races.
  • Attempt to use file locks to simulate mutex locks between threads.
  • Achieve synchronized access to critical data (data modified by multiple processes or threads) through file locks.
  • Need to learn about flock: man 2 flock

Final Result#

  • Calculating the sum from 1 to 1000 with 100 processes, the effect is as follows:
    • Image
    • Image
    • Successfully allowed processes to compete for calculating the sum on the same data.

Implementation Process#

Flowchart#

  • Image
  • Grasp the tasks of the parent and child processes.
  • Key: The locking operation for multiple processes accessing the same file makes reading and writing data an "atomic operation" [the smallest indivisible unit].
    • Can be understood as an atomic operation, but essentially just ensures the integrity of the data read and write process.
    • A process may be interrupted due to the time slice being exhausted, but because of the lock, other processes cannot access this data at that time.

Getting Command Line Arguments#

Capture the -s and -e options, which must be accompanied by parameters.

#include "head.h"
int main(int argc, char **argv) {
    int opt, start = 0, end = 0;
    while ((opt = getopt(argc, argv, "s:e:")) != -1) {
        switch (opt) {
            case 's':
                start = atoi(optarg);  // atoi: string -> integer
                break;
            case 'e':
                end = atoi(optarg);
                break;
            default:
                fprintf(stderr, "Usage : %s -s start_num -e end_num\n", argv[0]);
                exit(1);
        }
    }
    printf("start = %d\nend = %d\n", start, end);
    return 0;
}
  • The header file "head.h" is at the end.
  • atoi: string 👉 integer, optarg is a character array.
  • The effect is as follows:
    • Image
    • 🆗

Creating INS Processes#

Use fork to create INS processes, and be careful to use wait to prevent zombie processes.

#define INS 100
pid_t pid;
int x = 0;       // x: process number
for (int i = 1; i <= INS; i++) {
    if ((pid = fork()) < 0) {
        perror("fork");
        exit(1);  // Just for convenience, not recommended in work.
    }
    if (pid == 0) {
        x = i;   // Assign number to child process.
        break;   // Key, otherwise it will keep nesting.
    }
}
if (pid != 0) {
    // Prevent zombie processes [wait for all child processes to finish].
    for (int i = 1; i <= INS; i++) {
        wait(NULL);
    }
    // Parent process
    printf("I'm parent!\n");  
} else {
    printf("I'm %dth child!\n", x);
}
  • This code segment is placed in the main function after obtaining command line arguments.
  • INS is defined as a macro.
  • If the child process creation fails, it directly exits(1) for convenience, which is not recommended in work.
  • The effect is as follows:
    • Image
    • Successfully created 100 child processes.

File-based Data Read and Write Interface#

Use files as carriers for shared data between processes.

  • How to store data in files? ASCII code [character], int [low 16 bits + high 16 bits]...
  • Here, a structure is used to store data for clarity.
    • Image
    • Store addends and sums.
char data_file[] = "./.data";
char lock_file[] = "./.lock";  // [Optional] Set a dedicated lock.
struct Msg {
    int now;                     // Addend
    int sum;                     // Sum
};
struct Msg data;               // Structure data.
// Write structure data.
size_t set_data(struct Msg *msg) {
    FILE *f = fopen(data_file, "w");                     // Write
    if (f == NULL) {
        perror("fopen");
        return -1;                                         // Exiting in a small function is too rude.
    }
    size_t nwrite = fwrite(msg, 1, sizeof(struct Msg), f); // Write 1 byte at a time.
    fclose(f);
    return nwrite;            // Returns the number of bytes successfully written; if an error occurs, it also returns to the upper layer.
}
// Read structure data.
size_t get_data(struct Msg *msg) {
    FILE *f = fopen(data_file, "r");
    if (f == NULL) {
        perror("fopen");
        return -1;
    }
    size_t nread = fread(msg, 1, sizeof(struct Msg), f);    // Read structure data into msg.
    fclose(f);
    return nread;
}
  • Create a global variable data for data manipulation in processes.
  • Use standard file operations; low-level file operations are also feasible.
  • Return values can be used by callers to check whether read and write were successful.

Adding Locks⭐#

Allow processes to compete to maintain shared data and protect the data file from simultaneous operations.

【Two Approaches】 Use one file; use two files.

  • Approach One: Directly lock the data file.
char data_file[] = "./.data";
// Perform addition [atomic operation: read + write]; end: addition stop condition; id: child number [can monitor from a god's perspective].
void do_add(int end, int id) {
    // Child keeps adding inside.
    while (1) {
        /*
         * Approach One: One file, directly lock the data file.
         */
        // Open data_file for locking.
        FILE *f = fopen(data_file, "r");
        // Add mutex lock.
        flock(f->_fileno, LOCK_EX);
        // Read data from the file [the get_data function will open the data_file file again, corresponding to a new fd, the lock is not shared].
        if (get_data(&data) < 0) continue;
        // Addend +1, and check if the addend exceeds the range.
        if (++data.now > end) {
            fclose(f);
            break;
        }
        // Perform addition.
        data.sum += data.now;
        printf("The <%d>th Child : now = %d, sum = %d\n", id, data.now, data.sum);
        // Write data to file.
        if (set_data(&data) < 0) continue;
        // Unlock [closing later will also automatically release the lock].
        flock(fileno(f), LOCK_UN);
        fclose(f);
    }
}
  • Function parameters: end serves as a reference for the addition stop condition, id can be used to observe which child is performing each addition.
  • Locking 👉 Unlocking in the middle is an atomic operation [the smallest indivisible unit].
    • Encapsulates reading data, performing calculations, and writing data operations; during the process, data will not be preempted.
  • Obtain the file descriptor fd from the file pointer FILE* f.
    • ① f->_fileno
    • ② fileno(f)
  • [PS]
    • Repeatedly opening a file will yield different file descriptors, and the locks are independent of each other.
    • Closing a file will automatically release the lock.
    • After each call to the read and write interface, make good use of the return value to determine whether the operation was successful.
  • Approach Two: Set a dedicated file for locking.
char data_file[] = "./.data";
char lock_file[] = "./.lock";  // Set a dedicated lock.
void do_add(int end, int id) {
    while (1) {
        /*
         * Approach Two: Two files, use a separate file as a lock [easier to understand].
         */
        // Open or create a lock file; if the file is locked, it will wait for the user to unlock it.
        FILE *lock = fopen(lock_file, "w");  // "w": if the file does not exist, it will create one.
        if (lock == NULL) {
            perror("fopen");
            exit(1);
        }
        // Lock.
        flock(lock->_fileno, LOCK_EX);
        // Read data from the file.
        if (get_data(&data) < 0) {
            fclose(lock);            // Close the lock file, release the lock.
            continue;
        }
        // Addend +1, and check if the stop condition is met.
        if (++data.now > end) {
            fclose(lock);
            break;
        }
        // Perform addition.
        data.sum += data.now;
        printf("The <%d>th Child : now = %d, sum = %d\n", id, data.now, data.sum);
        // Write data to file.
        if (set_data(&data) < 0) continue;
        // Unlock.
        flock(lock->_fileno, LOCK_UN);
        fclose(lock);
    }
}
  • lock_file is solely for locking purposes.
  • The effect is as follows: 【Single-core, 5 processes, calculating 1~100】
    • Image
    • Image
    • The single-core effect is more orderly than multi-core.
      • A single core can only run one process at a time.
      • You can use usleep() to suspend processes in advance, preventing one process from calculating for too long, making the order more chaotic.
    • If the output is passed to more, it will rearrange the output by process.
  • 【Note】
    • In the main function, write the initial values of data to the file first; otherwise, the file will be empty [see complete code].
    • In the main function, call the do_add() function in the child process logic, and in the parent process logic, wait for all child processes to finish before retrieving and outputting the final result from the data file.
  • ❗ If no locks are added, the results are still correct.
    • The addend and sum are packaged together, and the addition will not be erroneous.
    • However, each process will calculate the result completely, possibly due to buffering? No.
      • After all write operations, adding fflush, although there are some cases where calculations continue, each process will still arrive at the correct final result.
      • It is equivalent to a process finishing the calculation, writing data to the file, but another process reading data that is not the latest yet will calculate the sum again.
    • Explanation:
      • Multiple processes opening the same file, each process has its own file table entry (file object), containing its own file offset.
      • Therefore, multiple processes reading the same file can work correctly, but writing to the same file may produce unexpected results; refer to using pread, pwrite.
      • Also refer to Simultaneous File Operations by Multiple Processes in Linux — cnblogs.

Complete Code#

sum.c#

#include "head.h"
#define INS 100
char data_file[] = "./.data";
char lock_file[] = "./.lock";  // [Optional] Set a dedicated lock.
// Data to be passed.
struct Msg {
    int now;                     // Addend
    int sum;                     // Sum
};
struct Msg data;               // Structure data.
// Write structure data.
size_t set_data(struct Msg *msg) {
    FILE *f = fopen(data_file, "w");                     // Write
    if (f == NULL) {
        perror("fopen");
        return -1;                                         // Exiting in a small function is too rude.
    }
    size_t nwrite = fwrite(msg, 1, sizeof(struct Msg), f); // Write 1 byte at a time.
    fclose(f);
    return nwrite;            // Returns the number of bytes successfully written; if an error occurs, it also returns to the upper layer.
}
// Read structure data.
size_t get_data(struct Msg *msg) {
    FILE *f = fopen(data_file, "r");
    if (f == NULL) {
        perror("fopen");
        return -1;
    }
    size_t nread = fread(msg, 1, sizeof(struct Msg), f);    // Read structure data into msg.
    return nread;
}
// Perform addition [atomic operation: read + write]; end: addition stop condition; id: child number [can monitor from a god's perspective].
void do_add(int end, int id) {
    // Child keeps adding inside.
    while (1) {
        /*
         * Approach Two: Two files, use a separate file as a lock [easier to understand].
         */
        // Open or create a lock file; if the file is locked, it will wait for the user to unlock it.
        FILE *lock = fopen(lock_file, "w");  // "w": if the file does not exist, it will create one.
        if (lock == NULL) {
            perror("fopen");
            exit(1);
        }
        // Lock.
        flock(lock->_fileno, LOCK_EX);
        // Read data from the file.
        if (get_data(&data) < 0) {
            fclose(lock);            // Close the lock file, release the lock.
            continue;
        }
        // Addend +1, and check if the stop condition is met.
        if (++data.now > end) {
            fclose(lock);
            break;
        }
        // Perform addition.
        data.sum += data.now;
        printf("The <%d>th Child : now = %d, sum = %d\n", id, data.now, data.sum);
        // Write data to file.
        if (set_data(&data) < 0) continue;
        // Unlock.
        flock(lock->_fileno, LOCK_UN);
        fclose(lock);
        /*
         * Approach One: One file, directly lock the data file.
         */
        /*
        // Open data_file for locking.
        FILE *f = fopen(data_file, "r");
        // Add mutex lock.
        flock(f->_fileno, LOCK_EX);
        // Read data from the file [the get_data function will open the data_file file again, corresponding to a new fd, the lock is not shared].
        if (get_data(&data) < 0) continue;
        // Addend +1, and check if the addend exceeds the range.
        if (++data.now > end) {
            fclose(f);
            break;
        }
        // Perform addition.
        data.sum += data.now;
        printf("The <%d>th Child : now = %d, sum = %d\n", id, data.now, data.sum);
        // Write data to file.
        if (set_data(&data) < 0) continue;
        // Unlock [closing later will also automatically release the lock].
        flock(fileno(f), LOCK_UN);
        fclose(f);
        */
    }
}
int main(int argc, char **argv) {
    int opt, start = 0, end = 0;
    while ((opt = getopt(argc, argv, "s:e:")) != -1) {
        switch (opt) {
            case 's':
                start = atoi(optarg);      // atoi: string -> integer
                break;
            case 'e':
                end = atoi(optarg);
                break;
            default:
                fprintf(stderr, "Usage : %s -s start_num -e end_num\n", argv[0]);
                exit(1);
        }
    }
    printf("start = %d\nend = %d\n", start, end);
    // Write initial data to the file first.
    if (set_data(&data) < 0) return -1;     // data is a global variable, members are default to 0.
    pid_t pid;
    int x = 0;                              // x: process number.
    for (int i = 1; i <= INS; i++) {
        if ((pid = fork()) < 0) {
            perror("fork");
            exit(1);                        // Just for convenience, not recommended in work.
        }
        if (pid == 0) {
            x = i;                          // Assign number to child process.
            break;                          // Key, otherwise it will keep nesting.
        }
    }
    if (pid != 0) {
        // Prevent zombie processes [wait for all child processes to finish].
        for (int i = 1; i <= INS; i++) {
            wait(NULL);
        }
        if (get_data(&data) < 0) return -1; // Get the final result.
        printf("sum = %d\n", data.sum);
    } else {
        do_add(end, x);                     // The only task of the child process.
    }
    return 0;
}

head.h#

#ifndef _HEAD_H
#define _HEAD_H
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <sys/ioctl.h>
#include <sys/time.h>
#include <sys/wait.h>
#include <sys/file.h>
#endif
  • There may be extra header files, which are not the focus.

References#

Loading...
Ownership of this post data is guaranteed by blockchain and smart contracts to the creator alone.