Course Content#
Definition and Initialization of Arrays#
- Definition: A sequential collection of the same type of elements of a fixed size
- Initialization
- int a[100] = {2, 3}; // a[0] = 2, a[1] = 3
- int a[100] = {0}; // Array clearing operation: initializes every element to 0
- Actually only defines the 0th element as 0, but causes other elements to be automatically initialized to 0
- Others
- Starts from index 0
- Size of array space: total size of all variables
- Storage space is contiguous
- int a[3]
Address | 0123 | 4567 | 89[10][11] |
---|---|---|---|
Element | a[0] | a[1] | a[2] |
-
- The arrays defined above are all static arrays
- Not recommended operation: int n; scanf("%d", &n); int arr(n);
- Index-value mapping
- Given the index, the time efficiency of accessing the value is O(1)
- The arrays defined above are all static arrays
Prime Sieve#
Core Idea: Use prime numbers to eliminate all composite numbers
- Learn as an algorithm framework, with many variations
- Prime numbers: only have 1 and itself as factors
- 1 is neither a prime nor a composite number
- 2 is the smallest prime number
- Prime numbers must be odd (except for 2), but odd numbers are not necessarily prime
- Utilizing array structure. An array can represent three types of information:
- Index
- Value mapped by the index
- Sign of the value
- Algorithm measurement standards and performance
- Space complexity: O(N), an array of size N is created for N numbers
- Time complexity: O(N * logN), O(N * loglogN)--optimized version, approaching O(N)
- Not O(N) due to duplicate marking
- The base of the logarithm is not important; base 2 is already small
- How to calculate: amortization idea, calculating expected value based on time complexity and probability...
- Marking: Prime, 0; Composite, 1
- Utilizing two states of the array: index → number, value → mark
- Prime sieve extension
- Find the minimum and maximum prime factors of all numbers within a range
- For example, for 12: 2, 3
Linear Sieve#
- Start from Euler's 7th problem
- Find the 10001st prime number
- Can use enumeration method O(N * sqrt(N)), prime sieve
- Estimate how large the 10001st prime number will be?
- Rule: The range of primes ranked within 200,000 will not exceed 20 times their rank
- Thoughts from the prime sieve
- How many times is 6 marked? 2 times: 2, 3
- How many times is 30 marked? 3 times: 2, 3, 5
- If a number N's prime factorization contains m different primes, how many times is N marked? m times
- Can it be marked only once? 👇
- Linear sieve
- Both space and time are O(n)
- Use an integer M to mark composite number N
- The relationship between M and N satisfies four properties
- The smallest prime factor of N is p
- N = p * M
- ❗ p must be less than or equal to the smallest prime factor in M
- Similar to the principle of capturing Meng Huo, only mark on the last occasion
- Use M * P' (the set of all primes not larger than the smallest prime of M) to mark N1, N2, ...
[i and iii are somewhat equivalent]
- Code
-
- Core idea: N = M (↑ as large as possible) * p (↓ as small as possible)
- Enumerate M, gradually increase p until p meets the termination condition: property iii
- Compared to M, p is better controlled!
- The key to avoiding duplication is property iii
- Note: Although there are two layers of for loops, the time complexity is O(N) because of the break in the second layer for loop, the entire array is traversed once
- The difference from optimizing the initial condition of the prime sieve to i * i is? The prime sieve still has duplicate marking situations
-
Binary Search Method#
-
Sequential search → binary search: O(N) → O(logN)
-
Key premise: The search sequence must be monotonic
-
Code
-
-
-
Mainly demonstrates three parts: ① Binary search in an array ② Binary search in a function ③
-
For the implementation part of binary search in a function
- Should be able to unify binary_search_f_i and binary_search_f_d into one function
- Focus on later learning, can you identify the return type of the passed function pointer?
- However, there are differences in the implementation part, so the significance should not be great
-
The commented code in the main function tests the results of binary search in arrays and functions, as follows:
-
- Using a function for binary search can be more flexible
-
Double type equality: use the difference less than EPSL to judge
- Remember to #undef EPSL
-
Details are in the comments
- Learning order: binary_search 👉 binary_search_f_i 👉 binary_search_f_d
-
-
Recursive Version Code
-
-
The parameters of the function differ from the iterative method, and the search boundaries need to be determined
-
Newton Iteration Method#
-
-
Purpose: To solve for the value of x when f(x) = 0
-
Conditions: The function must be differentiable; convergence
-
Iteration Formula: x2 = x1 - f(x1) / f'(x1)
-
Core idea: Each iteration brings x2 closer to the required value of x until the absolute value of f(x) is smaller than EPSL
-
Code
-
-
⭐ Remember to take the absolute value for the loop termination condition!
-
Is x = n / 2.0 this initial value fixed?
- No
- This value is an imagined x closest to the solution, which is fine on both sides of the solution
- However, it cannot be 0, as the derivative would be 0, leading to a division by 0
-
Wrap the Newton function to obtain the my_sqrt function
-
This is not binary search! Binary search requires monotonicity~
-
Preprocessing Commands#
- Commands starting with #
- #include <stdio.h> → During preprocessing, the stdio.h file is copied to the current location unchanged
- Macro Definitions
- Define symbolic constants: constant → symbol.
#define PI 3.1415926
#define MAX_N 1000
- Convenient for batch modification of values
- Foolish expressions
#define MAX(a, b) (a) > (b) ? (a) : (b)
#define S(a, b) a * b
- Foolish eg. #define S(a, b) a * b
- S(2 + 3, 4 + 5) → Simple replacement → 2 + 3 * 4 + 5 is equivalent to 2 + (3 * 4) + 5
- No calculations are done during the replacement!
- Define code segments (advanced)
#define P(a) { \
printf("%d\n", a); \
}
-
Macros can only accept one line of definition, and a newline requires the use of a backslash '' (continuation character)
-
Predefined macros (system encapsulated)
-
-
The two ends of the macro are two underscores
-
DATA, TIME: The date and time at pre-compilation, which do not change without preprocessing
- Can achieve functions similar to fingerprint recognition? Pre-stored relationships?
-
Non-standard macros: Some environments may not define them, and different environments may have different standards, so portability is poor
- How to handle this? Conditional compilation can be used 👇
-
-
Conditional Compilation
- #ifdef DEBUG
- Explanation: Whether the DEBUG macro is defined
- Note: DEBUG must be a macro, not a variable!
- Macros take effect after preprocessing, while variables take effect after compilation
- #ifndef DEBUG Whether not defined
- #if MAX_N == 5 Whether the macro MAX_N equals 5
- Can be used to check the user's mobile version in games
- #elif MAX_N == 4
- #else
- Must end with #endif**!!
- 【Coding Specification】
- Similar to va_start, va_end
- #ifdef DEBUG
-
Simple Compilation Process
-
-
C Source Code .c
👇 Preprocessing Stage (Pre-compilation) (gcc -E): Includes header files, macro replacements, and other preprocessing commands, replacing all # commands
- Compile Source Code .c
👇 Syntax checking, compilation (gcc –S)
- Assembly Source Code .s, .asm
👇 Assembly (gcc -c)
- Object File .o (under Linux) or .obj (under Windows)
👇 Linking (gcc)
- Executable Program a.out (under Linux) or a.exe (under Windows) (binary file)
PS: Finally, there is a loading process, mainly to establish the mapping relationship between the executable file, virtual address, and physical address
-
Strings#
- Character Arrays
- Definition: char str[size];
- Initialization
char str[] = "hello world";
char str[size] = {'h', 'e', 'l', 'l', 'o'};
-
Line1
- The size in [] can be omitted, and the system will calculate it automatically
- The system will automatically add a terminator '\0', the size of the character array: 11+1
-
Line2
- Size must be at least 6, considering the terminator '\0' occupies one position
- Extra positions are filled with '\0'
- If size is not added, '\0' must be added at the end of the array, as the system will not add it automatically
- Size must be at least 6, considering the terminator '\0' occupies one position
-
'\0'
- A termination marker
- Corresponds to 0 at the lower level: false
- '\0' can serve as a termination condition, such as: for loop reading a string
-
Is there a '\0' after the character? No, it is a characteristic of strings
-
Related Operations
-
-
Header File: <string.h>
-
strlen(str)
- Returns the index of character '\0', length does not include '\0'
- PS: sizeof() considers the actual memory occupied by the variable, including '\0', measured in bytes
-
strcmp(str1, str2);
- Compares ASCII values in dictionary order
- Starts from the first position
- Return value is the difference (0 if equal)
- Compares ASCII values in dictionary order
-
strcpy(dest, src);
-
The end markers for comparison and copying: look for character '\0', what if '\0' is lost? Hence there are
-
Safer comparisons and copies: strncmp, strncpy
- For cases where '\0' might be accidentally lost
- Compare and copy at most n bytes
-
Memory Related
-
memset(str, c, n)
-
Not limited to string operations, str corresponds to an address
-
Here is an example of operations on arrays
-
memset(arr, 0, sizeof(arr))
- Initializes, clears to 0
-
memset(arr, 1, sizeof(arr))
- This does not set every position to 1, because it operates on each byte, while int occupies 4 bytes
- 0000...01000...001000....0001...
-
memset(arr, -1, sizeof(arr))
- It is indeed -1, because -1 in 1 byte is 11111111
-
-
-
-
-
Header File: <stdio.h>
-
The combination of both can perform format concatenation: one for input, one for output
-
The str1 in scanf is analogous to terminal input, which is replaced with string input
In-Class Exercises#
Implement a BUG-free MAX macro#
-
-
Idea: ① Calculate correct output manually 👉 ② First write the simplest implementation with BUG and have friendly output display 👉 ③ Find and solve problems sequentially for erroneous outputs
-
① Correct output as shown
-
② First write with BUG, and implement friendly output display
#define MAX(a, b) a > b ? a : b
#define P(func) { \
printf("%s = %d\n", #func, func); \
}
- Line2-4: Friendly output display
- #func directly outputs its string representation
- Not adding # means calling its value
- ③ Find and solve problems sequentially for erroneous outputs
-
(The order of solving BUGs is also important, but here the order of BUGs is good)
-
At this point, check the erroneous situation
-
-
Errors caused by macros can be seen in the preprocessed file
- gcc -E *.c outputs the code after macro replacement
-
First error correction
-
- Solution: You can use parentheses to increase the precedence in the red box
-
-
#define MAX(a, b) (a > b ? a : b)
-
- Second error correction
-
-
The fourth result is 2, as shown in the above image
-
Note the conditional operator '?:'
-
Its precedence is very low: lower than '>'
-
When there are multiple '?:', its associativity is from right to left
-
-
-
Solution: Use parentheses for each variable to increase precedence
-
- Second error correction
#define MAX(a, b) ((a) > (b) ? (a) : (b))
-
- Third error correction
-
-
a++ ran twice
- It should first take a for comparison, then perform a+1
-
Solution: Assign the pre-increment value to a variable; define it as an expression generated by a code segment
-
- Third error correction
#define MAX(a, b) ({\
__typeof(a) _a = (a);\
__typeof(b) _b = (b);\
_a > _b ? _a : _b;\
})
-
Does __typeof(a++) execute?
- When used in an expression, it does not execute, it just retrieves its type
-
({}) expression
-
The value is the last expression's value in parentheses, and the data type is that of the last expression
-
Refer to C language's use of parentheses and braces in conjunction with && compound statements-CSDN
-
-
Removing {} or () will not work!
- () can only contain comma expressions
- The comma expression takes the value of the last expression
- {} is a code segment, which has no return value
- If directly defined as a macro function
- Directly printf output would lose the essence of MAX returning the maximum value
- ❓ Use return to return value
- Needs declaration
- But using an expression to represent the return value is smarter
- () can only contain comma expressions
-
-
Final Version Code
-
Result is all correct!
Implement LOG macro printing#
-
-
-
Code
-
-
When delivering work, to omit log information, conditional compilation can be used
- Use "gcc -DDEBUG" during compilation to pass in the DEBUG macro definition
- Define an empty macro replacement after #else
-
Variadic Macros
- Variadic '...' just needs a name. eg. args...
-
Usage of "#" and "##" in macros
- "#" : Take string
- "##" : Concatenate
-
Solve the case where the log macro input is an empty character: If empty, the comma after frm will be automatically removed, refer to the precompiled result 👇
-
-
Refer to Text Replacement Macros-cppreference
-
-
The difference between ab and a##b
- ab will not substitute a and b before concatenation
- The parameters corresponding to a and b that need to be concatenated can be undefined
- The preprocessor will concatenate these undefined variable names, and the defined variable name after concatenation is sufficient
-
-
Highlights Notes#
Code Demonstration#
Prime Sieve#
Code
-
-
Dual logic, control indentation
- if continue
-
The front part of the prime array can simultaneously count and store (used to mark overall)
- Initially, prime[0] and prime[1] are not used, left idle
- There will be no overtaking issue: even numbers are definitely composite, and prime numbers appear at least every other time
-
The initial condition of the second for loop can be optimized
- Initial Condition: i * 2 → i * i
- Time efficiency: O(Nlogn)→O(Nlognlogn)
- Because composite numbers smaller than i * i must have been marked by numbers smaller than i
- This reduces the number of duplicate markings in the earlier part, but not completely
- To completely avoid duplicate marking, a linear sieve must be used
- Danger point: i * i grows exponentially, more likely to overflow the maximum value that int can store
- Can a judgment be added in advance? But it may lead to greater time consumption
- Change to a data type with a higher upper limit to store i * i
Arrays#
Code
-
-
Output
-
- Details 👇
-
Declaration and Initialization#
- Programming Tip: Open 5 more positions to avoid overflow, which can reduce the bug rate
#define max_n 100
int arr[max_n + 5];
- The variable space in functions is all defined in the stack area
- The stack area can be understood as a badminton tube, last in first out
- Default stack size: 8MB≈8000000Byte (bytes)
- Arrays declared inside functions may be uninitialized and need manual initialization
- = {0};
- scanf input
for (int i = 0; i < max_n; i++) {
scanf("%d",arr + i);
}
-
- arr + i is equivalent to &arr[i]& cannot be omitted!
- Arrays declared outside functions
- The system will automatically clear them to 0, similar to the = {0} operation
- The size that can be allocated is limited by the computer's memory, essentially unlimited
- Arrays defined by malloc, calloc, realloc are in the heap area, even if defined inside functions
Space Occupation and Address#
- The control character corresponding to sizeof() is %lu, unsigned long
- The array name arr represents the address of the first element of the array, which is &arr[0]
printf("%p %p\n", arr, &arr[0]); // Address values are the same
- Address + 1 offsets how many bytes depend on the element byte size represented by the address
Parameter Passing#
- Condition: The external and function parameter representations must be consistent
- One-dimensional array: int arr[100];
- void func(int *num)
- Two-dimensional array: int arr2[100][100];
- void func2(int **num) will warn
- num + 1 and arr + 1 have inconsistent representations
- num is a pointer to an int pointer
- The address difference between num and num + 1 is the size of one pointer, which is 8 bytes in a 64-bit operating system
- The address difference between arr and arr + 1 is the size of one one-dimensional array, which is 4 * 100 bytes
- void func2(int (*num)[100]) is possible
- void func2(int **num) will warn
- Three-dimensional array: int arr3[100][100][32]
- void func3(int (*num)[100][32]) is possible
- (*num)[100][32] can be written as num[][100][32]
- (*num) and num[] have the same representation, but they are essentially different
- One-dimensional array: int arr[100];
- ❓ Is q pointing to nil? That is 0x0
Additional Knowledge Points#
- An array can represent three types of information
- int arr[100]; // When arr is passed as a parameter to a function, it is the address of the first element of the array
- Is 1 negated to -2? Derive it
- 1: 0000...001
- Negation: 1111...110
- Sign bit remains unchanged, negation plus one gives: 1000...010
- That is -2
- Conclusion: The bitwise negation of all positive integers is their own +1's negative number
- For source files containing <math.h>, add -lm during compilation, using gcc *.c -lm
- Coding habit: No more than 80 characters per line
- Control character "%X": Uppercase hexadecimal
Points to Ponder#
- Who is more powerful, functions or macros?
- Macros can generate functions; aren't macros the father of functions?
Tips#
- Modify the format for automatic completion of header files in vim: add spaces
- Enter the ~/.vimrc file and modify the corresponding format in SetTitle()
- Explore macro expansion and nesting yourself
- Refer to the reference book chapters 8.1, 8.4, and chapter 15