Bo2SS

Bo2SS

6 Forests and Disjoint Sets

Course Content#

  • Union-Find: Merge sets [Establish connectivity], Find connectivity in sets [Determine if two points are connected]

Connectivity Problem#

  • Image
  • Rule: Points that already have a connectivity relationship cannot be connected again

  • All points are divided into 2 sets: ①, ②

  • [PS] The relationship between points (merge, check connectivity) → can also be → the relationship between sets

Quick-Find Algorithm#

  • Core Idea: Coloring
    • One color corresponds to one category
    • Initialization: Individuals are independent, all written as their own index, belonging to an independent set
    • ⭐ Change the color of all points that are connected to themselves to the color to be dyed
  • Time Complexity
    • Connectivity check: O(1) — Fast lookup, hence called Quick-Find
    • Merge operation: O(N) — Needs to traverse all points to determine if they should be merged
  • 🆒 Provokes thought: Why is it also called a forest?
    • The key lies in the merge operation, merging several points with several points, merging sets with sets
    • Similar to merging between subtrees, or treating one tree as a subtree of another tree
    • All sets together, similar to all trees together, is a forest
  • So, besides coloring, are there other methods?
    • Thought: The color it is dyed does not matter, only need to know which points have the same color, i.e., are connected

Quick-Union Algorithm#

  • Life Inspiration: Find the big brother, find the leader
  • Core Idea: Find representative elements [Root node]
    • The stored value is its representative element
    • Initialization: Individuals are independent, all written as their own index, belonging to an independent set
    • ⭐ Find the representative elements of two points, modify the value in one representative element to the value in the other representative element
    • Representative element: The value inside is itself
    • Association: If the big brother or his little brother betrays, they all start betraying from the big brother, betraying to the big brother of another little brother
      • The result is the same as Quick-Find, both merge a subtree into another subtree, but achieved through representative elements
      • Only root nodes can point to root nodes
  • Time Complexity
    • Must find the root node, related to tree height
    • Connectivity operation: O(tree height)
    • Merge operation: O(tree height)
    • ❗ 2 cases
      • Normal state → both O(logN)
      • Degenerate into a linked list → both O(N)
  • For degenerate cases, how to solve it? 👇

Weighted Quick-Union Algorithm#

【Union by Rank】

  • How to avoid degeneration? → Ensure lush branches and leaves
    • Merge basis 1: Tree height, short trees hang under tall trees [pairwise combination]
      • A tree with height h requires at least N nodes of 2 ^ (h - 1)
      • That is, tree height h = log[2]N + 1 ≈ log[2]N
      • [PS] Only when two trees of the same height merge will the height increase
    • Merge basis 2: Number of nodes, trees with fewer nodes hang under trees with more nodes
    • Both optimization methods can achieve O(logN), but merge basis 2 [number of nodes] is slightly better
  • ⭐ Why is merge basis 2 better
    • 【Example】What is the average search time
      • As shown in the figure below, the average search times for trees A and B are calculated
      • Image
      • Node depth is the number of searches for the node, average search time = total search time / total number of nodes
      • In this example, tree B's search operation is faster
    • 【Derivation】Merge basis 2 directly determines the average search time
      • For trees A and B with SA and SB nodes, their total search times LA and LB are:
        • Image
        • Where li represents the depth of the i-th node
      • At this time, perform the merge operation, calculating the average search times for ① A→B and ② B→A
        • ① When tree A is merged as a subtree into tree B, it is
          • Image
          • All nodes in tree A need to search once more
        • ② When tree B is merged as a subtree into tree A, it is
          • Image
          • All nodes in tree B need to search once more
      • ❗【Comparing the average search times of the two methods】
        • Not directly related to tree height [LA, LB], but the number of nodes in the numerator [SA, SB] directly determines the number of searches, the smaller the number, the better
        • 👉 Whoever has fewer nodes will be merged as a subtree
        • ❓ Thought: Does the above derivation prove that height cannot be used as a merge basis?
          • ❌ No, height indirectly affects the number of nodes, generally speaking, the lower the height, the fewer nodes
          • However, for special cases, if tree A is taller than tree B, but tree A has fewer nodes than tree B, still use [number of nodes] as the merge basis, merging tree A as a subtree into tree B
    • Therefore, using the number of nodes as the merge basis is better! 👇 The merging idea is as follows
  • When merging two subtrees
    • If the number of nodes is the same, follow the ordinary Quick-Union approach
    • If not the same, the root node of the subtree with fewer nodes connects under the 👉 root node of the subtree with more nodes
  • [PS] In other words
    • When changing big brothers
    • If the number of little brothers is the same, follow the ordinary Quick-Union approach
    • If not the same, the big brother with fewer little brothers must mix with the 👉 big brother with more little brothers

+ Path Compression#

  • Refer to the visualization results of weighted Quick-Union in the class exercise 2
    • Image
    • The position of 0 is a bit awkward
    • ❗ Regardless of whether the representative element of 0 is 1 or 3, it makes no difference; directly attaching 0 under 3 can also reduce tree height
    • 【Method】Make each non-root node point directly to the root node, flattening the structure

Efficiency Comparison of the Above Algorithms

  • Image

Class Exercise#

Quick-Find vs. Quick-Union#

  • Image
  • 【Key】Understanding Quick-Union

    • 0->1->2->4->5, 3->4->5; 8->9->7->6
    • Search, merge boundaries: Stop when its representative element is itself

Quick-Union vs. Weighted Quick-Union#

  • Image
  • 【Key】Understanding the meaning of weighted

    • When the number of elements in two sets is different
    • The representative element of the set with fewer elements 👉 the representative element of the set with more elements
    • The big brother with fewer little brothers must follow the big brother with more little brothers
  • Result Visualization

    • Image
    • It is clear that the tree obtained by the weighted method is shorter, with higher efficiency in merging and searching

Code Demonstration#

HZOJ-71: Friend Circle#

Image

Sample Input

6 5
1 1 2
2 1 3
1 2 4
1 4 3
2 1 3

Sample Output

No
Yes
  • Idea
    • Use data structure — Union-Find
    • 1 is the merge operation, 2 is the find operation
    • Test the algorithm efficiency of Quick-Find, Quick-Union [+weighted, +path compression, -weighted]
  • Code

Quick-Find#

  • Image
  • Image
  • Search, merge strategy: Based on coloring

Quick-Union#

  • Image
  • Based on Quick-Find

    • Modify the structure definition's color to the representative element's father
    • Modify the search operation: Need to find the bottom
    • Modify the merge operation: Must find the bottoms of both elements before merging
  • Easy to degenerate into a linked list, optimization follows

+ Weighted#

  • Image
  • Based on Quick-Union

  • Add size attribute to record the number of nodes in the set where the node is located, using this as the merge strategy

+ Path Compression#

  • Image
  • Each search will perform a path compression, making all elements in the range from the 【searched element】 to the 【root representative element】 point to the root representative element [root node]

- Weighted#

  • Image
  • Remove all operations related to size
  • With path compression, not using rank for merging [removing size attribute] can still achieve good results

Time Taken for Testing Problems on HZOJ Platform

MethodQuick-FindQuick-Union+Weighted+Path CompressionWeighted
Time (ms)7442020164172188
  • Quick-Union encountered degeneration issues
  • After adding path compression, not using rank for merging can still achieve good results
  • 🆒The last three methods all have good effects!

Points of Reflection#

  • In the weighted Quick-Union algorithm section, does the derivation that the number of nodes is better prove that height cannot be used as a merge basis?
    • ❌ No, height indirectly affects the number of nodes; generally, the lower the height, the fewer nodes
    • However, for special cases, if tree A is taller than tree B, but tree A has fewer nodes than tree B
      • Must use [number of nodes] as the merge basis ❗ Merge tree A as a subtree into tree B

Tips#

  • "C++ Primer" is recommended as a reference book
  • Image

Course Summary#

Loading...
Ownership of this post data is guaranteed by blockchain and smart contracts to the creator alone.