Bo2SS

Bo2SS

1 Binary Search Tree, AVL Tree

Data structure is to define a property and maintain that property.

BS Tree 👉 AVL Tree

Course Content#

Binary Search Tree#

[Also known as binary search tree, binary lookup tree]

Basic Knowledge#

  • Property: For any triplet, left subtree < root node < right subtree
  • Purpose: To solve retrieval needs related to searching and ranking
  • 【Why is the entry point of a tree structure always the root node?】
    • First, understand the meaning of points and edges: sets and relations
    • The root node represents the whole set
  • Structural operations: insert, delete, search
    • Insert
      • Continuously compare with the root node of the subtree, recurse, until the subtree has no child nodes
      • ⭐ The new node will definitely become a leaf node
    • Delete [node]
      • Degree 0: delete directly
      • Degree 1: attach the child subtree [orphan subtree] directly to its parent node
      • Degree 2: [a bit troublesome]
        • Find predecessor or successor to replace the original node
          • Predecessor node: the node corresponding to the maximum value in its left subtree [this node definitely has no right subtree, degree at most 1]
          • Successor node: the node corresponding to the minimum value in its right subtree [this node definitely has no left subtree, degree at most 1]
          • 【Only for nodes of degree 2, otherwise need to consider finding towards the parent node, to what extent?】
        • ⭐ Thus transforming into the problem of deleting nodes of degree 0 or 1 [predecessor or successor]
        • [Can also be understood as the deletion operation of a one-dimensional array]
    • Search: Recursively search based on properties
  • The order of insertion will affect the tree structure, corresponding to different average search efficiencies
    • The same sequence, but different insertion orders
    • Average search efficiency: the expected value of the number of node searches. That is, average search times: total search times / number of nodes
      • Assuming each node is searched with equal probability
  • In-order traversal → an ordered sequence from smallest to largest
  • Refer to 《Basic Data Structures》——3 Trees and Binary Trees——Binary Search Trees

Extended Content#

[See specific code in code demonstration — Binary Search Tree]

  • Delete operation optimization [code demonstration — Binary Search Tree]
    • The delete operation for nodes of degree 0 and degree 1 can share the delete operation code for degree 1
    • Code for degree 0
    • Image
    • Code for degree 1
    • Image
    • When the degree is 0, temp points to NULL, also destroy root, then return NULL
  • — Find the element ranked k
    • ⭐ Add a size field to record the number of nodes in each subtree [including the root node]
      • The essence of data structure, modifying structure definition for specific problems
      • Update size during insertion and deletion
    • Search conditions
      • ① k = LS + 1, the root node is the value we are looking for
      • ② k < LS + 1, the element ranked k is in the left subtree, continue searching in the left subtree
      • ③ k > LS + 1, the element ranked k is in the right subtree, search for the element ranked k - LS - 1 in the right subtree
      • [PS] LS is the number of nodes in the left subtree
      • Image
      • Output at boundary conditions [-1 or key]
  • — Find the elements ranked top k [all elements ranked <= k]
    • 【Based on the ranking problem】
    • Search conditions
      • ① k = LS + 1, output all values in the left subtree
      • ② k < LS + 1, all top k elements are in the left subtree
      • ③ k > LS + 1, both the root node and the elements in the left subtree belong to the top k elements
      • Image
      • Equivalent to continuously reducing the size of the tree, thus transforming into the first case
      • There are actually many solutions [such as heaps], there is no standard answer
  • The relationship between binary search trees and quicksort
    • 【Binary search trees are the data structure used by quicksort at the conceptual level】
    • The Partition operation in quicksort
      • Treat the pivot value as the root node of the binary search tree
      • After each Partition operation, the relationship between the pivot value and the left and right sides is actually 👇
      • The relationship between the root node and the left and right subtrees
    • ⭐ Understanding the following two thoughts can clarify the relationship between the two
      • Thought 1: The relationship between the time complexity of quicksort and the tree-building time complexity of binary search trees
        • They are essentially the same, the tree-building process is similar to multiple Partition processes
      • Thought 2: The relationship between the quickselect algorithm and binary search trees
        • The essence of both is the same, the former manifests as an algorithm, the latter as a data structure
      • [Personal understanding]
        • Quicksort is a bunch of numbers doing a Partition once, then continuing to split
        • The tree-building process is one number at a time, determining the position of this number before moving to the next

Balanced Binary Search Tree — AVL Tree#

Solves the degeneration problem in binary search trees

Basic Knowledge#

  • Background
    • Inventors: AV, L, with AV making a greater contribution
    • Year: 1962
    • [Neural networks also emerged in this era]
  • Properties
    • Essentially also a binary search tree, possessing all properties of binary search trees
    • Additional property: The height difference between the left and right subtrees does not exceed 1 [left and right subtrees are roughly equal]
  • ⭐ Balance condition, balance adjustment

[Further] Thoughts#

What is the range of the number of nodes contained in a BS tree and AVL tree with height H?

  • [General upper limit for binary trees: 2 ^ H - 1]
  • <BS> H ~ 2 ^ H - 1
  • <AVL> low(H - 2) + low(H - 1) + 1 ~ 2 ^ H - 1
    • That is, 1.5 ^ H ~ 2 ^ H - 1, low(H) represents the minimum number of nodes in an AVL tree of height H
    • The left boundary is a Fibonacci sequence
      • Its growth rate is 1.618 ^ n ≈ 1.5 ^ n 【can be used for estimation】
    • 👉 The relationship between the number of nodes and height is logarithmic, so search efficiency is O(logN) level
  • ❓ [Further] Is the search efficiency of a regular binary search tree necessarily worse than that of an AVL tree?
    • Not necessarily, it depends on the order of the insertion sequence, but the lower limit of the AVL tree is high
    • 【Extension】
      • Tree height = lifespan, number of nodes = wealth of life, different algorithms yield different results
      • Education raises the baseline, not the upper limit; the upper limit depends on ability and luck
        • The upper limit largely depends on an uncontrollable insertion sequence, left to fate

Adjustment ⭐#

Insertion/deletion + adjustment: The insertion operation is consistent with that of binary search trees, the key lies in adjustment [⭐ which node to rotate]

Rotation#

[Basic operation used when unbalanced]

  • Left Rotation
    • Image
    • Grasp the root node and rotate left around the center point
      • K3 becomes the root node
      • K1 becomes the left subtree of K3
      • K3's original left subtree A becomes the right subtree of K1 [because K3 > A > K1]
    • "Leaf node" depth changes: A remains unchanged, B - 1, K2 + 1
      • 👉 Subtree height changes: left subtree + 1, right subtree - 1
  • Right Rotation
    • Image
    • Grasp the root node and rotate right around the center point
      • K2 becomes the root node
      • K1 becomes the right subtree of K2
      • K2's original right subtree B becomes the left subtree of K1 [because K2 < B < K1]
    • "Leaf node" depth changes: A - 1, B remains unchanged, K3 + 1
      • 👉 Subtree height changes: left subtree - 1, right subtree + 1
  • Left and right rotations are symmetric operations, essentially affecting the height of the subtrees

Types of Imbalance && Adjustment Strategies#

  • Image
  • Judging scenario: During the backtracking process, the first time an imbalance is detected, adjust as soon as an imbalance is found
  • Symmetrical situations: LL ——RR , LR——RL

① LL

  • Image
  • ⭐ Key point ⭐: Analyze the relationship between h1, h2, h3, h4 [write equations]
    • Analysis
      • Insert one by one, the newly inserted node must cause an LL-type imbalance in subtree A [deleting also happens one by one]
      • Before insertion, it must be balanced
      • After insertion, K1's left subtree is 2 higher than its right subtree
        • That is, h(K2) = h(K3) + 2, while
          • h(K2) = h1 + 1
          • h(K3) = max(h3, h4) + 1
    • Equation relationship=>
      • h1 = max(h3, h4) + 2 = h2 + 1
      • [PS] |h3 - h4| <= 1
  • <Adjustment Strategy> Big Right Rotation. Result shown in the above image, balance analysis as follows:
    • ① K1: left subtree height h2 = right subtree height max(h3, h4) + 1 = h1 -1
    • ② K2: left subtree height h1 = right subtree height h2 + 1
    • Already balanced, and can be said to be frighteningly balanced

② LR

  • Image
  • ⭐ Key point ⭐: Analyze the relationship between h1, h2, h3, h4 [write equations]
    • Analysis
      • Insert one by one, the newly inserted node must cause an LR-type imbalance in subtree B / C
      • Before insertion, it must be balanced
      • After insertion, K1's left subtree is 2 higher than its right subtree
        • That is, h(K2) = h(D) + 2, while
          • h(K2) = max(h2, h3) + 2
          • h(D) = h4
    • Equation relationship=>
      • max(h2, h3) = h4 = h1
      • [PS] |h2 - h3| <= 1

❗ Remember again! The judgment always occurs at the first detected imbalance during the backtracking process, so all subtrees before this are balanced

  • <Adjustment Strategy> Small Left Rotation + Big Right Rotation. Result shown in the below image:
    • Image
    • First perform a small left rotation to convert to LL type, then a big right rotation [LL adjustment strategy]
    • Balance analysis
      • It is clear that the heights of the left and right subtrees depend on h1, h2, h3, h4
      • And the height difference between them does not exceed 1 [one of h2 or h3 may be smaller by 1]

Summary of Adjustment Strategies#

  • Occurs at the first imbalance node during the backtracking phase
  • Key to adjustment: the height relationship of the four subtrees ABCD in the four cases
  • Four cases
    • LL, LR: both ultimately require a big right rotation
      • LL, big right rotation
      • LR, small left rotation first, then big right rotation
    • RL, RR: both ultimately require a big left rotation
      • RL, small right rotation first, then big left rotation
      • RR, big left rotation

[Balanced Binary Search Tree — SBTree]#

  • Image
  • AVL balances through tree height, while SBTree balances through the number of nodes
  • Refer to the learning sequence of AVL: balance condition + balance adjustment [insertion, deletion]

In-Class Exercises#

  • Image
  • The binary search tree formed by the second sequence degenerates into a linked list, and the insertion order will affect the structure
  • Image
  • Image
  • When judging imbalance, backtrack step by step from the inserted node, adjust immediately upon finding an imbalance

Code Demonstration#

Binary Search Tree#

  • Image
  • Image
  • Image
  • Image
  • Image
  • 5 basic operations: initialization, destruction, search, insertion, deletion
  • Add sorting problem [add size field], Top-k problem solution [based on sorting problem]
  • [PS]
    • Finding predecessor has a bug: the predecessor of nodes of degree 1 or 0 may not exist in the subtree, but this scenario does not need to be considered
    • Recursion must consider boundary conditions
    • Learn to use macros to make the code more elegant
    • Analyzing binary trees generally discusses three situations: left, root, right
  • Partial results
  • Image

【Appendix】 Code for randomly generating input operations

  • Image
  • Generate non-proportional operation types

AVL Tree#

Image

Image

Image

Image

  • After insertion and deletion, remember to recalculate the tree height [always remember to maintain the structure definition]
    • During insertion / deletion of nodes + during left rotation / right rotation adjustments
  • ⭐ Introduced NIL nodes to replace NULL🆒[Red-black trees also need to use this]
    • NULL is inaccessible
    • NIL is an actual node that can be accessed
    • 【Facilitates code implementation】
  • LL, LR, and RL, RR the last operations are big right rotations and big left rotations respectively

Additional Knowledge Points#

  • When analyzing binary tree situations, it is generally divided into three cases: left, root, right

Points for Thought#

  • Is the search efficiency of a regular binary search tree necessarily worse than that of an AVL tree?
    • Not necessarily, it depends on the order of the insertion sequence, but the lower limit of the AVL tree is high
    • 【Extension】
      • Tree height = lifespan, number of nodes = wealth of life, different algorithms yield different results
      • Education raises the baseline, not the upper limit; the upper limit depends on ability and luck
        • The upper limit largely depends on an uncontrollable insertion sequence, left to fate

Tips#

  • Prerequisites for learning C++: system programming, network knowledge, algorithm structure, C language
  • The so-called ability to design and analyze algorithms: the ability to classify discussions [divide into several situations] and summarize [combine multiple situations into one]
    • Program = Algorithm + Data Structure
    • Those who improve this ability by practicing problems are gifted individuals
  • Recommended Geek Column — Programming Introductory Course That Everyone Can Learn — a challenge to traditional learning methods
  • Image

Loading...
Ownership of this post data is guaranteed by blockchain and smart contracts to the creator alone.