Skip to content

第25章 算法设计范式

学习目标

  • 掌握分治算法的设计模式与主定理
  • 理解动态规划的最优子结构与状态转移
  • 掌握贪心算法的正确性证明方法
  • 理解各范式的适用条件与选择策略

25.1 分治算法

25.1.1 形式化定义

定义 25.1(分治策略) 分治策略将问题分解为若干规模较小的子问题,递归求解子问题,然后合并子问题的解。

算法 25.1(分治框架)

Divide-and-Conquer(P):
    if |P| ≤ base_case:
        return Solve-Directly(P)
    subproblems = Divide(P)
    solutions = [Divide-and-Conquer(sub) for sub in subproblems]
    return Combine(solutions)

25.1.2 主定理

定理 25.1(主定理) 设递推关系为:

$$T(n) = aT\left(\frac{n}{b}\right) + f(n)$$

其中 $a \geq 1$,$b > 1$。令 $c_{\text{crit}} = \log_b a$。

  1. 若 $f(n) = O(n^c)$ 且 $c < c_{\text{crit}}$,则 $T(n) = \Theta(n^{c_{\text{crit}}})$
  2. 若 $f(n) = \Theta(n^{c_{\text{crit}}} \log^k n)$,则 $T(n) = \Theta(n^{c_{\text{crit}}} \log^{k+1} n)$
  3. 若 $f(n) = \Omega(n^c)$ 且 $c > c_{\text{crit}}$,且满足正则条件,则 $T(n) = \Theta(f(n))$

证明概要:使用递归树方法。递归树有 $\log_b n$ 层,第 $i$ 层有 $a^i$ 个节点,每个节点代价为 $f(n/b^i)$。总代价为:

$$T(n) = \sum_{i=0}^{\log_b n} a^i f\left(\frac{n}{b^i}\right)$$

分析 $f(n)$ 与 $n^{c_{\text{crit}}}$ 的相对大小,得到三种情况。 ∎

25.1.3 应用实例

归并排序:$T(n) = 2T(n/2) + \Theta(n)$

由主定理,$a=2$,$b=2$,$c_{\text{crit}}=1$,$f(n)=\Theta(n)=\Theta(n^{c_{\text{crit}}})$。

属于情况2,$T(n) = \Theta(n \log n)$。

二分查找:$T(n) = T(n/2) + \Theta(1)$

$a=1$,$b=2$,$c_{\text{crit}}=0$,$f(n)=\Theta(1)=\Theta(n^0)$。

属于情况2,$T(n) = \Theta(\log n)$。

25.1.4 Python实现

python
from typing import List, TypeVar, Callable, Tuple
import math

T = TypeVar('T')

def merge_sort(arr: List[T]) -> List[T]:
    if len(arr) <= 1:
        return arr[:]
    
    mid = len(arr) // 2
    left = merge_sort(arr[:mid])
    right = merge_sort(arr[mid:])
    
    return _merge(left, right)

def _merge(left: List[T], right: List[T]) -> List[T]:
    result = []
    i = j = 0
    
    while i < len(left) and j < len(right):
        if left[i] <= right[j]:
            result.append(left[i])
            i += 1
        else:
            result.append(right[j])
            j += 1
    
    result.extend(left[i:])
    result.extend(right[j:])
    return result

def quick_select(arr: List[int], k: int) -> int:
    def select(left: int, right: int, k: int) -> int:
        if left == right:
            return arr[left]
        
        pivot_idx = partition(left, right)
        
        if k == pivot_idx:
            return arr[pivot_idx]
        elif k < pivot_idx:
            return select(left, pivot_idx - 1, k)
        else:
            return select(pivot_idx + 1, right, k)
    
    def partition(left: int, right: int) -> int:
        pivot = arr[right]
        i = left
        for j in range(left, right):
            if arr[j] <= pivot:
                arr[i], arr[j] = arr[j], arr[i]
                i += 1
        arr[i], arr[right] = arr[right], arr[i]
        return i
    
    return select(0, len(arr) - 1, k)

def closest_pair(points: List[Tuple[float, float]]) -> float:
    def distance(p1: Tuple[float, float], p2: Tuple[float, float]) -> float:
        return math.sqrt((p1[0] - p2[0])**2 + (p1[1] - p2[1])**2)
    
    def brute_force(pts: List[Tuple[float, float]]) -> float:
        min_dist = float('inf')
        for i in range(len(pts)):
            for j in range(i + 1, len(pts)):
                min_dist = min(min_dist, distance(pts[i], pts[j]))
        return min_dist
    
    def closest_recursive(px: List[Tuple[float, float]], py: List[Tuple[float, float]]) -> float:
        n = len(px)
        
        if n <= 3:
            return brute_force(px)
        
        mid = n // 2
        mid_point = px[mid]
        
        left_x = px[:mid]
        right_x = px[mid:]
        
        left_y = [p for p in py if p[0] <= mid_point[0]]
        right_y = [p for p in py if p[0] > mid_point[0]]
        
        dl = closest_recursive(left_x, left_y)
        dr = closest_recursive(right_x, right_y)
        d = min(dl, dr)
        
        strip = [p for p in py if abs(p[0] - mid_point[0]) < d]
        
        for i in range(len(strip)):
            for j in range(i + 1, min(i + 7, len(strip))):
                d = min(d, distance(strip[i], strip[j]))
        
        return d
    
    px = sorted(points, key=lambda p: p[0])
    py = sorted(points, key=lambda p: p[1])
    
    return closest_recursive(px, py)

def matrix_multiply(A: List[List[int]], B: List[List[int]]) -> List[List[int]]:
    n = len(A)
    
    if n == 1:
        return [[A[0][0] * B[0][0]]]
    
    mid = n // 2
    
    def add(X, Y):
        return [[X[i][j] + Y[i][j] for j in range(len(X[0]))] for i in range(len(X))]
    
    def subtract(X, Y):
        return [[X[i][j] - Y[i][j] for j in range(len(X[0]))] for i in range(len(X))]
    
    A11 = [row[:mid] for row in A[:mid]]
    A12 = [row[mid:] for row in A[:mid]]
    A21 = [row[:mid] for row in A[mid:]]
    A22 = [row[mid:] for row in A[mid:]]
    
    B11 = [row[:mid] for row in B[:mid]]
    B12 = [row[mid:] for row in B[:mid]]
    B21 = [row[:mid] for row in B[mid:]]
    B22 = [row[mid:] for row in B[mid:]]
    
    M1 = matrix_multiply(add(A11, A22), add(B11, B22))
    M2 = matrix_multiply(add(A21, A22), B11)
    M3 = matrix_multiply(A11, subtract(B12, B22))
    M4 = matrix_multiply(A22, subtract(B21, B11))
    M5 = matrix_multiply(add(A11, A12), B22)
    M6 = matrix_multiply(subtract(A11, A21), add(B11, B12))
    M7 = matrix_multiply(subtract(A12, A22), add(B21, B22))
    
    C11 = add(subtract(add(M1, M4), M5), M7)
    C12 = add(M3, M5)
    C21 = add(M2, M4)
    C22 = subtract(add(add(M1, M3), M6), M2)
    
    C = [[0] * n for _ in range(n)]
    for i in range(mid):
        for j in range(mid):
            C[i][j] = C11[i][j]
            C[i][j + mid] = C12[i][j]
            C[i + mid][j] = C21[i][j]
            C[i + mid][j + mid] = C22[i][j]
    
    return C

25.2 动态规划

25.2.1 形式化定义

定义 25.2(最优子结构) 问题具有最优子结构,若其最优解包含子问题的最优解。

定义 25.3(重叠子问题) 问题具有重叠子问题,若递归求解时相同子问题被多次计算。

定义 25.4(动态规划) 动态规划是一种通过存储子问题的解来避免重复计算的算法设计技术。

25.2.2 设计步骤

  1. 刻画最优解结构:描述最优解的形式
  2. 递归定义最优解:建立状态转移方程
  3. 计算最优解:自底向上或带备忘录的自顶向下
  4. 构造最优解:根据需要回溯构造解

25.2.3 经典问题

定理 25.2(背包问题最优子结构) 0-1背包问题的最优解具有最优子结构。

证明:设最优解包含物品 $i$,则剩余容量为 $W-w_i$ 时,剩余物品的选择必须是最优的。否则,可用更优的选择替换,得到更好的解,矛盾。 ∎

状态转移方程

$$dp[i][w] = \max(dp[i-1][w], dp[i-1][w-w_i] + v_i)$$

25.2.4 Python实现

python
from typing import List, Tuple, Optional
import math

def fibonacci_dp(n: int) -> int:
    if n <= 1:
        return n
    
    prev, curr = 0, 1
    for _ in range(2, n + 1):
        prev, curr = curr, prev + curr
    
    return curr

def fibonacci_matrix(n: int) -> int:
    def matrix_mult(A: List[List[int]], B: List[List[int]]) -> List[List[int]]:
        return [
            [A[0][0]*B[0][0] + A[0][1]*B[1][0], A[0][0]*B[0][1] + A[0][1]*B[1][1]],
            [A[1][0]*B[0][0] + A[1][1]*B[1][0], A[1][0]*B[0][1] + A[1][1]*B[1][1]]
        ]
    
    def matrix_pow(M: List[List[int]], n: int) -> List[List[int]]:
        result = [[1, 0], [0, 1]]
        while n > 0:
            if n % 2 == 1:
                result = matrix_mult(result, M)
            M = matrix_mult(M, M)
            n //= 2
        return result
    
    if n <= 1:
        return n
    
    F = [[1, 1], [1, 0]]
    result = matrix_pow(F, n - 1)
    return result[0][0]

def knapsack_01(weights: List[int], values: List[int], capacity: int) -> int:
    n = len(weights)
    dp = [0] * (capacity + 1)
    
    for i in range(n):
        for w in range(capacity, weights[i] - 1, -1):
            dp[w] = max(dp[w], dp[w - weights[i]] + values[i])
    
    return dp[capacity]

def knapsack_complete(weights: List[int], values: List[int], capacity: int) -> int:
    n = len(weights)
    dp = [0] * (capacity + 1)
    
    for i in range(n):
        for w in range(weights[i], capacity + 1):
            dp[w] = max(dp[w], dp[w - weights[i]] + values[i])
    
    return dp[capacity]

def longest_common_subsequence(text1: str, text2: str) -> int:
    m, n = len(text1), len(text2)
    dp = [[0] * (n + 1) for _ in range(m + 1)]
    
    for i in range(1, m + 1):
        for j in range(1, n + 1):
            if text1[i-1] == text2[j-1]:
                dp[i][j] = dp[i-1][j-1] + 1
            else:
                dp[i][j] = max(dp[i-1][j], dp[i][j-1])
    
    return dp[m][n]

def edit_distance(word1: str, word2: str) -> int:
    m, n = len(word1), len(word2)
    dp = [[0] * (n + 1) for _ in range(m + 1)]
    
    for i in range(m + 1):
        dp[i][0] = i
    for j in range(n + 1):
        dp[0][j] = j
    
    for i in range(1, m + 1):
        for j in range(1, n + 1):
            if word1[i-1] == word2[j-1]:
                dp[i][j] = dp[i-1][j-1]
            else:
                dp[i][j] = 1 + min(dp[i-1][j], dp[i][j-1], dp[i-1][j-1])
    
    return dp[m][n]

def longest_increasing_subsequence(nums: List[int]) -> int:
    if not nums:
        return 0
    
    tails = []
    
    for num in nums:
        left, right = 0, len(tails)
        while left < right:
            mid = (left + right) // 2
            if tails[mid] < num:
                left = mid + 1
            else:
                right = mid
        
        if left == len(tails):
            tails.append(num)
        else:
            tails[left] = num
    
    return len(tails)

def coin_change(coins: List[int], amount: int) -> int:
    dp = [float('inf')] * (amount + 1)
    dp[0] = 0
    
    for i in range(1, amount + 1):
        for coin in coins:
            if coin <= i:
                dp[i] = min(dp[i], dp[i - coin] + 1)
    
    return dp[amount] if dp[amount] != float('inf') else -1

def matrix_chain_order(dims: List[int]) -> Tuple[int, List[List[int]]]:
    n = len(dims) - 1
    m = [[0] * n for _ in range(n)]
    s = [[0] * n for _ in range(n)]
    
    for length in range(2, n + 1):
        for i in range(n - length + 1):
            j = i + length - 1
            m[i][j] = float('inf')
            for k in range(i, j):
                cost = m[i][k] + m[k+1][j] + dims[i] * dims[k+1] * dims[j+1]
                if cost < m[i][j]:
                    m[i][j] = cost
                    s[i][j] = k
    
    return m[0][n-1], s

def optimal_bst(keys: List[str], freq: List[float]) -> float:
    n = len(keys)
    dp = [[0.0] * n for _ in range(n)]
    prefix_sum = [0.0] * (n + 1)
    
    for i in range(n):
        prefix_sum[i+1] = prefix_sum[i] + freq[i]
    
    for i in range(n):
        dp[i][i] = freq[i]
    
    for length in range(2, n + 1):
        for i in range(n - length + 1):
            j = i + length - 1
            dp[i][j] = float('inf')
            
            for r in range(i, j + 1):
                left = dp[i][r-1] if r > i else 0
                right = dp[r+1][j] if r < j else 0
                cost = left + right + prefix_sum[j+1] - prefix_sum[i]
                dp[i][j] = min(dp[i][j], cost)
    
    return dp[0][n-1]

25.3 贪心算法

25.3.1 形式化定义

定义 25.5(贪心选择性质) 问题具有贪心选择性质,若可以通过局部最优选择构造全局最优解。

定义 25.6(贪心算法) 贪心算法在每一步选择当前最优的决策,期望通过局部最优达到全局最优。

25.3.2 正确性证明方法

定理 25.3(贪心正确性证明框架)

  1. 贪心选择性质:证明存在最优解包含贪心选择
  2. 最优子结构:证明做出贪心选择后,剩余子问题的最优解与贪心选择组合得到原问题的最优解

证明技巧

  • 交换论证:将任意最优解转换为贪心解,证明代价不增
  • 割差论证:分析贪心解与最优解的差异

25.3.3 经典问题

定理 25.4(活动选择贪心正确性) 按结束时间排序后,贪心选择最早结束的活动可得到最优解。

证明:设贪心选择的活动为 $a_1$,最优解的第一个活动为 $a'_1$。由于贪心选择最早结束的活动,$a_1$ 的结束时间不晚于 $a'_1$。将最优解中的 $a'_1$ 替换为 $a_1$,仍保持相容性,且活动数不变。 ∎

25.3.4 Python实现

python
from typing import List, Tuple

def activity_selection(start: List[int], end: List[int]) -> List[int]:
    activities = sorted(range(len(start)), key=lambda i: end[i])
    
    result = [activities[0]]
    last_end = end[activities[0]]
    
    for i in activities[1:]:
        if start[i] >= last_end:
            result.append(i)
            last_end = end[i]
    
    return result

def fractional_knapsack(weights: List[int], values: List[int], capacity: int) -> float:
    items = sorted(range(len(weights)), key=lambda i: values[i] / weights[i], reverse=True)
    
    total_value = 0.0
    remaining = capacity
    
    for i in items:
        if remaining <= 0:
            break
        
        take = min(weights[i], remaining)
        total_value += take * (values[i] / weights[i])
        remaining -= take
    
    return total_value

def huffman_coding(freq: List[Tuple[str, int]]) -> dict:
    import heapq
    
    heap = [(f, i, char) for i, (char, f) in enumerate(freq)]
    heapq.heapify(heap)
    
    codes = {char: '' for char, _ in freq}
    
    while len(heap) > 1:
        f1, _, c1 = heapq.heappop(heap)
        f2, _, c2 = heapq.heappop(heap)
        
        if isinstance(c1, str):
            for char in c1:
                codes[char] = '0' + codes[char]
        else:
            for char in c1:
                codes[char] = '0' + codes[char]
        
        if isinstance(c2, str):
            for char in c2:
                codes[char] = '1' + codes[char]
        else:
            for char in c2:
                codes[char] = '1' + codes[char]
        
        merged = c1 if isinstance(c1, set) else {c1}
        merged |= c2 if isinstance(c2, set) else {c2}
        
        heapq.heappush(heap, (f1 + f2, len(heap), merged))
    
    return codes

def job_scheduling_deadline(jobs: List[Tuple[int, int]]) -> List[int]:
    jobs = sorted(enumerate(jobs), key=lambda x: -x[1][1])
    
    max_deadline = max(d for _, (p, d) in jobs)
    slots = [False] * (max_deadline + 1)
    result = []
    
    for idx, (profit, deadline) in jobs:
        for slot in range(deadline, 0, -1):
            if not slots[slot]:
                slots[slot] = True
                result.append(idx)
                break
    
    return result

def minimum_spanning_tree_kruskal(n: int, edges: List[Tuple[int, int, int]]) -> List[Tuple[int, int]]:
    edges = sorted(edges, key=lambda e: e[2])
    
    parent = list(range(n))
    rank = [0] * n
    
    def find(x):
        if parent[x] != x:
            parent[x] = find(parent[x])
        return parent[x]
    
    def union(x, y):
        px, py = find(x), find(y)
        if px == py:
            return False
        if rank[px] < rank[py]:
            px, py = py, px
        parent[py] = px
        if rank[px] == rank[py]:
            rank[px] += 1
        return True
    
    mst = []
    for u, v, w in edges:
        if union(u, v):
            mst.append((u, v))
            if len(mst) == n - 1:
                break
    
    return mst

def dijkstra_shortest_path(n: int, edges: List[Tuple[int, int, int]], start: int) -> List[int]:
    import heapq
    
    graph = [[] for _ in range(n)]
    for u, v, w in edges:
        graph[u].append((v, w))
    
    dist = [float('inf')] * n
    dist[start] = 0
    
    pq = [(0, start)]
    
    while pq:
        d, u = heapq.heappop(pq)
        
        if d > dist[u]:
            continue
        
        for v, w in graph[u]:
            if dist[u] + w < dist[v]:
                dist[v] = dist[u] + w
                heapq.heappush(pq, (dist[v], v))
    
    return dist

25.4 算法范式比较

25.4.1 特性对比

范式核心思想适用条件复杂度
分治分解、求解、合并子问题独立主定理
动态规划存储子问题解最优子结构、重叠子问题状态空间
贪心局部最优选择贪心选择性质通常线性或对数

25.4.2 选择策略

问题特征推荐范式
子问题独立分治
子问题重叠 + 最优子结构动态规划
贪心选择性质贪心
需要枚举所有解回溯

25.5 本章小结

本章介绍了算法设计范式:

  1. 分治:分解问题,递归求解,合并结果
  2. 动态规划:存储子问题解,避免重复计算
  3. 贪心:局部最优选择,需证明正确性
  4. 选择策略:根据问题特征选择合适范式

参考文献

  1. Cormen, T. H., et al. (2009). Introduction to Algorithms, 3rd ed. MIT Press.
  2. Kleinberg, J., & Tardos, E. (2005). Algorithm Design. Pearson.
  3. Bentley, J. L., Haken, D., & Saxe, J. B. (1980). A general method for solving divide-and-conquer recurrences. SIGACT News, 12(3), 36-44.
  4. Bellman, R. (1957). Dynamic Programming. Princeton University Press.

下一章:第26章 复杂度分析与优化

Python技术丛书 - 江苏省宿城中等专业学校