INT202-Complexity of Algorithms 算法复杂度：算法-CSDN博客

1. 算法与数据结构 Algorithms and Data Structures

2. 算法分析 Algorithm Analysis

2.1算法的实验分析 Experimental Analysis of Algorithms

2.2算法的理论分析 Theoretical Analysis

3.伪代码 Pseudo-code

4.原语操作

5.递归算法 Recursive Algorithms

6.渐进符号Asymptotic notation

7.“大O”符号 “Big-Oh” Notation

8.渐进算法分析 Asymptotic Algorithm Analysis

9.空间复杂度 Space Complexity

1. 算法与数据结构 Algorithms and Data Structures

- 数据结构是一种系统化地组织和访问数据的方式。

A data structure is a systematic way of organizing and accessing data.

- 算法是在有限时间内完成某项任务的一系列步骤。

An algorithm is a sequence of steps for performing a task in a finite amount of time.

- 示例

· 导航到某个地点Navigate to a place --> 求根算法 Root-finding algorithm

· 转换音频/视频格式Transfer Audio/Video format --> 压缩算法Compression algorithm

· 控制太阳能板Control solar panels --> 优化算法Optimization algorithm

· 从二维/三维模型生成图像Generate an image from 2D/3D model

--> 渲染算法Rendering algorithm

2. 算法分析 Algorithm Analysis

- 算法分析是对计算机程序性能和资源使用的理论研究。

The analysis of algorithm is the theoretical study of computer program performance and recourse usage.

· 主要关注点：算法的运行时间（时间复杂度）以及对数据结构的操作。

Primary interest: Running time (time-complexity) of the algorithm and the operations on data structures.

· 次要关注点：空间（或“内存”）使用（空间复杂度）。

Secondary interest: Space (or “memory”) usage (space-complexity).

2.1算法的实验分析 Experimental Analysis of Algorithms

-- > 关注运行时间或内存需求与输入规模的依赖关系。 We are interested in dependency of the running time or memory requirement on size of the input

- 背景

· 为了分析算法，我们有时会进行实验，以实证观察，例如运行时间。

To analyze algorithms we sometimes perform experiments to empirically observe for example the running time.

· 这样做需要选择合适的样本输入，并进行适当数量的测试（以便我们对分析结果有足够的统计把握）。Doing so requires a good choice of sample inputs and an

- 图像描述：
· 在处理图时，size可能指顶点和边的数量；在编码/解码消息时，可能指消息的长度；在处理数字时，可能指存储它们所需的位数。 The size could mean the number of vertices and edges if we are operating on a graph, the length of a message we’re encoding/decoding, and/or the actual length of numbers we’re processing in terms of the bits needed to store them in memory.appropriate number of tests (so that we can have statistical certainty about our analysis).

·运行时间取决于输入的规模和实例、所使用的算法，以及运行它的软件和硬件环境。Running-time depends on both the size and instance of input and the algorithm used, as well as the software and hardware environment on which it is run.

> 一般来说，算法或数据结构方法的运行时间会随着输入规模的增加而增加。In general, the running time of an algorithm or data structure method increases with the input size

- 局限性 Limitations

· 实验仅在有限的测试输入上进行。Experiments are performed on a limited set of test inputs

· 需要在相同的硬件和软件环境中进行所有测试。 Requires all tests to be performed using same hardware and software.

· 需要实现并执行算法。Requires implementation and execution of algorithm.

2.2算法的理论分析 Theoretical Analysis

- 优势（相较于实验分析）：

·可以考虑所有可能的输入。 Can take all possible inputs into account

·可以独立于硬件/软件环境比较两个（或更多）算法的效率。 Can compare efficiency of two (or more) algorithms, independent of hardware/software environment.

·涉及对算法的高级描述（伪代码）的研究。Involves studying high-level descriptions of algorithms (pseudo code).

- 目标：为每个算法关联一个函数 f(n)，其中 f(n) 通过某种输入规模 n 的度量来描述运行时间。

associate a function f (n) to each algorithm, where f (n) characterizes the running-time in terms of some measure of the input-size, n

·常见的函数包括：n、log n、n²、n log n、2ⁿ…… Typical functions include: n, log n, n^2 , n log n, 2^n ,...

>eg: 算法 A 的运行时间与 n 成正比”。‘Algorithm A runs in time proportional to n’

- 需求：

·一种描述算法的语言。 A language for describing algorithms.

·算法执行的计算模型。 Computational model in which algorithms are executed.

·衡量性能的指标。Metric for measuring performance

·描述性能的方法。A way of characterizing performance.

3.伪代码 Pseudo-code

- 描述：

·伪代码是一种用于描述算法的高级语言。Pseudo-code is a high-level description language for algorithms.

·伪代码提供了对算法的更结构化的描述。Pseudo-code provides more structured description of algorithms

·它允许对算法进行高级分析，以确定其运行时间（以及内存需求）。Allows high-level analysis of algorithms to determine their running time (and memory requirements).

·伪代码是自然语言和高级编程语言（例如 Java、C 等）的混合体。 Pseudo-code is a mixture of natural language and high-level programming language (e.g., Java, C, etc.).

·它描述了数据结构或算法的通用实现。 Describes a generic implementation of data structures or algorithms.

·伪代码包括：表达式、声明、变量初始化和赋值、条件语句、循环、数组、方法调用等。 Pseudo-code includes: expressions, declarations, variable initialization and assignments, conditionals, loops, arrays, method calls, etc.

- 任务：为函数 `Minimum-Element` 提供伪代码，该函数用于查找一组数字中的最小元素。Task: Give a Pseudo-code for the function Minimum-Element, which finds the minimum element of a set of numbers.

·除了使用箭头符号（←）之外，也可以使用等号（=）。 Instead of the sign arrow ← an equal sign can also be used

·除了使用“if-then-end if”结构之外，也可以使用大括号（{}）。Instead of if-then-end if, the curly brackets {} can also be used

·在这里，我们不会严格定义一种提供伪代码的方法，但在使用它时，我们的目标是以一种能够使熟练的程序员能够准确地将伪代码转换为程序代码的方式来描述算法，避免对算法的误解。In here, we won’t formally define a strict method for giving pseudo code, but when using it we aim to describe an algorithm in a manner that would allow a competent programmer to translate the pseudo code into program code without misinterpretation of the algorithm

4.原语操作

- 我们定义了一组高级的原语操作，这些操作在很大程度上与所使用的编程语言无关。原语操作包括以下内容： We define a set of high-level primitive operations that are largely independent from the programming language used. Primitive operations include the following:

·给变量赋值 Assigning a value to a variable

·调用一个方法 Calling a method

·执行算术运算（例如，将两个数字相加） Performing an arithmetic operation (for example, adding two numbers)

·比较两个数字 Comparing two numbers

·访问数组的索引 Indexing into an array

·遵循对象引用 Following an object reference

·从方法中返回 Returning from a method

- 为了分析算法的运行时间，我们统计算法执行过程中执行的操作数量。To analyze the running time of an algorithm we count the number of operations executed during the course of the algorithm

4.1随机访问机Random Access Machine

- 这种简单的原语操作计数方法引出了一个称为随机访问机的计算模型。

This approach of simple counting primitive options gives rise to a computational model called the Random Access Machine.

·CPU连接到一组存储单元。 CPU connected to a bank of memory cells.

·每个存储单元可以存储一个数字、字符或地址。 Each memory cell can store a number, character, or address.

- 假设：原语操作（如单次加法、乘法或比较）的执行时间为常数。 Assumption: primitive operations (like a single addition, multiplication, or comparison) require constant time to perform.

·这并不总是成立，例如，乘法在计算机上执行的时间大约是加法的四倍。）Not necessarily true, e.g. multiplication takes about four times as long to perform on a computer than addition.

4.2平均情况复杂度与最坏情况复杂度 Average- vs. Worst-Case Complexity

- 一个算法在某些输入上可能比在其他输入上运行得更快。An algorithm may run faster on some inputs compared to others.

·平均情况复杂度是指在相同规模的所有输入上取平均值的运行时间。 Average-case complexity refers to running time as an average taken over all inputs of the same size.

·最坏情况复杂度是指在相同规模的所有输入上取最大值的运行时间。 Worst-case complexity refers to running time as the maximum taken over all inputs of the same size.

- 通常，我们最关注的是最坏情况复杂度。Usually, we’re most interested in worst-case complexity.

- 平均情况分析通常还需要我们根据给定的输入分布计算预期运行时间。An average-case analysis also typically requires that we calculate expected running times based on a given input distribution.

4.3计算原语操作

5.递归算法 Recursive Algorithms

- 递归涉及一个过程调用自身来解决更小规模的子问题。这些更小的子问题可以通过某种方式组合起来，从而得到一个更大问题的解。 Recursion involves a procedure calling itself to solve subproblems of a smaller size. These smaller subproblems can then be combined in some way to get a solution to a larger problem.

- 递归过程需要一个基本情况，该情况可以直接解决，而无需使用递归。Recursive procedures require a base case that can be solved directly without using recursion.

5.1递推关系 Recurrence Relations

- 递推关系有时允许我们用方程的形式定义算法的运行时间。Recurrence relations sometimes allow us to define the running-time of an algorithm in the form of an equation.

- 假设 T(n) 表示算法在规模为 n 的输入上的运行时间。那么我们或许可以用 T(n−1) 等来描述 T(n)。例如，我们或许可以展示： Suppose that T(n) denotes the running time of algorithm on input of size n. Then we might be able to characterize T(n) in terms of, say, T(n − 1). For example, we might be able to show that

- 理想情况下，给定这样的关系后，我们希望将这个递推关系表示为封闭形式。在例子中，我们可以写为： Ideally, given such a relationship we would then want to express this recurrence relation in a closed form.

- 递推关系可能有多种形式。一些例子包括：Recurrence relations may appear in many forms. Some examples include:

·C(n) = 3 · C(n − 1) + 2 · C(n − 2) + C(n − 3)，其中 C(1) = 1, C(2) = 3, C(3) = 5

·斐波那契数 The Fibonacci numbers

5.2递归示例：斐波那契数 Recursion example: Fibonacci Numbers

- 斐波那契数定义为序列：f1 = f2 = 1, 且对于 n ≥ 3，fn = fn− 1 + fn− 2。可以使用以下伪代码递归地计算它们。

- 问题：编写一段伪代码来计算斐波那契数，其中 n = 50。Write a piece of pseudocode to compute Fibonacci numbers, n=50.

- 斐波那契数列的项为：1, 1, 2, 3, 5, 8, 13, 21, 34

The terms of the Fibonacci sequence are: 1,1,2,3,5,8,13,21,34

5.3递归算法的局限性

- 尽管递归算法通常比非递归版本“更简单”编写，但在许多情况下，最好避免使用它们。 While recursive algorithms are often “simpler” to write than a nonrecursive version, there are often reasons to avoid them.

- 在许多情况下，递归算法执行过程中可能会重复解决较小的子问题。In many situations, the smaller subproblems might be solved repeatedly during execution of the recursive algorithm.

· 例如：为了计算 Fibonacci(n)，我们必须计算 Fibonacci(n−1) 和 Fibonacci(n−2)。这两个函数调用又必须分别计算 Fibonacci(n−3) 和 Fibonacci(n−4)，等等。这种工作的重复可能会极大地增加算法的整体运行时间。To compute Fibonacci(n), we must compute Fibonacci(n − 1) and Fibonacci(n − 2). Both of these function calls must then compute Fibonacci(n − 3) and Fibonacci(n − 4), etc. This repetition of work can massively increase the overall running time of the algorithm.

· 由于上述现象，对于较大的输入值，递归算法可能无法在计算机上执行，因为重复的函数调用可能会耗尽机器的内存。Because of the above phenomena, a recursive algorithm could also be impossible to perform on a computer (for large input values) because the repeated function calls might exhaust the memory of the machine.

6.渐进符号Asymptotic notation

- 描述：

·渐进符号允许我们对影响运行时间的主要因素进行描述。 Asymptotic notation allows characterization of the main factors affecting running time.

·它用于简化分析，估计执行的原语操作数量，直到一个常数因子。 Used in a simplified analysis that estimates the number of primitive operations executed up to a constant factor.

·这种符号让我们能够比较两个算法的运行时间。Such notation lets us compare the running times of two algorithms.

- 渐进分析的重要性 Importance of asymptotics

· 假设有一台 1MHz 的机器，在 1 秒、1 分钟和 1 小时内，不同运行时间的算法能够处理的最大输入规模如下： Maximum size allowed for an input instance for various running times to be solved in 1 second, 1 minute and 1 hour, assuming a 1MHz machine:

· 一个渐进运行时间较慢的算法，最终会被一个渐进运行时间较快的算法超越。An algorithm with an asymptotically slow running time is beaten in the long run by an algorithm with an asymptotically faster running time.

6.1 O(n)、Ω(n) 和 Θ(n) 符号

- 我们说f(n) 是 O(g(n))如果存在实数常数 c 和 n₀，使得：

· 对于所有 n ≥ n₀，有f(n) ≤ c · g(n) f (n) ≤ c ·g(n) for all n ≥ n0

- 我们说 f(n) 是 Ω(g(n))（大Ω）如果存在实数常数 c 和 n₀，使得： We say that f (n) is Ω(g(n)) (big-Omega) if there are real constants c and n0 such that:

· 对于所有 n ≥ n₀，有 f(n) ≥ c · g(n)。 f (n) ≥ c·g(n) for all n ≥ n0.

- 我们说 f(n) 是 Θ(g(n))（Theta）如果 f(n) 是 Ω(g(n)) 并且 f(n) 也是 O(g(n))。We say that f (n) is Θ(g(n)) (Theta) if f (n) is Ω(g(n)) and f (n) is also O(g(n)).

6.2渐进符号的直观理解：Intuition for Asymptotic Notation

- 大O符号（Big-Oh）

·如果 f(n) 渐进地小于或等于 g(n)，则称 f(n) 是 O(g(n))。f(n) is O(g(n)) if f(n) is asymptotically less than or equal to g(n)

- 大Ω符号（Big-Omega）

·如果 f(n) 渐进地大于或等于 g(n)，则称 f(n) 是 Ω(g(n))。f(n) is Ω(g(n)) if f(n) is asymptotically greater than or equal to g(n)

- 大Θ符号（Big-Theta）

· 如果 f(n) 渐进地等于 g(n)，则称 f(n) 是 Θ(g(n))。f(n) is Θ(g(n)) if f(n) is asymptotically equal to g(n)

7.“大O”符号 “Big-Oh” Notation

7.1 定义：

- “大O”符号可能是最常用的渐进符号形式。 “Big-Oh” notation is probably the most commonly used form of asymptotic notation.

- 给定两个正函数 f(n) 和 g(n)（定义在非负整数上），我们说 f(n) 是 O(g(n))，写作 f(n) ∈ O(g(n))，如果存在常数 c 和 n₀，使得：Given two positive functions f (n) and g(n) (defined on the nonnegative integers), we say f (n) is O(g(n)), written f (n) ∈O(g(n)), if there are constants c and n0 such that:

7.2 大O符号与增长率 Big-Oh and Growth Rate

- 按增长率排序的函数：Functions ordered by growth rate:

- 大O符号给出了函数增长率的上界。 The big-Oh notation gives an upper bound on the growth rate of a function.

- “f(n) 是 O(g(n))” 这一说法意味着 f(n) 的增长率不会超过 g(n) 的增长率。 The statement “f(n) is O(g(n))” means that the growth rate of f(n) is no more than the growth rate of g(n).

- 我们可以使用大O符号根据增长率对函数进行排序。We can use the big-Oh notation to rank functions according to their growth rate

7.3常见函数 Common functions

- 在分析算法时，我们通常会遇到以下几类函数：Here is a list of classes of functions that are commonly encountered when analyzing algorithm

· 常数时间 O(1) Constant O(1)

· 对数时间 O(log n) Logarithmic O(log n)

· 线性时间 O(n) Linear O(n)

· 线性对数时间 O(n log n) Log-linear O(nlog n)

· 二次方时间 O(n²) Quadratic O(n^2)

· 三次方时间 O(n³) Cubic O(n^3)

· 多项式时间 O(nᵏ)，其中 k 是常数 Polynomial O(n^k )

· 指数时间 O(aⁿ)，其中 a > 1 Exponential O(a^n ), a > 1

· 阶乘时间 O(n!) Factorial O(n!)

7.4 大O符号的规则 Big-Oh Rules

- 如果 f(n) 是一个 d 次多项式，那么 f(n) 是 O(nᵈ)，即：If f(n) is a polynomial of degree d, then f(n) is O(n^d), i.e.

· 忽略低阶项 Drop lower-order terms

· 忽略常数因子 Drop constant factors

- 使用尽可能小的函数类别 Use the smallest possible class of functions

· 说“2n 是 O(n)”，而不是“2n 是 O(n²)” Say “2n is O(n)” instead of “2n is O(n^2)”

- 使用该类别中最简单的表达形式Use the simplest expression of the class

· 说“3n + 5 是 O(n)”，而不是“3n + 5 是 O(3n)” Say “3n+ 5 is O(n)” instead of “3n + 5 is O(3n)”

8.渐进算法分析 Asymptotic Algorithm Analysis

- 算法的渐进分析通过大O符号确定算法的运行时间。 The asymptotic analysis of an algorithm determines the running time in big-Oh notation To perform the asymptotic analysis

- 要进行渐进分析：

· 我们找到作为输入规模函数的最坏情况下执行的原语操作数量。We find the worst-case number of primitive operations executed as a function of the input size

· 我们用大O符号表示这个函数。 We express this function with big-Oh notation

- 例如：

· 我们确定算法“Maximum-Element(A)”最多执行 7n - 2 次原语操作。We determine that the algorithm “Maximum-Element(A) ” executes at most 7n - 2 primitive operations

· 我们说该算法的运行时间为 O(n)。 We say that algorithm “runs in O(n) time”

9.空间复杂度 Space Complexity

- 空间复杂度是衡量算法所需工作存储量的一个指标。这意味着在算法的任何一点，在最坏情况下需要多少内存。Space complexity is a measure of the amount of working storage an algorithm needs. That means how much memory, in the worst case, is needed at any point in the algorithm.

- 与时间复杂度类似，我们主要关心的是随着输入问题规模 N 的增长，空间需求如何以大O符号表示增长。As with time complexity, we‘re mostly concerned with how the space needs grow, in big-Oh terms, as the size N of the input problem grows.

- 示例1:

· 需要3个单位的空间用于参数和1个单位用于局部变量，而且这个需求永远不会改变，所以这是 O(1)。requires 3 units of space for the parameters and 1 for the local variable, and this never changes, so this is O(1)

- 示例2: