python 内置标准数据结构分析

imoyao · imoyao · commit 755cf391b6fb · 2019-05-15T18:00:36.000+08:00
diff --git a/Python_stdin_data_structures/dict/code_test.py b/Python_stdin_data_structures/dict/code_test.py
@@ -0,0 +1,15 @@
+#!/usr/bin/env python
+# -*- coding: utf-8 -*-
+# Created by imoyao at 2019/5/15 11:04
+"""
+字典操作及性能测试
+"""
+import pathlib
+import sys
+import timeit
+
+util_p = pathlib.Path('../..').resolve()
+sys.path.append(str(util_p))
+from util import utils
+
+
diff --git a/Python_stdin_data_structures/dict/readme.md b/Python_stdin_data_structures/dict/readme.md
@@ -0,0 +1,20 @@
+`Python` 中第二个主要的数据结构是`dict`。`dict`与`list`的不同之处在于你需要通过一
+个键（`key`）来访问元素，而不是通过`index`。
+过现在我们要说的重点是，`dict`条目的访问和赋值都是`O(1)`的时间复杂度。`dict`的另一个重要的操
+作是所谓的`in`。检查一个键是否存在于`dict`中也只需 `O(1)`的时间。
+
+## `dict`内置操作的时间复杂度
+
+| 操作           | 操作说明 | 时间复杂度 |
+| ---------------- | --------- | ---------- |
+| copy             | 复制    | O(n)       |
+| get(value)       | 获取    | O(1)       |
+| set(value)       | 修改    | O(1)       |
+| delete(value)    | 删除    | O(1)       |
+| item `in` dict_obj | `in`关键字 | O(1)       |
+| iterration       | 迭代 | O(n)       |
+
+
+## 更多阅读
+
+[TimeComplexity](https://wiki.python.org/moin/TimeComplexity)
diff --git a/Python_stdin_data_structures/list/code_test.py b/Python_stdin_data_structures/list/code_test.py
@@ -0,0 +1,87 @@
+#!/usr/bin/env python
+# -*- coding: utf-8 -*-
+# Created by imoyao at 2019/5/15 11:04
+"""
+列表操作及性能测试
+"""
+import pathlib
+import sys
+import timeit
+
+util_p = pathlib.Path('../..').resolve()
+sys.path.append(str(util_p))
+from util import utils
+
+
+@utils.show_time
+def list_plus(n):
+    a_list = []
+    for i in range(n):
+        a_list += [i]
+    return a_list
+
+
+@utils.show_time
+def list_append(n):
+    a_list = []
+    for i in range(n):
+        a_list.append(i)
+    return a_list
+
+
+@utils.show_time
+def list_expression(n):
+    return [_ for _ in range(n)]
+
+
+# @utils.show_time
+def list_range(n):
+    return list(range(n))
+
+
+if __name__ == '__main__':
+    # t1 = timeit.Timer('list_plus(1000)', 'from __main__ import list_plus', )
+    # print(f'{list_plus.__name__} takes {t1.timeit(number=1000)} ms.')
+    #
+    # t2 = timeit.Timer('list_append(1000)', 'from __main__ import list_append')
+    # print(f'{list_append.__name__} takes {t1.timeit(number=1000)} ms.')
+    #
+    # t3 = timeit.Timer('list_expression(1000)', 'from __main__ import list_expression')
+    # print(f'{list_expression.__name__} takes {t1.timeit(number=1000)} ms.')
+    #
+    # t4 = timeit.Timer('list_range(1000)', 'from __main__ import list_range')
+    # print(f'{list_range.__name__} takes {t1.timeit(number=1000)} ms.')
+
+    # 上方为使用timeit模块的测试，下方为使用自写装饰器的测试
+
+    n = 1000000
+    list_plus(n)
+    list_append(n)
+    list_expression(n)
+    list_range(n)
+    '''
+    # 输出
+    The function **list_plus** takes 0.47864699363708496 time.
+    The function **list_append** takes 0.41912221908569336 time.
+    The function **list_expression** takes 0.14060020446777344 time.
+    The function **list_range** takes 0.06196928024291992 time.
+    '''
+
+    # 两种不同方式pop()操作耗时对比
+
+    n = 10000
+    x = list_range(n)
+    p1 = timeit.Timer('x.pop()', 'from __main__ import x')
+    print(f'list_pop_normal takes {p1.timeit(number=1000)} ms.')
+
+    p2 = timeit.Timer('x.pop(0)', 'from __main__ import x')
+    print(f'list_pop_index takes {p2.timeit(number=1000)} ms.')
+    '''
+    对比两次，发现指定index时耗时会随着list的增大而增加
+    n = 1000000
+    list_pop_normal takes 0.0006615779711864889 ms.
+    list_pop_index takes 1.150215208006557 ms.  
+    # n = 10000
+    list_pop_normal takes 0.00041822699131444097 ms.
+    list_pop_index takes 0.0079622509656474 ms.
+    '''
diff --git a/Python_stdin_data_structures/list/readme.md b/Python_stdin_data_structures/list/readme.md
@@ -0,0 +1,35 @@
+`List`可能是我们在 `Python` 实际开发中最频繁的数据结构之一。 
+
+## `list`内置操作的时间复杂度
+
+| 操作                 | 操作说明                                 | 时间复杂度 |
+| -------------------- | -------------------------------------------- | ---------- |
+| index(value)         | 查找list某个元素的索引              | O(1)       |
+| a = index(value)     | 索引赋值                                 | O(1)       |
+| append(value)        | 队尾添加                                 | O(1)       |
+| pop()                | 队尾删除                                 | O(1)       |
+| pop(index)           | 根据索引删除某个元素               | O(n)       |
+| insert(index, value) | 根据索引插入某个元素               | O(n)       |
+| iterration           | 列表迭代                                 | O(n)       |
+| item `in` List         | 列表搜索（in关键字）                | O(n)       |
+| slice [x:y]          | 切片, 获取x, y为O(1), 获取x,y 中间的值为O(k) | O(k)       |
+| del slice [x:y]      | 删除切片，删除切片后数据需要重新移动/合并 | O(n)       |
+| reverse              | 列表反转                                 | O(n)       |
+| sort                 | 排序                                       | O(nlogn)   |
+
+`index`和`append`是两个常见操作，它们无论列表多大，操作花费的时间都相同。当
+一个操作的速度不因列表的大小发生变化时，其操作复杂度就是 `O(1)`。
+
+随着列表长度的增加，从列表末端删除元素的 `pop()` 操作时间保持稳定，而从列表
+开头删除元素的 `pop(x)` 操作则随着长度的增加而增加。参见[代码](./code_test.py)`list_test.py:67`。
+
+说明
+当 `pop` 操作每次从列表的最后一位删除元素时复杂度为 `O（1）`，而将列表的第一个元素或中间任意
+一个位置的元素删除时，复杂度则为 `O（n）`。这样迥然不同的结果是由 `Python` 对列表的执行方式造
+成的。在 `Python` 的执行过程中，当从列表的第一位删除一个元素，其后的每一位元素都将向前挪动
+一位。你可能觉得这种操作很愚蠢，但当你仔细看完上表会发现这种执行方式是为了让 `index` 索引
+操作的复杂度降为 `O（1）`。这种在运行时间上的权衡是 `Python` 设计者的良苦用心。
+
+## 更多阅读
+
+[Python 内存分析:list和array](https://www.cnblogs.com/hellcat/p/8795841.html)