python关键字in的秘密_python in关键字原理-CSDN博客

本文链接：https://blog.csdn.net/Coll_Jack/article/details/106265102

python关键字in的秘密

如果想在一个数据集中确认某个数据是否存在可以使用in来判断，平时使用起来应该不会多思考，只会感觉in挺好用。比如这样使用in语句：

in_list = [1, 2, 3, 4]
if 2 in in_list:
    print('2在列表里')
print('2不在列表里')

现在考虑一下，上面的语句判断2是否在列表里，这条语句是怎么工作的呢？是马上找到2在列表里，还是逐个查找列表元素，先查看列表第一个元素为1，再查找列表第二个元素为2，确定2确实在列表中，如果列表足够长，恰巧这个要判断的数据在列表最后，那么就要一直查找到最后一个数据才能停止，这样是不是非常耗时间。

其实这里涉及到in语句在不同数据结构中查找的时间复杂度，如果随意使用in，不关注具体在那个数据集中使用，当这个数据集很小，这点时间应该不算什么，但是当这个数据集很大，例如有十万条数据，那么这个时间就浪费太多了，所以，如果知道不同数据结构的in查找时间复杂度，当数据量大时，使用合理的数据结构来存储数据，使用in语句可以大大减少查找时间。

我们经常使用的数据结构有元组、列表、集合、字典，现在我们看看它们之间用in查找的差别：

""" 测试in在不同数据结构中查找效率

    在列表中时间复杂度O(n)
    在元组中时间复杂度O(n)
    在集合中时间复杂度O(1)
    在字典中时间复杂度O(1)

    the statistics of this file:
    lines(count)    understand_level(h/m/l)    classes(count)    functions(count)    fields(count)
    000000000057    ----------------------l    00000000000000    0000000000000004    ~~~~~~~~~~~~~
"""

import time

import line_profiler


def list_in_test():
    list_test = list(range(1000000))
    for i in range(100):
        if 1000 in list_test:
            ...


def set_in_test():
    set_test = set(range(1000000))
    for i in range(100):
        if 1000 in set_test:
            ...


def tuple_in_test():
    tuple_test = tuple(range(1000000))
    for i in range(100):
        if 1000 in tuple_test:
            ...


def dict_in_test():
    dict_test = dict.fromkeys(range(1000000), 1)
    for i in range(100):
        if 1000 in dict_test:
            ...


if __name__ == '__main__':
    print(f'当前时间:{time.ctime()}')
    lp = line_profiler.LineProfiler()
    lp_list_in_test = lp(list_in_test)
    lp_list_in_test()
    lp_set_in_test = lp(set_in_test)
    lp_set_in_test()
    lp_tuple_in_test = lp(tuple_in_test)
    lp_tuple_in_test()
    lp_dict_in_test = lp(dict_in_test)
    lp_dict_in_test()
    lp.print_stats()