python logging模块一次疑似死锁的解决（其实不是）

最新推荐文章于 2023-01-11 21:34:42 发布

Goodwillie

最新推荐文章于 2023-01-11 21:34:42 发布

阅读量2.2k

点赞数

CC 4.0 BY-SA版权

文章标签： logging python

本文链接：https://blog.csdn.net/qq_39597358/article/details/114678090

一直知道python在线程进程混用的时候会因为因为子进程复制父进程的锁状态而导致死锁，下面是引自https://stackoverflow.com/questions/24509650/deadlock-with-logging-multiprocess-multithread-python-script的复现代码：

import multiprocessing
import threading
import logging
import time

SLEEP_TIME = 0.000001

logger = logging.getLogger()

ch = logging.StreamHandler()
ch.setFormatter(logging.Formatter('%(asctime)s %(levelname)s %(funcName)s(): %(message)s'))
ch.setLevel(logging.DEBUG)

logger.setLevel(logging.DEBUG)
logger.addHandler(ch)


class LoggingThread(threading.Thread):

    def __init__(self):
        threading.Thread.__init__(self)

    def run(self):
        while True:
            logger.debug('LoggingThread: {}'.format(self))
            time.sleep(SLEEP_TIME)


def action(i):
    logger.debug('action: {}'.format(i))


def do_parallel_job():

    processes = multiprocessing.cpu_count()
    pool = multiprocessing.Pool(processes=processes)
    for i in range(20):
        pool.apply_async(action, args=(i,))
    pool.close()
    pool.join()



if __name__ == '__main__':

    logger.debug('START')

    #
    # multithread part
    #
    for _ in range(10):
        lt = LoggingThread()
        lt.setDaemon(True)
        lt.start()

    #
    # multiprocess part
    #
    do_parallel_job()

    logger.debug('FINISH')

解释是

This is probably bug 6721.

The problem is common in any situation where you have locks, threads and forks. If thread 1 had a lock while thread 2 calls fork, in the forked process, there will only be thread 2 and the lock will be held forever. In your case, that is logging.StreamHandler.lock.

A fix can be found here (permalink) for the logging module. Note that you need to take care of any other locks, too.

今天也遇到了logging莫名卡死的情况：

具体情况是使用一个脚本去启动另一个脚本

import subprocess

def main():
    sb_p = subprocess.Popen(
        ['python', 'test_load.py'],
        shell=False,
        stdin=subprocess.PIPE,
        stderr=subprocess.PIPE,
    )
    sb_p.wait()

被启动的脚本打日志的时候hang住了。

想了各种办法，最后偶然间发现了规律，被启动的脚本test_load.py总是在输出日志的长度达到一个固定值的时候卡住——4096字节。

一点一点追踪下来，发现StreamHandler在初始化时候如果不指定，会默认把日志打到sys.stderr。


class StreamHandler(Handler):
    """
    A handler class which writes logging records, appropriately formatted,
    to a stream. Note that this class does not close the stream, as
    sys.stdout or sys.stderr may be used.
    """

    terminator = '\n'

    def __init__(self, stream=None):
        """
        Initialize the handler.

        If stream is not specified, sys.stderr is used.
        """
        Handler.__init__(self)
        if stream is None:
            stream = sys.stderr
        self.stream = stream

而通过subprocess启动脚本的时候，参数stderr=subprocess.PIPE使得日志的输出都被放入管道缓冲区，linux的匿名管道的默认缓冲区大小是4k或者64k：

windows10的没有查到。根据测试结果，应该是4k。

https://www.golinuxhub.com/2018/05/how-to-view-and-increase-default-pipe-size-buffer/

In Linux versions before 2.6.11, the capacity of a pipe was the same as the system page size (e.g., 4096 bytes on i386). Since Linux 2.6.11, the pipe capacity is 16 pages (i.e., 65,536 bytes in a system with a page size of 4096 bytes). Since Linux 2.6.35, the default pipe capacity is 16 pages, but the capacity can be queried and set using the fcntl(2) F_GETPIPE_SZ and F_SETPIPE_SZ operations.

管道缓冲区被写满之后，后面的写入请求就都hang住了。

值得一提的是，github上面的这篇文章对我定位这个问题造成了误导，但跟着思路走下去确实加深了对logging模块的理解。

https://github.com/evanj/app-engine-log-deadlock

Thread/Request A:

Calls logging.info which acquires the logging.Handler lock.
logging.info eventually calls the App Engine log hook: AppLogsHandler.emit.
AppLogsHandler.emit calls logservice.write_record, which decides to flush the log.
logservice.py:382 (in _LogsDequeBuffer._flush) is: bytes_left -= 1 + group.lengthString(line.ByteSize()). The call to group.lengthString is a Cython function on App Engine Standard. For some reason, it requires the Python module import lock so it blocks.

Thread/Request B:

Calls import x: acquiring the Python import lock and calling the module code.
Module x (or something transitively executed by it) calls logging.info().
This tries to acquire the lock on the logging.Handler installed by App Engine and blocks.