V2EX = way to explore
V2EX 是一个关于分享和探索的地方
现在注册
已注册用户请  登录
V2EX 提问指南
zeroday
V2EX  ›  问与答

CentOS 下 Python web server 出现 too many open files 异常如何排查问题?

  •  
  •   zeroday · 2016-07-07 16:44:29 +08:00 · 1619 次点击
    这是一个创建于 3068 天前的主题,其中的信息可能已经有所发展或是发生改变。
    在 linux 下用 python+ansible+django 写的网站,每隔一段时间就会出现 too many open files 的异常。

    我该如何排查这个问题呢?
    12 条回复    2016-08-18 14:00:11 +08:00
    shyling
        1
    shyling  
       2016-07-07 16:59:13 +08:00 via iPad
    limits 设高点
    wzxjohn
        2
    wzxjohn  
       2016-07-07 16:59:24 +08:00
    ulimit -a
    zeroday
        3
    zeroday  
    OP
       2016-07-07 17:26:44 +08:00
    @wzxjohn

    core file size (blocks, -c) 0
    data seg size (kbytes, -d) unlimited
    scheduling priority (-e) 0
    file size (blocks, -f) unlimited
    pending signals (-i) 126875
    max locked memory (kbytes, -l) 64
    max memory size (kbytes, -m) unlimited
    open files (-n) 200000
    pipe size (512 bytes, -p) 8
    POSIX message queues (bytes, -q) 819200
    real-time priority (-r) 0
    stack size (kbytes, -s) 8192
    cpu time (seconds, -t) unlimited
    max user processes (-u) 126875
    virtual memory (kbytes, -v) unlimited
    file locks (-x) unlimited
    zeroday
        4
    zeroday  
    OP
       2016-07-07 17:26:59 +08:00
    @shyling 已经设置成 200000
    zeroday
        5
    zeroday  
    OP
       2016-07-07 17:27:50 +08:00
    @shyling 已经设置为 200000 了,还是会满。
    BOYPT
        6
    BOYPT  
       2016-07-07 17:29:50 +08:00
    得从你的程序分析,是什么打开了描述字,什么导致没有回收。
    ryd994
        7
    ryd994  
       2016-07-07 18:10:52 +08:00
    用 lsof -p python 的 pid 看看都打开了什么文件
    zeroday
        8
    zeroday  
    OP
       2016-08-14 14:45:25 +08:00
    @BOYPT @ryd994

    查了一下文件已经删除,但是没有释放的文件描述符。

    [root@85-13-112]# lsof | grep deleted
    python 19327 root 8u REG 252,2 0 534767 /tmp/tmp4Hlr9O (deleted)
    python 19327 root 12u REG 252,2 0 534768 /tmp/tmp2TSn03 (deleted)
    python 19327 19332 root 8u REG 252,2 0 534767 /tmp/tmp4Hlr9O (deleted)
    python 19327 19332 root 12u REG 252,2 0 534768 /tmp/tmp2TSn03 (deleted)
    python 19327 19334 root 8u REG 252,2 0 534767 /tmp/tmp4Hlr9O (deleted)
    python 19327 19334 root 12u REG 252,2 0 534768 /tmp/tmp2TSn03 (deleted)
    python 19327 19335 root 8u REG 252,2 0 534767 /tmp/tmp4Hlr9O (deleted)
    python 19327 19335 root 12u REG 252,2 0 534768 /tmp/tmp2TSn03 (deleted)
    python 19327 19336 root 8u REG 252,2 0 534767 /tmp/tmp4Hlr9O (deleted)
    python 19327 19336 root 12u REG 252,2 0 534768 /tmp/tmp2TSn03 (deleted)
    python 19327 19337 root 8u REG 252,2 0 534767 /tmp/tmp4Hlr9O (deleted)
    python 19327 19337 root 12u REG 252,2 0 534768 /tmp/tmp2TSn03 (deleted)
    python 19327 19338 root 8u REG 252,2 0 534767 /tmp/tmp4Hlr9O (deleted)
    python 19327 19338 root 12u REG 252,2 0 534768 /tmp/tmp2TSn03 (deleted)
    python 19327 19339 root 8u REG 252,2 0 534767 /tmp/tmp4Hlr9O (deleted)
    python 19327 19339 root 12u REG 252,2 0 534768 /tmp/tmp2TSn03 (deleted)
    python 19327 19340 root 8u REG 252,2 0 534767 /tmp/tmp4Hlr9O (deleted)
    python 19327 19340 root 12u REG 252,2 0 534768 /tmp/tmp2TSn03 (deleted)
    python 19327 19341 root 8u REG 252,2 0 534767 /tmp/tmp4Hlr9O (deleted)
    python 19327 19341 root 12u REG 252,2 0 534768 /tmp/tmp2TSn03 (deleted)
    python 19327 19342 root 8u REG 252,2 0 534767 /tmp/tmp4Hlr9O (deleted)
    python 19327 19342 root 12u REG 252,2 0 534768 /tmp/tmp2TSn03 (deleted)
    python 19327 19343 root 8u REG 252,2 0 534767 /tmp/tmp4Hlr9O (deleted)
    python 19327 19343 root 12u REG 252,2 0 534768 /tmp/tmp2TSn03 (deleted)
    python 19327 19344 root 8u REG 252,2 0 534767 /tmp/tmp4Hlr9O (deleted)
    python 19327 19344 root 12u REG 252,2 0 534768 /tmp/tmp2TSn03 (deleted)
    python 19327 19345 root 8u REG 252,2 0 534767 /tmp/tmp4Hlr9O (deleted)
    python 19327 19345 root 12u REG 252,2 0 534768 /tmp/tmp2TSn03 (deleted)
    python 19327 19346 root 8u REG 252,2 0 534767 /tmp/tmp4Hlr9O (deleted)
    python 19327 19346 root 12u REG 252,2 0 534768 /tmp/tmp2TSn03 (deleted)
    python 19327 19347 root 8u REG 252,2 0 534767 /tmp/tmp4Hlr9O (deleted)
    python 19327 19347 root 12u REG 252,2 0 534768 /tmp/tmp2TSn03 (deleted)
    python 19327 19348 root 8u REG 252,2 0 534767 /tmp/tmp4Hlr9O (deleted)
    python 19327 19348 root 12u REG 252,2 0 534768 /tmp/tmp2TSn03 (deleted)
    python 19327 19349 root 8u REG 252,2 0 534767 /tmp/tmp4Hlr9O (deleted)
    python 19327 19349 root 12u REG 252,2 0 534768 /tmp/tmp2TSn03 (deleted)
    python 19327 19350 root 8u REG 252,2 0 534767 /tmp/tmp4Hlr9O (deleted)
    python 19327 19350 root 12u REG 252,2 0 534768 /tmp/tmp2TSn03 (deleted)
    python 19327 19351 root 8u REG 252,2 0 534767 /tmp/tmp4Hlr9O (deleted)
    python 19327 19351 root 12u REG 252,2 0 534768 /tmp/tmp2TSn03 (deleted)
    python 19327 19352 root 8u REG 252,2 0 534767 /tmp/tmp4Hlr9O (deleted)
    python 19327 19352 root 12u REG 252,2 0 534768 /tmp/tmp2TSn03 (deleted)
    python 19327 19353 root 8u REG 252,2 0 534767 /tmp/tmp4Hlr9O (deleted)
    python 19327 19353 root 12u REG 252,2 0 534768 /tmp/tmp2TSn03 (deleted)
    python 19327 19354 root 8u REG 252,2 0 534767 /tmp/tmp4Hlr9O (deleted)
    python 19327 19354 root 12u REG 252,2 0 534768 /tmp/tmp2TSn03 (deleted)
    python 19327 19355 root 8u REG 252,2 0 534767 /tmp/tmp4Hlr9O (deleted)
    python 19327 19355 root 12u REG 252,2 0 534768 /tmp/tmp2TSn03 (deleted)
    python 19327 19356 root 8u REG 252,2 0 534767 /tmp/tmp4Hlr9O (deleted)
    python 19327 19356 root 12u REG 252,2 0 534768 /tmp/tmp2TSn03 (deleted)
    python 19327 19357 root 8u REG 252,2 0 534767 /tmp/tmp4Hlr9O (deleted)
    python 19327 19357 root 12u REG 252,2 0 534768 /tmp/tmp2TSn03 (deleted)
    python 19327 19358 root 8u REG 252,2 0 534767 /tmp/tmp4Hlr9O (deleted)
    python 19327 19358 root 12u REG 252,2 0 534768 /tmp/tmp2TSn03 (deleted)
    python 19327 19359 root 8u REG 252,2 0 534767 /tmp/tmp4Hlr9O (deleted)
    python 19327 19359 root 12u REG 252,2 0 534768 /tmp/tmp2TSn03 (deleted)
    python 19327 19360 root 8u REG 252,2 0 534767 /tmp/tmp4Hlr9O (deleted)
    python 19327 19360 root 12u REG 252,2 0 534768 /tmp/tmp2TSn03 (deleted)
    python 19327 19361 root 8u REG 252,2 0 534767 /tmp/tmp4Hlr9O (deleted)
    python 19327 19361 root 12u REG 252,2 0 534768 /tmp/tmp2TSn03 (deleted)
    python 19327 19362 root 8u REG 252,2 0 534767 /tmp/tmp4Hlr9O (deleted)
    python 19327 19362 root 12u REG 252,2 0 534768 /tmp/tmp2TSn03 (deleted)
    python 19327 19363 root 8u REG 252,2 0 534767 /tmp/tmp4Hlr9O (deleted)
    python 19327 19363 root 12u REG 252,2 0 534768 /tmp/tmp2TSn03 (deleted)

    [root@85-13-112]# lsof | grep deleted | wc -l
    65
    ryd994
        9
    ryd994  
       2016-08-14 14:59:57 +08:00 via Android
    @zeroday 所以用完文件随手关一下啊
    BOYPT
        10
    BOYPT  
       2016-08-14 22:18:12 +08:00
    @zeroday 比如说使用了 tempfile.TemporaryFile 返回的描述字对象一直没有 close ,就这样了啊
    zeroday
        11
    zeroday  
    OP
       2016-08-14 22:32:30 +08:00
    @BOYPT 谢谢,后来我也发现了 ansible 里这段代码。

    ```
    def write_file(module, url, dest, content):
    # create a tempfile with some test content
    fd, tmpsrc = tempfile.mkstemp()
    f = open(tmpsrc, 'wb')
    try:
    f.write(content)
    except Exception, err:
    os.remove(tmpsrc)
    module.fail_json(msg="failed to create temporary content file: %s" % str(err))
    f.close()
    ```

    fd, tmpsrc = tempfile.mkstemp() 生成的 fd 没有 close 。

    还有一个问题,就是发现线上服务器上 /dev/null 这个描述符。这个可能是什么没有关呢?
    BOYPT
        12
    BOYPT  
       2016-08-18 14:00:11 +08:00   ❤️ 1
    @zeroday 一般服务进程的的 stdin/stdout/stderr 三个特殊描述字,可能会指向 /dev/null ,不需要关闭。
    关于   ·   帮助文档   ·   博客   ·   API   ·   FAQ   ·   实用小工具   ·   1047 人在线   最高记录 6679   ·     Select Language
    创意工作者们的社区
    World is powered by solitude
    VERSION: 3.9.8.5 · 23ms · UTC 19:33 · PVG 03:33 · LAX 11:33 · JFK 14:33
    Developed with CodeLauncher
    ♥ Do have faith in what you're doing.