V2EX = way to explore
V2EX 是一个关于分享和探索的地方
现在注册
已注册用户请  登录
V2EX 提问指南
doraos
V2EX  ›  问与答

CPU 访问缓存和主存的延迟大概是多少

  •  
  •   doraos · 2019-01-02 13:00:19 +08:00 · 2892 次点击
    这是一个创建于 1912 天前的主题,其中的信息可能已经有所发展或是发生改变。

    用 adia64 这类软件测的是 L1 1-2ns (3-5 clock),L2 2-3.5ns(6-10clock),L3 10-15ns (20-50 clock), 主存 30-50ns(60-150clock) 但是一些书例如现代操作系统上写的 L1 是瞬间访问, L2 两三个时钟,而另一些书又和上面的数据更加接近

    5 条回复    2019-01-02 14:00:07 +08:00
    xenme
        1
    xenme  
       2019-01-02 13:33:19 +08:00   ❤️ 1
    https://stackoverflow.com/questions/4087280/approximate-cost-to-access-various-caches-and-main-memory

    Core i7 Xeon 5500 Series Data Source Latency (approximate) [Pg. 22]

    local L1 CACHE hit, ~4 cycles ( 2.1 - 1.2 ns )
    local L2 CACHE hit, ~10 cycles ( 5.3 - 3.0 ns )
    local L3 CACHE hit, line unshared ~40 cycles ( 21.4 - 12.0 ns )
    local L3 CACHE hit, shared line in another core ~65 cycles ( 34.8 - 19.5 ns )
    local L3 CACHE hit, modified in another core ~75 cycles ( 40.2 - 22.5 ns )

    remote L3 CACHE (Ref: Fig.1 [Pg. 5]) ~100-300 cycles ( 160.7 - 30.0 ns )

    local DRAM ~60 ns
    remote DRAM ~100 ns
    yanaraika
        2
    yanaraika  
       2019-01-02 13:41:55 +08:00   ❤️ 2
    http://instlatx64.atw.hu/ MemLatX64 有更精确的数据
    ryd994
        3
    ryd994  
       2019-01-02 13:45:18 +08:00 via Android   ❤️ 1
    寄存器才是指令直接访问
    shengyu
        4
    shengyu  
       2019-01-02 13:57:59 +08:00   ❤️ 1
    https://www.7-cpu.com/cpu/Cortex-A57.html

    AMD Opteron A1170 (ARM Cortex-A57), 2.0 GHz, 28 nm. RAM: 16 GB. (Probably it's SoftIron Overdrive 3000 server, DDR3 RDIMM).

    L1 Data cache = 32 KB, 64 B/line, 2-WAY.
    L1 Instruction cache = 48 KB, 64 B/line, 3-WAY.
    L2 Cache = 1 MB (per 2 cores), 64 B/line, 16-WAY.
    L3 Cache = 8 MB (per 8 cores), 64 B/line, ?-WAY.

    L1 Data Cache Latency = 4 cycles for simple access via pointer
    L1 Data Cache Latency = 5 cycles for access with complex address calculation (size_t n, *p; n = p[n]).
    L2 Cache Latency = 18 cycles
    L3 Cache Latency = 60 cycles
    RAM Latency = 60 cycles + 124 ns
    shengyu
        5
    shengyu  
       2019-01-02 14:00:07 +08:00   ❤️ 1
    https://www.7-cpu.com/cpu/Skylake_X.html

    Intel i7-7820X (Skylake X), 8 cores, 4.3 GHz (Turbo Boost), Mesh 2.4 GHz, 14 nm. RAM: 4x 8 GB DDR4-3400 16-18-18-36.

    L1 Data cache = 32 KB, 64 B/line, 8-WAY
    L1 Instruction cache = 32 KB, 64 B/line, 8-WAY.
    L2 cache = 1024 KB, 64 B/line, 16-WAY
    L3 cache = 11 MB, 64 B/line, 11-WAY

    L1 Data Cache Latency = 4 cycles for simple access via pointer
    L1 Data Cache Latency = 5 cycles for access with complex address calculation (size_t n, *p; n = p[n]).
    L2 Cache Latency = 14 cycles
    L3 Cache Latency = 68 cycles (3.6 GHz)
    L3 Cache Latency = 79 cycles (4.3 GHz) (77-81 cycles for different cores)
    RAM Latency = 79 cycles + 50 ns

    不管是 ARM 还是 x86 都需要 4/5 个时钟
    关于   ·   帮助文档   ·   博客   ·   API   ·   FAQ   ·   我们的愿景   ·   实用小工具   ·   1065 人在线   最高记录 6543   ·     Select Language
    创意工作者们的社区
    World is powered by solitude
    VERSION: 3.9.8.5 · 26ms · UTC 19:02 · PVG 03:02 · LAX 12:02 · JFK 15:02
    Developed with CodeLauncher
    ♥ Do have faith in what you're doing.