V2EX = way to explore
V2EX 是一个关于分享和探索的地方
现在注册
已注册用户请  登录
rqxiao
V2EX  ›  Elasticsearch

elasticsearch 聚合结果的问题

  •  
  •   rqxiao · 2022-01-30 10:42:32 +08:00 · 2156 次点击
    这是一个创建于 1029 天前的主题,其中的信息可能已经有所发展或是发生改变。
    最近看 es 聚合分页的时候看到了 es 聚合结果不准的问题。

    首先创建一个 index (分片数量大于 1 才会出现聚合不准的问题)
    PUT /my_aggs_3
    { "settings": { "number_of_shards": 3}}


    POST /my_aggs_/_bulk
    { "index": {}}
    { "money": 50, "bid":"11" }
    { "index": {}}
    { "money": 40, "bid":"11" }
    { "index": {}}
    { "money": 20, "bid":"11" }
    { "index": {}}
    { "money": 10, "bid":"10" }
    { "index": {}}
    { "money": 10, "bid":"10" }
    { "index": {}}
    { "money": 10, "bid":"10" }
    { "index": {}}
    { "money": 10, "bid":"10" }
    { "index": {}}
    { "money": 10, "bid":"10" }
    { "index": {}}
    { "money": 10, "bid":"10" }
    { "index": {}}
    { "money": 10, "bid":"10" }
    { "index": {}}
    { "money": 10, "bid":"10" }
    { "index": {}}
    { "money": 10, "bid":"10" }
    { "index": {}}
    { "money": 10, "bid":"10" }
    { "index": {}}
    { "money": 10, "bid":"9" }
    { "index": {}}
    { "money": 20, "bid":"9" }
    { "index": {}}
    { "money": 20, "bid":"9" }
    { "index": {}}
    { "money": 20, "bid":"9" }
    { "index": {}}
    { "money": 20, "bid":"9" }
    { "index": {}}
    { "money": 60, "bid":"8" }
    { "index": {}}
    { "money": 10, "bid":"8" }
    { "index": {}}
    { "money": 10, "bid":"8" }
    { "index": {}}
    { "money": 60, "bid":"7" }
    { "index": {}}
    { "money": 10, "bid":"7" }
    { "index": {}}
    { "money": 20, "bid":"6" }
    { "index": {}}
    { "money": 40, "bid":"6" }
    { "index": {}}
    { "money": 10, "bid":"5" }
    { "index": {}}
    { "money": 20, "bid":"5" }
    { "index": {}}
    { "money": 20, "bid":"5" }
    { "index": {}}
    { "money": 40, "bid":"4" }
    { "index": {}}
    { "money": 30, "bid":"3" }
    { "index": {}}
    { "money": 10, "bid":"2" }
    { "index": {}}
    { "money": 10, "bid":"2" }
    { "index": {}}
    { "money": 10, "bid":"1" }

    一开始怎么测试都测不出不正确的结果,后来调小了 shard_size (官网说默认是 1.5 * size + 10 )

    size 是你想要取前几名数据的几
    shard_size 则是 es 会去每个分片上找多少个记录

    GET my_aggs/_search
    {
    "from": 0,
    "size": 0,
    "aggs": {
    "aggs_bid": {
    "terms": {
    "field": "bid.keyword",
    "size":3,
    "shard_size": 3,
    "order": {
    "aggs_money": "desc"
    }
    },
    "aggs": {
    "aggs_money": {
    "sum": {
    "field": "money"
    }
    }
    }
    }
    }
    }

    ----------结果-------------------

    {
    "took" : 0,
    "timed_out" : false,
    "_shards" : {
    "total" : 3,
    "successful" : 3,
    "skipped" : 0,
    "failed" : 0
    },
    "hits" : {
    "total" : {
    "value" : 33,
    "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
    },
    "aggregations" : {
    "aggs_bid" : {
    "doc_count_error_upper_bound" : -1,
    "sum_other_doc_count" : 20,
    "buckets" : [
    {
    "key" : "11",
    "doc_count" : 3,
    "aggs_money" : {
    "value" : 110.0
    }
    },
    {
    "key" : "10",
    "doc_count" : 8,
    "aggs_money" : {
    "value" : 80.0
    }
    },
    {
    "key" : "8",
    "doc_count" : 2,
    "aggs_money" : {
    "value" : 70.0
    }
    }
    ]
    }
    }
    }
    rqxiao
        1
    rqxiao  
    OP
       2022-01-30 10:46:20 +08:00
    所以想请教下 es 聚合中一般采用什么方式,现在我直接把 size 调到 Integer.MAX 。其他的做法其实还有增加分片数量。问下 es 聚合遇到的时候实际生产是怎么做的
    关于   ·   帮助文档   ·   博客   ·   API   ·   FAQ   ·   实用小工具   ·   898 人在线   最高记录 6679   ·     Select Language
    创意工作者们的社区
    World is powered by solitude
    VERSION: 3.9.8.5 · 24ms · UTC 20:22 · PVG 04:22 · LAX 12:22 · JFK 15:22
    Developed with CodeLauncher
    ♥ Do have faith in what you're doing.