您当前的位置: 首页 >  搜索

Dongguo丶

暂无认证

  • 2浏览

    0关注

    472博文

    0收益

  • 0浏览

    0点赞

    0打赏

    0留言

私信
关注
热门博文

49实战基于scoll技术滚动搜索大量数据

Dongguo丶 发布时间:2021-11-13 08:26:13 ,浏览量:2

如果一次性要查出来比如10万条数据,那么性能会很差,此时一般会采取用scoll滚动查询,一批一批的查,直到所有数据都查询完处理完

使用scoll滚动搜索,可以先搜索一批数据,然后下次再搜索一批数据,以此类推,直到搜索出全部的数据来 scoll搜索会在第一次搜索的时候,保存一个当时的视图快照,之后只会基于该旧的视图快照提供数据搜索,如果这个期间数据变更,是不会让用户看到的 采用基于_doc进行排序的方式,性能较高 每次发送scroll请求,我们还需要指定一个scoll参数,指定一个时间窗口,每次搜索请求只要在这个时间窗口内能完成就可以了

GET /test_index/test_type/_search

响应结果

{
  "took": 1,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 9,
    "max_score": 1,
    "hits": [
      {
        "_index": "test_index",
        "_type": "test_type",
        "_id": "8",
        "_score": 1,
        "_source": {
          "test_field": "test client 2"
        }
      },
      {
        "_index": "test_index",
        "_type": "test_type",
        "_id": "10",
        "_score": 1,
        "_source": {
          "test_field1": "test1",
          "test_field2": "updated test2"
        }
      },
      {
        "_index": "test_index",
        "_type": "test_type",
        "_id": "12",
        "_score": 1,
        "_source": {
          "test_field": "test12"
        }
      },
      {
        "_index": "test_index",
        "_type": "test_type",
        "_id": "4",
        "_score": 1,
        "_source": {
          "test_field1": "test field111111"
        }
      },
      {
        "_index": "test_index",
        "_type": "test_type",
        "_id": "6",
        "_score": 1,
        "_source": {
          "test_field": "test test"
        }
      },
      {
        "_index": "test_index",
        "_type": "test_type",
        "_id": "2",
        "_score": 1,
        "_source": {
          "test_field": "replaced test2"
        }
      },
      {
        "_index": "test_index",
        "_type": "test_type",
        "_id": "7",
        "_score": 1,
        "_source": {
          "test_field": "test client 2"
        }
      },
      {
        "_index": "test_index",
        "_type": "test_type",
        "_id": "1",
        "_score": 1,
        "_source": {
          "test_field1": "test field1",
          "test_field2": "bulk test1"
        }
      },
      {
        "_index": "test_index",
        "_type": "test_type",
        "_id": "11",
        "_score": 1,
        "_source": {
          "num": 1,
          "tags": []
        }
      }
    ]
  }
}

一共9条结果,使用scoll,3条查询一次

GET /test_index/test_type/_search?scroll=1m
{
  "query": {
    "match_all": {}
  },
  "sort": [ "_doc" ],
  "size": 3
}

响应结果

{
  "_scroll_id": "DnF1ZXJ5VGhlbkZldGNoBQAAAAAAAAivFngwUVIxRDAyUmttVXlRdzZ1R19heVEAAAAAAAAIsBZ4MFFSMUQwMlJrbVV5UXc2dUdfYXlRAAAAAAAACLEWeDBRUjFEMDJSa21VeVF3NnVHX2F5UQAAAAAAAAiyFngwUVIxRDAyUmttVXlRdzZ1R19heVEAAAAAAAAIsxZ4MFFSMUQwMlJrbVV5UXc2dUdfYXlR",
  "took": 93,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 9,
    "max_score": null,
    "hits": [
      {
        "_index": "test_index",
        "_type": "test_type",
        "_id": "8",
        "_score": null,
        "_source": {
          "test_field": "test client 2"
        },
        "sort": [
          0
        ]
      },
      {
        "_index": "test_index",
        "_type": "test_type",
        "_id": "4",
        "_score": null,
        "_source": {
          "test_field1": "test field111111"
        },
        "sort": [
          0
        ]
      },
      {
        "_index": "test_index",
        "_type": "test_type",
        "_id": "7",
        "_score": null,
        "_source": {
          "test_field": "test client 2"
        },
        "sort": [
          0
        ]
      }
    ]
  }
}

获得的结果会有一个scoll_id,下一次再发送scoll请求的时候,必须带上这个scoll_id

scoll_id包含了上次查询的上下文,比如sort,size等信息

"_scroll_id": "DnF1ZXJ5VGhlbkZldGNoBQAAAAAAAAivFngwUVIxRDAyUmttVXlRdzZ1R19heVEAAAAAAAAIsBZ4MFFSMUQwMlJrbVV5UXc2dUdfYXlRAAAAAAAACLEWeDBRUjFEMDJSa21VeVF3NnVHX2F5UQAAAAAAAAiyFngwUVIxRDAyUmttVXlRdzZ1R19heVEAAAAAAAAIsxZ4MFFSMUQwMlJrbVV5UXc2dUdfYXlR"
GET /_search/scroll
{
    "scroll": "1m", 
    "scroll_id": "DnF1ZXJ5VGhlbkZldGNoBQAAAAAAAAivFngwUVIxRDAyUmttVXlRdzZ1R19heVEAAAAAAAAIsBZ4MFFSMUQwMlJrbVV5UXc2dUdfYXlRAAAAAAAACLEWeDBRUjFEMDJSa21VeVF3NnVHX2F5UQAAAAAAAAiyFngwUVIxRDAyUmttVXlRdzZ1R19heVEAAAAAAAAIsxZ4MFFSMUQwMlJrbVV5UXc2dUdfYXlR"
}

响应结果

{
  "_scroll_id": "DnF1ZXJ5VGhlbkZldGNoBQAAAAAAAAivFngwUVIxRDAyUmttVXlRdzZ1R19heVEAAAAAAAAIsBZ4MFFSMUQwMlJrbVV5UXc2dUdfYXlRAAAAAAAACLEWeDBRUjFEMDJSa21VeVF3NnVHX2F5UQAAAAAAAAiyFngwUVIxRDAyUmttVXlRdzZ1R19heVEAAAAAAAAIsxZ4MFFSMUQwMlJrbVV5UXc2dUdfYXlR",
  "took": 341,
  "timed_out": false,
  "terminated_early": true,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 9,
    "max_score": null,
    "hits": [
      {
        "_index": "test_index",
        "_type": "test_type",
        "_id": "11",
        "_score": null,
        "_source": {
          "num": 1,
          "tags": []
        },
        "sort": [
          0
        ]
      },
      {
        "_index": "test_index",
        "_type": "test_type",
        "_id": "10",
        "_score": null,
        "_source": {
          "test_field1": "test1",
          "test_field2": "updated test2"
        },
        "sort": [
          1
        ]
      },
      {
        "_index": "test_index",
        "_type": "test_type",
        "_id": "6",
        "_score": null,
        "_source": {
          "test_field": "test test"
        },
        "sort": [
          1
        ]
      }
    ]
  }
}

scoll,看起来挺像分页的,但是其实使用场景不一样。分页主要是用来一页一页搜索,给用户看的;scoll主要是用来一批一批检索数据,让系统进行处理的

关注
打赏
1638062488
查看更多评论
立即登录/注册

微信扫码登录

0.0399s