服务器之家:专注于VPS、云服务器配置技术及软件下载分享
分类导航

PHP教程|ASP.NET教程|Java教程|ASP教程|编程技术|正则表达式|C/C++|IOS|C#|Swift|Android|VB|R语言|JavaScript|易语言|vb.net|

服务器之家 - 编程语言 - Java教程 - ElasticSearch学习之多条件组合查询验证及示例分析

ElasticSearch学习之多条件组合查询验证及示例分析

2023-05-31 15:12程序员皮卡秋 Java教程

这篇文章主要为大家介绍了ElasticSearch 多条件组合查询验证及示例分析,有需要的朋友可以借鉴参考下,希望能够有所帮助,祝大家多多进步,早日升职加薪

多条件组合查询

bool

es中使用bool来控制多条件查询,bool查询支持以下参数:

  • must:被查询的数据必须满足当前条件
  • mush_not:被查询的数据必须不满足当前条件
  • should:被查询的数据应该满足当前条件。should查询被用于修正查询结果的评分。需要注意的是,如果组合查询中没有must,那么被查询的数据至少要匹配一条should。如果有must语句,那么就无须匹配shouldshould将完全用于修正查询结果的评分
  • filter:被查询的数据必须满足当前条件,但是filter操作不涉及查询结果评分。仅用于条件过滤

下面通过一个例子来看下如何使用:

?
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
GET class_1/_search
{
  "query": {
    "bool": {
      "must": [
        {"match": {
          "name": "apple"
        }}
      ],
      "must_not": [
        {"term": {
          "num": {
            "value": "5"
          }
        }}
      ],
      "should": [
        {"match": {
          "name": "k"
        }}
      ],"filter": [
        {"range": {
          "num": {
            "gte": 0,
            "lte": 10
          }
        }}
      ]
    }
  }
}

结果返回:

?
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
{
  "took" : 9,
  "timed_out" : false,
  "_shards" : {
    "total" : 3,
    "successful" : 3,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 3,
      "relation" : "eq"
    },
    "max_score" : 0.752627,
    "hits" : [
      {
        "_index" : "class_1",
        "_type" : "_doc",
        "_id" : "b8fcCoYB090miyjed7YE",
        "_score" : 0.752627,
        "_source" : {
          "name" : "I eat apple so haochi1~",
          "num" : 1
        }
      },
      {
        "_index" : "class_1",
        "_type" : "_doc",
        "_id" : "ccfcCoYB090miyjed7YE",
        "_score" : 0.752627,
        "_source" : {
          "name" : "I eat apple so haochi3~",
          "num" : 1
        }
      },
      {
        "_index" : "class_1",
        "_type" : "_doc",
        "_id" : "cMfcCoYB090miyjed7YE",
        "_score" : 0.7389809,
        "_source" : {
          "name" : "I eat apple so zhen haochi2~",
          "num" : 1
        }
      }
    ]
  }
}

constant_score

constant_score查询可以通过boost指定一个固定的评分,通常来说,constant_score的作用是代替一个只有filterbool查询

下面看具体使用:

?
1
2
3
4
5
6
7
8
9
10
11
12
13
GET class_1/_search
{
  "query": {
    "constant_score": {
      "filter": {
        "term": {
          "num": 6
        }
      },
      "boost": 1.2
    }
  }
}

返回:

?
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
{
  "took" : 7,
  "timed_out" : false,
  "_shards" : {
    "total" : 3,
    "successful" : 3,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 2,
      "relation" : "eq"
    },
    "max_score" : 1.2,
    "hits" : [
      {
        "_index" : "class_1",
        "_type" : "_doc",
        "_id" : "h2Fg-4UBECmbBdQA6VLg",
        "_score" : 1.2,
        "_source" : {
          "name" : "b",
          "num" : 6
        }
      },
      {
        "_index" : "class_1",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 1.2,
        "_source" : {
          "name" : "l",
          "num" : 6
        }
      }
    ]
  }
}

查询验证 & 分析

验证

es中通过/_validate/query路由来验证查询条件的正确性, 这里要注意是验证查询条件是否准确

示例:

?
1
2
3
4
5
6
7
8
9
10
11
12
GET class_1/_validate/query?explain
{
  "query": {
    "bool": {
      "must": [
        {"match": {
          "name": "apple"
        }}
      ]
    }
  }
}

正常返回:

?
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
{
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "failed" : 0
  },
  "valid" : true,
  "explanations" : [
    {
      "index" : "class_1",
      "valid" : true,
      "explanation" : "+name:apple"
    }
  ]
}

name字段改为 name1再查询:

?
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
{
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "failed" : 0
  },
  "valid" : true,
  "explanations" : [
    {
      "index" : "class_1",
      "valid" : true,
      "explanation" : """+MatchNoDocsQuery("unmapped fields [name1]")"""
    }
  ]
}

可以看到报了异常错误

分析

es中通过/_validate/query?explain路由来进行查询分析

示例:

?
1
2
3
4
5
6
7
8
9
10
11
12
GET class_1/_validate/query?explain
{
  "query": {
    "bool": {
      "must": [
        {"match": {
          "name": "apple so"
        }}
      ]
    }
  }
}

返回:

?
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
{
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "failed" : 0
  },
  "valid" : true,
  "explanations" : [
    {
      "index" : "class_1",
      "valid" : true,
      "explanation" : "+(name:apple name:so)"
    }
  ]
}

可以看到"explanation" : "+(name:apple name:so)",查询的短语apple so被进行了分词,分成了name:apple, name: so

排序

默认排序

在前面的几个例子中,我们可以看到它的默认排序是按照_score降序,也就是匹配度高的比较靠前,但是_socre的计算是很占用查询性能的,这个不难理解。

当我们不需要进行_score计算,可以通过filterconstant_score来进行构建查询条件

filter示例:

?
1
2
3
4
5
6
7
8
9
10
11
12
GET class_1/_search
{
  "query": {
    "bool": {
      "filter": [
        {"term": {
          "num": 1
        }}
      ]
    }
  }
}

返回:

?
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
{
  "took" : 5,
  "timed_out" : false,
  "_shards" : {
    "total" : 3,
    "successful" : 3,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 3,
      "relation" : "eq"
    },
    "max_score" : 0.0,
    "hits" : [
      {
        "_index" : "class_1",
        "_type" : "_doc",
        "_id" : "b8fcCoYB090miyjed7YE",
        "_score" : 0.0,
        "_source" : {
          "name" : "I eat apple so haochi1~",
          "num" : 1
        }
      },
      {
        "_index" : "class_1",
        "_type" : "_doc",
        "_id" : "ccfcCoYB090miyjed7YE",
        "_score" : 0.0,
        "_source" : {
          "name" : "I eat apple so haochi3~",
          "num" : 1
        }
      },
      {
        "_index" : "class_1",
        "_type" : "_doc",
        "_id" : "cMfcCoYB090miyjed7YE",
        "_score" : 0.0,
        "_source" : {
          "name" : "I eat apple so zhen haochi2~",
          "num" : 1
        }
      }
    ]
  }
}

通过查询结果我们发现score都为0.0了,说明没有进行score计算

constant_score示例:

?
1
2
3
4
5
6
7
8
9
10
11
12
13
GET class_1/_search
{
  "query": {
    "constant_score": {
      "filter": {
        "term": {
          "num": 1
        }
      },
      "boost": 1.2
    }
  }
}

返回:

?
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
{
  "took" : 3,
  "timed_out" : false,
  "_shards" : {
    "total" : 3,
    "successful" : 3,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 3,
      "relation" : "eq"
    },
    "max_score" : 1.2,
    "hits" : [
      {
        "_index" : "class_1",
        "_type" : "_doc",
        "_id" : "b8fcCoYB090miyjed7YE",
        "_score" : 1.2,
        "_source" : {
          "name" : "I eat apple so haochi1~",
          "num" : 1
        }
      },
      {
        "_index" : "class_1",
        "_type" : "_doc",
        "_id" : "ccfcCoYB090miyjed7YE",
        "_score" : 1.2,
        "_source" : {
          "name" : "I eat apple so haochi3~",
          "num" : 1
        }
      },
      {
        "_index" : "class_1",
        "_type" : "_doc",
        "_id" : "cMfcCoYB090miyjed7YE",
        "_score" : 1.2,
        "_source" : {
          "name" : "I eat apple so zhen haochi2~",
          "num" : 1
        }
      }
    ]
  }
}

可以看到,对应返回的分值,都是使用boost属性指定的分值

自定义排序

自定义可以用于大部分场景,那么es中怎么进行自定义排序呢? es中使用sort参数来自定义排序顺序,默认为升序,那么降序怎么操作呢?

  • 升序
?
1
{"sort":["num"]}
  • 降序, desc代表降序
?
1
{"sort":[{"num":{"order":"desc"}}]}

tips

  • es中使用doc value列式存储来实现字段的排序功能
  • text字段默认不创建doc value,因此无法针对text字段进行排序
  • 可以通过设置text字段属性fielddata=true来开启对text字段的排序功能,但是不建议开启,对text字段排序及其消耗查询性能且不符合需求

单字段排序

?
1
2
3
4
5
6
GET class_1/_search
{
    "sort": [
        "num"
    ]
}

返回:

?
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
{
  "took" : 6,
  "timed_out" : false,
  "_shards" : {
    "total" : 3,
    "successful" : 3,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 11,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [
      {
        "_index" : "class_1",
        "_type" : "_doc",
        "_id" : "b8fcCoYB090miyjed7YE",
        "_score" : null,
        "_source" : {
          "name" : "I eat apple so haochi1~",
          "num" : 1
        },
        "sort" : [
          1
        ]
      },
      {
        "_index" : "class_1",
        "_type" : "_doc",
        "_id" : "ccfcCoYB090miyjed7YE",
        "_score" : null,
        "_source" : {
          "name" : "I eat apple so haochi3~",
          "num" : 1
        },
        "sort" : [
          1
        ]
      },
      {
        "_index" : "class_1",
        "_type" : "_doc",
        "_id" : "cMfcCoYB090miyjed7YE",
        "_score" : null,
        "_source" : {
          "name" : "I eat apple so zhen haochi2~",
          "num" : 1
        },
        "sort" : [
          1
        ]
      },
      {
        "_index" : "class_1",
        "_type" : "_doc",
        "_id" : "h2Fg-4UBECmbBdQA6VLg",
        "_score" : null,
        "_source" : {
          "name" : "b",
          "num" : 6
        },
        "sort" : [
          6
        ]
      },
      {
        "_index" : "class_1",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : null,
        "_source" : {
          "name" : "l",
          "num" : 6
        },
        "sort" : [
          6
        ]
      },
      {
        "_index" : "class_1",
        "_type" : "_doc",
        "_id" : "3",
        "_score" : null,
        "_source" : {
          "num" : 9,
          "name" : "e",
          "age" : 9,
          "desc" : [
            "hhhh"
          ]
        },
        "sort" : [
          9
        ]
      },
      {
        "_index" : "class_1",
        "_type" : "_doc",
        "_id" : "4",
        "_score" : null,
        "_source" : {
          "name" : "f",
          "age" : 10,
          "num" : 10
        },
        "sort" : [
          10
        ]
      },
      {
        "_index" : "class_1",
        "_type" : "_doc",
        "_id" : "RWlfBIUBDuA8yW5cu9wu",
        "_score" : null,
        "_source" : {
          "name" : "一年级",
          "num" : 20
        },
        "sort" : [
          20
        ]
      },
      {
        "_index" : "class_1",
        "_type" : "_doc",
        "_id" : "iGFt-4UBECmbBdQAnVJe",
        "_score" : null,
        "_source" : {
          "name" : "g",
          "age" : 8
        },
        "sort" : [
          9223372036854775807
        ]
      },
      {
        "_index" : "class_1",
        "_type" : "_doc",
        "_id" : "iWFt-4UBECmbBdQAnVJg",
        "_score" : null,
        "_source" : {
          "name" : "h",
          "age" : 9
        },
        "sort" : [
          9223372036854775807
        ]
      }
    ]
  }
}

可以看到是按照num默认升序排序

再看下降序:

?
1
2
3
4
5
6
GET class_1/_search
{
    "sort": [
        {"num": {"order":"desc"}}
    ]
}

返回:

?
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
{
  "took" : 15,
  "timed_out" : false,
  "_shards" : {
    "total" : 3,
    "successful" : 3,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 11,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [
      {
        "_index" : "class_1",
        "_type" : "_doc",
        "_id" : "RWlfBIUBDuA8yW5cu9wu",
        "_score" : null,
        "_source" : {
          "name" : "一年级",
          "num" : 20
        },
        "sort" : [
          20
        ]
      },
      {
        "_index" : "class_1",
        "_type" : "_doc",
        "_id" : "4",
        "_score" : null,
        "_source" : {
          "name" : "f",
          "age" : 10,
          "num" : 10
        },
        "sort" : [
          10
        ]
      },
      {
        "_index" : "class_1",
        "_type" : "_doc",
        "_id" : "3",
        "_score" : null,
        "_source" : {
          "num" : 9,
          "name" : "e",
          "age" : 9,
          "desc" : [
            "hhhh"
          ]
        },
        "sort" : [
          9
        ]
      },
      {
        "_index" : "class_1",
        "_type" : "_doc",
        "_id" : "h2Fg-4UBECmbBdQA6VLg",
        "_score" : null,
        "_source" : {
          "name" : "b",
          "num" : 6
        },
        "sort" : [
          6
        ]
      },
      {
        "_index" : "class_1",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : null,
        "_source" : {
          "name" : "l",
          "num" : 6
        },
        "sort" : [
          6
        ]
      },
      {
        "_index" : "class_1",
        "_type" : "_doc",
        "_id" : "b8fcCoYB090miyjed7YE",
        "_score" : null,
        "_source" : {
          "name" : "I eat apple so haochi1~",
          "num" : 1
        },
        "sort" : [
          1
        ]
      },
      {
        "_index" : "class_1",
        "_type" : "_doc",
        "_id" : "ccfcCoYB090miyjed7YE",
        "_score" : null,
        "_source" : {
          "name" : "I eat apple so haochi3~",
          "num" : 1
        },
        "sort" : [
          1
        ]
      },
      {
        "_index" : "class_1",
        "_type" : "_doc",
        "_id" : "cMfcCoYB090miyjed7YE",
        "_score" : null,
        "_source" : {
          "name" : "I eat apple so zhen haochi2~",
          "num" : 1
        },
        "sort" : [
          1
        ]
      },
      {
        "_index" : "class_1",
        "_type" : "_doc",
        "_id" : "iGFt-4UBECmbBdQAnVJe",
        "_score" : null,
        "_source" : {
          "name" : "g",
          "age" : 8
        },
        "sort" : [
          -9223372036854775808
        ]
      },
      {
        "_index" : "class_1",
        "_type" : "_doc",
        "_id" : "iWFt-4UBECmbBdQAnVJg",
        "_score" : null,
        "_source" : {
          "name" : "h",
          "age" : 9
        },
        "sort" : [
          -9223372036854775808
        ]
      }
    ]
  }
}

这下就降序排序了

多字段

?
1
2
3
4
5
6
GET class_1/_search
{
    "sort": [
        "num", "age"
    ]
}

scroll分页

还记得之前给大家讲的from+size的分页方式吗,es中默认允许from+size的分页的最大数据量为10000。当我们想要批量获取更大的数据量时,使用from+size就会十分的耗费性能。

然而大部分应用场景下的数据量是极其庞大的,比如你要查询某些系统日志数据。es中可以使用/scorll路由来进行滚动分页查询,它类似于在查询初始时间点创建了一个当前服务集群的数据快照(包含每一个分片),并保留它一段时间。在时间超过了设置的过期时间以后,快照将在es空闲时被删除。

需要注意的是,因为是进行快照查询,因此在快照创建后数据的变更在本次的滚动查询中,不可见

初始化快照 & 快照保存10分钟

查询示例:

?
1
2
3
4
5
6
7
8
9
GET class_1/_search?scroll=10m
{
"query": {
 "match_phrase": {
   "name": "apple"
 }
},
"size": 2
}

返回:

?
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
{
  "_scroll_id" : "DnF1ZXJ5VGhlbkZldGNoAwAAAAAAAAXoFjEwWkdOMkxLUTVPZEMzM01ZdHhPc1EAAAAAAAACABZjUy1CemQwQVFfU3BUeGs2OGk0R1Z3AAAAAAAAAgEWY1MtQnpkMEFRX1NwVHhrNjhpNEdWdw==",
  "took" : 6,
  "timed_out" : false,
  "_shards" : {
    "total" : 3,
    "successful" : 3,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 3,
      "relation" : "eq"
    },
    "max_score" : 0.752627,
    "hits" : [
      {
        "_index" : "class_1",
        "_type" : "_doc",
        "_id" : "b8fcCoYB090miyjed7YE",
        "_score" : 0.752627,
        "_source" : {
          "name" : "I eat apple so haochi1~",
          "num" : 1
        }
      },
      {
        "_index" : "class_1",
        "_type" : "_doc",
        "_id" : "ccfcCoYB090miyjed7YE",
        "_score" : 0.752627,
        "_source" : {
          "name" : "I eat apple so haochi3~",
          "num" : 1
        }
      }
    ]
  }
}

如图,当前共返回2条数据,并且返回了一个快照ID,后续可以根据快照ID进行滚动查询:

根据快照ID滚动查询

?
1
2
3
4
5
GET /_search/scroll
{
 "scroll": "10m",
 "scroll_id" : "DnF1ZXJ5VGhlbkZldGNoAwAAAAAAAAXoFjEwWkdOMkxLUTVPZEMzM01ZdHhPc1EAAAAAAAACABZjUy1CemQwQVFfU3BUeGs2OGk0R1Z3AAAAAAAAAgEWY1MtQnpkMEFRX1NwVHhrNjhpNEdWdw=="
}

返回:

?
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
{
  "_scroll_id" : "DnF1ZXJ5VGhlbkZldGNoAwAAAAAAAAXoFjEwWkdOMkxLUTVPZEMzM01ZdHhPc1EAAAAAAAACABZjUy1CemQwQVFfU3BUeGs2OGk0R1Z3AAAAAAAAAgEWY1MtQnpkMEFRX1NwVHhrNjhpNEdWdw==",
  "took" : 6,
  "timed_out" : false,
  "_shards" : {
    "total" : 3,
    "successful" : 3,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 3,
      "relation" : "eq"
    },
    "max_score" : 0.752627,
    "hits" : [
      {
        "_index" : "class_1",
        "_type" : "_doc",
        "_id" : "cMfcCoYB090miyjed7YE",
        "_score" : 0.7389809,
        "_source" : {
          "name" : "I eat apple so zhen haochi2~",
          "num" : 1
        }
      }
    ]
  }
}

在滚动一次:

?
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
{
  "_scroll_id" : "DnF1ZXJ5VGhlbkZldGNoAwAAAAAAAAXoFjEwWkdOMkxLUTVPZEMzM01ZdHhPc1EAAAAAAAACABZjUy1CemQwQVFfU3BUeGs2OGk0R1Z3AAAAAAAAAgEWY1MtQnpkMEFRX1NwVHhrNjhpNEdWdw==",
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 3,
    "successful" : 3,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 3,
      "relation" : "eq"
    },
    "max_score" : 0.752627,
    "hits" : [ ]
  }
}

有的小伙伴可能不知道怎么滚动的,因为后续滚动都是同一个scroll_id,其实通过结果,我们不难发现:

  • 首先创建了一个10分钟的快照,规定了每次返回的数据量为2条,并且初始化的时候,返回了2条
  • 通过scroll_id进行滚动操作,返回了1条数据,原因是快照的数据量总共只有3条,初始化的时候返回了2条,所以现在只有1条
  • 再次滚动的时候,发现返回了空,因为数据已经被查完了

以上就是ElasticSearch 多条件组合查询验证及示例分析的详细内容,更多关于ElasticSearch 多条件组合查询的资料请关注服务器之家其它相关文章!

原文链接:https://juejin.cn/post/7195396652088164389

延伸 · 阅读

精彩推荐