多条件组合查询
bool
es
中使用bool
来控制多条件
查询,bool
查询支持以下参数:
-
must
:被查询的数据必须满足
当前条件 -
mush_not
:被查询的数据必须不满足
当前条件 -
should
:被查询的数据应该满足
当前条件。should
查询被用于修正查询结果的评分。需要注意的是,如果组合查询中没有must
,那么被查询的数据至少要匹配一条should
。如果有must
语句,那么就无须匹配should
,should
将完全用于修正查询结果的评分 -
filter
:被查询的数据必须满足
当前条件,但是filter
操作不涉及查询结果评分。仅用于条件过滤
下面通过一个例子来看下如何使用:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
|
GET class_ 1 /_search { "query" : { "bool" : { "must" : [ { "match" : { "name" : "apple" }} ], "must_not" : [ { "term" : { "num" : { "value" : "5" } }} ], "should" : [ { "match" : { "name" : "k" }} ], "filter" : [ { "range" : { "num" : { "gte" : 0 , "lte" : 10 } }} ] } } } |
结果返回:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
|
{ "took" : 9 , "timed_out" : false , "_shards" : { "total" : 3 , "successful" : 3 , "skipped" : 0 , "failed" : 0 }, "hits" : { "total" : { "value" : 3 , "relation" : "eq" }, "max_score" : 0.752627 , "hits" : [ { "_index" : "class_1" , "_type" : "_doc" , "_id" : "b8fcCoYB090miyjed7YE" , "_score" : 0.752627 , "_source" : { "name" : "I eat apple so haochi1~" , "num" : 1 } }, { "_index" : "class_1" , "_type" : "_doc" , "_id" : "ccfcCoYB090miyjed7YE" , "_score" : 0.752627 , "_source" : { "name" : "I eat apple so haochi3~" , "num" : 1 } }, { "_index" : "class_1" , "_type" : "_doc" , "_id" : "cMfcCoYB090miyjed7YE" , "_score" : 0.7389809 , "_source" : { "name" : "I eat apple so zhen haochi2~" , "num" : 1 } } ] } } |
constant_score
constant_score
查询可以通过boost
指定一个固定的评分,通常来说,constant_score
的作用是代替一个只有filter
的bool
查询
下面看具体使用:
1
2
3
4
5
6
7
8
9
10
11
12
13
|
GET class_ 1 /_search { "query" : { "constant_score" : { "filter" : { "term" : { "num" : 6 } }, "boost" : 1.2 } } } |
返回:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
|
{ "took" : 7 , "timed_out" : false , "_shards" : { "total" : 3 , "successful" : 3 , "skipped" : 0 , "failed" : 0 }, "hits" : { "total" : { "value" : 2 , "relation" : "eq" }, "max_score" : 1.2 , "hits" : [ { "_index" : "class_1" , "_type" : "_doc" , "_id" : "h2Fg-4UBECmbBdQA6VLg" , "_score" : 1.2 , "_source" : { "name" : "b" , "num" : 6 } }, { "_index" : "class_1" , "_type" : "_doc" , "_id" : "1" , "_score" : 1.2 , "_source" : { "name" : "l" , "num" : 6 } } ] } } |
查询验证 & 分析
验证
es
中通过/_validate/query
路由来验证查询条件的正确性, 这里要注意是验证查询条件是否准确
示例:
1
2
3
4
5
6
7
8
9
10
11
12
|
GET class_ 1 /_validate/query?explain { "query" : { "bool" : { "must" : [ { "match" : { "name" : "apple" }} ] } } } |
正常返回:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
|
{ "_shards" : { "total" : 1 , "successful" : 1 , "failed" : 0 }, "valid" : true , "explanations" : [ { "index" : "class_1" , "valid" : true , "explanation" : "+name:apple" } ] } |
将name
字段改为 name1
再查询:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
|
{ "_shards" : { "total" : 1 , "successful" : 1 , "failed" : 0 }, "valid" : true , "explanations" : [ { "index" : "class_1" , "valid" : true , "explanation" : "" "+MatchNoDocsQuery(" unmapped fields [name 1 ] ")" "" } ] } |
可以看到报了异常错误
分析
es
中通过/_validate/query?explain
路由来进行查询分析
示例:
1
2
3
4
5
6
7
8
9
10
11
12
|
GET class_ 1 /_validate/query?explain { "query" : { "bool" : { "must" : [ { "match" : { "name" : "apple so" }} ] } } } |
返回:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
|
{ "_shards" : { "total" : 1 , "successful" : 1 , "failed" : 0 }, "valid" : true , "explanations" : [ { "index" : "class_1" , "valid" : true , "explanation" : "+(name:apple name:so)" } ] } |
可以看到"explanation" : "+(name:apple name:so)"
,查询的短语apple so
被进行了分词,分成了name:apple
, name: so
排序
默认排序
在前面的几个例子中,我们可以看到它的默认排序是按照_score降序,也就是匹配度高的比较靠前,但是_socre
的计算是很占用查询性能的,这个不难理解。
当我们不需要进行_score计算,可以通过filter
或constant_score
来进行构建查询条件
filter
示例:
1
2
3
4
5
6
7
8
9
10
11
12
|
GET class_ 1 /_search { "query" : { "bool" : { "filter" : [ { "term" : { "num" : 1 }} ] } } } |
返回:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
|
{ "took" : 5 , "timed_out" : false , "_shards" : { "total" : 3 , "successful" : 3 , "skipped" : 0 , "failed" : 0 }, "hits" : { "total" : { "value" : 3 , "relation" : "eq" }, "max_score" : 0.0 , "hits" : [ { "_index" : "class_1" , "_type" : "_doc" , "_id" : "b8fcCoYB090miyjed7YE" , "_score" : 0.0 , "_source" : { "name" : "I eat apple so haochi1~" , "num" : 1 } }, { "_index" : "class_1" , "_type" : "_doc" , "_id" : "ccfcCoYB090miyjed7YE" , "_score" : 0.0 , "_source" : { "name" : "I eat apple so haochi3~" , "num" : 1 } }, { "_index" : "class_1" , "_type" : "_doc" , "_id" : "cMfcCoYB090miyjed7YE" , "_score" : 0.0 , "_source" : { "name" : "I eat apple so zhen haochi2~" , "num" : 1 } } ] } } |
通过查询结果我们发现score
都为0.0
了,说明没有进行score
计算
constant_score
示例:
1
2
3
4
5
6
7
8
9
10
11
12
13
|
GET class_ 1 /_search { "query" : { "constant_score" : { "filter" : { "term" : { "num" : 1 } }, "boost" : 1.2 } } } |
返回:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
|
{ "took" : 3 , "timed_out" : false , "_shards" : { "total" : 3 , "successful" : 3 , "skipped" : 0 , "failed" : 0 }, "hits" : { "total" : { "value" : 3 , "relation" : "eq" }, "max_score" : 1.2 , "hits" : [ { "_index" : "class_1" , "_type" : "_doc" , "_id" : "b8fcCoYB090miyjed7YE" , "_score" : 1.2 , "_source" : { "name" : "I eat apple so haochi1~" , "num" : 1 } }, { "_index" : "class_1" , "_type" : "_doc" , "_id" : "ccfcCoYB090miyjed7YE" , "_score" : 1.2 , "_source" : { "name" : "I eat apple so haochi3~" , "num" : 1 } }, { "_index" : "class_1" , "_type" : "_doc" , "_id" : "cMfcCoYB090miyjed7YE" , "_score" : 1.2 , "_source" : { "name" : "I eat apple so zhen haochi2~" , "num" : 1 } } ] } } |
可以看到,对应返回的分值,都是使用boost
属性指定的分值
自定义排序
自定义可以用于大部分场景,那么es
中怎么进行自定义排序呢? es
中使用sort
参数来自定义排序顺序,默认为升序,那么降序怎么操作呢?
- 升序
1
|
{ "sort" :[ "num" ]} |
-
降序,
desc
代表降序
1
|
{ "sort" :[{ "num" :{ "order" : "desc" }}]} |
tips
-
es
中使用doc value
列式存储来实现字段的排序功能 -
text
字段默认不创建doc value
,因此无法针对text
字段进行排序 -
可以通过设置
text
字段属性fielddata=true
来开启对text
字段的排序功能,但是不建议开启,对text
字段排序及其消耗查询性能且不符合需求
单字段排序
1
2
3
4
5
6
|
GET class_ 1 /_search { "sort" : [ "num" ] } |
返回:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
|
{ "took" : 6 , "timed_out" : false , "_shards" : { "total" : 3 , "successful" : 3 , "skipped" : 0 , "failed" : 0 }, "hits" : { "total" : { "value" : 11 , "relation" : "eq" }, "max_score" : null , "hits" : [ { "_index" : "class_1" , "_type" : "_doc" , "_id" : "b8fcCoYB090miyjed7YE" , "_score" : null , "_source" : { "name" : "I eat apple so haochi1~" , "num" : 1 }, "sort" : [ 1 ] }, { "_index" : "class_1" , "_type" : "_doc" , "_id" : "ccfcCoYB090miyjed7YE" , "_score" : null , "_source" : { "name" : "I eat apple so haochi3~" , "num" : 1 }, "sort" : [ 1 ] }, { "_index" : "class_1" , "_type" : "_doc" , "_id" : "cMfcCoYB090miyjed7YE" , "_score" : null , "_source" : { "name" : "I eat apple so zhen haochi2~" , "num" : 1 }, "sort" : [ 1 ] }, { "_index" : "class_1" , "_type" : "_doc" , "_id" : "h2Fg-4UBECmbBdQA6VLg" , "_score" : null , "_source" : { "name" : "b" , "num" : 6 }, "sort" : [ 6 ] }, { "_index" : "class_1" , "_type" : "_doc" , "_id" : "1" , "_score" : null , "_source" : { "name" : "l" , "num" : 6 }, "sort" : [ 6 ] }, { "_index" : "class_1" , "_type" : "_doc" , "_id" : "3" , "_score" : null , "_source" : { "num" : 9 , "name" : "e" , "age" : 9 , "desc" : [ "hhhh" ] }, "sort" : [ 9 ] }, { "_index" : "class_1" , "_type" : "_doc" , "_id" : "4" , "_score" : null , "_source" : { "name" : "f" , "age" : 10 , "num" : 10 }, "sort" : [ 10 ] }, { "_index" : "class_1" , "_type" : "_doc" , "_id" : "RWlfBIUBDuA8yW5cu9wu" , "_score" : null , "_source" : { "name" : "一年级" , "num" : 20 }, "sort" : [ 20 ] }, { "_index" : "class_1" , "_type" : "_doc" , "_id" : "iGFt-4UBECmbBdQAnVJe" , "_score" : null , "_source" : { "name" : "g" , "age" : 8 }, "sort" : [ 9223372036854775807 ] }, { "_index" : "class_1" , "_type" : "_doc" , "_id" : "iWFt-4UBECmbBdQAnVJg" , "_score" : null , "_source" : { "name" : "h" , "age" : 9 }, "sort" : [ 9223372036854775807 ] } ] } } |
可以看到是按照num
默认升序排序
再看下降序:
1
2
3
4
5
6
|
GET class_ 1 /_search { "sort" : [ { "num" : { "order" : "desc" }} ] } |
返回:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
|
{ "took" : 15 , "timed_out" : false , "_shards" : { "total" : 3 , "successful" : 3 , "skipped" : 0 , "failed" : 0 }, "hits" : { "total" : { "value" : 11 , "relation" : "eq" }, "max_score" : null , "hits" : [ { "_index" : "class_1" , "_type" : "_doc" , "_id" : "RWlfBIUBDuA8yW5cu9wu" , "_score" : null , "_source" : { "name" : "一年级" , "num" : 20 }, "sort" : [ 20 ] }, { "_index" : "class_1" , "_type" : "_doc" , "_id" : "4" , "_score" : null , "_source" : { "name" : "f" , "age" : 10 , "num" : 10 }, "sort" : [ 10 ] }, { "_index" : "class_1" , "_type" : "_doc" , "_id" : "3" , "_score" : null , "_source" : { "num" : 9 , "name" : "e" , "age" : 9 , "desc" : [ "hhhh" ] }, "sort" : [ 9 ] }, { "_index" : "class_1" , "_type" : "_doc" , "_id" : "h2Fg-4UBECmbBdQA6VLg" , "_score" : null , "_source" : { "name" : "b" , "num" : 6 }, "sort" : [ 6 ] }, { "_index" : "class_1" , "_type" : "_doc" , "_id" : "1" , "_score" : null , "_source" : { "name" : "l" , "num" : 6 }, "sort" : [ 6 ] }, { "_index" : "class_1" , "_type" : "_doc" , "_id" : "b8fcCoYB090miyjed7YE" , "_score" : null , "_source" : { "name" : "I eat apple so haochi1~" , "num" : 1 }, "sort" : [ 1 ] }, { "_index" : "class_1" , "_type" : "_doc" , "_id" : "ccfcCoYB090miyjed7YE" , "_score" : null , "_source" : { "name" : "I eat apple so haochi3~" , "num" : 1 }, "sort" : [ 1 ] }, { "_index" : "class_1" , "_type" : "_doc" , "_id" : "cMfcCoYB090miyjed7YE" , "_score" : null , "_source" : { "name" : "I eat apple so zhen haochi2~" , "num" : 1 }, "sort" : [ 1 ] }, { "_index" : "class_1" , "_type" : "_doc" , "_id" : "iGFt-4UBECmbBdQAnVJe" , "_score" : null , "_source" : { "name" : "g" , "age" : 8 }, "sort" : [ -9223372036854775808 ] }, { "_index" : "class_1" , "_type" : "_doc" , "_id" : "iWFt-4UBECmbBdQAnVJg" , "_score" : null , "_source" : { "name" : "h" , "age" : 9 }, "sort" : [ -9223372036854775808 ] } ] } } |
这下就降序
排序了
多字段
1
2
3
4
5
6
|
GET class_ 1 /_search { "sort" : [ "num" , "age" ] } |
scroll分页
还记得之前给大家讲的from+size
的分页方式吗,es
中默认允许from+size
的分页的最大数据量为10000
。当我们想要批量获取更大的数据量时,使用from+size
就会十分的耗费性能。
然而大部分应用场景下的数据量是极其庞大的,比如你要查询某些系统日志数据。es
中可以使用/scorll
路由来进行滚动分页查询
,它类似于在查询初始时间点创建了一个当前服务集群的数据快照
(包含每一个分片),并保留它一段时间。在时间超过了设置的过期时间以后,快照将在es空闲时被删除。
需要注意的是,因为是进行快照
查询,因此在快照
创建后数据的变更在本次的滚动查询中,不可见
初始化快照 & 快照保存10分钟
查询示例:
1
2
3
4
5
6
7
8
9
|
GET class_ 1 /_search?scroll= 10 m { "query" : { "match_phrase" : { "name" : "apple" } }, "size" : 2 } |
返回:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
|
{ "_scroll_id" : "DnF1ZXJ5VGhlbkZldGNoAwAAAAAAAAXoFjEwWkdOMkxLUTVPZEMzM01ZdHhPc1EAAAAAAAACABZjUy1CemQwQVFfU3BUeGs2OGk0R1Z3AAAAAAAAAgEWY1MtQnpkMEFRX1NwVHhrNjhpNEdWdw==" , "took" : 6 , "timed_out" : false , "_shards" : { "total" : 3 , "successful" : 3 , "skipped" : 0 , "failed" : 0 }, "hits" : { "total" : { "value" : 3 , "relation" : "eq" }, "max_score" : 0.752627 , "hits" : [ { "_index" : "class_1" , "_type" : "_doc" , "_id" : "b8fcCoYB090miyjed7YE" , "_score" : 0.752627 , "_source" : { "name" : "I eat apple so haochi1~" , "num" : 1 } }, { "_index" : "class_1" , "_type" : "_doc" , "_id" : "ccfcCoYB090miyjed7YE" , "_score" : 0.752627 , "_source" : { "name" : "I eat apple so haochi3~" , "num" : 1 } } ] } } |
如图,当前共返回2
条数据,并且返回了一个快照ID,后续可以根据快照ID进行滚动查询:
根据快照ID滚动查询
1
2
3
4
5
|
GET /_search/scroll { "scroll" : "10m" , "scroll_id" : "DnF1ZXJ5VGhlbkZldGNoAwAAAAAAAAXoFjEwWkdOMkxLUTVPZEMzM01ZdHhPc1EAAAAAAAACABZjUy1CemQwQVFfU3BUeGs2OGk0R1Z3AAAAAAAAAgEWY1MtQnpkMEFRX1NwVHhrNjhpNEdWdw==" } |
返回:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
|
{ "_scroll_id" : "DnF1ZXJ5VGhlbkZldGNoAwAAAAAAAAXoFjEwWkdOMkxLUTVPZEMzM01ZdHhPc1EAAAAAAAACABZjUy1CemQwQVFfU3BUeGs2OGk0R1Z3AAAAAAAAAgEWY1MtQnpkMEFRX1NwVHhrNjhpNEdWdw==" , "took" : 6 , "timed_out" : false , "_shards" : { "total" : 3 , "successful" : 3 , "skipped" : 0 , "failed" : 0 }, "hits" : { "total" : { "value" : 3 , "relation" : "eq" }, "max_score" : 0.752627 , "hits" : [ { "_index" : "class_1" , "_type" : "_doc" , "_id" : "cMfcCoYB090miyjed7YE" , "_score" : 0.7389809 , "_source" : { "name" : "I eat apple so zhen haochi2~" , "num" : 1 } } ] } } |
在滚动一次:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
|
{ "_scroll_id" : "DnF1ZXJ5VGhlbkZldGNoAwAAAAAAAAXoFjEwWkdOMkxLUTVPZEMzM01ZdHhPc1EAAAAAAAACABZjUy1CemQwQVFfU3BUeGs2OGk0R1Z3AAAAAAAAAgEWY1MtQnpkMEFRX1NwVHhrNjhpNEdWdw==" , "took" : 1 , "timed_out" : false , "_shards" : { "total" : 3 , "successful" : 3 , "skipped" : 0 , "failed" : 0 }, "hits" : { "total" : { "value" : 3 , "relation" : "eq" }, "max_score" : 0.752627 , "hits" : [ ] } } |
有的小伙伴可能不知道怎么滚动
的,因为后续滚动都是同一个scroll_id
,其实通过结果,我们不难发现:
-
首先创建了一个10分钟的
快照
,规定了每次返回的数据量为2条
,并且初始化的时候,返回了2条 -
通过
scroll_id
进行滚动操作,返回了1条
数据,原因是快照的数据量总共只有3条
,初始化的时候返回了2条
,所以现在只有1条
- 再次滚动的时候,发现返回了空,因为数据已经被查完了
以上就是ElasticSearch 多条件组合查询验证及示例分析的详细内容,更多关于ElasticSearch 多条件组合查询的资料请关注服务器之家其它相关文章!
原文链接:https://juejin.cn/post/7195396652088164389