Elasticsearch 全文检索与过滤

一、本地或云端集群

# 一条命令拉起 ES + Kibana(8.x 以上)
curl -fsSL https://elastic.co/start-local | sh

浏览器打开 http://localhost:5601,进入 Kibana → DevTools → Console,后续 JSON 请求直接粘贴即可执行。

云端 / Serverless 用户:只要有 superuser 或开发者角色,同样复制即可。

二、映射与嵌套字段

2.1.创建索引 cooking_blog

PUT /cooking_blog

2.3.完整映射(含嵌套 ingredients)

PUT /cooking_blog/_mapping
{
  "properties": {
    "title":       { "type": "text", "fields": {"keyword": {"type": "keyword","ignore_above":256 }}},
    "description": { "type": "text", "fields": {"keyword": {"type": "keyword"}}},
    "author":      { "type": "text", "fields": {"keyword": {"type": "keyword"}}},
    "date":        { "type": "date", "format": "yyyy-MM-dd"},
    "category":    { "type": "text", "fields": {"keyword": {"type": "keyword"}}},
    "tags":        { "type": "text", "fields": {"keyword": {"type": "keyword"}}},
    "rating":      { "type": "float"},

    "ingredients": {                         # ⬅︎ 配料表
      "type": "nested",
      "properties": {
        "name":     { "type": "text", "fields": {"keyword": {"type": "keyword"}}},
        "quantity": { "type": "float"},
        "unit":     { "type": "keyword"}
      }
    }
  }
}
  • text + keyword 多字段:同时支持分词检索与精确过滤
  • nested:让配料成为独立文档,便于高精度查询

三、批量导入示例数据

POST /cooking_blog/_bulk?refresh=wait_for
{ "index": { "_id": "1" } }
{"title":"Perfect Pancakes: A Fluffy Breakfast Delight","description":"Learn the secrets…","author":"Maria Rodriguez","date":"2023-05-01","category":"Breakfast","tags":["pancakes","breakfast","easy recipes"],"rating":4.8,"ingredients":[{"name":"flour","quantity":200,"unit":"g"},{"name":"buttermilk","quantity":250,"unit":"ml"}]}
{ "index": { "_id": "2" } }
{"title":"Spicy Thai Green Curry: A Vegetarian Adventure","description":"Dive into the flavors…","author":"Liam Chen","date":"2023-05-05","category":"Main Course","tags":["thai","vegetarian","curry","spicy"],"rating":4.6}
{ "index": { "_id": "3" } }
{"title":"Vegan Chocolate Avocado Mousse","description":"Discover the magic of avocado…","author":"Alex Green","date":"2023-05-15","category":"Dessert","tags":["vegan","chocolate","avocado"],"rating":4.5}

四、match / match_phrase / multi_match

4.1. match —— 标准全文搜索

GET /cooking_blog/_search
{
  "_source": ["title","description"],
  "query": {
    "match": {
      "description": "fluffy pancakes"
    }
  }
}
  • 默认 OR:任一 term 命中即可
  • 可加 "operator": "and" 强制全部出现

4.2. match_phrase —— 顺序与距离

GET /cooking_blog/_search
{
  "query": {
    "match_phrase": {
      "description": {
        "query": "Thai green curry",
        "slop": 2            # 允许错位 2 个 term
      }
    }
  }
}

4.3. multi_match —— 多字段一次搞定

GET /cooking_blog/_search
{
  "query": {
    "multi_match": {
      "query":  "vegetarian curry",
      "type":   "cross_fields",
      "fields": ["title^3", "description^2", "tags"]
    }
  }
}
  • cross_fields:把多个字段视作一个大字段,适合多 field 同义词
  • ^ 权重:标题 > 描述 > 标签

4.4. simple_query_string —— 用户友好语法

GET /cooking_blog/_search
{
  "query": {
    "simple_query_string": {
      "query": "\"green curry\" | \"thai curry\" -spicy",
      "fields": ["title","description"]
    }
  }
}

五、精准过滤:term / range / exists

5.1. 类别等值

GET /cooking_blog/_search
{
  "query": {
    "term": {
      "category.keyword": "Dessert"
    }
  }
}

5.2. 评分区间

GET /cooking_blog/_search
{
  "query": {
    "range": {
      "rating": { "gte": 4.7 }
    }
  }
}

5.3. 最近 30 天

GET /cooking_blog/_search
{
  "query": {
    "range": {
      "date": { "gte": "now-30d/d" }
    }
  }
}

过滤不会计分,执行更快,可被缓存。

六、bool、boosting、function_score

6.1. bool——最常用的组合

GET /cooking_blog/_search
{
  "_source": ["title","tags","rating","date"],
  "query": {
    "bool": {
      "must": [
        { "term":  { "tags.keyword": "vegetarian" } },
        { "range": { "rating": { "gte": 4.5 } } }
      ],
      "should": [
        { "term":  { "category.keyword": "Main Course" } },
        { "range": { "date": { "gte": "now-1M/d" } } }
      ],
      "must_not": [
        { "term": { "category.keyword": "Dessert" } }
      ],
      "minimum_should_match": 1
    }
  }
}

6.2. boosting——惩罚不想要的命中

GET /cooking_blog/_search
{
  "query": {
    "boosting": {
      "positive": { "match": { "tags": "chocolate" } },
      "negative": { "term":  { "tags.keyword": "vegan" } },
      "negative_boost": 0.3
    }
  }
}

6.3. function_score——按时间、评分衰减

GET /cooking_blog/_search
{
  "query": {
    "function_score": {
      "query": { "match": { "description": "curry" } },
      "functions": [
        {
          "gauss": {
            "date": { "origin": "now", "scale": "30d", "decay": 0.5 }
          }
        },
        {
          "field_value_factor": {
            "field": "rating",
            "modifier": "sqrt"
          }
        }
      ],
      "boost_mode": "sum"
    }
  }
}

七、高亮、排序与分页

GET /cooking_blog/_search
{
  "from": 0, "size": 5,
  "sort": [
    { "rating": { "order": "desc" } },
    { "date":   { "order": "desc" } }
  ],
  "query": { "match": { "description": "chocolate" } },
  "highlight": {
    "pre_tags": ["<em>"], "post_tags": ["</em>"],
    "fields": { "description": {} }
  }
}
  • 深分页:改用 search_afterscroll
  • 高亮:默认使用字段分析器,需原文保护可用 term_vectorstored_fields

八、为前端做 Facet

GET /cooking_blog/_search
{
  "size": 0,
  "aggs": {
    "by_category": {          "terms": { "field": "category.keyword" } },
    "tag_top10":   {          "terms": { "field": "tags.keyword", "size": 10 } },
    "rating_stats":{          "stats": { "field": "rating" } },
    "monthly":     {
      "date_histogram": {
        "field": "date", "calendar_interval": "month"
      }
    }
  }
}
  • 用于显示“分类计数”“热门标签”“评分分布”“月活跃度”等

九、ingredients 配料表

GET /cooking_blog/_search
{
  "query": {
    "nested": {
      "path": "ingredients",
      "query": {
        "bool": {
          "must": [
            { "match": { "ingredients.name": "flour" } },
            { "term":  { "ingredients.unit": "g" } }
          ]
        }
      },
      "inner_hits": { "size": 3 }
    }
  }
}

十、性能 & 相关性调优清单

场景调优要点
常查询少更新调高 refresh_interval,合并段,降低写放大
过滤维度固定建议放入 filter,并开启 query.cache(ES 自动处理)
深分页卡顿使用 search_after 或 Elasticsearch v8+ 的 point in time
高基数排序为字段启用 doc_values,或设置 index.sort 排序索引
同义词 / 停用词自定义 analyzer,并分离索引时与查询时同义词策略
多语言每种语言一个子字段,如 title.entitle.fr,查询时 multi_match
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

Hello.Reader

请我喝杯咖啡吧😊

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值