您当前的位置: 首页 > 

Dongguo丶

暂无认证

  • 1浏览

    0关注

    472博文

    0收益

  • 0浏览

    0点赞

    0打赏

    0留言

私信
关注
热门博文

06实战之电商网站商品管理:索引、文档

Dongguo丶 发布时间:2021-11-04 08:22:15 ,浏览量:1

1、document数据格式 面向文档的搜索分析引擎

(1)应用系统的数据结构都是面向对象的,复杂的 (2)对象数据存储到数据库中,只能拆解开来,变为扁平的多张表,每次查询的时候还得还原回对象格式,相当麻烦 (3)ES是面向文档的,文档中存储的数据结构,与面向对象的数据结构是一样的,基于这种文档数据结构,es可以提供复杂的索引,全文检索,分析聚合等功能 (4)es的document用json数据格式来表达

对于一个实体类Employee

public class Employee {
  private String email;
  private String firstName;
  private String lastName;
  private EmployeeInfo info;
  private Date joinDate;
}

public class EmployeeInfo {
  private String bio; // 性格
  private Integer age;
  private String[] interests; // 兴趣爱好
}

创建一个Employee对象

EmployeeInfo info = new EmployeeInfo();
info.setBio("curious and modest");
info.setAge(30);
info.setInterests(new String[]{"bike", "climb"});

Employee employee = new Employee();
employee.setEmail("zhangsan@sina.com");
employee.setFirstName("san");
employee.setLastName("zhang");
employee.setInfo(info);
employee.setJoinDate(new Date());

employee对象:里面包含了Employee类自己的属性,还有一个EmployeeInfo对象

如果使用数据库去存储

创建两张表:employee表,employee_info表,将employee对象的数据重新拆开来,变成Employee数据和EmployeeInfo数据 employee表:id, email,first_name,last_name,join_date,4个字段 employee_info表:bio,age,interests,3个字段;此外还有一个外键字段,比如employee_id,关联着employee表

这样很麻烦了

如果使用es去存储

再复杂的对象都可以使用一个document去表示

{
    "email":      "zhangsan@sina.com",
    "first_name": "san",
    "last_name": "zhang",
    "info": {
        "bio":         "curious and modest",
        "age":         30,
        "interests": [ "bike", "climb" ]
    },
    "join_date": "2017/01/01"
}

我们就明白了es的document数据格式和数据库的关系型数据格式的区别

数据格式

Elasticsearch 是面向文档型数据库,一条数据在这里就是一个文档。 为了方便大家理解, 我们将 Elasticsearch 里存储文档数据和关系型数据库 MySQL 存储数据的概念进行一个类比

image-20211028204250290

ES 里的 Index 可以看做一个库,而 Types 相当于表, Documents 则相当于表的行。 这里 Types 的概念已经被逐渐弱化, Elasticsearch 6.X 中,一个 index 下已经只能包含一个 type, Elasticsearch 7.X 中, Type 的概念已经被删除了。

RESTful

REST 指的是一组架构约束条件和原则。满足这些约束条件和原则的应用程序或设计就 是 RESTful。 Web 应用程序最重要的 REST 原则是,客户端和服务器之间的交互在请求之 间是无状态的。从客户端到服务器的每个请求都必须包含理解请求所必需的信息。如果服务 器在请求之间的任何时间点重启,客户端不会得到通知。此外,无状态请求可以由任何可用 服务器回答,这十分适合云计算之类的环境。客户端可以缓存数据以改进性能。 在服务器端,应用程序状态和功能可以分为各种资源。资源是一个有趣的概念实体,它 向客户端公开。资源的例子有:应用程序对象、数据库记录、算法等等。每个资源都使用 URI (Universal Resource Identifier) 得到一个唯一的地址。所有资源都共享统一的接口,以便在客 户端和服务器之间传输状态。使用的是标准的 HTTP 方法,比如 GET、 PUT、 POST 和 DELETE。 在 REST 样式的 Web 服务中,每个资源都有一个地址。资源本身都是方法调用的目 标,方法列表对所有资源都是一样的。这些方法都是标准方法,包括 HTTP GET、 POST、 PUT、 DELETE,还可能包括 HEAD 和 OPTIONS。简单的理解就是,如果想要访问互联 网上的资源,就必须向资源所在的服务器发出请求,请求体中必须包含资源的网络路径, 以 及对资源进行的操作(增删改查)。

2、电商网站商品管理案例背景介绍

有一个电商网站,需要为其基于ES构建一个后台系统,提供以下功能:

(1)对商品信息进行CRUD(增删改查)操作 (2)执行简单的结构化查询 (3)可以执行简单的全文检索,以及复杂的phrase(短语)检索 (4)对于全文检索的结果,可以进行高亮显示 (5)对数据进行简单的聚合分析

3、简单的集群管理 (1)快速检查集群的健康状况

es提供了一套api,叫做cat api,可以查看es中各种各样的数据

查看集群的健康状态:

GET /_cat/health?v

响应结果

epoch      timestamp cluster       status node.total node.data shards pri relo init unassign pending_tasks max_task_wait_time active_shards_percent
1635864788 22:53:08  elasticsearch yellow          1         1      1   1    0    0        1             0                  -                 50.0%

如何快速了解集群的健康状况?

green:每个索引的primary shard和replica shard都是active状态的 yellow:每个索引的primary shard都是active状态的,但是部分replica shard不是active状态,处于不可用的状态 red:不是所有索引的primary shard都是active状态的,部分索引有数据丢失了

为什么现在会处于一个yellow状态?

我们现在就一个笔记本电脑,就启动了一个es进程,相当于就只有一个node。现在es中有一个index,就是kibana自己内置建立的index。由于默认的配置是给每个index分配5个primary shard和5个replica shard,而且primary shard和replica shard不能在同一台机器上(为了容错)。现在kibana自己建立的index是1个primary shard和1个replica shard。当前就一个node,所以只有1个primary shard被分配了和启动了,但是一个replica shard没有第二台机器去启动。

做一个小实验:此时只要启动第二个es进程,就会在es集群中有2个node,然后那1个replica shard就会自动分配过去,然后cluster status就会变成green状态。

我们再解压一个es

image-20211102230132178

启动es,此时端口号是9201

再次查看集群的健康状态:

GET /_cat/health?v

响应结果

epoch      timestamp cluster       status node.total node.data shards pri relo init unassign pending_tasks max_task_wait_time active_shards_percent
1635865517 23:05:17  elasticsearch green           2         2      2   1    0    0        0             0                  -                100.0%

实际上在学习中并不需要启动那么多节点,可以停掉第二台es。

2)快速查看集群中有哪些索引

?v - 显示题头

GET /_cat/indices?v

响应结果

health status index   uuid                   pri rep docs.count docs.deleted store.size pri.store.size
yellow open   .kibana 7gzEC7M5SLWUhd52buBAkw   1   1          1            0      3.2kb          3.2kb

.kibana是kibana默认创建的索引

表头含义health当前服务器健康状态: green(集群完整) yellow(单点正常、集群不完整) red(单点不正常)status索引打开、关闭状态index索引名uuid索引统一编号pri主分片数量rep副本数量docs.count可用文档数量docs.deleted文档删除状态(逻辑删除)store.size主分片和副分片整体占空间大小pri.store.size主分片占空间大小 (3)简单的索引操作 创建索引

?pretty - 格式化输出

PUT /test_index?pretty

响应结果

{
  "acknowledged": true,
  "shards_acknowledged": true,
  "index": "test_index"
}

解释

{
"acknowledged"【响应结果】 : true, # true 操作成功
"shards_acknowledged"【分片结果】 : true, # 分片操作成功
"index"【索引名称】 : "test_index"
}

注意:创建索引库的分片数在 7.0.0 之前的 Elasticsearch 版本中,默认 5 片;在7.0.0之后默认 1 片

如果重复添加索引,会返回错误信息 “index [test_index/-kQoG72UTkm5L0GlVYFm7w] already exists”,

{
  "error": {
    "root_cause": [
      {
        "type": "index_already_exists_exception",
        "reason": "index [test_index/-kQoG72UTkm5L0GlVYFm7w] already exists",
        "index_uuid": "-kQoG72UTkm5L0GlVYFm7w",
        "index": "test_index"
      }
    ],
    "type": "index_already_exists_exception",
    "reason": "index [test_index/-kQoG72UTkm5L0GlVYFm7w] already exists",
    "index_uuid": "-kQoG72UTkm5L0GlVYFm7w",
    "index": "test_index"
  },
  "status": 400
}

put具有幂等性,而且创建索引不支持post

POST /test_index?pretty

响应结果

No handler found for uri [/test_index?pretty] and method [POST]
查看单个索引
GET /test_index?pretty

响应结果

{
  "test_index": {
    "aliases": {},
    "mappings": {},
    "settings": {
      "index": {
        "creation_date": "1635866355428",
        "number_of_shards": "5",
        "number_of_replicas": "1",
        "uuid": "-kQoG72UTkm5L0GlVYFm7w",
        "version": {
          "created": "5060099"
        },
        "provided_name": "test_index"
      }
    }
  }
}

说明

{
	"shopping"【
	索引名】: {
		"aliases"【
		别名】: {},
		"mappings"【
		映射】: {},
		"settings"【
		设置】: {
			"index"【
			设置 - 索引】: {
				"creation_date"【
				设置 - 索引 - 创建时间】: "1635866355428",
				"number_of_shards"【
				设置 - 索引 - 主分片数量】: "5",   #7.0.0之前为5,7.0.0之后为1
				"number_of_replicas"【
				设置 - 索引 - 副分片数量】: "1",
				"uuid"【
				设置 - 索引 - 唯一标识】: "-kQoG72UTkm5L0GlVYFm7w",
				"version"【
				设置 - 索引 - 版本】: {
					"created": "5060099"
				},
				"provided_name"【
				设置 - 索引 - 名称】: "test_index"
			}
		}
	}
}
删除索引
DELETE /test_index?pretty

响应结果

{
  "acknowledged": true
}

已经删除索引,再次删除

{
  "error": {
    "root_cause": [
      {
        "type": "index_not_found_exception",
        "reason": "no such index",
        "resource.type": "index_or_alias",
        "resource.id": "test_index",
        "index_uuid": "_na_",
        "index": "test_index"
      }
    ],
    "type": "index_not_found_exception",
    "reason": "no such index",
    "resource.type": "index_or_alias",
    "resource.id": "test_index",
    "index_uuid": "_na_",
    "index": "test_index"
  },
  "status": 404
}
4、商品文档的CRUD操作

对文档的增删改查,这里的文档可以类比为关系型数据库中的表数据,添加的数据格式为 JSON 格式

(1)新增商品:新增文档,建立索引
PUT /index/type/id
{
  "json数据"
}

新增商品

PUT /ecommerce/product/1
{
    "name" : "gaolujie yagao",
    "desc" :  "gaoxiao meibai",
    "price" :  30,
    "producer" :      "gaolujie producer",
    "tags": [ "meibai", "fangzhu" ]
}

响应结果

{
  "_index": "ecommerce",
  "_type": "product",
  "_id": "1",
  "_version": 1,
  "result": "created",
  "_shards": {
    "total": 2,
    "successful": 1,
    "failed": 0
  },
  "created": true
}

说明

{
	"_index"【
	索引】: "ecommerce",
		"_type"【
	类型 - 文档】: "product",
		"_id"【
	唯一标识】: "1", #可以类比为 MySQL 中的主键, 随机生成 "_version"【
	版本】: 1,
		"result"【
	结果】: "created", #这里的 create 表示创建成功 "_shards"【
	分片】: {
			"total"【
			分片 - 总数】: 2,
			"successful"【
			分片 - 成功】: 1,
			"failed"【
			分片 - 失败】: 0
		},
  		"created": true
}

上面的数据创建后,如果没有指定数据唯一性标识(ID),默认情况下, ES 服务器会随机 生成一个。创建内容相同的文档,id是不同的,所以创建文档不支持put,put具有幂等性

POST /ecommerce/product

如果增加数据时明确数据主键,那么请求方式也可以为 PUT

PUT /ecommerce/product/1

再多新增几个商品

PUT /ecommerce/product/2
{
    "name" : "jiajieshi yagao",
    "desc" :  "youxiao fangzhu",
    "price" :  25,
    "producer" :      "jiajieshi producer",
    "tags": [ "fangzhu" ]
}
PUT /ecommerce/product/3
{
    "name" : "zhonghua yagao",
    "desc" :  "caoben zhiwu",
    "price" :  40,
    "producer" :      "zhonghua producer",
    "tags": [ "qingxin" ]
}

es会自动建立index和type,不需要提前创建,而且es默认会对document每个field都建立倒排索引,让其可以被搜索

(2)查询商品:检索文档 主键查询

查看文档时,需要指明文档的唯一性标识,类似于 MySQL 中数据的主键查询

GET /index/type/id

查询商品

GET /ecommerce/product/1

响应结果

{
  "_index": "ecommerce",
  "_type": "product",
  "_id": "1",
  "_version": 1,
  "found": true,
  "_source": {
    "name": "gaolujie yagao",
    "desc": "gaoxiao meibai",
    "price": 30,
    "producer": "gaolujie producer",
    "tags": [
      "meibai",
      "fangzhu"
    ]
  }
}

说明

{
	"_index"【
	索引】: "ecommerce",
		"_type"【
	文档类型】: "product",
		"_id": "1",
		"_version": 1,
		"found"【
	查询结果】: true, #true 表示查找到, false 表示未查找到 "_source"【
	文档源信息】: {
		 "name": "gaolujie yagao",
   		 "desc": "gaoxiao meibai",
  		  "price": 30,
  		  "producer": "gaolujie producer",
          "tags": [
              "meibai",
              "fangzhu"
            ]
	}
}
全查询

_search -搜索全部

GET /ecommerce/product/_search

响应结果

{
  "took": 3,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 3,
    "max_score": 1,
    "hits": [
      {
        "_index": "ecommerce",
        "_type": "product",
        "_id": "2",
        "_score": 1,
        "_source": {
          "name": "jiajieshi yagao",
          "desc": "youxiao fangzhu",
          "price": 25,
          "producer": "jiajieshi producer",
          "tags": [
            "fangzhu"
          ]
        }
      },
      {
        "_index": "ecommerce",
        "_type": "product",
        "_id": "1",
        "_score": 1,
        "_source": {
          "name": "gaolujie yagao",
          "desc": "gaoxiao meibai",
          "price": 30,
          "producer": "gaolujie producer",
          "tags": [
            "meibai",
            "fangzhu"
          ]
        }
      },
      {
        "_index": "ecommerce",
        "_type": "product",
        "_id": "3",
        "_score": 1,
        "_source": {
          "name": "zhonghua yagao",
          "desc": "caoben zhiwu",
          "price": 40,
          "producer": "zhonghua producer",
          "tags": [
            "qingxin"
          ]
        }
      }
    ]
  }
}
(3)修改商品:替换文档(全量修改)

替换方式有一个不好,必须带上所有的field,才能去进行信息的修改

PUT /ecommerce/product/1
{
    "name" : "jiaqiangban gaolujie yagao",
    "desc" :  "gaoxiao meibai",
    "price" :  30,
    "producer" :      "gaolujie producer",
    "tags": [ "meibai", "fangzhu" ]
}

响应结果

{
  "_index": "ecommerce",
  "_type": "product",
  "_id": "1",
  "_version": 2,
  "result": "updated",
  "_shards": {
    "total": 2,
    "successful": 1,
    "failed": 0
  },
  "created": false
}

修改之前的数据

{
  "_index": "ecommerce",
  "_type": "product",
  "_id": "1",
  "_version": 1,
  "found": true,
  "_source": {
    "name": "gaolujie yagao",
    "desc": "gaoxiao meibai",
    "price": 30,
    "producer": "gaolujie producer",
    "tags": [
      "meibai",
      "fangzhu"
    ]
  }
}

修改后的数据

{
  "_index": "ecommerce",
  "_type": "product",
  "_id": "1",
  "_version": 2,
  "found": true,
  "_source": {
    "name": "jiaqiangban gaolujie yagao",
    "desc": "gaoxiao meibai",
    "price": 30,
    "producer": "gaolujie producer",
    "tags": [
      "meibai",
      "fangzhu"
    ]
  }
}

如果请求体变化,比如只带上一个field,会将原有的数据内容覆盖

PUT /ecommerce/product/1
{
    "name" : "jiaqiangban gaolujie yagao"
}

响应结果

{
  "_index": "ecommerce",
  "_type": "product",
  "_id": "1",
  "_version": 3,
  "result": "updated",
  "_shards": {
    "total": 2,
    "successful": 1,
    "failed": 0
  },
  "created": false
}

查看文档

{
  "_index": "ecommerce",
  "_type": "product",
  "_id": "1",
  "_version": 3,
  "found": true,
  "_source": {
    "name": "jiaqiangban gaolujie yagao"
  }
}

再次替换回原来的文档

PUT /ecommerce/product/1
{
    "name" : "jiaqiangban gaolujie yagao",
    "desc" :  "gaoxiao meibai",
    "price" :  30,
    "producer" :      "gaolujie producer",
    "tags": [ "meibai", "fangzhu" ]
}

查看文档

{
  "_index": "ecommerce",
  "_type": "product",
  "_id": "1",
  "_version": 4,
  "found": true,
  "_source": {
    "name": "gaolujie yagao",
    "desc": "gaoxiao meibai",
    "price": 30,
    "producer": "gaolujie producer",
    "tags": [
      "meibai",
      "fangzhu"
    ]
  }
}
(4)修改商品:更新文档(局部修改)

可以修改指定的field

POST /ecommerce/product/1/_update
{
  "doc": {
    "name": "jiaqiangban gaolujie yagao"
  }
}

响应结果

{
  "_index": "ecommerce",
  "_type": "product",
  "_id": "1",
  "_version": 5,
  "result": "updated",
  "_shards": {
    "total": 2,
    "successful": 1,
    "failed": 0
  },
  "created": false
}

修改之前的数据

{
  "_index": "ecommerce",
  "_type": "product",
  "_id": "1",
  "_version": 4,
  "found": true,
  "_source": {
    "name": "gaolujie yagao",
    "desc": "gaoxiao meibai",
    "price": 30,
    "producer": "gaolujie producer",
    "tags": [
      "meibai",
      "fangzhu"
    ]
  }
}

修改之后的数据

{
  "_index": "ecommerce",
  "_type": "product",
  "_id": "1",
  "_version": 6,
  "found": true,
  "_source": {
    "name": "jiaqiangban gaolujie yagao",
    "desc": "gaoxiao meibai",
    "price": 30,
    "producer": "gaolujie producer",
    "tags": [
      "meibai",
      "fangzhu"
    ]
  }
}
(5)删除商品:删除文档

删除一个文档其实不会立即从磁盘上移除,它只是被标记成已删除(逻辑删除)。

主键删除
DELETE /ecommerce/product/1

响应结果

{
  "found": true,
  "_index": "ecommerce",
  "_type": "product",
  "_id": "1",
  "_version": 7,
  "result": "deleted",
  "_shards": {
    "total": 2,
    "successful": 1,
    "failed": 0
  }
}

查看商品

GET /ecommerce/product/1

响应结果

{
  "_index": "ecommerce",
  "_type": "product",
  "_id": "1",
  "found": false
}
条件删除文档

一般删除数据都是根据文档的唯一性标识进行删除,实际操作时,也可以根据条件对多条数据进行删除

先查看当前全部文档

GET /ecommerce/product/_search

响应结果

{
  "took": 1,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 2,
    "max_score": 1,
    "hits": [
      {
        "_index": "ecommerce",
        "_type": "product",
        "_id": "2",
        "_score": 1,
        "_source": {
          "name": "jiajieshi yagao",
          "desc": "youxiao fangzhu",
          "price": 25,
          "producer": "jiajieshi producer",
          "tags": [
            "fangzhu"
          ]
        }
      },
      {
        "_index": "ecommerce",
        "_type": "product",
        "_id": "3",
        "_score": 1,
        "_source": {
          "name": "zhonghua yagao",
          "desc": "caoben zhiwu",
          "price": 40,
          "producer": "zhonghua producer",
          "tags": [
            "qingxin"
          ]
        }
      }
    ]
  }
}

删除price=40的所有商品

POST /ecommerce/product/_delete_by_query
{
	"query": {
		"match": {
			"price": 40
		}
	}
}

响应结果

{
  "took": 322,
  "timed_out": false,
  "total": 1,
  "deleted": 1,
  "batches": 1,
  "version_conflicts": 0,
  "noops": 0,
  "retries": {
    "bulk": 0,
    "search": 0
  },
  "throttled_millis": 0,
  "requests_per_second": -1,
  "throttled_until_millis": 0,
  "failures": []
}

再次查询商品

{
  "took": 1,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 1,
    "max_score": 1,
    "hits": [
      {
        "_index": "ecommerce",
        "_type": "product",
        "_id": "2",
        "_score": 1,
        "_source": {
          "name": "jiajieshi yagao",
          "desc": "youxiao fangzhu",
          "price": 25,
          "producer": "jiajieshi producer",
          "tags": [
            "fangzhu"
          ]
        }
      }
    ]
  }
}
关注
打赏
1638062488
查看更多评论
立即登录/注册

微信扫码登录

0.0407s