查看 Elasticsearch 日志存储的状态

编辑

查看 Elasticsearch 日志存储的状态
- 示例条件消息
查看日志存储组件的状态
Elasticsearch 集群状态

您可以查看 OpenShift Elasticsearch Operator 和许多 Elasticsearch 组件的状态。

您可以查看 Elasticsearch 日志存储的状态。

先决条件

已安装 Red Hat OpenShift Logging Operator 和 OpenShift Elasticsearch Operator。

步骤

通过运行以下命令切换到openshift-logging项目
```
$ oc project openshift-logging
```

要查看状态

通过运行以下命令获取 Elasticsearch 日志存储实例的名称
```
$ oc get Elasticsearch
```
示例输出
```
NAME            AGE
elasticsearch   5h9m
```

通过运行以下命令获取 Elasticsearch 日志存储状态

$ oc get Elasticsearch <Elasticsearch-instance> -o yaml

例如

$ oc get Elasticsearch elasticsearch -n openshift-logging -o yaml

输出包含类似以下的信息

示例输出

status: (1)
  cluster: (2)
    activePrimaryShards: 30
    activeShards: 60
    initializingShards: 0
    numDataNodes: 3
    numNodes: 3
    pendingTasks: 0
    relocatingShards: 0
    status: green
    unassignedShards: 0
  clusterHealth: ""
  conditions: [] (3)
  nodes: (4)
  - deploymentName: elasticsearch-cdm-zjf34ved-1
    upgradeStatus: {}
  - deploymentName: elasticsearch-cdm-zjf34ved-2
    upgradeStatus: {}
  - deploymentName: elasticsearch-cdm-zjf34ved-3
    upgradeStatus: {}
  pods: (5)
    client:
      failed: []
      notReady: []
      ready:
      - elasticsearch-cdm-zjf34ved-1-6d7fbf844f-sn422
      - elasticsearch-cdm-zjf34ved-2-dfbd988bc-qkzjz
      - elasticsearch-cdm-zjf34ved-3-c8f566f7c-t7zkt
    data:
      failed: []
      notReady: []
      ready:
      - elasticsearch-cdm-zjf34ved-1-6d7fbf844f-sn422
      - elasticsearch-cdm-zjf34ved-2-dfbd988bc-qkzjz
      - elasticsearch-cdm-zjf34ved-3-c8f566f7c-t7zkt
    master:
      failed: []
      notReady: []
      ready:
      - elasticsearch-cdm-zjf34ved-1-6d7fbf844f-sn422
      - elasticsearch-cdm-zjf34ved-2-dfbd988bc-qkzjz
      - elasticsearch-cdm-zjf34ved-3-c8f566f7c-t7zkt
  shardAllocationEnabled: all

1	在输出中，集群状态字段显示在`status`部分。
2	Elasticsearch 日志存储的状态活动主分片的数量。活动分片的数量。正在初始化的分片数量。 Elasticsearch 日志存储数据节点的数量。 Elasticsearch 日志存储节点的总数。挂起任务的数量。 Elasticsearch 日志存储状态：`green`、`red`、`yellow`。未分配分片的数量。
3	任何状态条件（如果存在）。Elasticsearch 日志存储状态指示调度程序中 pod 无法放置的原因。将显示与以下条件相关的任何事件 Elasticsearch 日志存储和代理容器都在等待容器。 Elasticsearch 日志存储和代理容器均已终止容器。 Pod 无法调度。此外，还显示了许多问题的条件；请参见示例条件消息。
4	集群中的 Elasticsearch 日志存储节点，带有`upgradeStatus`。
5	集群中的 Elasticsearch 日志存储客户端、数据和主 pod，列在`failed`、`notReady`或`ready`状态下。

编辑

示例条件消息

以下是 Elasticsearch 实例Status部分中一些条件消息的示例。

以下状态消息指示节点已超过配置的低水位线，并且不会将任何分片分配给该节点。

status:
  nodes:
  - conditions:
    - lastTransitionTime: 2019-03-15T15:57:22Z
      message: Disk storage usage for node is 27.5gb (36.74%). Shards will be not
        be allocated on this node.
      reason: Disk Watermark Low
      status: "True"
      type: NodeStorage
    deploymentName: example-elasticsearch-cdm-0-1
    upgradeStatus: {}

以下状态消息指示节点已超过配置的高水位线，并且分片将被重新定位到其他节点。

status:
  nodes:
  - conditions:
    - lastTransitionTime: 2019-03-15T16:04:45Z
      message: Disk storage usage for node is 27.5gb (36.74%). Shards will be relocated
        from this node.
      reason: Disk Watermark High
      status: "True"
      type: NodeStorage
    deploymentName: example-elasticsearch-cdm-0-1
    upgradeStatus: {}

以下状态消息指示自定义资源 (CR) 中的 Elasticsearch 日志存储节点选择器与集群中的任何节点都不匹配。

status:
    nodes:
    - conditions:
      - lastTransitionTime: 2019-04-10T02:26:24Z
        message: '0/8 nodes are available: 8 node(s) didn''t match node selector.'
        reason: Unschedulable
        status: "True"
        type: Unschedulable

以下状态消息指示 Elasticsearch 日志存储 CR 使用不存在的持久卷声明 (PVC)。

status:
   nodes:
   - conditions:
     - last Transition Time:  2019-04-10T05:55:51Z
       message:               pod has unbound immediate PersistentVolumeClaims (repeated 5 times)
       reason:                Unschedulable
       status:                True
       type:                  Unschedulable

以下状态消息指示您的 Elasticsearch 日志存储集群没有足够的节点来支持冗余策略。

status:
  clusterHealth: ""
  conditions:
  - lastTransitionTime: 2019-04-17T20:01:31Z
    message: Wrong RedundancyPolicy selected. Choose different RedundancyPolicy or
      add more nodes with data roles
    reason: Invalid Settings
    status: "True"
    type: InvalidRedundancy

此状态消息指示您的集群控制平面节点过多。

status:
  clusterHealth: green
  conditions:
    - lastTransitionTime: '2019-04-17T20:12:34Z'
      message: >-
        Invalid master nodes count. Please ensure there are no more than 3 total
        nodes with master roles
      reason: Invalid Settings
      status: 'True'
      type: InvalidMasters

以下状态消息指示 Elasticsearch 存储不支持您尝试进行的更改。

例如

status:
  clusterHealth: green
  conditions:
    - lastTransitionTime: "2021-05-07T01:05:13Z"
      message: Changing the storage structure for a custom resource is not supported
      reason: StorageStructureChangeIgnored
      status: 'True'
      type: StorageStructureChangeIgnored

reason和type字段指定不支持的更改类型。

StorageClassNameChangeIgnored

对存储类名称进行不支持的更改。

StorageSizeChangeIgnored

对存储大小进行不支持的更改。

StorageStructureChangeIgnored

在短暂存储和持久性存储结构之间进行不支持的更改。

如果您尝试将ClusterLogging CR 配置为从短暂存储切换到持久性存储，OpenShift Elasticsearch Operator 将创建一个持久卷声明 (PVC)，但不会创建持久卷 (PV)。要清除StorageStructureChangeIgnored状态，您必须将ClusterLogging CR 的更改恢复原状并删除 PVC。

编辑

查看日志存储组件的状态

您可以查看许多日志存储组件的状态。

Elasticsearch 索引

您可以查看 Elasticsearch 索引的状态。

获取 Elasticsearch pod 的名称

$ oc get pods --selector component=elasticsearch -o name

示例输出

pod/elasticsearch-cdm-1godmszn-1-6f8495-vp4lw
pod/elasticsearch-cdm-1godmszn-2-5769cf-9ms2n
pod/elasticsearch-cdm-1godmszn-3-f66f7d-zqkz7

获取索引的状态

$ oc exec elasticsearch-cdm-4vjor49p-2-6d4d7db474-q2w7z -- indices

示例输出

Defaulting container name to elasticsearch.
Use 'oc describe pod/elasticsearch-cdm-4vjor49p-2-6d4d7db474-q2w7z -n openshift-logging' to see all of the containers in this pod.

green  open   infra-000002                                                     S4QANnf1QP6NgCegfnrnbQ   3   1     119926            0        157             78
green  open   audit-000001                                                     8_EQx77iQCSTzFOXtxRqFw   3   1          0            0          0              0
green  open   .security                                                        iDjscH7aSUGhIdq0LheLBQ   1   1          5            0          0              0
green  open   .kibana_-377444158_kubeadmin                                     yBywZ9GfSrKebz5gWBZbjw   3   1          1            0          0              0
green  open   infra-000001                                                     z6Dpe__ORgiopEpW6Yl44A   3   1     871000            0        874            436
green  open   app-000001                                                       hIrazQCeSISewG3c2VIvsQ   3   1       2453            0          3              1
green  open   .kibana_1                                                        JCitcBMSQxKOvIq6iQW6wg   1   1          0            0          0              0
green  open   .kibana_-1595131456_user1                                        gIYFIEGRRe-ka0W3okS-mQ   3   1          1            0          0              0

日志存储 pod

您可以查看托管日志存储的 pod 的状态。

获取 pod 的名称

$ oc get pods --selector component=elasticsearch -o name

示例输出

pod/elasticsearch-cdm-1godmszn-1-6f8495-vp4lw
pod/elasticsearch-cdm-1godmszn-2-5769cf-9ms2n
pod/elasticsearch-cdm-1godmszn-3-f66f7d-zqkz7

获取 pod 的状态

$ oc describe pod elasticsearch-cdm-1godmszn-1-6f8495-vp4lw

输出包含以下状态信息

示例输出

....
Status:             Running

....

Containers:
  elasticsearch:
    Container ID:   cri-o://b7d44e0a9ea486e27f47763f5bb4c39dfd2
    State:          Running
      Started:      Mon, 08 Jun 2020 10:17:56 -0400
    Ready:          True
    Restart Count:  0
    Readiness:  exec [/usr/share/elasticsearch/probe/readiness.sh] delay=10s timeout=30s period=5s #success=1 #failure=3

....

  proxy:
    Container ID:  cri-o://3f77032abaddbb1652c116278652908dc01860320b8a4e741d06894b2f8f9aa1
    State:          Running
      Started:      Mon, 08 Jun 2020 10:18:38 -0400
    Ready:          True
    Restart Count:  0

....

Conditions:
  Type              Status
  Initialized       True
  Ready             True
  ContainersReady   True
  PodScheduled      True

....

Events:          <none>

日志存储 pod 部署配置

您可以查看日志存储部署配置的状态。

获取部署配置的名称

$ oc get deployment --selector component=elasticsearch -o name

示例输出

deployment.extensions/elasticsearch-cdm-1gon-1
deployment.extensions/elasticsearch-cdm-1gon-2
deployment.extensions/elasticsearch-cdm-1gon-3

获取部署配置的状态

$ oc describe deployment elasticsearch-cdm-1gon-1

输出包含以下状态信息

示例输出

....
  Containers:
   elasticsearch:
    Image:      registry.redhat.io/openshift-logging/elasticsearch6-rhel8
    Readiness:  exec [/usr/share/elasticsearch/probe/readiness.sh] delay=10s timeout=30s period=5s #success=1 #failure=3

....

Conditions:
  Type           Status   Reason
  ----           ------   ------
  Progressing    Unknown  DeploymentPaused
  Available      True     MinimumReplicasAvailable

....

Events:          <none>

日志存储副本集

您可以查看日志存储副本集的状态。

获取副本集的名称

$ oc get replicaSet --selector component=elasticsearch -o name

replicaset.extensions/elasticsearch-cdm-1gon-1-6f8495
replicaset.extensions/elasticsearch-cdm-1gon-2-5769cf
replicaset.extensions/elasticsearch-cdm-1gon-3-f66f7d

获取副本集的状态

$ oc describe replicaSet elasticsearch-cdm-1gon-1-6f8495

输出包含以下状态信息

示例输出

....
  Containers:
   elasticsearch:
    Image:      registry.redhat.io/openshift-logging/elasticsearch6-rhel8@sha256:4265742c7cdd85359140e2d7d703e4311b6497eec7676957f455d6908e7b1c25
    Readiness:  exec [/usr/share/elasticsearch/probe/readiness.sh] delay=10s timeout=30s period=5s #success=1 #failure=3

....

Events:          <none>

编辑

Elasticsearch 集群状态

在OpenShift 集群管理器的**观察**部分中，仪表板显示 Elasticsearch 集群的状态。

要获取 OpenShift Elasticsearch 集群的状态，请访问OpenShift 集群管理器的**观察**部分中的仪表板，网址为<cluster_url>/monitoring/dashboards/grafana-dashboard-cluster-logging。

Elasticsearch 状态字段

eo_elasticsearch_cr_cluster_management_state

显示 Elasticsearch 集群是否处于托管或非托管状态。例如

eo_elasticsearch_cr_cluster_management_state{state="managed"} 1
eo_elasticsearch_cr_cluster_management_state{state="unmanaged"} 0

eo_elasticsearch_cr_restart_total

显示 Elasticsearch 节点由于证书重启、滚动重启或计划重启而重启的次数。例如

eo_elasticsearch_cr_restart_total{reason="cert_restart"} 1
eo_elasticsearch_cr_restart_total{reason="rolling_restart"} 1
eo_elasticsearch_cr_restart_total{reason="scheduled_restart"} 3

es_index_namespaces_total

显示 Elasticsearch 索引命名空间的总数。例如

Total number of Namespaces.
es_index_namespaces_total 5

es_index_document_count

显示每个命名空间的记录数。例如

es_index_document_count{namespace="namespace_1"} 25
es_index_document_count{namespace="namespace_2"} 10
es_index_document_count{namespace="namespace_3"} 5

“缺少或为空的 Secret Elasticsearch 字段”消息

如果 Elasticsearch 缺少admin-cert、admin-key、logging-es.crt或logging-es.key文件，则仪表板将显示类似于以下示例的状态消息

message": "Secret \"elasticsearch\" fields are either missing or empty: [admin-cert, admin-key, logging-es.crt, logging-es.key]",
"reason": "Missing Required Secrets",