×

您可以查看 OpenShift Elasticsearch Operator 和许多 Elasticsearch 组件的状态。

查看 Elasticsearch 日志存储的状态

您可以查看 Elasticsearch 日志存储的状态。

先决条件
  • 已安装 Red Hat OpenShift Logging Operator 和 OpenShift Elasticsearch Operator。

步骤
  1. 通过运行以下命令切换到openshift-logging项目

    $ oc project openshift-logging
  2. 要查看状态

    1. 通过运行以下命令获取 Elasticsearch 日志存储实例的名称

      $ oc get Elasticsearch
      示例输出
      NAME            AGE
      elasticsearch   5h9m
    2. 通过运行以下命令获取 Elasticsearch 日志存储状态

      $ oc get Elasticsearch <Elasticsearch-instance> -o yaml

      例如

      $ oc get Elasticsearch elasticsearch -n openshift-logging -o yaml

      输出包含类似以下的信息

      示例输出
      status: (1)
        cluster: (2)
          activePrimaryShards: 30
          activeShards: 60
          initializingShards: 0
          numDataNodes: 3
          numNodes: 3
          pendingTasks: 0
          relocatingShards: 0
          status: green
          unassignedShards: 0
        clusterHealth: ""
        conditions: [] (3)
        nodes: (4)
        - deploymentName: elasticsearch-cdm-zjf34ved-1
          upgradeStatus: {}
        - deploymentName: elasticsearch-cdm-zjf34ved-2
          upgradeStatus: {}
        - deploymentName: elasticsearch-cdm-zjf34ved-3
          upgradeStatus: {}
        pods: (5)
          client:
            failed: []
            notReady: []
            ready:
            - elasticsearch-cdm-zjf34ved-1-6d7fbf844f-sn422
            - elasticsearch-cdm-zjf34ved-2-dfbd988bc-qkzjz
            - elasticsearch-cdm-zjf34ved-3-c8f566f7c-t7zkt
          data:
            failed: []
            notReady: []
            ready:
            - elasticsearch-cdm-zjf34ved-1-6d7fbf844f-sn422
            - elasticsearch-cdm-zjf34ved-2-dfbd988bc-qkzjz
            - elasticsearch-cdm-zjf34ved-3-c8f566f7c-t7zkt
          master:
            failed: []
            notReady: []
            ready:
            - elasticsearch-cdm-zjf34ved-1-6d7fbf844f-sn422
            - elasticsearch-cdm-zjf34ved-2-dfbd988bc-qkzjz
            - elasticsearch-cdm-zjf34ved-3-c8f566f7c-t7zkt
        shardAllocationEnabled: all
      1 在输出中,集群状态字段显示在status部分。
      2 Elasticsearch 日志存储的状态
      • 活动主分片的数量。

      • 活动分片的数量。

      • 正在初始化的分片数量。

      • Elasticsearch 日志存储数据节点的数量。

      • Elasticsearch 日志存储节点的总数。

      • 挂起任务的数量。

      • Elasticsearch 日志存储状态:greenredyellow

      • 未分配分片的数量。

      3 任何状态条件(如果存在)。Elasticsearch 日志存储状态指示调度程序中 pod 无法放置的原因。将显示与以下条件相关的任何事件
      • Elasticsearch 日志存储和代理容器都在等待容器。

      • Elasticsearch 日志存储和代理容器均已终止容器。

      • Pod 无法调度。此外,还显示了许多问题的条件;请参见**示例条件消息**。

      4 集群中的 Elasticsearch 日志存储节点,带有upgradeStatus
      5 集群中的 Elasticsearch 日志存储客户端、数据和主 pod,列在failednotReadyready状态下。

示例条件消息

以下是 Elasticsearch 实例Status部分中一些条件消息的示例。

以下状态消息指示节点已超过配置的低水位线,并且不会将任何分片分配给该节点。

status:
  nodes:
  - conditions:
    - lastTransitionTime: 2019-03-15T15:57:22Z
      message: Disk storage usage for node is 27.5gb (36.74%). Shards will be not
        be allocated on this node.
      reason: Disk Watermark Low
      status: "True"
      type: NodeStorage
    deploymentName: example-elasticsearch-cdm-0-1
    upgradeStatus: {}

以下状态消息指示节点已超过配置的高水位线,并且分片将被重新定位到其他节点。

status:
  nodes:
  - conditions:
    - lastTransitionTime: 2019-03-15T16:04:45Z
      message: Disk storage usage for node is 27.5gb (36.74%). Shards will be relocated
        from this node.
      reason: Disk Watermark High
      status: "True"
      type: NodeStorage
    deploymentName: example-elasticsearch-cdm-0-1
    upgradeStatus: {}

以下状态消息指示自定义资源 (CR) 中的 Elasticsearch 日志存储节点选择器与集群中的任何节点都不匹配。

status:
    nodes:
    - conditions:
      - lastTransitionTime: 2019-04-10T02:26:24Z
        message: '0/8 nodes are available: 8 node(s) didn''t match node selector.'
        reason: Unschedulable
        status: "True"
        type: Unschedulable

以下状态消息指示 Elasticsearch 日志存储 CR 使用不存在的持久卷声明 (PVC)。

status:
   nodes:
   - conditions:
     - last Transition Time:  2019-04-10T05:55:51Z
       message:               pod has unbound immediate PersistentVolumeClaims (repeated 5 times)
       reason:                Unschedulable
       status:                True
       type:                  Unschedulable

以下状态消息指示您的 Elasticsearch 日志存储集群没有足够的节点来支持冗余策略。

status:
  clusterHealth: ""
  conditions:
  - lastTransitionTime: 2019-04-17T20:01:31Z
    message: Wrong RedundancyPolicy selected. Choose different RedundancyPolicy or
      add more nodes with data roles
    reason: Invalid Settings
    status: "True"
    type: InvalidRedundancy

此状态消息指示您的集群控制平面节点过多。

status:
  clusterHealth: green
  conditions:
    - lastTransitionTime: '2019-04-17T20:12:34Z'
      message: >-
        Invalid master nodes count. Please ensure there are no more than 3 total
        nodes with master roles
      reason: Invalid Settings
      status: 'True'
      type: InvalidMasters

以下状态消息指示 Elasticsearch 存储不支持您尝试进行的更改。

例如

status:
  clusterHealth: green
  conditions:
    - lastTransitionTime: "2021-05-07T01:05:13Z"
      message: Changing the storage structure for a custom resource is not supported
      reason: StorageStructureChangeIgnored
      status: 'True'
      type: StorageStructureChangeIgnored

reasontype字段指定不支持的更改类型。

StorageClassNameChangeIgnored

对存储类名称进行不支持的更改。

StorageSizeChangeIgnored

对存储大小进行不支持的更改。

StorageStructureChangeIgnored

在短暂存储和持久性存储结构之间进行不支持的更改。

如果您尝试将ClusterLogging CR 配置为从短暂存储切换到持久性存储,OpenShift Elasticsearch Operator 将创建一个持久卷声明 (PVC),但不会创建持久卷 (PV)。要清除StorageStructureChangeIgnored状态,您必须将ClusterLogging CR 的更改恢复原状并删除 PVC。

查看日志存储组件的状态

您可以查看许多日志存储组件的状态。

Elasticsearch 索引

您可以查看 Elasticsearch 索引的状态。

  1. 获取 Elasticsearch pod 的名称

    $ oc get pods --selector component=elasticsearch -o name
    示例输出
    pod/elasticsearch-cdm-1godmszn-1-6f8495-vp4lw
    pod/elasticsearch-cdm-1godmszn-2-5769cf-9ms2n
    pod/elasticsearch-cdm-1godmszn-3-f66f7d-zqkz7
  2. 获取索引的状态

    $ oc exec elasticsearch-cdm-4vjor49p-2-6d4d7db474-q2w7z -- indices
    示例输出
    Defaulting container name to elasticsearch.
    Use 'oc describe pod/elasticsearch-cdm-4vjor49p-2-6d4d7db474-q2w7z -n openshift-logging' to see all of the containers in this pod.
    
    green  open   infra-000002                                                     S4QANnf1QP6NgCegfnrnbQ   3   1     119926            0        157             78
    green  open   audit-000001                                                     8_EQx77iQCSTzFOXtxRqFw   3   1          0            0          0              0
    green  open   .security                                                        iDjscH7aSUGhIdq0LheLBQ   1   1          5            0          0              0
    green  open   .kibana_-377444158_kubeadmin                                     yBywZ9GfSrKebz5gWBZbjw   3   1          1            0          0              0
    green  open   infra-000001                                                     z6Dpe__ORgiopEpW6Yl44A   3   1     871000            0        874            436
    green  open   app-000001                                                       hIrazQCeSISewG3c2VIvsQ   3   1       2453            0          3              1
    green  open   .kibana_1                                                        JCitcBMSQxKOvIq6iQW6wg   1   1          0            0          0              0
    green  open   .kibana_-1595131456_user1                                        gIYFIEGRRe-ka0W3okS-mQ   3   1          1            0          0              0
日志存储 pod

您可以查看托管日志存储的 pod 的状态。

  1. 获取 pod 的名称

    $ oc get pods --selector component=elasticsearch -o name
    示例输出
    pod/elasticsearch-cdm-1godmszn-1-6f8495-vp4lw
    pod/elasticsearch-cdm-1godmszn-2-5769cf-9ms2n
    pod/elasticsearch-cdm-1godmszn-3-f66f7d-zqkz7
  2. 获取 pod 的状态

    $ oc describe pod elasticsearch-cdm-1godmszn-1-6f8495-vp4lw

    输出包含以下状态信息

    示例输出
    ....
    Status:             Running
    
    ....
    
    Containers:
      elasticsearch:
        Container ID:   cri-o://b7d44e0a9ea486e27f47763f5bb4c39dfd2
        State:          Running
          Started:      Mon, 08 Jun 2020 10:17:56 -0400
        Ready:          True
        Restart Count:  0
        Readiness:  exec [/usr/share/elasticsearch/probe/readiness.sh] delay=10s timeout=30s period=5s #success=1 #failure=3
    
    ....
    
      proxy:
        Container ID:  cri-o://3f77032abaddbb1652c116278652908dc01860320b8a4e741d06894b2f8f9aa1
        State:          Running
          Started:      Mon, 08 Jun 2020 10:18:38 -0400
        Ready:          True
        Restart Count:  0
    
    ....
    
    Conditions:
      Type              Status
      Initialized       True
      Ready             True
      ContainersReady   True
      PodScheduled      True
    
    ....
    
    Events:          <none>
日志存储 pod 部署配置

您可以查看日志存储部署配置的状态。

  1. 获取部署配置的名称

    $ oc get deployment --selector component=elasticsearch -o name
    示例输出
    deployment.extensions/elasticsearch-cdm-1gon-1
    deployment.extensions/elasticsearch-cdm-1gon-2
    deployment.extensions/elasticsearch-cdm-1gon-3
  2. 获取部署配置的状态

    $ oc describe deployment elasticsearch-cdm-1gon-1

    输出包含以下状态信息

    示例输出
    ....
      Containers:
       elasticsearch:
        Image:      registry.redhat.io/openshift-logging/elasticsearch6-rhel8
        Readiness:  exec [/usr/share/elasticsearch/probe/readiness.sh] delay=10s timeout=30s period=5s #success=1 #failure=3
    
    ....
    
    Conditions:
      Type           Status   Reason
      ----           ------   ------
      Progressing    Unknown  DeploymentPaused
      Available      True     MinimumReplicasAvailable
    
    ....
    
    Events:          <none>
日志存储副本集

您可以查看日志存储副本集的状态。

  1. 获取副本集的名称

    $ oc get replicaSet --selector component=elasticsearch -o name
    
    replicaset.extensions/elasticsearch-cdm-1gon-1-6f8495
    replicaset.extensions/elasticsearch-cdm-1gon-2-5769cf
    replicaset.extensions/elasticsearch-cdm-1gon-3-f66f7d
  2. 获取副本集的状态

    $ oc describe replicaSet elasticsearch-cdm-1gon-1-6f8495

    输出包含以下状态信息

    示例输出
    ....
      Containers:
       elasticsearch:
        Image:      registry.redhat.io/openshift-logging/elasticsearch6-rhel8@sha256:4265742c7cdd85359140e2d7d703e4311b6497eec7676957f455d6908e7b1c25
        Readiness:  exec [/usr/share/elasticsearch/probe/readiness.sh] delay=10s timeout=30s period=5s #success=1 #failure=3
    
    ....
    
    Events:          <none>

Elasticsearch 集群状态

OpenShift 集群管理器的**观察**部分中,仪表板显示 Elasticsearch 集群的状态。

要获取 OpenShift Elasticsearch 集群的状态,请访问OpenShift 集群管理器的**观察**部分中的仪表板,网址为<cluster_url>/monitoring/dashboards/grafana-dashboard-cluster-logging

Elasticsearch 状态字段
eo_elasticsearch_cr_cluster_management_state

显示 Elasticsearch 集群是否处于托管或非托管状态。例如

eo_elasticsearch_cr_cluster_management_state{state="managed"} 1
eo_elasticsearch_cr_cluster_management_state{state="unmanaged"} 0
eo_elasticsearch_cr_restart_total

显示 Elasticsearch 节点由于证书重启、滚动重启或计划重启而重启的次数。例如

eo_elasticsearch_cr_restart_total{reason="cert_restart"} 1
eo_elasticsearch_cr_restart_total{reason="rolling_restart"} 1
eo_elasticsearch_cr_restart_total{reason="scheduled_restart"} 3
es_index_namespaces_total

显示 Elasticsearch 索引命名空间的总数。例如

Total number of Namespaces.
es_index_namespaces_total 5
es_index_document_count

显示每个命名空间的记录数。例如

es_index_document_count{namespace="namespace_1"} 25
es_index_document_count{namespace="namespace_2"} 10
es_index_document_count{namespace="namespace_3"} 5
“缺少或为空的 Secret Elasticsearch 字段”消息

如果 Elasticsearch 缺少admin-certadmin-keylogging-es.crtlogging-es.key文件,则仪表板将显示类似于以下示例的状态消息

message": "Secret \"elasticsearch\" fields are either missing or empty: [admin-cert, admin-key, logging-es.crt, logging-es.key]",
"reason": "Missing Required Secrets",