访问第三方监控 API - 监控 | 可观测性 | Red Hat OpenShift Service on AWS

编辑

关于访问监控 Web 服务 API
访问监控 Web 服务 API
使用 Prometheus 的联合端点查询指标
为自定义应用程序访问集群外部的指标
集群监控操作员的资源参考
- CMO 路由资源
- CMO 服务资源
其他资源

在 Red Hat OpenShift Service on AWS 中，您可以通过命令行界面 (CLI) 访问某些监控组件的 Web 服务 API。

在某些情况下，访问 API 端点可能会降低集群的性能和可扩展性，尤其是在使用端点检索、发送或查询大量指标数据时。

为避免这些问题，请遵循以下建议

避免频繁查询端点。将查询限制为每 30 秒最多一次。
不要尝试通过 Prometheus 的`/federate`端点检索所有指标数据。只有在您想要检索有限的聚合数据集时才查询它。例如，每个请求检索少于 1000 个样本有助于最大限度地降低性能下降的风险。

编辑

关于访问监控 Web 服务 API

您可以直接从命令行访问以下监控堆栈组件的 Web 服务 API 端点

Prometheus
Alertmanager
Thanos Ruler
Thanos Querier

要访问 Thanos Ruler 和 Thanos Querier 服务 API，请求帐户必须对命名空间资源具有`get`权限，这可以通过将`cluster-monitoring-view`集群角色绑定到帐户来授予。

访问监控组件的 Web 服务 API 端点时，请注意以下限制

您只能使用 bearer 令牌身份验证来访问 API 端点。
您只能访问路由中`/api`路径中的端点。如果您尝试在 Web 浏览器中访问 API 端点，则会发生`Application is not available`错误。要在 Web 浏览器中访问监控功能，请使用 Red Hat OpenShift Service on AWS Web 控制台查看监控仪表板。

其他资源

查看监控仪表盘

编辑

访问监控 Web 服务 API

以下示例显示了如何查询用于核心平台监控的 Alertmanager 服务的服务 API 接收器。您可以使用类似的方法访问核心平台 Prometheus 的`prometheus-k8s`服务和 Thanos Ruler 的`thanos-ruler`服务。

先决条件

您已登录到一个帐户，该帐户已绑定到`openshift-monitoring`命名空间中的`monitoring-alertmanager-edit`角色。
您已登录到一个帐户，该帐户具有获取 Alertmanager API 路由的权限。

如果您的帐户没有获取 Alertmanager API 路由的权限，集群管理员可以提供路由的 URL。

步骤

通过运行以下命令提取身份验证令牌
```
$ TOKEN=$(oc whoami -t)
```

通过运行以下命令提取`alertmanager-main`API 路由 URL

$ HOST=$(oc -n openshift-monitoring get route alertmanager-main -ojsonpath={.status.ingress[].host})

通过运行以下命令查询 Alertmanager 的服务 API 接收器

$ curl -H "Authorization: Bearer $TOKEN" -k "https://$HOST/api/v2/receivers"

编辑

使用 Prometheus 的联合端点查询指标

您可以使用 Prometheus 的联合端点从集群外部的网络位置抓取平台和用户定义的指标。为此，请通过 Red Hat OpenShift Service on AWS 路由访问集群的 Prometheus`/federate`端点。

使用联合时，会延迟检索指标数据。此延迟可能会影响抓取指标的准确性和及时性。

使用联合端点也可能会降低集群的性能和可扩展性，尤其是在使用联合端点检索大量指标数据时。为避免这些问题，请遵循以下建议

不要尝试通过 Prometheus 的联合端点检索所有指标数据。只有在您想要检索有限的聚合数据集时才查询它。例如，每个请求检索少于 1000 个样本有助于最大限度地降低性能下降的风险。
避免频繁查询 Prometheus 的联合端点。将查询限制为每 30 秒最多一次。

如果您需要将大量数据转发到集群外部，请改用远程写入。有关更多信息，请参见《配置远程写入存储》部分。

先决条件

您已安装 OpenShift CLI (oc)。
您可以作为具有`cluster-monitoring-view`集群角色的用户访问集群，或者已获得对`namespaces`资源具有`get`权限的 bearer 令牌。

您只能使用 bearer 令牌身份验证来访问 Prometheus 联合端点。
您已登录到一个帐户，该帐户具有获取 Prometheus 联合路由的权限。

如果您的帐户没有获取 Prometheus 联合路由的权限，集群管理员可以提供路由的 URL。

步骤

通过运行以下命令检索 bearer 令牌
```
$ TOKEN=$(oc whoami -t)
```

通过运行以下命令获取 Prometheus 联合路由 URL

$ HOST=$(oc -n openshift-monitoring get route prometheus-k8s-federate -ojsonpath={.status.ingress[].host})

查询`/federate`路由中的指标。以下示例命令查询`up`指标

$ curl -G -k -H "Authorization: Bearer $TOKEN" https://$HOST/federate --data-urlencode 'match[]=up'

示例输出

# TYPE up untyped
up{apiserver="kube-apiserver",endpoint="https",instance="10.0.143.148:6443",job="apiserver",namespace="default",service="kubernetes",prometheus="openshift-monitoring/k8s",prometheus_replica="prometheus-k8s-0"} 1 1657035322214
up{apiserver="kube-apiserver",endpoint="https",instance="10.0.148.166:6443",job="apiserver",namespace="default",service="kubernetes",prometheus="openshift-monitoring/k8s",prometheus_replica="prometheus-k8s-0"} 1 1657035338597
up{apiserver="kube-apiserver",endpoint="https",instance="10.0.173.16:6443",job="apiserver",namespace="default",service="kubernetes",prometheus="openshift-monitoring/k8s",prometheus_replica="prometheus-k8s-0"} 1 1657035343834
...

编辑

为自定义应用程序访问集群外部的指标

使用用户定义的项目监控您自己的服务时，您可以查询集群外部的 Prometheus 指标。通过使用`thanos-querier`路由从集群外部访问此数据。

此访问仅支持使用 bearer 令牌进行身份验证。

先决条件

您已部署了自己的服务，遵循“为用户定义的项目启用监控”过程。
您已登录到具有`cluster-monitoring-view`集群角色的帐户，该角色提供访问 Thanos Querier API 的权限。
您已登录到一个帐户，该帐户具有获取 Thanos Querier API 路由的权限。

如果您的帐户没有获取 Thanos Querier API 路由的权限，集群管理员可以提供路由的 URL。

步骤

运行以下命令提取连接到 Prometheus 的身份验证令牌：
```
$ TOKEN=$(oc whoami -t)
```

运行以下命令提取thanos-querier API 路由 URL：

$ HOST=$(oc -n openshift-monitoring get route thanos-querier -ojsonpath={.status.ingress[].host})

使用以下命令将命名空间设置为您的服务正在运行的命名空间：
```
$ NAMESPACE=ns1
```

运行以下命令在命令行查询您自己服务的指标：

$ curl -H "Authorization: Bearer $TOKEN" -k "https://$HOST/api/v1/query?" --data-urlencode "query=up{namespace='$NAMESPACE'}"

输出显示 Prometheus 正在抓取的每个应用程序 Pod 的状态。

格式化示例输出

{
  "status": "success",
  "data": {
    "resultType": "vector",
    "result": [
      {
        "metric": {
          "__name__": "up",
          "endpoint": "web",
          "instance": "10.129.0.46:8080",
          "job": "prometheus-example-app",
          "namespace": "ns1",
          "pod": "prometheus-example-app-68d47c4fb6-jztp2",
          "service": "prometheus-example-app"
        },
        "value": [
          1591881154.748,
          "1"
        ]
      }
    ],
  }
}

格式化示例输出使用过滤工具（例如jq）提供格式化的缩进 JSON。有关使用jq的更多信息，请参阅jq 手册（jq 文档）。
该命令请求 Thanos Querier 服务的即时查询端点，该端点在某一时间点评估选择器。

集群监控操作员的资源参考

本文档描述了由集群监控操作员 (CMO) 部署和管理的以下资源：

路由
服务

当您想要配置 API 端点连接以检索、发送或查询指标数据时，请使用此信息。

在某些情况下，访问 API 端点可能会降低集群的性能和可扩展性，尤其是在使用端点检索、发送或查询大量指标数据时。

为避免这些问题，请遵循以下建议

避免频繁查询端点。将查询限制为每 30 秒最多一次。
不要尝试通过 Prometheus 的`/federate`端点检索所有指标数据。只有在您想要检索有限的聚合数据集时才查询它。例如，每个请求检索少于 1000 个样本有助于最大限度地降低性能下降的风险。

CMO 路由资源

编辑

openshift-monitoring/alertmanager-main

通过路由器公开alertmanager-main服务的/api端点。

openshift-monitoring/prometheus-k8s

通过路由器公开prometheus-k8s服务的/api端点。

openshift-monitoring/prometheus-k8s-federate

通过路由器公开prometheus-k8s服务的/federate端点。

openshift-user-workload-monitoring/federate

通过路由器公开prometheus-user-workload服务的/federate端点。

openshift-monitoring/thanos-querier

通过路由器公开thanos-querier服务的/api端点。

openshift-user-workload-monitoring/thanos-ruler

通过路由器公开thanos-ruler服务的/api端点。

CMO 服务资源

openshift-monitoring/prometheus-operator-admission-webhook

公开在端口 8443 上验证PrometheusRules和AlertmanagerConfig自定义资源的准入 Webhook 服务。

openshift-user-workload-monitoring/alertmanager-user-workload

在集群中公开以下端口的用户定义 Alertmanager Web 服务器：

端口 9095 提供对 Alertmanager 端点的访问。授予访问权限需要将用户绑定到openshift-user-workload-monitoring项目中的monitoring-alertmanager-api-reader角色（对于只读操作）或monitoring-alertmanager-api-writer角色。
端口 9092 提供对仅限于给定项目的 Alertmanager 端点的访问。授予访问权限需要将用户绑定到项目中的monitoring-rules-edit集群角色或monitoring-edit集群角色。
端口 9097 仅提供对/metrics端点的访问。此端口用于内部使用，不保证其他用途。

openshift-monitoring/alertmanager-main

在集群中公开以下端口的 Alertmanager Web 服务器：

端口 9094 提供对所有 Alertmanager 端点的访问。授予访问权限需要将用户绑定到openshift-monitoring项目中的monitoring-alertmanager-view角色（对于只读操作）或monitoring-alertmanager-edit角色。
端口 9092 提供对仅限于给定项目的 Alertmanager 端点的访问。授予访问权限需要将用户绑定到项目中的monitoring-rules-edit集群角色或monitoring-edit集群角色。
端口 9097 仅提供对/metrics端点的访问。此端口用于内部使用，不保证其他用途。

openshift-monitoring/kube-state-metrics

在集群中公开以下端口的 kube-state-metrics /metrics 端点：

端口 8443 提供对 Kubernetes 资源指标的访问。此端口用于内部使用，不保证其他用途。
端口 9443 提供对内部 kube-state-metrics 指标的访问。此端口用于内部使用，不保证其他用途。

openshift-monitoring/metrics-server

在端口 443 上公开 metrics-server Web 服务器。此端口用于内部使用，不保证其他用途。

openshift-monitoring/monitoring-plugin

在端口 9443 上公开监控插件服务。此端口用于内部使用，不保证其他用途。

openshift-monitoring/node-exporter

在端口 9100 上公开/metrics端点。此端口用于内部使用，不保证其他用途。

openshift-monitoring/openshift-state-metrics

在集群中公开以下端口的 openshift-state-metrics /metrics 端点：

端口 8443 提供对 OpenShift 资源指标的访问。此端口用于内部使用，不保证其他用途。
端口 9443 提供对内部openshift-state-metrics指标的访问。此端口用于内部使用，不保证其他用途。

openshift-monitoring/prometheus-k8s

在集群中公开以下端口的 Prometheus Web 服务器：

端口 9091 提供对所有 Prometheus 端点的访问。授予访问权限需要将用户绑定到cluster-monitoring-view集群角色。
端口 9092 仅提供对/metrics和/federate端点的访问。此端口用于内部使用，不保证其他用途。

openshift-user-workload-monitoring/prometheus-operator

在端口 8443 上公开/metrics端点。此端口用于内部使用，不保证其他用途。

openshift-monitoring/prometheus-operator

在端口 8443 上公开/metrics端点。此端口用于内部使用，不保证其他用途。

openshift-user-workload-monitoring/prometheus-user-workload

在集群中公开以下端口的 Prometheus Web 服务器：

端口 9091 仅提供对/metrics端点的访问。此端口用于内部使用，不保证其他用途。
端口 9092 仅提供对/federate端点的访问。授予访问权限需要将用户绑定到cluster-monitoring-view集群角色。

这也公开了端口 10902 上 Thanos sidecar Web 服务器的/metrics端点。此端口用于内部使用，不保证其他用途。

openshift-monitoring/telemeter-client

在端口 8443 上公开/metrics端点。此端口用于内部使用，不保证其他用途。

openshift-monitoring/thanos-querier

在集群中公开以下端口的 Thanos Querier Web 服务器：

端口 9091 提供对所有 Thanos Querier 端点的访问。授予访问权限需要将用户绑定到cluster-monitoring-view集群角色。
端口 9092 提供对仅限于给定项目的/api/v1/query、/api/v1/query_range/、/api/v1/labels、/api/v1/label/*/values和/api/v1/series端点的访问。授予访问权限需要将用户绑定到项目中的view集群角色。
端口 9093 提供对仅限于给定项目的/api/v1/alerts和/api/v1/rules端点的访问。授予访问权限需要将用户绑定到项目中的monitoring-rules-edit集群角色、monitoring-edit集群角色或monitoring-rules-view集群角色。
端口 9094 仅提供对/metrics端点的访问。此端口用于内部使用，不保证其他用途。

openshift-user-workload-monitoring/thanos-ruler

在集群中公开以下端口的 Thanos Ruler Web 服务器：

端口 9091 提供对所有 Thanos Ruler 端点的访问。授予访问权限需要将用户绑定到cluster-monitoring-view集群角色。
端口 9092 仅提供对/metrics端点的访问。此端口用于内部使用，不保证其他用途。

这还会在 10901 端口上公开 gRPC 端点。此端口供内部使用，不保证其他用途。

openshift-monitoring/cluster-monitoring-operator

在端口 8443 上公开/metrics端点。此端口用于内部使用，不保证其他用途。

编辑

使用 CLI 访问监控 API

关于访问监控 Web 服务 API

访问监控 Web 服务 API

使用 Prometheus 的联合端点查询指标

为自定义应用程序访问集群外部的指标

集群监控操作员的资源参考

CMO 路由资源

openshift-monitoring/alertmanager-main

openshift-monitoring/prometheus-k8s

openshift-monitoring/prometheus-k8s-federate

openshift-user-workload-monitoring/federate

openshift-monitoring/thanos-querier

openshift-user-workload-monitoring/thanos-ruler

CMO 服务资源

openshift-monitoring/prometheus-operator-admission-webhook

openshift-user-workload-monitoring/alertmanager-user-workload

openshift-monitoring/alertmanager-main

openshift-monitoring/kube-state-metrics

openshift-monitoring/metrics-server

openshift-monitoring/monitoring-plugin

openshift-monitoring/node-exporter

openshift-monitoring/openshift-state-metrics

openshift-monitoring/prometheus-k8s

openshift-user-workload-monitoring/prometheus-operator

openshift-monitoring/prometheus-operator

openshift-user-workload-monitoring/prometheus-user-workload

openshift-monitoring/telemeter-client

openshift-monitoring/thanos-querier

openshift-user-workload-monitoring/thanos-ruler

openshift-monitoring/cluster-monitoring-operator

其他资源