分析集群资源级别 - 使用集群 | 节点 | OpenShift Container Platform 4.17

了解 OpenShift 集群容量工具
在命令行上运行 OpenShift 集群容量工具
在 Pod 内作为作业运行 OpenShift 集群容量工具

作为集群管理员，您可以使用 OpenShift 集群容量工具查看可以调度的 Pod 数量，从而在资源耗尽之前增加当前资源，并确保可以调度任何未来的 Pod。此容量来自集群中的单个节点主机，包括 CPU、内存、磁盘空间等。

了解 OpenShift 集群容量工具

OpenShift 集群容量工具模拟一系列调度决策，以确定在集群资源耗尽之前可以调度多少个输入 Pod 实例，从而提供更准确的估计。

剩余的可分配容量只是一个粗略的估计，因为它没有计算所有分配给节点的资源。它只分析剩余资源，并根据可以调度到集群中的具有给定要求的 Pod 实例数量来估计仍然可使用的可用容量。

此外，Pod 可能仅基于其选择和亲和力标准在特定节点集上具有调度支持。因此，估计集群可以调度哪些剩余 Pod 可能很困难。

您可以从命令行将 OpenShift 集群容量工具作为独立实用程序运行，也可以在 OpenShift Container Platform 集群内的 Pod 中作为作业运行。在 Pod 内运行该工具使您能够在无需干预的情况下多次运行它。

在命令行上运行 OpenShift 集群容量工具

您可以从命令行运行 OpenShift 集群容量工具来估计可以调度到集群中的 Pod 数量。

您创建一个示例 Pod 规范文件，该工具使用它来估计资源使用情况。Pod 规范指定其资源需求为limits或requests。集群容量工具考虑 Pod 的资源需求进行其估计分析。

先决条件

运行OpenShift 集群容量工具，该工具作为容器镜像从 Red Hat 生态系统目录提供。

创建示例 Pod 规范文件

创建一个类似于以下内容的 YAML 文件

apiVersion: v1
kind: Pod
metadata:
  name: small-pod
  labels:
    app: guestbook
    tier: frontend
spec:
  securityContext:
    runAsNonRoot: true
    seccompProfile:
      type: RuntimeDefault
  containers:
  - name: php-redis
    image: gcr.io/google-samples/gb-frontend:v4
    imagePullPolicy: Always
    resources:
      limits:
        cpu: 150m
        memory: 100Mi
      requests:
        cpu: 150m
        memory: 100Mi
    securityContext:
      allowPrivilegeEscalation: false
      capabilities:
        drop: [ALL]

创建集群角色

$ oc create -f <file_name>.yaml

例如

$ oc create -f pod-spec.yaml

步骤

要在命令行上使用集群容量工具

从终端登录到 Red Hat 注册表
```
$ podman login registry.redhat.io
```

拉取集群容量工具镜像

$ podman pull registry.redhat.io/openshift4/ose-cluster-capacity

运行集群容量工具

$ podman run -v $HOME/.kube:/kube:Z -v $(pwd):/cc:Z  ose-cluster-capacity \
/bin/cluster-capacity --kubeconfig /kube/config --<pod_spec>.yaml /cc/<pod_spec>.yaml \
--verbose

其中

<pod_spec>.yaml: 指定要使用的 Pod 规范。
verbose: 输出集群中每个节点可以调度多少个 Pod 的详细说明。

示例输出

small-pod pod requirements:
	- CPU: 150m
	- Memory: 100Mi

The cluster can schedule 88 instance(s) of the pod small-pod.

Termination reason: Unschedulable: 0/5 nodes are available: 2 Insufficient cpu,
3 node(s) had taint {node-role.kubernetes.io/master: }, that the pod didn't
tolerate.

Pod distribution among nodes:
small-pod
	- 192.168.124.214: 45 instance(s)
	- 192.168.124.120: 43 instance(s)

在上面的示例中，可以调度到集群中的估计 Pod 数量为 88。

在 Pod 内作为作业运行 OpenShift 集群容量工具

在 Pod 内作为作业运行 OpenShift 集群容量工具允许您在无需用户干预的情况下多次运行该工具。您可以使用ConfigMap对象作为作业运行 OpenShift 集群容量工具。

先决条件

下载并安装OpenShift 集群容量工具。

步骤

要运行集群容量工具

创建集群角色

创建一个类似于以下内容的 YAML 文件

kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: cluster-capacity-role
rules:
- apiGroups: [""]
  resources: ["pods", "nodes", "persistentvolumeclaims", "persistentvolumes", "services", "replicationcontrollers"]
  verbs: ["get", "watch", "list"]
- apiGroups: ["apps"]
  resources: ["replicasets", "statefulsets"]
  verbs: ["get", "watch", "list"]
- apiGroups: ["policy"]
  resources: ["poddisruptionbudgets"]
  verbs: ["get", "watch", "list"]
- apiGroups: ["storage.k8s.io"]
  resources: ["storageclasses"]
  verbs: ["get", "watch", "list"]

运行以下命令创建集群角色

$ oc create -f <file_name>.yaml

例如

$ oc create sa cluster-capacity-sa

创建服务帐户

$ oc create sa cluster-capacity-sa -n default

将角色添加到服务帐户

$ oc adm policy add-cluster-role-to-user cluster-capacity-role \
    system:serviceaccount:<namespace>:cluster-capacity-sa

其中

<namespace>: 指定 Pod 所在的命名空间。

定义并创建 Pod 规范

创建一个类似于以下内容的 YAML 文件

apiVersion: v1
kind: Pod
metadata:
  name: small-pod
  labels:
    app: guestbook
    tier: frontend
spec:
  securityContext:
    runAsNonRoot: true
    seccompProfile:
      type: RuntimeDefault
  containers:
  - name: php-redis
    image: gcr.io/google-samples/gb-frontend:v4
    imagePullPolicy: Always
    resources:
      limits:
        cpu: 150m
        memory: 100Mi
      requests:
        cpu: 150m
        memory: 100Mi
    securityContext:
      allowPrivilegeEscalation: false
      capabilities:
        drop: [ALL]

运行以下命令创建 Pod

$ oc create -f <file_name>.yaml

例如

$ oc create -f pod.yaml

运行以下命令创建 ConfigMap 对象
```
$ oc create configmap cluster-capacity-configmap \
    --from-file=pod.yaml=pod.yaml
```
集群容量分析安装在一个卷中，使用名为cluster-capacity-configmap的 ConfigMap 对象将输入 Pod 规范文件pod.yaml安装到路径/test-pod的卷test-volume中。

使用以下作业规范文件的示例创建作业

创建一个类似于以下内容的 YAML 文件

apiVersion: batch/v1
kind: Job
metadata:
  name: cluster-capacity-job
spec:
  parallelism: 1
  completions: 1
  template:
    metadata:
      name: cluster-capacity-pod
    spec:
        containers:
        - name: cluster-capacity
          image: openshift/origin-cluster-capacity
          imagePullPolicy: "Always"
          volumeMounts:
          - mountPath: /test-pod
            name: test-volume
          env:
          - name: CC_INCLUSTER (1)
            value: "true"
          command:
          - "/bin/sh"
          - "-ec"
          - |
            /bin/cluster-capacity --podspec=/test-pod/pod.yaml --verbose
        restartPolicy: "Never"
        serviceAccountName: cluster-capacity-sa
        volumes:
        - name: test-volume
          configMap:
            name: cluster-capacity-configmap

1	一个必需的环境变量，让集群容量工具知道它在集群内作为 Pod 运行。 `ConfigMap`对象的`pod.yaml`键与`Pod`规范文件名相同，但这并非必需。通过这样做，可以在 Pod 内访问输入 Pod 规范文件，路径为`/test-pod/pod.yaml`。

运行以下命令在 Pod 中将集群容量镜像作为作业运行
```
$ oc create -f cluster-capacity-job.yaml
```

验证

检查作业日志以查找可以调度到集群中的 Pod 数量

$ oc logs jobs/cluster-capacity-job

示例输出

small-pod pod requirements:
        - CPU: 150m
        - Memory: 100Mi

The cluster can schedule 52 instance(s) of the pod small-pod.

Termination reason: Unschedulable: No nodes are available that match all of the
following predicates:: Insufficient cpu (2).

Pod distribution among nodes:
small-pod
        - 192.168.124.214: 26 instance(s)
        - 192.168.124.120: 26 instance(s)

估算 OpenShift Container Platform 节点可以容纳的 Pod 数量

了解 OpenShift 集群容量工具

在命令行上运行 OpenShift 集群容量工具

在 Pod 内作为作业运行 OpenShift 集群容量工具