apiVersion: metallb.io/v1beta1
kind: MetalLB
metadata:
name: metallb
namespace: metallb-system
spec:
logLevel: debug
nodeSelector:
node-role.kubernetes.io/worker: ""
如果您需要排查 MetalLB 配置问题,请参阅以下部分了解常用命令。
MetalLB 使用 FRRouting (FRR) 容器,默认设置为 info
,会生成大量日志。您可以通过设置 logLevel
来控制生成的日志的详细程度,如下例所示。
将 logLevel
设置为 debug
以更深入地了解 MetalLB,如下所示:
您可以作为具有 cluster-admin
角色的用户访问集群。
您已安装 OpenShift CLI (oc
)。
创建一个文件,例如 setdebugloglevel.yaml
,内容如下例所示:
apiVersion: metallb.io/v1beta1
kind: MetalLB
metadata:
name: metallb
namespace: metallb-system
spec:
logLevel: debug
nodeSelector:
node-role.kubernetes.io/worker: ""
应用配置
$ oc replace -f setdebugloglevel.yaml
使用 |
显示 speaker
Pod 的名称
$ oc get -n metallb-system pods -l component=speaker
NAME READY STATUS RESTARTS AGE
speaker-2m9pm 4/4 Running 0 9m19s
speaker-7m4qw 3/4 Running 0 19s
speaker-szlmx 4/4 Running 0 9m19s
将重新创建 Speaker 和 Controller Pod 以确保应用更新的日志级别。所有 MetalLB 组件的日志级别都已修改。 |
查看 speaker
日志
$ oc logs -n metallb-system speaker-7m4qw -c speaker
{"branch":"main","caller":"main.go:92","commit":"3d052535","goversion":"gc / go1.17.1 / amd64","level":"info","msg":"MetalLB speaker starting (commit 3d052535, branch main)","ts":"2022-05-17T09:55:05Z","version":""} {"caller":"announcer.go:110","event":"createARPResponder","interface":"ens4","level":"info","msg":"created ARP responder for interface","ts":"2022-05-17T09:55:05Z"} {"caller":"announcer.go:119","event":"createNDPResponder","interface":"ens4","level":"info","msg":"created NDP responder for interface","ts":"2022-05-17T09:55:05Z"} {"caller":"announcer.go:110","event":"createARPResponder","interface":"tun0","level":"info","msg":"created ARP responder for interface","ts":"2022-05-17T09:55:05Z"} {"caller":"announcer.go:119","event":"createNDPResponder","interface":"tun0","level":"info","msg":"created NDP responder for interface","ts":"2022-05-17T09:55:05Z"} I0517 09:55:06.515686 95 request.go:665] Waited for 1.026500832s due to client-side throttling, not priority and fairness, request: GET:https://172.30.0.1:443/apis/operators.coreos.com/v1alpha1?timeout=32s {"Starting Manager":"(MISSING)","caller":"k8s.go:389","level":"info","ts":"2022-05-17T09:55:08Z"} {"caller":"speakerlist.go:310","level":"info","msg":"node event - forcing sync","node addr":"10.0.128.4","node event":"NodeJoin","node name":"ci-ln-qb8t3mb-72292-7s7rh-worker-a-vvznj","ts":"2022-05-17T09:55:08Z"} {"caller":"service_controller.go:113","controller":"ServiceReconciler","enqueueing":"openshift-kube-controller-manager-operator/metrics","epslice":"{\"metadata\":{\"name\":\"metrics-xtsxr\",\"generateName\":\"metrics-\",\"namespace\":\"openshift-kube-controller-manager-operator\",\"uid\":\"ac6766d7-8504-492c-9d1e-4ae8897990ad\",\"resourceVersion\":\"9041\",\"generation\":4,\"creationTimestamp\":\"2022-05-17T07:16:53Z\",\"labels\":{\"app\":\"kube-controller-manager-operator\",\"endpointslice.kubernetes.io/managed-by\":\"endpointslice-controller.k8s.io\",\"kubernetes.io/service-name\":\"metrics\"},\"annotations\":{\"endpoints.kubernetes.io/last-change-trigger-time\":\"2022-05-17T07:21:34Z\"},\"ownerReferences\":[{\"apiVersion\":\"v1\",\"kind\":\"Service\",\"name\":\"metrics\",\"uid\":\"0518eed3-6152-42be-b566-0bd00a60faf8\",\"controller\":true,\"blockOwnerDeletion\":true}],\"managedFields\":[{\"manager\":\"kube-controller-manager\",\"operation\":\"Update\",\"apiVersion\":\"discovery.k8s.io/v1\",\"time\":\"2022-05-17T07:20:02Z\",\"fieldsType\":\"FieldsV1\",\"fieldsV1\":{\"f:addressType\":{},\"f:endpoints\":{},\"f:metadata\":{\"f:annotations\":{\".\":{},\"f:endpoints.kubernetes.io/last-change-trigger-time\":{}},\"f:generateName\":{},\"f:labels\":{\".\":{},\"f:app\":{},\"f:endpointslice.kubernetes.io/managed-by\":{},\"f:kubernetes.io/service-name\":{}},\"f:ownerReferences\":{\".\":{},\"k:{\\\"uid\\\":\\\"0518eed3-6152-42be-b566-0bd00a60faf8\\\"}\":{}}},\"f:ports\":{}}}]},\"addressType\":\"IPv4\",\"endpoints\":[{\"addresses\":[\"10.129.0.7\"],\"conditions\":{\"ready\":true,\"serving\":true,\"terminating\":false},\"targetRef\":{\"kind\":\"Pod\",\"namespace\":\"openshift-kube-controller-manager-operator\",\"name\":\"kube-controller-manager-operator-6b98b89ddd-8d4nf\",\"uid\":\"dd5139b8-e41c-4946-a31b-1a629314e844\",\"resourceVersion\":\"9038\"},\"nodeName\":\"ci-ln-qb8t3mb-72292-7s7rh-master-0\",\"zone\":\"us-central1-a\"}],\"ports\":[{\"name\":\"https\",\"protocol\":\"TCP\",\"port\":8443}]}","level":"debug","ts":"2022-05-17T09:55:08Z"}
查看 FRR 日志
$ oc logs -n metallb-system speaker-7m4qw -c frr
Started watchfrr 2022/05/17 09:55:05 ZEBRA: client 16 says hello and bids fair to announce only bgp routes vrf=0 2022/05/17 09:55:05 ZEBRA: client 31 says hello and bids fair to announce only vnc routes vrf=0 2022/05/17 09:55:05 ZEBRA: client 38 says hello and bids fair to announce only static routes vrf=0 2022/05/17 09:55:05 ZEBRA: client 43 says hello and bids fair to announce only bfd routes vrf=0 2022/05/17 09:57:25.089 BGP: Creating Default VRF, AS 64500 2022/05/17 09:57:25.090 BGP: dup addr detect enable max_moves 5 time 180 freeze disable freeze_time 0 2022/05/17 09:57:25.090 BGP: bgp_get: Registering BGP instance (null) to zebra 2022/05/17 09:57:25.090 BGP: Registering VRF 0 2022/05/17 09:57:25.091 BGP: Rx Router Id update VRF 0 Id 10.131.0.1/32 2022/05/17 09:57:25.091 BGP: RID change : vrf VRF default(0), RTR ID 10.131.0.1 2022/05/17 09:57:25.091 BGP: Rx Intf add VRF 0 IF br0 2022/05/17 09:57:25.091 BGP: Rx Intf add VRF 0 IF ens4 2022/05/17 09:57:25.091 BGP: Rx Intf address add VRF 0 IF ens4 addr 10.0.128.4/32 2022/05/17 09:57:25.091 BGP: Rx Intf address add VRF 0 IF ens4 addr fe80::c9d:84da:4d86:5618/64 2022/05/17 09:57:25.091 BGP: Rx Intf add VRF 0 IF lo 2022/05/17 09:57:25.091 BGP: Rx Intf add VRF 0 IF ovs-system 2022/05/17 09:57:25.091 BGP: Rx Intf add VRF 0 IF tun0 2022/05/17 09:57:25.091 BGP: Rx Intf address add VRF 0 IF tun0 addr 10.131.0.1/23 2022/05/17 09:57:25.091 BGP: Rx Intf address add VRF 0 IF tun0 addr fe80::40f1:d1ff:feb6:5322/64 2022/05/17 09:57:25.091 BGP: Rx Intf add VRF 0 IF veth2da49fed 2022/05/17 09:57:25.091 BGP: Rx Intf address add VRF 0 IF veth2da49fed addr fe80::24bd:d1ff:fec1:d88/64 2022/05/17 09:57:25.091 BGP: Rx Intf add VRF 0 IF veth2fa08c8c 2022/05/17 09:57:25.091 BGP: Rx Intf address add VRF 0 IF veth2fa08c8c addr fe80::6870:ff:fe96:efc8/64 2022/05/17 09:57:25.091 BGP: Rx Intf add VRF 0 IF veth41e356b7 2022/05/17 09:57:25.091 BGP: Rx Intf address add VRF 0 IF veth41e356b7 addr fe80::48ff:37ff:fede:eb4b/64 2022/05/17 09:57:25.092 BGP: Rx Intf add VRF 0 IF veth1295c6e2 2022/05/17 09:57:25.092 BGP: Rx Intf address add VRF 0 IF veth1295c6e2 addr fe80::b827:a2ff:feed:637/64 2022/05/17 09:57:25.092 BGP: Rx Intf add VRF 0 IF veth9733c6dc 2022/05/17 09:57:25.092 BGP: Rx Intf address add VRF 0 IF veth9733c6dc addr fe80::3cf4:15ff:fe11:e541/64 2022/05/17 09:57:25.092 BGP: Rx Intf add VRF 0 IF veth336680ea 2022/05/17 09:57:25.092 BGP: Rx Intf address add VRF 0 IF veth336680ea addr fe80::94b1:8bff:fe7e:488c/64 2022/05/17 09:57:25.092 BGP: Rx Intf add VRF 0 IF vetha0a907b7 2022/05/17 09:57:25.092 BGP: Rx Intf address add VRF 0 IF vetha0a907b7 addr fe80::3855:a6ff:fe73:46c3/64 2022/05/17 09:57:25.092 BGP: Rx Intf add VRF 0 IF vethf35a4398 2022/05/17 09:57:25.092 BGP: Rx Intf address add VRF 0 IF vethf35a4398 addr fe80::40ef:2fff:fe57:4c4d/64 2022/05/17 09:57:25.092 BGP: Rx Intf add VRF 0 IF vethf831b7f4 2022/05/17 09:57:25.092 BGP: Rx Intf address add VRF 0 IF vethf831b7f4 addr fe80::f0d9:89ff:fe7c:1d32/64 2022/05/17 09:57:25.092 BGP: Rx Intf add VRF 0 IF vxlan_sys_4789 2022/05/17 09:57:25.092 BGP: Rx Intf address add VRF 0 IF vxlan_sys_4789 addr fe80::80c1:82ff:fe4b:f078/64 2022/05/17 09:57:26.094 BGP: 10.0.0.1 [FSM] Timer (start timer expire). 2022/05/17 09:57:26.094 BGP: 10.0.0.1 [FSM] BGP_Start (Idle->Connect), fd -1 2022/05/17 09:57:26.094 BGP: Allocated bnc 10.0.0.1/32(0)(VRF default) peer 0x7f807f7631a0 2022/05/17 09:57:26.094 BGP: sendmsg_zebra_rnh: sending cmd ZEBRA_NEXTHOP_REGISTER for 10.0.0.1/32 (vrf VRF default) 2022/05/17 09:57:26.094 BGP: 10.0.0.1 [FSM] Waiting for NHT 2022/05/17 09:57:26.094 BGP: bgp_fsm_change_status : vrf default(0), Status: Connect established_peers 0 2022/05/17 09:57:26.094 BGP: 10.0.0.1 went from Idle to Connect 2022/05/17 09:57:26.094 BGP: 10.0.0.1 [FSM] TCP_connection_open_failed (Connect->Active), fd -1 2022/05/17 09:57:26.094 BGP: bgp_fsm_change_status : vrf default(0), Status: Active established_peers 0 2022/05/17 09:57:26.094 BGP: 10.0.0.1 went from Connect to Active 2022/05/17 09:57:26.094 ZEBRA: rnh_register msg from client bgp: hdr->length=8, type=nexthop vrf=0 2022/05/17 09:57:26.094 ZEBRA: 0: Add RNH 10.0.0.1/32 type Nexthop 2022/05/17 09:57:26.094 ZEBRA: 0:10.0.0.1/32: Evaluate RNH, type Nexthop (force) 2022/05/17 09:57:26.094 ZEBRA: 0:10.0.0.1/32: NH has become unresolved 2022/05/17 09:57:26.094 ZEBRA: 0: Client bgp registers for RNH 10.0.0.1/32 type Nexthop 2022/05/17 09:57:26.094 BGP: VRF default(0): Rcvd NH update 10.0.0.1/32(0) - metric 0/0 #nhops 0/0 flags 0x6 2022/05/17 09:57:26.094 BGP: NH update for 10.0.0.1/32(0)(VRF default) - flags 0x6 chgflags 0x0 - evaluate paths 2022/05/17 09:57:26.094 BGP: evaluate_paths: Updating peer (10.0.0.1(VRF default)) status with NHT 2022/05/17 09:57:30.081 ZEBRA: Event driven route-map update triggered 2022/05/17 09:57:30.081 ZEBRA: Event handler for route-map: 10.0.0.1-out 2022/05/17 09:57:30.081 ZEBRA: Event handler for route-map: 10.0.0.1-in 2022/05/17 09:57:31.104 ZEBRA: netlink_parse_info: netlink-listen (NS 0) type RTM_NEWNEIGH(28), len=76, seq=0, pid=0 2022/05/17 09:57:31.104 ZEBRA: Neighbor Entry received is not on a VLAN or a BRIDGE, ignoring 2022/05/17 09:57:31.105 ZEBRA: netlink_parse_info: netlink-listen (NS 0) type RTM_NEWNEIGH(28), len=76, seq=0, pid=0 2022/05/17 09:57:31.105 ZEBRA: Neighbor Entry received is not on a VLAN or a BRIDGE, ignoring
Red Hat 支持的 BGP 实现使用 speaker
Pod 中的容器中的 FRRouting (FRR)。作为集群管理员,如果您需要排查 BGP 配置问题,则需要在 FRR 容器中运行命令。
您可以作为具有 cluster-admin
角色的用户访问集群。
您已安装 OpenShift CLI (oc
)。
显示 speaker
Pod 的名称
$ oc get -n metallb-system pods -l component=speaker
NAME READY STATUS RESTARTS AGE
speaker-66bth 4/4 Running 0 56m
speaker-gvfnf 4/4 Running 0 56m
...
显示 FRR 的运行配置
$ oc exec -n metallb-system speaker-66bth -c frr -- vtysh -c "show running-config"
Building configuration... Current configuration: ! frr version 7.5.1_git frr defaults traditional hostname some-hostname log file /etc/frr/frr.log informational log timestamp precision 3 service integrated-vtysh-config ! router bgp 64500 (1) bgp router-id 10.0.1.2 no bgp ebgp-requires-policy no bgp default ipv4-unicast no bgp network import-check neighbor 10.0.2.3 remote-as 64500 (2) neighbor 10.0.2.3 bfd profile doc-example-bfd-profile-full (3) neighbor 10.0.2.3 timers 5 15 neighbor 10.0.2.4 remote-as 64500 neighbor 10.0.2.4 bfd profile doc-example-bfd-profile-full neighbor 10.0.2.4 timers 5 15 ! address-family ipv4 unicast network 203.0.113.200/30 (4) neighbor 10.0.2.3 activate neighbor 10.0.2.3 route-map 10.0.2.3-in in neighbor 10.0.2.4 activate neighbor 10.0.2.4 route-map 10.0.2.4-in in exit-address-family ! address-family ipv6 unicast network fc00:f853:ccd:e799::/124 neighbor 10.0.2.3 activate neighbor 10.0.2.3 route-map 10.0.2.3-in in neighbor 10.0.2.4 activate neighbor 10.0.2.4 route-map 10.0.2.4-in in exit-address-family ! route-map 10.0.2.3-in deny 20 ! route-map 10.0.2.4-in deny 20 ! ip nht resolve-via-default ! ipv6 nht resolve-via-default ! line vty ! bfd profile doc-example-bfd-profile-full transmit-interval 35 receive-interval 35 passive-mode echo-mode echo-interval 35 minimum-ttl 10 ! ! end
1 | router bgp 部分指示 MetalLB 的 ASN。 |
2 | 确认每个您添加的 BGP 对等体自定义资源都存在 neighbor <ip-address> remote-as <peer-ASN> 行。 |
3 | 如果您配置了 BFD,请确认 BFD 配置文件与正确的 BGP 对等体相关联,并且 BFD 配置文件出现在命令输出中。 |
4 | 确认 network <ip-address-range> 行与您在添加的地址池自定义资源中指定的 IP 地址范围匹配。 |
显示 BGP 摘要
$ oc exec -n metallb-system speaker-66bth -c frr -- vtysh -c "show bgp summary"
IPv4 Unicast Summary: BGP router identifier 10.0.1.2, local AS number 64500 vrf-id 0 BGP table version 1 RIB entries 1, using 192 bytes of memory Peers 2, using 29 KiB of memory Neighbor V AS MsgRcvd MsgSent TblVer InQ OutQ Up/Down State/PfxRcd PfxSnt 10.0.2.3 4 64500 387 389 0 0 0 00:32:02 0 1 (1) 10.0.2.4 4 64500 0 0 0 0 0 never Active 0 (2) Total number of neighbors 2 IPv6 Unicast Summary: BGP router identifier 10.0.1.2, local AS number 64500 vrf-id 0 BGP table version 1 RIB entries 1, using 192 bytes of memory Peers 2, using 29 KiB of memory Neighbor V AS MsgRcvd MsgSent TblVer InQ OutQ Up/Down State/PfxRcd PfxSnt 10.0.2.3 4 64500 387 389 0 0 0 00:32:02 NoNeg 10.0.2.4 4 64500 0 0 0 0 0 never Active 0 Total number of neighbors 2
1 | 确认输出包含您添加的每个 BGP 对等体自定义资源的行。 |
2 | 显示接收到的消息和发送的消息为 0 的输出表示没有 BGP 会话的 BGP 对等体。检查网络连接和 BGP 对等体的 BGP 配置。 |
显示收到地址池的 BGP 对等体
$ oc exec -n metallb-system speaker-66bth -c frr -- vtysh -c "show bgp ipv4 unicast 203.0.113.200/30"
将 ipv4
替换为 ipv6
以显示收到 IPv6 地址池的 BGP 对等体。将 203.0.113.200/30
替换为地址池中的 IPv4 或 IPv6 IP 地址范围。
BGP routing table entry for 203.0.113.200/30 Paths: (1 available, best #1, table default) Advertised to non peer-group peers: 10.0.2.3 (1) Local 0.0.0.0 from 0.0.0.0 (10.0.1.2) Origin IGP, metric 0, weight 32768, valid, sourced, local, best (First path received) Last update: Mon Jan 10 19:49:07 2022
1 | 确认输出包含 BGP 对等体的 IP 地址。 |
Red Hat 支持的双向转发检测 (BFD) 实现使用容器中的 FRRouting (FRR)(位于 `speaker` Pod 中)。BFD 实现依赖于也配置为 BGP 对等体的 BFD 对等体,并具有已建立的 BGP 会话。作为集群管理员,如果您需要对 BFD 配置问题进行故障排除,则需要在 FRR 容器中运行命令。
您可以作为具有 cluster-admin
角色的用户访问集群。
您已安装 OpenShift CLI (oc
)。
显示 speaker
Pod 的名称
$ oc get -n metallb-system pods -l component=speaker
NAME READY STATUS RESTARTS AGE
speaker-66bth 4/4 Running 0 26m
speaker-gvfnf 4/4 Running 0 26m
...
显示 BFD 对等体
$ oc exec -n metallb-system speaker-66bth -c frr -- vtysh -c "show bfd peers brief"
Session count: 2 SessionId LocalAddress PeerAddress Status ========= ============ =========== ====== 3909139637 10.0.1.2 10.0.2.3 up (1)
1 | 确认 `PeerAddress` 列包含每个 BFD 对等体。如果输出未列出您预期输出中包含的 BFD 对等体 IP 地址,请对与对等体的 BGP 连接进行故障排除。如果状态字段指示 `down`,请检查节点和对等体之间链路和设备的连接情况。您可以使用类似于 `oc get pods -n metallb-system speaker-66bth -o jsonpath='{.spec.nodeName}'` 的命令确定 speaker Pod 的节点名称。 |
OpenShift Container Platform 收集了以下与 MetalLB 的 BGP 对等体和 BFD 配置文件相关的 Prometheus 指标。
名称 | 描述 |
---|---|
|
计算从每个 BFD 对等体接收到的 BFD 控制数据包的数量。 |
|
计算发送到每个 BFD 对等体的 BFD 控制数据包的数量。 |
|
计算从每个 BFD 对等体接收到的 BFD 回显数据包的数量。 |
|
计算发送到每个 BFD 的 BFD 回显数据包的数量。 |
|
计算与对等体的 BFD 会话进入 `down` 状态的次数。 |
|
指示与 BFD 对等体的连接状态。`1` 表示会话处于 `up` 状态,`0` 表示会话处于 `down` 状态。 |
|
计算与对等体的 BFD 会话进入 `up` 状态的次数。 |
|
计算每个 BFD 对等体的 BFD Zebra 通知数量。 |
名称 | 描述 |
---|---|
|
计算通告给 BGP 对等体的负载均衡器 IP 地址前缀的数量。“前缀”和“聚合路由”具有相同的含义。 |
|
指示与 BGP 对等体的连接状态。`1` 表示会话处于 `up` 状态,`0` 表示会话处于 `down` 状态。 |
|
计算发送到每个 BGP 对等体的 BGP 更新消息的数量。 |
|
计算发送到每个 BGP 对等体的 BGP OPEN 消息的数量。 |
|
计算从每个 BGP 对等体接收到的 BGP OPEN 消息的数量。 |
|
计算发送到每个 BGP 对等体的 BGP 通知消息的数量。 |
|
计算从每个 BGP 对等体接收到的 BGP 更新消息的数量。 |
|
计算发送到每个 BGP 对等体的 BGP 保活消息的数量。 |
|
计算从每个 BGP 对等体接收到的 BGP 保活消息的数量。 |
|
计算发送到每个 BGP 对等体的 BGP 路由刷新消息的数量。 |
|
计算发送到每个 BGP 对等体的 BGP 总消息数量。 |
|
计算从每个 BGP 对等体接收到的 BGP 总消息数量。 |
有关使用监控仪表板的信息,请参阅 查询指标。
您可以使用 `oc adm must-gather` CLI 命令来收集有关您的集群、MetalLB 配置和 MetalLB 运算符的信息。以下功能和对象与 MetalLB 和 MetalLB 运算符相关联
MetalLB 运算符部署到的命名空间及其子对象
所有 MetalLB 运算符自定义资源定义 (CRD)
`oc adm must-gather` CLI 命令会从 Red Hat 用于实现 BGP 和 BFD 的 FRRouting (FRR) 收集以下信息
/etc/frr/frr.conf
/etc/frr/frr.log
`/etc/frr/daemons` 配置文件
/etc/frr/vtysh.conf
上述列表中的日志和配置文件是从每个 `speaker` Pod 中的 `frr` 容器中收集的。
除了日志和配置文件外,`oc adm must-gather` CLI 命令还会收集以下 `vtysh` 命令的输出
show running-config
show bgp ipv4
show bgp ipv6
show bgp neighbor
show bfd peer
运行 `oc adm must-gather` CLI 命令时无需任何其他配置。