ICode9

精准搜索请尝试: 精确搜索
首页 > 其他分享> 文章详细

排查 Kubernetes 集群无法加入 control-plane 的问题

2022-05-21 09:02:59  阅读:185  来源: 互联网

标签:control 10.0 9.171 Kubernetes plane etcd kube


使用下面的命令将 kube-master1 作为 control-plane 加入 k8s 集群

kubeadm join k8s-api:6443 \
  --token ****** \
  --discovery-token-ca-cert-hash ****** \
  --control-plane \
  --certificate-key *****

加入 etcd 集群时卡住

[etcd] Announced new etcd member joining to the existing etcd cluster
[etcd] Creating static Pod manifest for "etcd"
[etcd] Waiting for the new etcd member to join the cluster. This can take up to 40s
[kubelet-check] Initial timeout of 40s passed.

在 /var/log/containers 中发现 etcd 的错误日志

{
  "level": "warn",
  "ts": "2022-05-20T23:25:34.108Z",
  "caller": "etcdserver/cluster_util.go:79",
  "msg": "failed to get cluster response",
  "address": "https://10.0.9.171:2380/members",
  "error": "Get \"https://10.0.9.171:2380/members\": x509: certificate is valid for 10.0.1.81, 127.0.0.1, ::1, not 10.0.9.171"
}

从日志看是请求 https://10.0.9.171:2380/members 时,10.0.9.171 返回的证书不对。10.0.9.171 是集群中现有的 control-plane,主机名是 kube-master0。10.0.1.81 是以前的 control-plane,主机名是 k8s-master0。

用 openssl 命令检查证书

openssl s_client -showcerts -servername 10.0.9.171 -connect 10.0.9.171:2380

的确是证书问题,用的是以前的 k8s-master0 证书

---
Certificate chain
 0 s:CN = k8s-master0
   i:CN = etcd-ca
-----BEGIN CERTIFICATE-----
******
-----END CERTIFICATE-----
---
Server certificate
subject=CN = k8s-master0

issuer=CN = etcd-ca

---
Acceptable client certificate CA names
CN = etcd-ca

到 kube-master0 服务上检查 /etc/kubernetes/pki/etcd 中的证书

openssl x509 -in server.crt -text -noout
openssl x509 -in peer.crt -text -noout

的确还是以前 k8s-master0 使用的证书。

知道了问题原因,就很好解决了,重新生成 etcd 用到的证书。

删除 /etc/kubernetes/pki/etcd 中除了 ca.crt 与 ca.key 之外的证书文件,用下面的命令重新生成证书

kubeadm init phase certs etcd-server
kubeadm init phase certs etcd-peer
kubeadm init phase certs etcd-healthcheck-client

在 kube-master0 上从集群中删除没成功加入集群的 kube-master1

kubectl delete node kube-master1

在 kube-master1 退出集群并重新加入

kubeadm reset
kubeadm join k8s-api:6443 ...

加入成功!问题终于解决!

[etcd] Announced new etcd member joining to the existing etcd cluster
[etcd] Creating static Pod manifest for "etcd"
[etcd] Waiting for the new etcd member to join the cluster. This can take up to 40s
The 'update-status' phase is deprecated and will be removed in a future release. Currently it performs no operation
[mark-control-plane] Marking the node kube-master1 as control-plane by adding the labels: [node-role.kubernetes.io/control-plane node.kubernetes.io/exclude-from-external-load-balancers]
[mark-control-plane] Marking the node kube-master1 as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule node-role.kubernetes.io/control-plane:NoSchedule]

This node has joined the cluster and a new control plane instance was created:

标签:control,10.0,9.171,Kubernetes,plane,etcd,kube
来源: https://www.cnblogs.com/dudu/p/16294338.html

本站声明: 1. iCode9 技术分享网(下文简称本站)提供的所有内容,仅供技术学习、探讨和分享;
2. 关于本站的所有留言、评论、转载及引用,纯属内容发起人的个人观点,与本站观点和立场无关;
3. 关于本站的所有言论和文字,纯属内容发起人的个人观点,与本站观点和立场无关;
4. 本站文章均是网友提供,不完全保证技术分享内容的完整性、准确性、时效性、风险性和版权归属;如您发现该文章侵犯了您的权益,可联系我们第一时间进行删除;
5. 本站为非盈利性的个人网站,所有内容不会用来进行牟利,也不会利用任何形式的广告来间接获益,纯粹是为了广大技术爱好者提供技术内容和技术思想的分享性交流网站。

专注分享技术,共同学习,共同进步。侵权联系[81616952@qq.com]

Copyright (C)ICode9.com, All Rights Reserved.

ICode9版权所有