一次故障排查,k8s节点无法连接到master
在节点机器 Node1 上执行 kubeadm join 的时候连接不上,以下是部分输出的信息:
[preflight] Running pre-flight checks
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Starting the kubelet
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...
[kubelet-check] Initial timeout of 40s passed.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp [::1]:10248: connect: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp [::1]:10248: connect: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp [::1]:10248: connect: connection refused.
通过以下命令查看日志
journalctl -xefu kubelet
报错很明显,这里的驱动匹配不上
Failed to run kubelet“ err=“failed to run Kubelet: misconfiguration: kubelet cgroup driver:cgroup...
通过以下命令查看当前 docker 的配置,发现这个配置文件我并没有创建,所以我创建这个文件,并写入以下数据
vim /etc/docker/daemon.json
{
"exec-opts": ["native.cgroupdriver=systemd"]
}
完成后重载并重启相关的应用
systemctl daemon-reload
systemctl restart docker
systemctl restart kubelet
# 完成上述操作之后 运行以下命令查看 kubelet 状态
systemctl status kubelet
如果还是报同样的错,说明驱动还是不匹配,可以将上述的 systemd
改为 cgroupfs
{
"exec-opts": ["native.cgroupdriver=cgroupfs"]
}
改完后再次重复以上步骤就可以了。
完成后重新在主节点 master 创建 token
kubeadm token create --print-join-command
得到 join token,拿到这段数据后在刚刚的子节点重新执行一下
kubeadm join 192.168.23.71:6443 --token urqxik.8zr2zc92gvk2e430 --discovery-token-ca-cert-hash sha256:50ba00474686e024817b3025604f8f2048f8c9c2fed4f2a2521d30d2b3e04a79
执行后报了这个错误,这些文件已经存在了,手动删除它们即可
[root@node1 ~]# kubeadm join 192.168.23.71:6443 --token urqxik.8zr2zc92gvk2e430 --discovery-token-ca-cert-hash sha256:50ba00474686e024817b3025604f8f2048f8c9c2fed4f2a2521d30d2b3e04a79
[preflight] Running pre-flight checks
error execution phase preflight: [preflight] Some fatal errors occurred:
[ERROR FileAvailable--etc-kubernetes-kubelet.conf]: /etc/kubernetes/kubelet.conf already exists
[ERROR FileAvailable--etc-kubernetes-bootstrap-kubelet.conf]: /etc/kubernetes/bootstrap-kubelet.conf already exists
[ERROR Port-10250]: Port 10250 is in use
[ERROR FileAvailable--etc-kubernetes-pki-ca.crt]: /etc/kubernetes/pki/ca.crt already exists
[preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...`
To see the stack trace of this error execute with --v=5 or higher
手动删除后再执行一次
rm -rf /etc/kubernetes/kubelet.conf /etc/kubernetes/pki/ca.crt /etc/kubernetes/bootstrap-kubelet.conf
kubeadm join 192.168.23.71:6443 --token urqxik.8zr2zc92gvk2e430 --discovery-token-ca-cert-hash sha256:50ba00474686e024817b3025604f8f2048f8c9c2fed4f2a2521d30d2b3e04a79
切换回主节点,通过以下命令查看
kubectl get nodes
一切正常了
声明: 因编程语言版本更新较快,当前文章所涉及的语法或某些特性相关的信息并不一定完全适用于您当前所使用的版本,请仔细甄别。文章内容仅作为学习和参考,若有错误,欢迎指正。
开发者
专题·造轮子
k8s·热门
相关文章