kubectl get cs
NAME                 STATUS      MESSAGE                                                                                     ERROR
scheduler            Unhealthy   Get http://127.0.0.1:10251/healthz: dial tcp 127.0.0.1:10251: connect: connection refused
controller-manager   Unhealthy   Get http://127.0.0.1:10252/healthz: dial tcp 127.0.0.1:10252: connect: connection refused
etcd-0               Healthy     {"health":"true"}

kubeadm安装得集群scheduler和controller-manager不健康

出现这种情况,是/etc/kubernetes/manifests下的kube-controller-manager.yaml和kube-scheduler.yaml设置的默认端口是0,在文件中注释掉就可以了

直接执行以下命令即可

sed -i 's/^[^#].*--port=0*/#&/g' /etc/kubernetes/manifests/kube-scheduler.yaml
sed -i 's/^[^#].*--port=0*/#&/g' /etc/kubernetes/manifests/kube-controller-manager.yaml

kube-controller-manager.yaml文件修改:注释掉27行 示例如下

 1 apiVersion: v1
 2 kind: Pod
 3 metadata:
 4   creationTimestamp: null
 5   labels:
 6     component: kube-controller-manager
 7     tier: control-plane
 8   name: kube-controller-manager
 9   namespace: kube-system
10 spec:
11   containers:
12   - command:
13     - kube-controller-manager
14     - --allocate-node-cidrs=true
15     - --authentication-kubeconfig=/etc/kubernetes/controller-manager.conf
16     - --authorization-kubeconfig=/etc/kubernetes/controller-manager.conf
17     - --bind-address=127.0.0.1
18     - --client-ca-file=/etc/kubernetes/pki/ca.crt
19     - --cluster-cidr=10.244.0.0/16
20     - --cluster-name=kubernetes
21     - --cluster-signing-cert-file=/etc/kubernetes/pki/ca.crt
22     - --cluster-signing-key-file=/etc/kubernetes/pki/ca.key
23     - --controllers=*,bootstrapsigner,tokencleaner
24     - --kubeconfig=/etc/kubernetes/controller-manager.conf
25     - --leader-elect=true
26     - --node-cidr-mask-size=24
27   #  - --port=0
28     - --requestheader-client-ca-file=/etc/kubernetes/pki/front-proxy-ca.crt
29     - --root-ca-file=/etc/kubernetes/pki/ca.crt
30     - --service-account-private-key-file=/etc/kubernetes/pki/sa.key
31     - --service-cluster-ip-range=10.1.0.0/16
32     - --use-service-account-credentials=true

查看组件状态发现 controller-manager 和 scheduler 状态显示 Unhealthy,但是集群正常工作,是因为TKE metacluster托管方式集群的 apiserver 与 controller-manager 和 scheduler 不在同一个节点导致的,这个不影响功能。如果发现是 Healthy 说明 apiserver 跟它们部署在同一个节点,所以这个取决于部署方式。

可参考https://github.com/opsnull/follow-me-install-kubernetes-cluster/issues/492

# IP地址访问:非安全模式,只限于本地访问
curl -s http://127.0.0.1:10252/metrics | head
curl -s http://127.0.0.1:10252/healthz | head
# IP地址访问:https安全模式
curl -s --cacert /k8s/kubernetes/ssl/k8s-ca.pem --cert /k8s/kubernetes/ssl/admin.pem --key /k8s/kubernetes/ssl/admin-key.pem https://192.168.11.10:10257/metrics | head
curl -s --cacert /k8s/kubernetes/ssl/k8s-ca.pem --cert /k8s/kubernetes/ssl/admin.pem --key /k8s/kubernetes/ssl/admin-key.pem https://192.168.11.10:10257/healthz | head

curl -s --cacert /k8s/kubernetes/ssl/k8s-ca.pem --cert /k8s/kubernetes/ssl/admin.pem --key /k8s/kubernetes/ssl/admin-key.pem https://127.0.0.1:10257/metrics | head
curl -s --cacert /k8s/kubernetes/ssl/k8s-ca.pem --cert /k8s/kubernetes/ssl/admin.pem --key /k8s/kubernetes/ssl/admin-key.pem https://127.0.0.1:10257/healthz | head

更详细的原因:

apiserver探测controller-manager 和 scheduler写死直接连的本机

func (s componentStatusStorage) serversToValidate() map[string]*componentstatus.Server {
    serversToValidate := map[string]*componentstatus.Server{
        "controller-manager": {Addr: "127.0.0.1", Port: ports.InsecureKubeControllerManagerPort, Path: "/healthz"},
        "scheduler":          {Addr: "127.0.0.1", Port: ports.InsecureSchedulerPort, Path: "/healthz"},
    }

源码可查看https://github.com/kubernetes/kubernetes/blob/v1.14.3/pkg/registry/core/rest/storage_core.go#L256