操作系统信息
实体机器,108核128G
操作系统:
Distributor ID: Ubuntu
Description: Ubuntu 20.04.6 LTS
Release: 20.04
Codename: focal
Kubernetes版本信息
将 kubectl version
命令执行结果贴在下方
Client Version: version.Info{Major:"1", Minor:"22", GitVersion:"v1.22.12", GitCommit:"b058e1760c79f46a834ba59bd7a3486ecf28237d", GitTreeState:"clean", BuildDate:"2022-07-13T14:59:18Z", GoVersion:"go1.16.15", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"22", GitVersion:"v1.22.12", GitCommit:"b058e1760c79f46a834ba59bd7a3486ecf28237d", GitTreeState:"clean", BuildDate:"2022-07-13T14:53:39Z", GoVersion:"go1.16.15", Compiler:"gc", Platform:"linux/amd64"}
容器运行时
将 docker version
/ crictl version
/ nerdctl version
结果贴在下方
Client:
Version: 20.10.8
API version: 1.41
Go version: go1.16.6
Git commit: 3967b7d
Built: Fri Jul 30 19:50:40 2021
OS/Arch: linux/amd64
Context: default
Experimental: true
Server: Docker Engine - Community
Engine:
Version: 20.10.8
API version: 1.41 (minimum version 1.12)
Go version: go1.16.6
Git commit: 75249d8
Built: Fri Jul 30 19:55:09 2021
OS/Arch: linux/amd64
Experimental: false
containerd:
Version: v1.4.9
GitCommit: e25210fe30a0a703442421b0f60afac609f950a3
runc:
Version: 1.0.1
GitCommit: v1.0.1-0-g4144b638
docker-init:
Version: 0.19.0
GitCommit: de40ad0
- crictl version
没有这个命令
- nerdctl version
没有这个命令
KubeSphere版本信息
例如:v2.1.1/v3.0.0。离线安装还是在线安装。在已有K8s上安装还是使用kk安装。
kk version: &version.Info{Major:"3", Minor:"0", GitVersion:"v3.0.7", GitCommit:"e755baf67198d565689d7207378174f429b508ba", GitTreeState:"clean", BuildDate:"2023-01-18T01:57:24Z", GoVersion:"go1.19.2", Compiler:"gc", Platform:"linux/amd64"}
离线安装
问题是什么
已经离线安装成功了,重启了整个集群之后,web页面登录不了,提示“request to http://ks-apiserver/oauth/token failed, reason: getaddrinfo EAI_AGAIN ks-apiserver”
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system calico-kube-controllers-846ddd49bc-6szkv 1/1 Running 5 (64s ago) 28h
kube-system calico-node-8tkmg 1/1 Running 6 (112m ago) 28h
kube-system calico-node-mbh7m 1/1 Running 2 (116m ago) 28h
kube-system calico-node-mpvtq 1/1 Running 3 (116m ago) 28h
kube-system calico-node-sz7qm 1/1 Running 2 (116m ago) 28h
kube-system coredns-558b97598-fwhww 1/1 Running 3 (116m ago) 28h
kube-system coredns-558b97598-s64mn 1/1 Running 3 (116m ago) 28h
kube-system haproxy-worker-216-01 1/1 Running 2 (116m ago) 28h
kube-system kube-apiserver-master-213-01 1/1 Running 37 (115m ago) 28h
kube-system kube-apiserver-master-214-02 1/1 Running 31 (116m ago) 28h
kube-system kube-apiserver-master-215-03 1/1 Running 31 (116m ago) 28h
kube-system kube-controller-manager-master-213-01 1/1 Running 7 (116m ago) 28h
kube-system kube-controller-manager-master-214-02 1/1 Running 4 (116m ago) 28h
kube-system kube-controller-manager-master-215-03 1/1 Running 3 (116m ago) 28h
kube-system kube-proxy-4dpjk 1/1 Running 2 (116m ago) 28h
kube-system kube-proxy-dlfjk 1/1 Running 7 (112m ago) 28h
kube-system kube-proxy-mj6fp 1/1 Running 4 (116m ago) 28h
kube-system kube-proxy-x9nwz 1/1 Running 2 (116m ago) 28h
kube-system kube-scheduler-master-213-01 1/1 Running 7 (116m ago) 28h
kube-system kube-scheduler-master-214-02 1/1 Running 3 (116m ago) 28h
kube-system kube-scheduler-master-215-03 1/1 Running 3 (116m ago) 28h
kube-system nodelocaldns-9tvcv 0/1 CrashLoopBackOff 80 (25s ago) 28h
kube-system nodelocaldns-ddvg2 0/1 CrashLoopBackOff 95 (35s ago) 28h
kube-system nodelocaldns-fmmrm 0/1 CrashLoopBackOff 61 (32s ago) 28h
kube-system nodelocaldns-g4f4x 0/1 CrashLoopBackOff 84 (48s ago) 28h
kube-system openebs-localpv-provisioner-6f54869bc7-6mn6b 0/1 Error 29 8h
kube-system snapshot-controller-0 1/1 Running 1 (116m ago) 6h45m
kubesphere-controls-system default-http-backend-59d5cf569f-4gsjb 0/1 Error 0 8h
kubesphere-controls-system kubectl-admin-7ffdf4596b-82rfv 1/1 Running 1 (116m ago) 8h
kubesphere-monitoring-system alertmanager-main-0 1/2 Running 2 (116m ago) 6h45m
kubesphere-monitoring-system alertmanager-main-1 0/2 Completed 0 6h45m
kubesphere-monitoring-system alertmanager-main-2 0/2 Completed 0 6h45m
kubesphere-monitoring-system kube-state-metrics-5474f8f7b-sfjfc 0/3 Completed 1 8h
kubesphere-monitoring-system node-exporter-cq78v 2/2 Running 4 (116m ago) 28h
kubesphere-monitoring-system node-exporter-fs6lh 2/2 Running 10 (112m ago) 28h
kubesphere-monitoring-system node-exporter-svwtp 2/2 Running 4 (116m ago) 28h
kubesphere-monitoring-system node-exporter-wtbgp 2/2 Running 8 (116m ago) 28h
kubesphere-monitoring-system notification-manager-deployment-7b586bd8fb-j4g86 0/2 Error 2 8h
kubesphere-monitoring-system notification-manager-deployment-7b586bd8fb-ljfz4 2/2 Running 4 (113m ago) 8h
kubesphere-monitoring-system notification-manager-operator-64ff97cb98-j2tzf 0/2 Completed 30 8h
kubesphere-monitoring-system prometheus-k8s-0 0/2 Error 0 6h45m
kubesphere-monitoring-system prometheus-k8s-1 0/2 Error 0 6h45m
kubesphere-monitoring-system prometheus-operator-64b7b4db85-qhhbn 0/2 Completed 1 8h
kubesphere-system ks-apiserver-848bfd75fd-4tnbz 0/1 CrashLoopBackOff 58 (23s ago) 28h
kubesphere-system ks-apiserver-848bfd75fd-cjbwv 0/1 Error 47 (57s ago) 6h49m
kubesphere-system ks-apiserver-848bfd75fd-k52lc 0/1 CrashLoopBackOff 59 (18s ago) 28h
kubesphere-system ks-console-868887c49f-9ltmd 1/1 Running 3 (116m ago) 28h
kubesphere-system ks-console-868887c49f-lql7m 1/1 Running 1 (112m ago) 6h50m
kubesphere-system ks-console-868887c49f-vsn57 1/1 Running 2 (116m ago) 28h
kubesphere-system ks-controller-manager-67b896bb6d-2rrrb 1/1 Running 5 (43s ago) 28h
kubesphere-system ks-controller-manager-67b896bb6d-9dx65 1/1 Running 1 (112m ago) 6h49m
kubesphere-system ks-controller-manager-67b896bb6d-tlj8h 1/1 Running 3 (116m ago) 28h
kubesphere-system ks-installer-5655f896fb-5k28b 0/1 Completed 1 8h
kubesphere-system redis-7cc8746478-g2p9c 1/1 Running 5 (112m ago) 28h
pic-distribute mysql-v1-0 0/1 Completed 0 6h44m
没有启动成功的过滤出来如下
kubectl get pods -A|grep -v Running
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system nodelocaldns-9tvcv 0/1 CrashLoopBackOff 86 (35s ago) 28h
kube-system nodelocaldns-ddvg2 0/1 CrashLoopBackOff 99 (3m4s ago) 28h
kube-system nodelocaldns-fmmrm 0/1 CrashLoopBackOff 66 (2m23s ago) 28h
kube-system nodelocaldns-g4f4x 0/1 CrashLoopBackOff 87 (4m43s ago) 28h
kubesphere-system ks-apiserver-848bfd75fd-4tnbz 0/1 CrashLoopBackOff 64 (61s ago) 28h
kubesphere-system ks-apiserver-848bfd75fd-cjbwv 0/1 CrashLoopBackOff 51 (2m25s ago) 7h6m
kubesphere-system ks-apiserver-848bfd75fd-k52lc 0/1 CrashLoopBackOff 65 (56s ago) 28h
nodelocaldns-9tvcv -n kube-system
的描述如下
kubectl describe nodelocaldns-9tvcv -n kube-system
error: the server doesn't have a resource type "nodelocaldns-9tvcv"
root@master-214-02:~# kubectl describe pod nodelocaldns-9tvcv -n kube-system
Name: nodelocaldns-9tvcv
Namespace: kube-system
Priority: 2000000000
Priority Class Name: system-cluster-critical
Node: worker-216-01/192.168.50.216
Start Time: Thu, 16 Nov 2023 06:26:21 +0000
Labels: controller-revision-hash=5855b6bfd
k8s-app=nodelocaldns
pod-template-generation=1
Annotations: prometheus.io/port: 9253
prometheus.io/scrape: true
Status: Running
IP: 192.168.50.216
IPs:
IP: 192.168.50.216
Controlled By: DaemonSet/nodelocaldns
Containers:
node-cache:
Container ID: docker://bcdd43f3ea3658b7cd4aabc9afd5fc9897ab77a6835744d916abaf95275a20f8
Image: dockerhub.kubekey.local/kubesphereio/k8s-dns-node-cache:1.15.12
Image ID: docker-pullable://dockerhub.kubekey.local/kubesphereio/k8s-dns-node-cache@sha256:b6b9dc5cb4ab54aea6905ceeceb61e54791bc2acadecbea65db3641d99c7fe69
Ports: 53/UDP, 53/TCP, 9253/TCP
Host Ports: 53/UDP, 53/TCP, 9253/TCP
Args:
-localip
169.254.25.10
-conf
/etc/coredns/Corefile
-upstreamsvc
coredns
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 1
Started: Fri, 17 Nov 2023 11:04:09 +0000
Finished: Fri, 17 Nov 2023 11:04:09 +0000
Ready: False
Restart Count: 86
Limits:
memory: 170Mi
Requests:
cpu: 100m
memory: 70Mi
Liveness: http-get http://169.254.25.10:9254/health delay=0s timeout=5s period=10s #success=1 #failure=10
Readiness: http-get http://169.254.25.10:9254/health delay=0s timeout=5s period=10s #success=1 #failure=10
Environment: <none>
Mounts:
/etc/coredns from config-volume (rw)
/run/xtables.lock from xtables-lock (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-vlnth (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
config-volume:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: nodelocaldns
Optional: false
xtables-lock:
Type: HostPath (bare host directory volume)
Path: /run/xtables.lock
HostPathType: FileOrCreate
kube-api-access-vlnth:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: :NoSchedule op=Exists
:NoExecute op=Exists
CriticalAddonsOnly op=Exists
node.kubernetes.io/disk-pressure:NoSchedule op=Exists
node.kubernetes.io/memory-pressure:NoSchedule op=Exists
node.kubernetes.io/network-unavailable:NoSchedule op=Exists
node.kubernetes.io/not-ready:NoExecute op=Exists
node.kubernetes.io/pid-pressure:NoSchedule op=Exists
node.kubernetes.io/unreachable:NoExecute op=Exists
node.kubernetes.io/unschedulable:NoSchedule op=Exists
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal SandboxChanged 18m kubelet Pod sandbox changed, it will be killed and re-created.
Warning Unhealthy 18m kubelet Readiness probe failed: Get "http://169.254.25.10:9254/health": dial tcp 169.254.25.10:9254: connect: connection refused
Normal Started 17m (x3 over 18m) kubelet Started container node-cache
Normal Pulled 16m (x4 over 18m) kubelet Container image "dockerhub.kubekey.local/kubesphereio/k8s-dns-node-cache:1.15.12" already present on machine
Normal Created 16m (x4 over 18m) kubelet Created container node-cache
Warning BackOff 3m54s (x78 over 18m) kubelet Back-off restarting failed container
nodelocaldns-9tvcv
日志
kubectl logs -f nodelocaldns-9tvcv -n kube-system
2023/11/17 11:04:09 [INFO] Using Corefile /etc/coredns/Corefile
2023/11/17 11:04:09 [ERROR] Failed to read node-cache coreFile /etc/coredns/Corefile.base - open /etc/coredns/Corefile.base: no such file or directory
2023/11/17 11:04:09 [ERROR] Failed to sync kube-dns config directory /etc/kube-dns, err: lstat /etc/kube-dns: no such file or directory
cluster.local.:53 on 169.254.25.10
in-addr.arpa.:53 on 169.254.25.10
ip6.arpa.:53 on 169.254.25.10
.:53 on 169.254.25.10
[INFO] plugin/reload: Running configuration MD5 = adf97d6b4504ff12113ebb35f0c6413e
CoreDNS-1.6.7
linux/amd64, go1.11.13,
[FATAL] plugin/loop: Loop (169.254.25.10:54381 -> 169.254.25.10:53) detected for zone ".", see https://coredns.io/plugins/loop#troubleshooting. Query: "HINFO 2309482293177081995.1716274136328598708."
ks-apiserver-848bfd75fd-4tnbz
的日志
kubectl describe pod -n kubesphere-system ks-apiserver-848bfd75fd-4tnbz
Name: ks-apiserver-848bfd75fd-4tnbz
Namespace: kubesphere-system
Priority: 0
Node: master-214-02/192.168.50.214
Start Time: Thu, 16 Nov 2023 06:28:52 +0000
Labels: app=ks-apiserver
pod-template-hash=848bfd75fd
tier=backend
Annotations: cni.projectcalico.org/containerID: 4e2209e4b5c0d712d66cab6b82f276cc3f5eabb2451ddd5ad37526995963239d
cni.projectcalico.org/podIP: 10.233.109.20/32
cni.projectcalico.org/podIPs: 10.233.109.20/32
Status: Running
IP: 10.233.109.20
IPs:
IP: 10.233.109.20
Controlled By: ReplicaSet/ks-apiserver-848bfd75fd
Containers:
ks-apiserver:
Container ID: docker://a29abea9b50633e849ba87955f72e71d093be684691c3f1ab279aade83dacacc
Image: dockerhub.kubekey.local/kubesphereio/ks-apiserver:v3.3.2
Image ID: docker-pullable://dockerhub.kubekey.local/kubesphereio/ks-apiserver@sha256:78d856f371d0981f9acef156da3869cf8b0a609bedf93f7d6a6d98d77d40ecd8
Port: 9090/TCP
Host Port: 0/TCP
Command:
ks-apiserver
--logtostderr=true
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 1
Started: Fri, 17 Nov 2023 11:08:56 +0000
Finished: Fri, 17 Nov 2023 11:09:01 +0000
Ready: False
Restart Count: 65
Limits:
cpu: 1
memory: 1Gi
Requests:
cpu: 20m
memory: 100Mi
Liveness: http-get http://:9090/kapis/version delay=15s timeout=15s period=10s #success=1 #failure=8
Environment:
KUBESPHERE_CACHE_OPTIONS_PASSWORD: <set to the key 'auth' in secret 'redis-secret'> Optional: false
Mounts:
/etc/kubesphere/ from kubesphere-config (rw)
/etc/kubesphere/ingress-controller from ks-router-config (rw)
/etc/localtime from host-time (ro)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-6lhcp (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
ks-router-config:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: ks-router-config
Optional: false
kubesphere-config:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: kubesphere-config
Optional: false
host-time:
Type: HostPath (bare host directory volume)
Path: /etc/localtime
HostPathType:
kube-api-access-6lhcp:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: CriticalAddonsOnly op=Exists
node-role.kubernetes.io/master:NoSchedule
node.kubernetes.io/not-ready:NoExecute op=Exists for 60s
node.kubernetes.io/unreachable:NoExecute op=Exists for 60s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal SandboxChanged 24m kubelet Pod sandbox changed, it will be killed and re-created.
Warning Unhealthy 23m (x3 over 23m) kubelet Liveness probe failed: Get "http://10.233.109.20:9090/kapis/version": dial tcp 10.233.109.20:9090: connect: connection refused
Normal Pulled 21m (x4 over 24m) kubelet Container image "dockerhub.kubekey.local/kubesphereio/ks-apiserver:v3.3.2" already present on machine
Normal Created 21m (x4 over 24m) kubelet Created container ks-apiserver
Normal Started 21m (x4 over 24m) kubelet Started container ks-apiserver
Warning BackOff 4m13s (x92 over 23m) kubelet Back-off restarting failed container
ks-apiserver-848bfd75fd-4tnbz
的日志
kubectl logs -f -n kubesphere-system ks-apiserver-848bfd75fd-4tnbz
W1117 11:08:56.216916 1 client_config.go:615] Neither --kubeconfig nor --master was specified. Using the inClusterConfig. This might not work.
W1117 11:08:56.219836 1 client_config.go:615] Neither --kubeconfig nor --master was specified. Using the inClusterConfig. This might not work.
W1117 11:08:56.236662 1 metricsserver.go:238] Metrics API not available.
E1117 11:09:01.237069 1 cache.go:76] failed to create cache, error: dial tcp: i/o timeout
Error: failed to create cache, error: dial tcp: i/o timeout
2023/11/17 11:09:01 failed to create cache, error: dial tcp: i/o timeout