服务器之家:专注于VPS、云服务器配置技术及软件下载分享
分类导航

云服务器|WEB服务器|FTP服务器|邮件服务器|虚拟主机|服务器安全|DNS服务器|服务器知识|Nginx|IIS|Tomcat|

服务器之家 - 服务器技术 - Nginx - Kubernetes中Nginx服务启动失败排查流程分析(Error: ImagePullBackOff)

Kubernetes中Nginx服务启动失败排查流程分析(Error: ImagePullBackOff)

2023-03-15 17:09xybDIY Nginx

这篇文章主要介绍了Kubernetes中Nginx服务启动失败排查流程(Error: ImagePullBackOff),本文给大家介绍的非常详细,对大家的学习或工作具有一定的参考借鉴价值,需要的朋友可以参考下

pod节点启动失败,nginx服务无法正常访问,服务状态显示为ImagePullBackOff

?
1
2
3
[root@m1 ~]# kubectl get pods
NAME                    READY   STATUS             RESTARTS   AGE
nginx-f89759699-cgjgp   0/1     ImagePullBackOff   0          103m

查看nginx服务的Pod节点详细信息。

?
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
[root@m1 ~]# kubectl describe pod nginx-f89759699-cgjgp
Name:             nginx-f89759699-cgjgp
Namespace:        default
Priority:         0
Service Account:  default
Node:             n1/192.168.200.84
Start Time:       Fri, 10 Mar 2023 08:40:33 +0800
Labels:           app=nginx
                  pod-template-hash=f89759699
Annotations:      <none>
Status:           Pending
IP:               10.244.3.20
IPs:
  IP:           10.244.3.20
Controlled By:  ReplicaSet/nginx-f89759699
Containers:
  nginx:
    Container ID:  
    Image:          nginx
    Image ID:      
    Port:           <none>
    Host Port:      <none>
    State:          Waiting
      Reason:       ImagePullBackOff
    Ready:          False
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-zk8sj (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             False
  ContainersReady   False
  PodScheduled      True
Volumes:
  default-token-zk8sj:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-zk8sj
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                 node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason   Age                     From     Message
  ----     ------   ----                    ----     -------
  Normal   BackOff  57m (x179 over 100m)    kubelet  Back-off pulling image "nginx"
  Normal   Pulling  7m33s (x22 over 100m)   kubelet  Pulling image "nginx"
  Warning  Failed   2m30s (x417 over 100m)  kubelet  Error: ImagePullBackOff

发现,获取nginx镜像失败。可能是由于Docker服务引起的。

于是,检查Docker是否正常启动

?
1
systemctl status docker

发现,docker服务启动失败,手动尝试重新启动。

?
1
systemctl restart docker

但是,重启docker服务失败,出现如下报错信息。

?
1
2
3
[root@m1 ~]# systemctl restart docker
Job for docker.service failed because the control process exited with error code.
See "systemctl status docker.service" and "journalctl -xe" for details.

执行systemctl restart docker命令失效。

接着,当执行docker version命令时,发现未能连接到Docker daemon

?
1
2
3
4
5
6
7
8
9
10
11
[root@m1 ~]# docker version
Client: Docker Engine - Community
 Version:           20.10.17
 API version:       1.41
 Go version:        go1.17.11
 Git commit:        100c701
 Built:             Mon Jun  6 23:03:11 2022
 OS/Arch:           linux/amd64
 Context:           default
 Experimental:      true
Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?

于是,再次通过执行systemctl status docker命令,查看docker服务未能启动,阅读输出报错信息,如下所示。

?
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
[root@m1 ~]# systemctl status docker
● docker.service - Docker Application Container Engine
   Loaded: loaded (/usr/lib/systemd/system/docker.service; enabled; vendor preset: disabled)
   Active: failed (Result: exit-code) since Fri 2023-03-10 10:28:16 CST; 4min 35s ago
     Docs: https://docs.docker.com
 Main PID: 2221 (code=exited, status=1/FAILURE)
 
Mar 10 10:28:13 m1 systemd[1]: docker.service: Main process exited, code=exited, status=1/FAILURE
Mar 10 10:28:13 m1 systemd[1]: docker.service: Failed with result 'exit-code'.
Mar 10 10:28:13 m1 systemd[1]: Failed to start Docker Application Container Engine.
Mar 10 10:28:16 m1 systemd[1]: docker.service: Service RestartSec=2s expired, scheduling restart.
Mar 10 10:28:16 m1 systemd[1]: docker.service: Scheduled restart job, restart counter is at 3.
Mar 10 10:28:16 m1 systemd[1]: Stopped Docker Application Container Engine.
Mar 10 10:28:16 m1 systemd[1]: docker.service: Start request repeated too quickly.
Mar 10 10:28:16 m1 systemd[1]: docker.service: Failed with result 'exit-code'.
Mar 10 10:28:16 m1 systemd[1]: Failed to start Docker Application Container Engine.
[root@m1 ~]#

通过上述输出显示,Docker 服务进程的启动失败,状态为 1/FAILURE

接下来,尝试通过以下步骤来排查和解决问题:

查看 Docker 服务日志:使用以下命令查看 Docker 服务日志,以便更详细地了解失败原因。

?
1
sudo journalctl -u docker.service

Kubernetes中Nginx服务启动失败排查流程分析(Error: ImagePullBackOff)

 通过输出Ddocker日志分析,提取到了相关报错信息片段,发现是配置daemon中的/etc/docker/daemon.json配置文件出错导致的。

?
1
2
3
4
5
6
7
8
Mar 10 10:20:17 m1 systemd[1]: Starting Docker Application Container Engine...
Mar 10 10:20:17 m1 dockerd[1572]: unable to configure the Docker daemon with file /etc/docker/daemon.json: invalid character '"' after object key:value pair
Mar 10 10:20:17 m1 systemd[1]: docker.service: Main process exited, code=exited, status=1/FAILURE
Mar 10 10:20:17 m1 systemd[1]: docker.service: Failed with result 'exit-code'.
Mar 10 10:20:17 m1 systemd[1]: Failed to start Docker Application Container Engine.
Mar 10 10:20:19 m1 systemd[1]: docker.service: Service RestartSec=2s expired, scheduling restart.
Mar 10 10:20:19 m1 systemd[1]: docker.service: Scheduled restart job, restart counter is at 2.
Mar 10 10:20:19 m1 systemd[1]: Stopped Docker Application Container Engine.

此时,查看daemon配置文件/etc/docker/daemon.json是否配置正确。

?
1
2
3
4
5
6
7
[root@m1 ~]# cat /etc/docker/daemon.json
{  
  # 设置 Docker 镜像的注册表镜像源为阿里云镜像源。
  "registry-mirrors": ["https://w2kavmmf.mirror.aliyuncs.com"]
  # 指定 Docker 守护进程使用 systemd 作为 cgroup driver。
  "exec-opts": ["native.cgroupdriver=systemd"]
}

咋一看,配置信息没有什么问题,都是正确的,但仔细一看,就会发现应该在"registry-mirrors"选项的结尾添加逗号。犯了缺少逗号(,)导致的语法错误,终于找到了问题根源。

修改后:

?
1
2
3
4
5
6
7
8
9
10
11
[root@m1 ~]# cat /etc/docker/daemon.json
{
  "registry-mirrors": ["https://w2kavmmf.mirror.aliyuncs.com"],
  "exec-opts": ["native.cgroupdriver=systemd"]
}
 
[root@m1 ~]# cat /etc/docker/daemon.json
{
  "registry-mirrors": ["https://w2kavmmf.mirror.aliyuncs.com"],
  "exec-opts": ["native.cgroupdriver=systemd"]
}

按下:wq报错退出。

 重新加载系统并重新启动Docker服务

?
1
2
3
systemctl daemon-reload
systemctl restart docker
systemctl status docker

检查docker版本信息是否输出正常

?
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
[root@m1 ~]# docket version
-bash: docket: command not found
[root@m1 ~]# docker version
Client: Docker Engine - Community
 Version:           20.10.17
 API version:       1.41
 Go version:        go1.17.11
 Git commit:        100c701
 Built:             Mon Jun  6 23:03:11 2022
 OS/Arch:           linux/amd64
 Context:           default
 Experimental:      true
 
Server: Docker Engine - Community
 Engine:
  Version:          20.10.17
  API version:      1.41 (minimum version 1.12)
  Go version:       go1.17.11
  Git commit:       a89b842
  Built:            Mon Jun  6 23:01:29 2022
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.6.6
  GitCommit:        10c12954828e7c7c9b6e0ea9b0c02b01407d3ae1
 runc:
  Version:          1.1.2
  GitCommit:        v1.1.2-0-ga916309
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0
?
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
[root@m1 ~]# docker info
Client:
 Context:    default
 Debug Mode: false
 Plugins:
  app: Docker App (Docker Inc., v0.9.1-beta3)
  buildx: Docker Buildx (Docker Inc., v0.8.2-docker)
  scan: Docker Scan (Docker Inc., v0.17.0)
 
Server:
 Containers: 20
  Running: 8
  Paused: 0
  Stopped: 12
 Images: 20
 Server Version: 20.10.17
 Storage Driver: overlay2
  Backing Filesystem: xfs
  Supports d_type: true
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: systemd
 Cgroup Version: 1
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 io.containerd.runtime.v1.linux runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 10c12954828e7c7c9b6e0ea9b0c02b01407d3ae1
 runc version: v1.1.2-0-ga916309
 init version: de40ad0
 Security Options:
  seccomp
   Profile: default
 Kernel Version: 4.18.0-372.9.1.el8.x86_64
 Operating System: Rocky Linux 8.6 (Green Obsidian)
 OSType: linux
 Architecture: x86_64
 CPUs: 2
 Total Memory: 9.711GiB
 Name: m1
 ID: 4YIS:FHSB:YXRI:CED5:PJSJ:EAS2:BCR3:GJJF:FDPK:EDJH:DVKU:AIYJ
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Registry: https://index.docker.io/v1/
 Labels:
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Registry Mirrors:
  https://w2kavmmf.mirror.aliyuncs.com/
 Live Restore Enabled: false

至此,Docker服务重启成功,pod节点恢复正常,Nginx服务能够正常访问。

?
1
2
3
[root@m1 ~]# kubectl get pods
NAME                    READY   STATUS    RESTARTS   AGE
nginx-f89759699-cgjgp   1/1     Running   0          174m

查看pod详细信息,显示正常。

?
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
[root@m1 ~]# kubectl describe pod nginx-f89759699-cgjgp
Name:             nginx-f89759699-cgjgp
Namespace:        default
Priority:         0
Service Account:  default
Node:             n1/192.168.200.84
Start Time:       Fri, 10 Mar 2023 08:40:33 +0800
Labels:           app=nginx
                  pod-template-hash=f89759699
Annotations:      <none>
Status:           Running
IP:               10.244.3.20
IPs:
  IP:           10.244.3.20
Controlled By:  ReplicaSet/nginx-f89759699
Containers:
  nginx:
    Container ID:   docker://88bdc2bfa592f60bf99bac2125b0adae005118ae8f2f271225245f20b7cfb3c8
    Image:          nginx
    Image ID:       docker-pullable://nginx@sha256:aa0afebbb3cfa473099a62c4b32e9b3fb73ed23f2a75a65ce1d4b4f55a5c2ef2
    Port:           <none>
    Host Port:      <none>
    State:          Running
      Started:      Fri, 10 Mar 2023 10:37:42 +0800
    Ready:          True
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-zk8sj (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             True
  ContainersReady   True
  PodScheduled      True
Volumes:
  default-token-zk8sj:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-zk8sj
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                 node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type    Reason   Age                   From     Message
  ----    ------   ----                  ----     -------
  Normal  BackOff  58m (x480 over 171m)  kubelet  Back-off pulling image "nginx"
[root@m1 ~]#

Kubernetes中Nginx服务启动失败排查流程分析(Error: ImagePullBackOff)

到此这篇关于Kubernetes中Nginx服务启动失败排查流程(Error: ImagePullBackOff)的文章就介绍到这了,更多相关Nginx服务启动失败内容请搜索服务器之家以前的文章或继续浏览下面的相关文章希望大家以后多多支持服务器之家!

原文链接:https://blog.csdn.net/qq_45392321/article/details/129440238

延伸 · 阅读

精彩推荐
  • Nginxnginx开启HSTS让浏览器强制跳转HTTPS访问详解

    nginx开启HSTS让浏览器强制跳转HTTPS访问详解

    这篇文章主要介绍了nginx开启HSTS让浏览器强制跳转HTTPS访问详解,小编觉得挺不错的,现在分享给大家,也给大家做个参考。一起跟随小编过来看看吧 ...

    龙恩07074742020-01-08
  • Nginxnginx不支持apk ipa文件下载的设置方法

    nginx不支持apk ipa文件下载的设置方法

    今天在帮客户配置nginx服务器的时候,对方需要支持apk ipa文件下载,这里简单分享下,方便需要的朋友 ...

    nginx技术网8602019-10-17
  • NginxNginx 如何限制访问频率,下载速率和并发连接数的方法

    Nginx 如何限制访问频率,下载速率和并发连接数的方法

    这篇文章主要介绍了Nginx 如何限制访问频率,下载速率和并发连接数的方法,文中通过示例代码介绍的非常详细,对大家的学习或者工作具有一定的参考学...

    Tom-时光5232020-01-13
  • Nginxnginx出现500 internal server error什么意思

    nginx出现500 internal server error什么意思

    我们在使用nginx的时候,有的小伙伴可能就会遇到系统提示500 internal server error的情况,又不知道具体是什么意思。那么对于这种问题小编认为可能是内部服...

    Nginx教程网12912021-06-24
  • NginxNginx中rewrite(地址重定向)的深入剖析

    Nginx中rewrite(地址重定向)的深入剖析

    Rewrite主要实现url地址重写,以及url地址跳转,下面这篇文章主要给大家介绍了关于Nginx中rewrite(地址重定向)的深入剖析,文中通过实例代码介绍的非常详细,需要...

    阁下何不踏风起9422022-10-09
  • NginxNginx 中文域名配置详解及实现

    Nginx 中文域名配置详解及实现

    这篇文章主要介绍了Nginx中 文域名配置详解及实现的相关资料,Nginx虚拟主机上绑定一个带中文域名但是不能跳转,这里给大家说下如何实现,需要的朋友可...

    lqh4002019-11-19
  • NginxNginx配置使用详解

    Nginx配置使用详解

    Nginx是一个高性能的HTTP和反向代理web服务器。本文详细讲解了Nginx配置使用的方法,对大家的学习或者工作具有一定的参考学习价值,需要的朋友们下面随...

    小旭202111402022-07-04
  • Nginxnginx rewrite 伪静态配置参数和使用例子

    nginx rewrite 伪静态配置参数和使用例子

    nginx下伪静态配置参数详细说明,使用nginx的朋友,nginx rewrite 伪静态配置参数和使用例子 附正则使用说明 ...

    服务器之家2572019-10-08