部署的pod处于CrashLoopBackOff状态

作者:chenhaozjnubit 分类: 发布于:2018-8-10 9:44 ė7252次浏览 60条评论

1 问题描述

使用命令kubectl create -f myubuntu_deploy.yaml --record生成pod,结果显示pod处于CrashLoopBackOff状态。

CrashLoopBackOff 告诉我们,Kubernetes 正在尽力启动这个 Pod,但是一个或多个容器已经挂了,或者正被删除。

This is what I keep getting:

  1. [root@centos-master ~]# kubectl get pods
  2. NAME READY STATUS RESTARTS AGE
  3. nfs-server-h6nw8 1/1 Running 0 1h
  4. nfs-web-07rxz 0/1 CrashLoopBackOff 8 16m
  5. nfs-web-fdr9h 0/1 CrashLoopBackOff 8 16m

Below is output from "describe pods" kubectl describe pods

  1. Events:
  2. FirstSeen LastSeen Count From SubobjectPath Type Reason Message
  3. --------- -------- ----- ---- ------------- -------- ------ -------
  4. 16m 16m 1 {default-scheduler } Normal Scheduled Successfully assigned nfs-web-fdr9h to centos-minion-2
  5. 16m 16m 1 {kubelet centos-minion-2} spec.containers{web} Normal Created Created container with docker id 495fcbb06836
  6. 16m 16m 1 {kubelet centos-minion-2} spec.containers{web} Normal Started Started container with docker id 495fcbb06836
  7. 16m 16m 1 {kubelet centos-minion-2} spec.containers{web} Normal Started Started container with docker id d56f34ae4e8f
  8. 16m 16m 1 {kubelet centos-minion-2} spec.containers{web} Normal Created Created container with docker id d56f34ae4e8f
  9. 16m 16m 2 {kubelet centos-minion-2} Warning FailedSync Error syncing pod, skipping: failed to "StartContainer" for "web" with CrashLoopBackOff: "Back-off 10s restarting failed container=web pod=nfs-web-fdr9h_default(461c937d-d870-11e6-98de-005056040cc2)"

I have two pods: nfs-web-07rxz, nfs-web-fdr9h, but if I do "kubectl logs nfs-web-07rxz" or with "-p" option I don't see any log in both pods.

  1. [root@centos-master ~]# kubectl logs nfs-web-07rxz -p
  2. [root@centos-master ~]# kubectl logs nfs-web-07rxz

This is my replicationController yaml file: replicationController yaml file

  1. apiVersion: v1 kind: ReplicationController metadata: name: nfs-web spec: replicas: 2 selector:
  2. role: web-frontend template:
  3. metadata:
  4. labels:
  5. role: web-frontend
  6. spec:
  7. containers:
  8. - name: web
  9. image: eso-cmbu-docker.artifactory.eng.vmware.com/demo-container:demo-version3.0
  10. ports:
  11. - name: web
  12. containerPort: 80
  13. securityContext:
  14. privileged: true

My Docker image was made from this simple docker file:

  1. FROM ubuntu
  2. RUN apt-get update
  3. RUN apt-get install -y nginx
  4. RUN apt-get install -y nfs-common

I am running my kubernetes cluster on CentOs-1611, kube version:

  1. [root@centos-master ~]# kubectl version
  2. Client Version: version.Info{Major:"1", Minor:"3", GitVersion:"v1.3.0", GitCommit:"86dc49aa137175378ac7fba7751c3d3e7f18e5fc", GitTreeState:"clean", BuildDate:"2016-12-15T16:57:18Z", GoVersion:"go1.6.3", Compiler:"gc", Platform:"linux/amd64"}
  3. Server Version: version.Info{Major:"1", Minor:"3", GitVersion:"v1.3.0", GitCommit:"86dc49aa137175378ac7fba7751c3d3e7f18e5fc", GitTreeState:"clean", BuildDate:"2016-12-15T16:57:18Z", GoVersion:"go1.6.3", Compiler:"gc", Platform:"linux/amd64"}

If I run the docker image by "docker run" I was able to run the image without any issue, only through kubernetes I got the crash.

Can someone help me out, how can I debug without seeing any log?

The entire dokcerfile is just one command "FROM ubuntu" and it is still crashing

2 解决方法

you need to have your Dockerfile have a Command to run or have your ReplicationController specify a command.

The pod is crashing because it starts up then immediately exits, thus Kubernetes restarts and the cycle continues.

查看了我制作镜像的Dockerfile,是dockerfile文件中最后的CMD命令出错。

修改后执行命令重新生成镜像:

docker build -t mynginx:1.13.9 .

执行命令:kubectl create -f nginx_deploy.yaml --record生成pod

deployment文件:

  1. root@master:~/deployment# cat nginx_deploy.yaml
  2. apiVersion: apps/v1
  3. kind: Deployment
  4. metadata:
  5. name: nginx-deployment
  6. labels:
  7. app: nginx
  8. spec:
  9. replicas: 2
  10. selector:
  11. matchLabels:
  12. app: nginx
  13. template:
  14. metadata:
  15. labels:
  16. app: nginx
  17. spec:
  18. containers:
  19. - name: nginx
  20. image: mynginx:1.13.9
  21. ports:
  22. - containerPort: 80

3 常用操作

使用以下命令可以看到目前集群里的信息:

master

1
2
3
4
5
6
kubectl get po # 查看目前所有的pod
kubectl get rs # 查看目前所有的replica set
kubectl get deployment # 查看目前所有的deployment
kubectl describe po my-nginx # 查看my-nginx pod的详细状态
kubectl describe rs my-nginx # 查看my-nginx replica set的详细状态
kubectl describe deployment my-nginx # 查看my-nginx deployment的详细状态

7     kubectl get eventskubectl get events查看相关事件

8  kubectl delete deployment my-nginx

参考:

https://stackoverflow.com/questions/41604499/my-kubernetes-pods-keep-crashing-with-crashloopbackoff-but-i-cant-find-any-lo

本文出自 爱奕乐,转载时请注明出处及相应链接。

0
分享本文至:

发表评论

电子邮件地址不会被公开。必填项已用*标注