I’m attempting to deploy a Docker container to a minikube instance running locally, and getting this error when it attempts to pull(?) the image. The image exists in a self-hosted Docker registry. The image I’m testing with is built with the following Dockerfile:
FROM alpine:latest
ENTRYPOINT ["echo"]
I’m using the fabric8io kubernetes-client
library to create a deployment like so:
// 'kube' is an instance of io.fabric8.kubernetes.client.KubernetesClient
final Deployment deployment = kube.extensions().deployments()
.createOrReplaceWithNew()
.withNewMetadata()
.withName(name)
.withNamespace("staging")
.endMetadata()
.withNewSpec()
.withReplicas(1)
.withNewTemplate()
.withNewMetadata()
.addToLabels("app", name)
.endMetadata()
.withNewSpec()
.addNewImagePullSecret()
// "regsecret" is the kubectl-created docker secret
.withName("regsecret")
.endImagePullSecret()
.addNewContainer()
.withName(name)
.withImage(imageName + ":latest")
.endContainer()
.endSpec()
.endTemplate()
.endSpec()
.done();
This is all running on Arch Linux, kernel Linux 4.10.9-1-ARCH x86_64 GNU/Linux
. Using minikube 0.18.0-1
and kubectl-bin 1.6.1-1
from the AUR, docker 1:17.04.0-1
from the community repositories, and the docker registry
container at latest
(2.6.1
as of writing this). fabric8io kubernetes-client
is at version 2.2.13
.
I have checked:
- that the self-hosted registry is running over HTTPS correctly
- that the image can even be pulled.
docker pull
anddocker run
on both the host and inside the minikube VM work exactly as expected - that the image runs. See above
- that there aren’t any name conflicts / etc. in minikube. I delete the deployments, replica sets, and pods between attempts, and I recreate the namespace, just to be safe. However, I’ve found that it doesn’t make a difference which I do, as my code cleans up existing pods/replica sets/deployments as needed
- that DNS is not an issue, as far as I can tell
I have not:
- run kubernetes locally (as opposed to minikube), as the AUR package for kubernetes takes an unbelievably long time to build on my machine
- read through the kubernetes source code, as I don’t know golang
When checking minikube dashboard
, the sections for Deployments, Replica Sets, and Pods all have the same error:
Failed to inspect image "registry_domain/XXX/YYY:latest": Id or size of image "registry_domain/XXX/YYY:latest" is not set
Error syncing pod, skipping: failed to "StartContainer" for "YYY" with ImageInspectError: "Failed to inspect image "registry_domain/XXX/YYY:latest": Id or size of image "registry_domain/XXX/YYY:latest" is not set"
and the pod logs are permanently stuck at
container "YYY" in pod "YYY" is waiting to start: ImageInspectError
Looking up the error message provided leads me to https://github.com/kubernetes/minikube/issues/947, but this is not the same issue, as kube-dns
is working as expected. This is the only relevant search result, as the other results that come up are
- Slack chatroom archives that don’t even contain the relevant error message
- The kubernetes source, which isn’t helpful to me
- kubernetes/minikube #947, as above
I’m honestly not sure where to go from here. Any advice would be appreciated.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and
privacy statement. We’ll occasionally send you account related emails.
Already on GitHub?
Sign in
to your account
Closed
dkirrane opened this issue
Mar 27, 2018
· 7 comments
Comments
Is this a BUG REPORT or FEATURE REQUEST?
Choose one: BUG REPORT
Versions
kubeadm version (use kubeadm version
):
kubeadm version: &version.Info{Major:"1", Minor:"9", GitVersion:"v1.9.6", GitCommit:"9f8ebd171479bec0ada837d7ee641dec2f8c6dd1", GitTreeState:"clean", BuildDate:"2018-03-21T15:13:31Z", GoVersion:"go1.9.3", Compiler:"gc", Platform:"linux/amd64"}
Environment:
- Kubernetes version (use
kubectl version
):
Client Version: version.Info{Major:"1", Minor:"9", GitVersion:"v1.9.6", GitCommit:"9f8ebd171479bec0ada837d7ee641dec2f8c6dd1", GitTreeState:"clean", BuildDate:"2018-03-21T15:21:50Z", GoVersion:"go1.9.3", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"9", GitVersion:"v1.9.6", GitCommit:"9f8ebd171479bec0ada837d7ee641dec2f8c6dd1", GitTreeState:"clean", BuildDate:"2018-03-21T15:13:31Z", GoVersion:"go1.9.3", Compiler:"gc", Platform:"linux/amd64"}
- Cloud provider or hardware configuration:
- OS (e.g. from /etc/os-release):
NAME="CentOS Linux"
VERSION="7 (Core)"
ID="centos"
ID_LIKE="rhel fedora"
VERSION_ID="7"
PRETTY_NAME="CentOS Linux 7 (Core)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:centos:centos:7"
HOME_URL="https://www.centos.org/"
BUG_REPORT_URL="https://bugs.centos.org/"
CENTOS_MANTISBT_PROJECT="CentOS-7"
CENTOS_MANTISBT_PROJECT_VERSION="7"
REDHAT_SUPPORT_PRODUCT="centos"
REDHAT_SUPPORT_PRODUCT_VERSION="7"
-
Kernel (e.g.
uname -a
):
Linux master1-dev 3.10.0-693.el7.x86_64 kubeadm join on slave node fails preflight checks #1 SMP Tue Aug 22 21:09:27 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux -
Others:
docker version
Client:
Version: 1.13.1
API version: 1.26
Package version: <unknown>
Go version: go1.8.3
Git commit: 774336d/1.13.1
Built: Wed Mar 7 17:06:16 2018
OS/Arch: linux/amd64
Server:
Version: 1.13.1
API version: 1.26 (minimum version 1.12)
Package version: <unknown>
Go version: go1.8.3
Git commit: 774336d/1.13.1
Built: Wed Mar 7 17:06:16 2018
OS/Arch: linux/amd64
Experimental: false
What happened?
I created a cluster with kubeadm init --pod-network-cidr=10.244.0.0/16 --kubernetes-version 1.9.6 --ignore-preflight-errors=cri
All control plane pods are running Running
state
However I cannot start a simple postgres pod on the cluster. kubectl describe pod postgres-664cfc9966-sz9zf
shows
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 26s default-scheduler Successfully assigned postgres-6847cfbf7d-vgr47 to minion3-dev
Normal SuccessfulMountVolume 25s kubelet, minion3-dev MountVolume.SetUp succeeded for volume "default-token-kdq6z"
Warning InspectFailed 2s (x8 over 23s) kubelet, minion3-dev Failed to inspect image "postgres:9.6.5": rpc error: code = Unknown desc = Error response from daemon: readlink /var/lib/docker/overlay2: invalid argument
Warning Failed 2s (x8 over 23s) kubelet, minion3-dev Error: ImageInspectError
Normal SandboxChanged 1s (x8 over 22s) kubelet, minion3-dev Pod sandbox changed, it will be killed and re-created.
All pods
default postgres-664cfc9966-sz9zf 0/1 ImageInspectError 0 22m
kube-system etcd-master1-dev 1/1 Running 0 29m
kube-system kube-apiserver-master1-dev 1/1 Running 0 26m
kube-system kube-controller-manager-master1-dev 1/1 Running 0 26m
kube-system kube-dns-6f4fd4bdf-qd4r5 3/3 Running 0 30m
kube-system kube-flannel-ds-2g6wx 1/1 Running 0 30m
kube-system kube-flannel-ds-dw79b 1/1 Running 0 29m
kube-system kube-flannel-ds-qpp29 1/1 Running 0 30m
kube-system kube-flannel-ds-qtvbn 1/1 Running 0 30m
kube-system kube-proxy-lfcwt 1/1 Running 0 30m
kube-system kube-proxy-qc7jv 1/1 Running 0 29m
kube-system kube-proxy-rf2fz 1/1 Running 0 30m
kube-system kube-proxy-vvrpl 1/1 Running 0 30m
kube-system kube-scheduler-master1-dev 1/1 Running 0 29m
kube-system kubernetes-dashboard-5bd6f767c7-4l77b 1/1 Running 0 29m
kube-system tiller-deploy-865dd6c794-2wp8r 1/1 Running 0 29m
What you expected to happen?
Pod starts
How to reproduce it (as minimally and precisely as possible)?
- kubeadm init —pod-network-cidr=10.244.0.0/16 —kubernetes-version 1.9.6 —ignore-preflight-errors=cri
- kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/v0.9.1/Documentation/kube-flannel.yml
- kubeadm token create —print-join-command
- join minions
- kubectl create -f postgres.yaml
Anything else we need to know?
postgres.yaml
---
apiVersion: v1
kind: Service
metadata:
name: postgres
namespace: default
spec:
selector:
app: postgres
ports:
- port: 5432
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: postgres
namespace: default
spec:
replicas: 1
selector:
matchLabels:
app: postgres
template:
metadata:
labels:
app: postgres
spec:
containers:
- name: postgres
image: postgres:9.6.5
env:
- name: POSTGRES_USER
value: admin
- name: POSTGRES_PASSWORD
value: admin
ports:
- containerPort: 5432
@dkirrane this is not a kubeadm specific error, but I’ll loop in some other folks to see if they want to comment. It seems like an docker storage or image issue.
/cc @jberkus
I moved to kubeadm and Kubernetes 1.10.0 and works now.
I’m hitting this same issue again. Same CentOS VM. This time I’m using kubeadmn 1.10.3
and docker 18.03.1-ce
and same CentOS as above. Can this be re-opened. It seems to be an intermittent issue
@dkirrane report this in the kubernetes/kubernetes repo
Is this a BUG REPORT or FEATURE REQUEST?
Choose one: BUG REPORT
Versions
kubeadm version (use
kubeadm version
):kubeadm version: &version.Info{Major:"1", Minor:"9", GitVersion:"v1.9.6", GitCommit:"9f8ebd171479bec0ada837d7ee641dec2f8c6dd1", GitTreeState:"clean", BuildDate:"2018-03-21T15:13:31Z", GoVersion:"go1.9.3", Compiler:"gc", Platform:"linux/amd64"}
Environment:
- Kubernetes version (use
kubectl version
):Client Version: version.Info{Major:"1", Minor:"9", GitVersion:"v1.9.6", GitCommit:"9f8ebd171479bec0ada837d7ee641dec2f8c6dd1", GitTreeState:"clean", BuildDate:"2018-03-21T15:21:50Z", GoVersion:"go1.9.3", Compiler:"gc", Platform:"linux/amd64"} Server Version: version.Info{Major:"1", Minor:"9", GitVersion:"v1.9.6", GitCommit:"9f8ebd171479bec0ada837d7ee641dec2f8c6dd1", GitTreeState:"clean", BuildDate:"2018-03-21T15:13:31Z", GoVersion:"go1.9.3", Compiler:"gc", Platform:"linux/amd64"}
- Cloud provider or hardware configuration:
- OS (e.g. from /etc/os-release):
NAME="CentOS Linux" VERSION="7 (Core)" ID="centos" ID_LIKE="rhel fedora" VERSION_ID="7" PRETTY_NAME="CentOS Linux 7 (Core)" ANSI_COLOR="0;31" CPE_NAME="cpe:/o:centos:centos:7" HOME_URL="https://www.centos.org/" BUG_REPORT_URL="https://bugs.centos.org/" CENTOS_MANTISBT_PROJECT="CentOS-7" CENTOS_MANTISBT_PROJECT_VERSION="7" REDHAT_SUPPORT_PRODUCT="centos" REDHAT_SUPPORT_PRODUCT_VERSION="7"
- Kernel (e.g.
uname -a
):
Linux master1-dev 3.10.0-693.el7.x86_64 kubeadm join on slave node fails preflight checks #1 SMP Tue Aug 22 21:09:27 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux- Others:
docker version Client: Version: 1.13.1 API version: 1.26 Package version: <unknown> Go version: go1.8.3 Git commit: 774336d/1.13.1 Built: Wed Mar 7 17:06:16 2018 OS/Arch: linux/amd64 Server: Version: 1.13.1 API version: 1.26 (minimum version 1.12) Package version: <unknown> Go version: go1.8.3 Git commit: 774336d/1.13.1 Built: Wed Mar 7 17:06:16 2018 OS/Arch: linux/amd64 Experimental: false
What happened?
I created a cluster with
kubeadm init --pod-network-cidr=10.244.0.0/16 --kubernetes-version 1.9.6 --ignore-preflight-errors=cri
All control plane pods are running
Running
state
However I cannot start a simple postgres pod on the cluster.kubectl describe pod postgres-664cfc9966-sz9zf
showsEvents: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 26s default-scheduler Successfully assigned postgres-6847cfbf7d-vgr47 to minion3-dev Normal SuccessfulMountVolume 25s kubelet, minion3-dev MountVolume.SetUp succeeded for volume "default-token-kdq6z" Warning InspectFailed 2s (x8 over 23s) kubelet, minion3-dev Failed to inspect image "postgres:9.6.5": rpc error: code = Unknown desc = Error response from daemon: readlink /var/lib/docker/overlay2: invalid argument Warning Failed 2s (x8 over 23s) kubelet, minion3-dev Error: ImageInspectError Normal SandboxChanged 1s (x8 over 22s) kubelet, minion3-dev Pod sandbox changed, it will be killed and re-created.
All pods
default postgres-664cfc9966-sz9zf 0/1 ImageInspectError 0 22m kube-system etcd-master1-dev 1/1 Running 0 29m kube-system kube-apiserver-master1-dev 1/1 Running 0 26m kube-system kube-controller-manager-master1-dev 1/1 Running 0 26m kube-system kube-dns-6f4fd4bdf-qd4r5 3/3 Running 0 30m kube-system kube-flannel-ds-2g6wx 1/1 Running 0 30m kube-system kube-flannel-ds-dw79b 1/1 Running 0 29m kube-system kube-flannel-ds-qpp29 1/1 Running 0 30m kube-system kube-flannel-ds-qtvbn 1/1 Running 0 30m kube-system kube-proxy-lfcwt 1/1 Running 0 30m kube-system kube-proxy-qc7jv 1/1 Running 0 29m kube-system kube-proxy-rf2fz 1/1 Running 0 30m kube-system kube-proxy-vvrpl 1/1 Running 0 30m kube-system kube-scheduler-master1-dev 1/1 Running 0 29m kube-system kubernetes-dashboard-5bd6f767c7-4l77b 1/1 Running 0 29m kube-system tiller-deploy-865dd6c794-2wp8r 1/1 Running 0 29m
What you expected to happen?
Pod starts
How to reproduce it (as minimally and precisely as possible)?
- kubeadm init —pod-network-cidr=10.244.0.0/16 —kubernetes-version 1.9.6 —ignore-preflight-errors=cri
- kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/v0.9.1/Documentation/kube-flannel.yml
- kubeadm token create —print-join-command
- join minions
- kubectl create -f postgres.yaml
Anything else we need to know?
postgres.yaml
--- apiVersion: v1 kind: Service metadata: name: postgres namespace: default spec: selector: app: postgres ports: - port: 5432 --- apiVersion: apps/v1 kind: Deployment metadata: name: postgres namespace: default spec: replicas: 1 selector: matchLabels: app: postgres template: metadata: labels: app: postgres spec: containers: - name: postgres image: postgres:9.6.5 env: - name: POSTGRES_USER value: admin - name: POSTGRES_PASSWORD value: admin ports: - containerPort: 5432
HI @dkirrane I got the same problem too…
Have you find any solution to fix it ?
I’m also seeing this intermittently.
Cluster versions:
- Kubernetes: 1.10.3
- Docker: 18.3.1
- OS: Ubuntu 16.0.4 LTS
- Kernel: 4.4.0-109-generic
Я пытаюсь развернуть функцию с OpenFaas проект и кластером кубернетов, работающим на 2 Raspberry Pi 3B +.
К сожалению, модуль, который должен обрабатывать эту функцию, переходит в состояние ImageInspectError …
Я попытался запустить функцию напрямую с помощью Docker, который содержится в образе Docker, и все работает нормально.
Я открыл проблема на github OpenFaas, и сопровождающий сказал мне напрямую попросить сообщество Kubernetes получить некоторые подсказки.
Мой первый вопрос: что означает ImageInspectError и откуда оно взялось?
И вот вся информация, которая у меня есть:
Ожидаемое поведение
Под должен работать.
Текущее поведение
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system etcd-masternode 1/1 Running 1 1d
kube-system kube-apiserver-masternode 1/1 Running 1 1d
kube-system kube-controller-manager-masternode 1/1 Running 1 1d
kube-system kube-dns-7f9b64f644-x42sr 3/3 Running 3 1d
kube-system kube-proxy-wrp6f 1/1 Running 1 1d
kube-system kube-proxy-x6pvq 1/1 Running 1 1d
kube-system kube-scheduler-masternode 1/1 Running 1 1d
kube-system weave-net-4995q 2/2 Running 3 1d
kube-system weave-net-5g7pd 2/2 Running 3 1d
openfaas-fn figlet-7f556fcd87-wrtf4 1/1 Running 0 4h
openfaas-fn testfaceraspi-7f6fcb5897-rs4cq 0/1 ImageInspectError 0 2h
openfaas alertmanager-66b98dd4d4-kcsq4 1/1 Running 1 1d
openfaas faas-netesd-5b5d6d5648-mqftl 1/1 Running 1 1d
openfaas gateway-846f8b5686-724q8 1/1 Running 2 1d
openfaas nats-86955fb749-7vsbm 1/1 Running 1 1d
openfaas prometheus-6ffc57bb8f-fpk6r 1/1 Running 1 1d
openfaas queue-worker-567bcf4d47-ngsgv 1/1 Running 2 1d
Testfaceraspi не запускается.
Журналы из капсулы:
$ kubectl logs testfaceraspi-7f6fcb5897-rs4cq -n openfaas-fn
Error from server (BadRequest): container "testfaceraspi" in pod "testfaceraspi-7f6fcb5897-rs4cq" is waiting to start: ImageInspectError
Под описать:
$ kubectl describe pod -n openfaas-fn testfaceraspi-7f6fcb5897-rs4cq
Name: testfaceraspi-7f6fcb5897-rs4cq
Namespace: openfaas-fn
Node: workernode/10.192.79.198
Start Time: Thu, 12 Jul 2018 11:39:05 +0200
Labels: faas_function=testfaceraspi
pod-template-hash=3929761453
Annotations: prometheus.io.scrape=false
Status: Pending
IP: 10.40.0.16
Controlled By: ReplicaSet/testfaceraspi-7f6fcb5897
Containers:
testfaceraspi:
Container ID:
Image: gallouche/testfaceraspi
Image ID:
Port: 8080/TCP
Host Port: 0/TCP
State: Waiting
Reason: ImageInspectError
Ready: False
Restart Count: 0
Liveness: exec [cat /tmp/.lock] delay=3s timeout=1s period=10s #success=1 #failure=3
Readiness: exec [cat /tmp/.lock] delay=3s timeout=1s period=10s #success=1 #failure=3
Environment:
fprocess: python3 index.py
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-5qhnn (ro)
Conditions:
Type Status
Initialized True
Ready False
PodScheduled True
Volumes:
default-token-5qhnn:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-5qhnn
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning DNSConfigForming 2m (x1019 over 3h) kubelet, workernode Search Line limits were exceeded, some search paths have been omitted, the applied search line is: openfaas-fn.svc.cluster.local svc.cluster.local cluster.local heig-vd.ch einet.ad.eivd.ch web.ad.eivd.ch
И журналы событий:
$ kubectl get events --sort-by=.metadata.creationTimestamp -n openfaas-fn
LAST SEEN FIRST SEEN COUNT NAME KIND SUBOBJECT TYPE REASON SOURCE MESSAGE
14m 1h 347 testfaceraspi-7f6fcb5897-rs4cq.1540db41e89d4c52 Pod Warning DNSConfigForming kubelet, workernode Search Line limits were exceeded, some search paths have been omitted, the applied search line is: openfaas-fn.svc.cluster.local svc.cluster.local cluster.local heig-vd.ch einet.ad.eivd.ch web.ad.eivd.ch
4m 1h 75 figlet-7f556fcd87-wrtf4.1540db421002b49e Pod Warning DNSConfigForming kubelet, workernode Search Line limits were exceeded, some search paths have been omitted, the applied search line is: openfaas-fn.svc.cluster.local svc.cluster.local cluster.local heig-vd.ch einet.ad.eivd.ch web.ad.eivd.ch
10m 10m 1 testfaceraspi-7f6fcb5897-d6z78.1540df9ed8b91865 Pod Normal Scheduled default-scheduler Successfully assigned testfaceraspi-7f6fcb5897-d6z78 to workernode
10m 10m 1 testfaceraspi-7f6fcb5897.1540df9ed6eee11f ReplicaSet Normal SuccessfulCreate replicaset-controller Created pod: testfaceraspi-7f6fcb5897-d6z78
10m 10m 1 testfaceraspi-7f6fcb5897-d6z78.1540df9eef3ef504 Pod Normal SuccessfulMountVolume kubelet, workernode MountVolume.SetUp succeeded for volume "default-token-5qhnn"
4m 10m 27 testfaceraspi-7f6fcb5897-d6z78.1540df9eef5445c0 Pod Warning DNSConfigForming kubelet, workernode Search Line limits were exceeded, some search paths have been omitted, the applied search line is: openfaas-fn.svc.cluster.local svc.cluster.local cluster.local heig-vd.ch einet.ad.eivd.ch web.ad.eivd.ch
8m 9m 8 testfaceraspi-7f6fcb5897-d6z78.1540df9f670d0dad Pod spec.containers{testfaceraspi} Warning InspectFailed kubelet, workernode Failed to inspect image "gallouche/testfaceraspi": rpc error: code = Unknown desc = Error response from daemon: readlink /var/lib/docker/overlay2/l: invalid argument
9m 9m 7 testfaceraspi-7f6fcb5897-d6z78.1540df9f670fcf3e Pod spec.containers{testfaceraspi} Warning Failed kubelet, workernode Error: ImageInspectError
Шаги по воспроизведению (для ошибок)
- Развертывание OpenFaas в 2-узловом кластере k8s
- Создать функцию с
faas new testfaceraspi --lang python3-armhf
-
Добавьте следующий код в handler.py:
import json def handle(req): jsonl = json.loads(req) return ("Found " + str(jsonl["nbFaces"]) + " faces in OpenFaas Function on raspi !")
-
Сменить шлюз и образ в .yml
провайдер:
имя: faas
шлюз: http://127.0.0.1:31112functions: testfaceraspi: lang: python3-armhf handler: ./testfaceraspi image: gallouche/testfaceraspi
-
Запускаем faas build -f testfacepi.yml
- Войдите в DockerHub с помощью
docker login
- Запускаем
faas push -f testfacepi.yml
- Запускаем
faas deploy -f testfacepi.yml
Ваше окружение
-
Версия FaaS-CLI (Полный вывод: faas-cli version):
Commit: 3995a8197f1df1ecdf524844477cffa04e4690ea Version: 0.6.11
-
Версия Docker (Полный вывод: docker version):
Client: Version: 18.04.0-ce API version: 1.37 Go version: go1.9.4 Git commit: 3d479c0 Built: Tue Apr 10 18:25:24 2018 OS/Arch: linux/arm Experimental: false Orchestrator: swarm Server: Engine: Version: 18.04.0-ce API version: 1.37 (minimum version 1.12) Go version: go1.9.4 Git commit: 3d479c0 Built: Tue Apr 10 18:21:25 2018 OS/Arch: linux/arm Experimental: false
-
Операционная система и версия (например, Linux, Windows, MacOS):
Distributor ID: Raspbian Description: Raspbian GNU/Linux 9.4 (stretch) Release: 9.4 Codename: stretch
Заранее спасибо и сообщите мне, если вам нужна дополнительная информация.
Галлуш
Docker daemon start/stop issue
Symptom : A container cannot be stopped or removed AND trying to stop/start the docker daemon doesn’t help : the daemon looks hanging.
Cause :
The container mounted a directory that it didn’t release even after the container was stopped/removed.
Solution to fix : identify the directory/file mounted and not free by the container and unmount it.
– To see mounts containing docker word : mount | grep docker
– To umount a file/directory : umount fileOrDir
Solution to prevent :
Ensure that these containers a volume mounting issues to be in a quite stable state before to stop and remove them.
Symptom : The docker daemon cannot be stopped or started: the daemon looks hanging.
At dockerd stop we could see something like that :
dockerd[26985]: time="2020-01-07T10:19:54.506363297+02:00" level=info msg="Processing signal 'terminated'" systemd[1]: docker.service stop-sigterm timed out. Killing. systemd[1]: docker.service: main process exited, code=killed, status=9/KILL systemd[1]: Stopped Docker Application Container Engine. systemd[1]: Unit docker.service entered failed state.
At dockerd start we could see several kind of messages (about volume, container..).
But they have generally the same cause :
error : no such container FOO_CONTAINER
Possible cause :
A stale state in the current running docker-containerd-shim processes.
Solution to fix :
– stop the dockerd (at least try)
– identify all docker-containerd-shim processes running on the host.
– kill them
There may contain a dozen or more of containerd processes. Using awk may help to batch their killing.
1) output the pids separate by a blank :
ps aux | grep [d]ocker | awk '{pids=pids " " $2} END {print pids}'
2) kill them (copy paste them or store them into a var):
kill -9 ….
Building image issues
The image build fails at a step and we want to inspect the image state/content before the error
In the DockerFile :
1) Comment all instructions in the DockerFile since the instruction that fails
2) Comment as well as the existing ENTRYPOINT/CMD
3) Add an entryPoint that run the shell : ENTRYPOINT ["sh"]
3) Add an entryPoint that loops forever as last uncommented instruction :
ENTRYPOINT ["tail", "-f", "/dev/null"]
In the host shell :
1) Build the image (supposing that the build context is the current directory) :
docker build [-f DOCKERFILE_LOCATION_IF_NEEDED] -t TAG:VERSION .
2) Run the container and execute the shell as command :
docker run -ti IMAGE
The image build fails at a step and I want to have a sandbox to make fast multiple tries to understand the issue
Follow the instructions in the previous point.
Once connected to the container, experiment the instruction that fails by entering and try with any variant to understand the issue.
Running container issues
The contained exits prematurely at startup and we don’t have enough information to understand the reason
1) In the host shell :
– docker way :
Run the container and pass the shell command as entrypoint to prevent the failure :
docker run --rm -ti --entrypoint sh IMAGE
– docker-compose way :
docker-compose run --service-ports --entrypoint sh IMAGE
Note that by default run uses an interactive mode.
2) In the container shell :
Reexecute the command defined in the ENTRYPOINT/CMD that has exited with an error. Now the container stays running, you can analyse the state and do any experimentations.
The container fail fast to run
1 Case) The iptables are stale because of recent changes.
The error message look like :
docker: Error response from daemon: driver failed programming external connectivity on endpoint prom (6ef6b57285842e5dc9aa30ee06c3e9cdf0ae444e9027762f9c9c4982c388b85f): (iptables failed: iptables --wait -t filter -A DOCKER ! -i docker0 -o docker0 -p tcp -d 172.17.0.6 --dport 9090 -j ACCEPT: iptables: No chain/target/match by that name. (exit status 1)).
Solution)
systemctl stop docker
systemctl start docker
The container fails during its starting with a permission denied error to open/write a file
Multiple causes :
1) The owner/permissions of docker directories (/var/lib/docker…) were changed manually.
In that case, it may be complex to identify the exact issue.
Solution (by order) :
a) set the folders/files of docker with the correct owner :
chown -R root:root /var/lib/docker
b) If not working, docker-ce reinstall may solve be tried.
Docker : container and image data with Overlay 2 driver
Layout
Image and container data are stored into : /var/lib/docker/overlay2
Inside it, each directory is either a layer or a container (not sure…).
Container content is in the merged folder.
Identify unused data in containers/images
Sometimes, clearing unused containers/images with commands such as docker system prune
is not enough to clear all unused layer/container data.
If we lack of space, we could identify layer/container directories that consume a lot of space and ensure that these are really used.
To identify big folders :
du -sh /var/lib/docker/overlay2/* | sort -k1h
To identify merged folders of existing containers :
docker inspect -f $'{{.Name}}t{{.GraphDriver.Data.MergedDir}}' $(docker ps -aq)
To ensure that merged folders are used by existing containers :
Not simple because some layers are used but not found in the docker inspect command.
So it should be done very cautiously.
Corrupted images or layers during docker build
In some rare but possible circumstances, image and layer may be corrupted.
Sometimes, adding –pull and –rm flags in the docker build command fixes the issue.
Other times, no.
In that case, to fix the problem, a possible trick is specifying another version for the base image of the Dockerfile.
Often a minor increment version is enough to make docker build to require to download completely new layer.
Ce contenu a été publié dans Non classé. Vous pouvez le mettre en favoris avec ce permalien.
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.