I’m currently getting errors when trying to exec or get logs for my pods on my GKE cluster.
$ kubectl logs <POD-NAME>
Error from server: Get "https://<NODE-PRIVATE-IP>:10250/containerLogs/default/<POD-NAME>/<DEPLOYMENT-NAME>": remote error: tls: internal error
$ kubectl exec -it <POD-NAME> -- sh
Error from server: error dialing backend: remote error: tls: internal error
One suspicious thing I found while troubleshooting is that all CSRs are getting denied…
$ kubectl get csr
NAME AGE SIGNERNAME REQUESTOR CONDITION
csr-79zkn 4m16s kubernetes.io/kubelet-serving system:node:<NODE-NAME> Denied
csr-7b5sx 91m kubernetes.io/kubelet-serving system:node:<NODE-NAME> Denied
csr-7fzjh 103m kubernetes.io/kubelet-serving system:node:<NODE-NAME> Denied
csr-7gstl 19m kubernetes.io/kubelet-serving system:node:<NODE-NAME> Denied
csr-7hrvm 11m kubernetes.io/kubelet-serving system:node:<NODE-NAME> Denied
csr-7mn6h 87m kubernetes.io/kubelet-serving system:node:<NODE-NAME> Denied
csr-7nd7h 4m57s kubernetes.io/kubelet-serving system:node:<NODE-NAME> Denied
...
Any idea why this is happening ? Maybe a firewall issue ?
Thanks in advance !
Update 1
Here the same commands with a verbose output --v=8
, withtout the goroutines
stack trace
$ kubectl logs --v=8 <POD-NAME>
I0527 09:27:59.624843 10407 loader.go:375] Config loaded from file: /home/kevin/.kube/config
I0527 09:27:59.628621 10407 round_trippers.go:420] GET https://<PUBLIC-IP>/api/v1/namespaces/default/pods/<POD-NAME>
I0527 09:27:59.628635 10407 round_trippers.go:427] Request Headers:
I0527 09:27:59.628644 10407 round_trippers.go:431] Accept: application/json, */*
I0527 09:27:59.628649 10407 round_trippers.go:431] User-Agent: kubectl/v1.19.3 (linux/amd64) kubernetes/1e11e4a
I0527 09:27:59.727411 10407 round_trippers.go:446] Response Status: 200 OK in 98 milliseconds
I0527 09:27:59.727461 10407 round_trippers.go:449] Response Headers:
I0527 09:27:59.727480 10407 round_trippers.go:452] Audit-Id: ...
I0527 09:27:59.727496 10407 round_trippers.go:452] Cache-Control: no-cache, private
I0527 09:27:59.727512 10407 round_trippers.go:452] Content-Type: application/json
I0527 09:27:59.727528 10407 round_trippers.go:452] Date: Thu, 27 May 2021 07:27:59 GMT
I0527 09:27:59.727756 10407 request.go:1097] Response Body: {"kind":"Pod","apiVersion":"v1","metadata":{"name":"<POD-NAME>","generateName":"<POD-BASE-NAME>","namespace":"default","selfLink":"/api/v1/namespaces/default/pods/<POD-NAME>","uid":"...","resourceVersion":"6764210","creationTimestamp":"2021-05-19T10:33:28Z","labels":{"app":"<NAME>","pod-template-hash":"..."},"ownerReferences":[{"apiVersion":"apps/v1","kind":"ReplicaSet","name":"<POD-BASE-NAME>","uid":"...","controller":true,"blockOwnerDeletion":true}],"managedFields":[{"manager":"kube-controller-manager","operation":"Update","apiVersion":"v1","time":"2021-05-19T10:33:28Z","fieldsType":"FieldsV1","fieldsV1":{"f:metadata":{"f:generateName":{},"f:labels":{".":{},"f:app":{},"f:pod-template-hash":{}},"f:ownerReferences":{".":{},"k:{"uid":"..."}":{".":{},"f:apiVersion":{},"f:blockOwnerDeletion":{},"f:controller":{},"f:kind":{},"f:name":{},"f:uid":{}}}},"f:spec":{"f:c [truncated 3250 chars]
I0527 09:27:59.745985 10407 round_trippers.go:420] GET https://<PUBLIC-IP>/api/v1/namespaces/default/pods/<POD-NAME>/log
I0527 09:27:59.746035 10407 round_trippers.go:427] Request Headers:
I0527 09:27:59.746055 10407 round_trippers.go:431] Accept: application/json, */*
I0527 09:27:59.746071 10407 round_trippers.go:431] User-Agent: kubectl/v1.19.3 (linux/amd64) kubernetes/1e11e4a
I0527 09:27:59.800586 10407 round_trippers.go:446] Response Status: 500 Internal Server Error in 54 milliseconds
I0527 09:27:59.800638 10407 round_trippers.go:449] Response Headers:
I0527 09:27:59.800654 10407 round_trippers.go:452] Audit-Id: ...
I0527 09:27:59.800668 10407 round_trippers.go:452] Cache-Control: no-cache, private
I0527 09:27:59.800680 10407 round_trippers.go:452] Content-Type: application/json
I0527 09:27:59.800693 10407 round_trippers.go:452] Content-Length: 217
I0527 09:27:59.800712 10407 round_trippers.go:452] Date: Thu, 27 May 2021 07:27:59 GMT
I0527 09:27:59.800772 10407 request.go:1097] Response Body: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"Get "https://10.156.0.8:10250/containerLogs/default/<POD-NAME>/<SERVICE-NAME>": remote error: tls: internal error","code":500}
I0527 09:27:59.801848 10407 helpers.go:216] server response object: [{
"metadata": {},
"status": "Failure",
"message": "Get "https://10.156.0.8:10250/containerLogs/default/<POD-NAME>/<SERVICE-NAME>": remote error: tls: internal error",
"code": 500
}]
F0527 09:27:59.801944 10407 helpers.go:115] Error from server: Get "https://10.156.0.8:10250/containerLogs/default/<POD-NAME>/<SERVICE-NAME>": remote error: tls: internal error
kubectl exec --v=8 -it <POD-NAME> -- sh
I0527 09:44:48.673774 11157 loader.go:375] Config loaded from file: /home/kevin/.kube/config
I0527 09:44:48.678514 11157 round_trippers.go:420] GET https://<PUBLIC-IP>/api/v1/namespaces/default/pods/<POD-NAME>
I0527 09:44:48.678528 11157 round_trippers.go:427] Request Headers:
I0527 09:44:48.678535 11157 round_trippers.go:431] Accept: application/json, */*
I0527 09:44:48.678543 11157 round_trippers.go:431] User-Agent: kubectl/v1.19.3 (linux/amd64) kubernetes/1e11e4a
I0527 09:44:48.795864 11157 round_trippers.go:446] Response Status: 200 OK in 117 milliseconds
I0527 09:44:48.795920 11157 round_trippers.go:449] Response Headers:
I0527 09:44:48.795963 11157 round_trippers.go:452] Audit-Id: ...
I0527 09:44:48.795995 11157 round_trippers.go:452] Cache-Control: no-cache, private
I0527 09:44:48.796019 11157 round_trippers.go:452] Content-Type: application/json
I0527 09:44:48.796037 11157 round_trippers.go:452] Date: Thu, 27 May 2021 07:44:48 GMT
I0527 09:44:48.796644 11157 request.go:1097] Response Body: {"kind":"Pod","apiVersion":"v1","metadata":{"name":"<POD-NAME>","generateName":"","namespace":"default","selfLink":"/api/v1/namespaces/default/pods/<POD-NAME>","uid":"","resourceVersion":"6764210","creationTimestamp":"2021-05-19T10:33:28Z","labels":{"app":"...","pod-template-hash":"..."},"ownerReferences":[{"apiVersion":"apps/v1","kind":"ReplicaSet","name":"<POD-BASE-NAME>","uid":"...","controller":true,"blockOwnerDeletion":true}],"managedFields":[{"manager":"kube-controller-manager","operation":"Update","apiVersion":"v1","time":"2021-05-19T10:33:28Z","fieldsType":"FieldsV1","fieldsV1":{"f:metadata":{"f:generateName":{},"f:labels":{".":{},"f:app":{},"f:pod-template-hash":{}},"f:ownerReferences":{".":{},"k:{"uid":"..."}":{".":{},"f:apiVersion":{},"f:blockOwnerDeletion":{},"f:controller":{},"f:kind":{},"f:name":{},"f:uid":{}}}},"f:spec":{"f:c [truncated 3250 chars]
I0527 09:44:48.814315 11157 round_trippers.go:420] POST https://<PUBLIC-IP>/api/v1/namespaces/default/pods/<POD-NAME>/exec?command=sh&container=<SERVICE-NAME>&stdin=true&stdout=true&tty=true
I0527 09:44:48.814372 11157 round_trippers.go:427] Request Headers:
I0527 09:44:48.814391 11157 round_trippers.go:431] User-Agent: kubectl/v1.19.3 (linux/amd64) kubernetes/1e11e4a
I0527 09:44:48.814406 11157 round_trippers.go:431] X-Stream-Protocol-Version: v4.channel.k8s.io
I0527 09:44:48.814420 11157 round_trippers.go:431] X-Stream-Protocol-Version: v3.channel.k8s.io
I0527 09:44:48.814445 11157 round_trippers.go:431] X-Stream-Protocol-Version: v2.channel.k8s.io
I0527 09:44:48.814471 11157 round_trippers.go:431] X-Stream-Protocol-Version: channel.k8s.io
I0527 09:44:48.913928 11157 round_trippers.go:446] Response Status: 500 Internal Server Error in 99 milliseconds
I0527 09:44:48.913977 11157 round_trippers.go:449] Response Headers:
I0527 09:44:48.914005 11157 round_trippers.go:452] Audit-Id: ...
I0527 09:44:48.914029 11157 round_trippers.go:452] Cache-Control: no-cache, private
I0527 09:44:48.914054 11157 round_trippers.go:452] Content-Type: application/json
I0527 09:44:48.914077 11157 round_trippers.go:452] Date: Thu, 27 May 2021 07:44:48 GMT
I0527 09:44:48.914099 11157 round_trippers.go:452] Content-Length: 149
I0527 09:44:48.915741 11157 helpers.go:216] server response object: [{
"metadata": {},
"status": "Failure",
"message": "error dialing backend: remote error: tls: internal error",
"code": 500
}]
F0527 09:44:48.915837 11157 helpers.go:115] Error from server: error dialing backend: remote error: tls: internal error
Update 2
After connecting to one of the GKE worker nodes and checking kubelet
logs I found these wired lines
May 27 09:30:11 gke-<CLUSTER-NAME>-default-pool-<NODE-UID> kubelet[1272]: I0527 09:30:11.271022 1272 log.go:181] http: TLS handshake error from 10.156.0.9:54672: no serving certificate available for the kubelet
May 27 09:30:11 gke-<CLUSTER-NAME>-default-pool-<NODE-UID> kubelet[1272]: I0527 09:30:11.305628 1272 log.go:181] http: TLS handshake error from 10.156.0.9:54674: no serving certificate available for the kubelet
May 27 09:30:12 gke-<CLUSTER-NAME>-default-pool-<NODE-UID> kubelet[1272]: I0527 09:30:12.067998 1272 log.go:181] http: TLS handshake error from 10.156.0.11:57610: no serving certificate available for the kubelet
May 27 09:30:14 gke-<CLUSTER-NAME>-default-pool-<NODE-UID> kubelet[1272]: I0527 09:30:14.144826 1272 certificate_manager.go:412] Rotating certificates
May 27 09:30:14 gke-<CLUSTER-NAME>-default-pool-<NODE-UID> kubelet[1272]: I0527 09:30:14.154322 1272 reflector.go:207] Starting reflector *v1.CertificateSigningRequest (0s) from k8s.io/client-go/tools/watch/informerwatcher.go:146
May 27 09:30:14 gke-<CLUSTER-NAME>-default-pool-<NODE-UID> kubelet[1272]: I0527 09:30:14.448976 1272 reflector.go:213] Stopping reflector *v1.CertificateSigningRequest (0s) from k8s.io/client-go/tools/watch/informerwatcher.go:146
May 27 09:30:14 gke-<CLUSTER-NAME>-default-pool-<NODE-UID> kubelet[1272]: E0527 09:30:14.449045 1272 certificate_manager.go:454] certificate request was not signed: cannot watch on the certificate signing request: certificate signing request is denied, reason: AutoDenied, message:
Update 3
I’ve updated the cluster version from 1.19.9-gke.1400
to 1.19.9-gke.1900
. Didn’t solved the problem…
Made a Credentials Rotation
on the cluster. But didn’t solved as well…
Final
After trying lot of changes in the cluster :
- Restarting kubelet on nodes
- Restarting nodes
- Upscaling/Downscaling node pool size
- Upgrading cluster version
- Rotating cluster certificates
Even creating a new cluster (on the same project, with same VPC, etc) didn’t solve the issue…
This problem might be related to changes made on Firewall rules.
Only solution found, creating a new GKE cluster in a new GCP project and migrating workflow using Velero.
Содержание
- docker push remote error: tls: internal error #4279
- Comments
- Footer
- AWS EKS — remote error: tls: internal error — CSR pending #610
- Comments
- Alpha-2 (0.87.0): oc logs: remote error: tls: internal error #218
- Comments
- Footer
- tls: internal error when getting logs on OpenStack #1467
- Comments
- Platform (aws|libvirt|openstack):
- What you expected to happen?
- How to reproduce it (as minimally and precisely as possible)?
- Getting kubectl tls: internal error on worker nodes scaled by Cluster Autoscaler #1324
- Comments
docker push remote error: tls: internal error #4279
Bit new to docker so might be doing something wrong but when I try to push my image I get an error of Get https://docker.myhost.com/v2/: remote error: tls: internal error
The text was updated successfully, but these errors were encountered:
Actually might be something to do with flynn docker login as that also gives an error of
What version of Docker are you using (check docker version ).
Client:
Version: 17.10.0-ce
I am unable to reproduce this. What OS and version are you running? What happens when you hit the endpoint using curl ( curl -v https://docker.myhost.com/v2/ )?
this is what i get
- TCP_NODELAY set
- Connected to docker.myhost.com (0.0.0.0) port 443 (#0)
- Unknown SSL protocol error in connection to docker.myhost.com:-9838
- Closing connection 0
curl: (35) Unknown SSL protocol error in connection to docker.myhost.com-9838
Ok, I suspect there is something wrong in your cluster. What is the output of the following right after trying flynn docker login :
Flynn is unmaintained and our infrastructure will shut down on June 1, 2021. See the README for details.
© 2023 GitHub, Inc.
You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.
Источник
AWS EKS — remote error: tls: internal error — CSR pending #610
What happened: We have EKS cluster deployed with managed nodes. When we try to run kubectl logs or kubectl exec it is giving Error from server: error dialing backend: remote error: tls: internal error. In the admin console, it is showing all the Nodes are ready and Workloads are ready. Then I run kubectl get csr and it is showing all requests as Pending. Then I described a CSR it seems like the details are correct. Please refer the below output
Anything else we need to know?:This issue came suddenly. Our guess after scalling
Environment:
- AWS Region: North Virginia
- Instance Type(s): M5.Large
- EKS Platform version (use aws eks describe-cluster —name —query cluster.platformVersion ):eks.3
- Kubernetes version (use aws eks describe-cluster —name —query cluster.version ):1.18
- AMI Version:AL2_x86_64
- Kernel (e.g. uname -a ):Linux ip-192-168-33-152.ec2.internal 4.14.214-160.339.amzn2.x86_64
Template is missing source_ami_id in the variables section #1 SMP Sun Jan 10 05:53:05 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
The text was updated successfully, but these errors were encountered:
In order to debug this issue, we will need the cluster ARN. I recommend creating a support case with AWS and providing relevant details there.
To anyone with a similar issue, be aware AWS will charge you for support cases, but fail to diagnose or help in any way.
Any update on this? We have experienced this three times now, each time having to delete and recreate the cluster. AWS support couldn’t reproduce it on their side, charged us for the support case they never solved, and then asked us to reproduce it for them, giving the following response:
AWS Support:
Also, I’ve tested it in my cluster by scaling the worker nodes from the eks console but in my case the node was launched successfully.
Therefore, please check once again if you can reproduce this issue, if so please share the steps and the logs/outputs that I’ve requested in my previous correspondence and I’ll investigate this further.
In our case, AWS terminated a node (without notifying or requesting):
W0708 15:12:35.439299 1 aws.go:1730] the instance i-04c7a**** is terminated
I0708 15:12:35.439314 1 node_lifecycle_controller.go:156] deleting node since it is no longer present in cloud provider: ip-********.eu-west-1.compute.internal
The node that came back up started with TLS issue, brought down parts of our system and now the cluster is again unhealthy.
CSR’s from nodes have the following auto-approve config:
Источник
Alpha-2 (0.87.0): oc logs: remote error: tls: internal error #218
log can not be viewed.
How to reproduce:
The text was updated successfully, but these errors were encountered:
@rbaumgar this is happen because there is no virtualbox provider so you need to approve the csr manually, try below command to do it.
work around works! Thanks
but I am using libvirt.
@rbaumgar in libvirt we didn’t see this before, will check what changed there.
I’m also facing the issue on libvirt (0.87.0)
If that matters I saw tls: internal error in OSX as well.
it affects the image/c;luster. it is not related to the hypervisor
on libvirt machine config operator should auto approve those csr which is not happening atm, we need to find what changed.
Question: is that workaround ( oc adm certificate. ) required to be periodically run? Say I create and delete a bunch of deployments/pods over time — will I need to run that workaround each time I deploy new things? Or are those certificates fixed after startup and I only have to run that «oc adm» command once?
@jmazzitelli I ran this command once 10 days ago and did not face the error again
© 2023 GitHub, Inc.
You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.
Источник
tls: internal error when getting logs on OpenStack #1467
Platform (aws|libvirt|openstack):
Running oc logs always seems to fail with a tls internal error. For example:
See the troubleshooting documentation for ideas about what information to collect.
For example, if the installer fails to create resources, attach the relevant portions of your .openshift_install.log .
What you expected to happen?
Enter text here.
How to reproduce it (as minimally and precisely as possible)?
Install OpenShift on OpenStack (you have to create the install-config.yaml manually) and once the cluster is mostly up (e.g. the worker nodes are created), pick a pod and run oc logs pod .
This should produce the logs, but instead outputs the tls error above.
The text was updated successfully, but these errors were encountered:
We need to figure that out asap because it’s blocking us from figuring out any of the other issues.
Ah, thanks for the pointer @zeenix! I’ll see if we need to do something similar for OpenStack.
Oops. This is openstack backend. The PR above is libvirt specific. Having said that, I’d bet it’s a very similar issue.
@tomassedovic I had the same issue because the csr was not approved. This happened because the nodelink-controller can not link the machine because of the missing status field.
I created an PR here openshift/cluster-api-provider-openstack#29 but still no response ;(
Thanks @FlorinPeter, I’ll have a look.
Can you check the following on your node? #1494 See if the certs in /var/lib/kubelet/pki are still valid.
This seems like an machine API and machine approver issue. Please open issues on corresponding PRs to track this.
But if you still think this installer configuring resources incorrectly please feel free to reopen with your reasoning.
This seems like an machine API and machine approver issue. Please open issues on corresponding PRs to track this.
But if you still think this installer configuring resources incorrectly please feel free to reopen with your reasoning.
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
Источник
Getting kubectl tls: internal error on worker nodes scaled by Cluster Autoscaler #1324
I created a k8s cluster using eksctl & then created Cluster Autoscaler deployment on it.
For the pods running on new worker nodes that were scaled out by Cluster Autoscaler, kubectl logs & kubectl exec is not working. Below error is thrown
Error from server remote error: tls: internal error
I checked the kubelet logs on these scaled worker node, and found these logs :
http: TLS handshake error from 172.XX.X.XX:48784: no serving certificate available for the kubelet
I observed the CSR generated by these nodes were in PENDING state. After manually approving these CSRs, kubectl logs/exec command threw this error
Error from server: Get https://172.XXX.X.XX:10250/containerLogs/kube-system/container-name: x509: cannot validate certificate for 172.XXX.X.XX because it doesn’t contain any IP SANs
This happens only when the Cluster Autoscalar provisions new worker node in the nodegroup.
Everything works fine if I manually change the desired count in the ASG group of the nodes.
I did made appropriate changes in the Cluster Autoscalar deployment yaml for the sslCert file path location as per eksctl
The text was updated successfully, but these errors were encountered:
Источник
Issue
- We are getting «An error occurred while retrieving the requested logs.» when trying to view logs for any pod in OCP 4.1 web interface.
WebSocket connection to 'wss://console-openshift-console.apps.example.com/api/kubernetes/api/v1/namespaces/openshift-console/pods/console-79b6c7bb87-gt2ck/log?container=console&follow=true&tailLines=1000&x-csrf-token=ESx4l2bhkAyUQ8nx9f0%2FmA3qThlJEI6IOptYX2N%2FSPBDwcQuQ1K91DDjT0I3J99QYF4rogNwgleVtq6FV%2BkL7Q%3D%3D' failed: Error during WebSocket handshake: Unexpected response code: 500
- The command line logs, exec, and rsh tools give a remote error
$ oc logs console-79b6c7bb87-gt2ck
Error from server: Get https://master0.example.com:10250/containerLogs/openshift-console/console-79b6c7bb87-gt2ck/console: remote error: tls: internal error
-
We have pending CSRs in an OpenShift 4 cluster after install
-
The attempt to
oc exec ...
is failing
$ oc exec marketplace-operator-768b99959-9pftm -n openshift-marketplace -- echo foo
Error from server: error dialing backend: remote error: tls: internal error
$ oc logs marketplace-operator-768b99959-9pftm -n openshift-marketplace
Error from server: Get https://master:10250/containerLogs/openshift-marketplace/marketplace-operator-768b99959-9pftm/marketplace-operator: remote error: tls: internal error
- kube-apiserver container has errors
$ sudo crictl ps | grep kube-api
239ec13eeaf4e beaf65fce4dc16947c5bd5d1ca7e16313234c393e8ca1c4251ac9b85094972bb About an hour ago Running kube-apiserver-operator 3 bd197ceb6f882
6f2bdcab072ca beaf65fce4dc16947c5bd5d1ca7e16313234c393e8ca1c4251ac9b85094972bb About an hour ago Running kube-apiserver-cert-syncer-8 1 6938a6ebc2c3d
e6b9db2994d07 0d8dcfc307048a0f0400e644fcd1c9929018103b15d0f9b23b4841f1e71937bc About an hour ago Running kube-apiserver-8 1 6938a6ebc2c3d
$ sudo crictl logs e6b9db2994d07
...
E0725 17:38:54.707552 1 status.go:64] apiserver received an error that is not an metav1.Status: &url.Error{Op:"Get", URL:"https://master:10250/containerLogs/openshift-kube-apiserver/kube-apiserver-master/kube-apiserver-8", Err:(*net.OpError)(0xc01ec89270)}
...
Environment
- Red Hat OpenShift Container Platform
- 4.x
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.
Current Customers and Partners
Log in for full access
Log In