Remote error tls internal error - Исправление ошибок и поиск оптимальных решений проблем

I’m currently getting errors when trying to exec or get logs for my pods on my GKE cluster.

$ kubectl logs <POD-NAME>
Error from server: Get "https://<NODE-PRIVATE-IP>:10250/containerLogs/default/<POD-NAME>/<DEPLOYMENT-NAME>": remote error: tls: internal error

$ kubectl exec -it <POD-NAME> -- sh
Error from server: error dialing backend: remote error: tls: internal error

One suspicious thing I found while troubleshooting is that all CSRs are getting denied…

$ kubectl get csr
NAME        AGE     SIGNERNAME                      REQUESTOR                 CONDITION
csr-79zkn   4m16s   kubernetes.io/kubelet-serving   system:node:<NODE-NAME>   Denied
csr-7b5sx   91m     kubernetes.io/kubelet-serving   system:node:<NODE-NAME>   Denied
csr-7fzjh   103m    kubernetes.io/kubelet-serving   system:node:<NODE-NAME>   Denied
csr-7gstl   19m     kubernetes.io/kubelet-serving   system:node:<NODE-NAME>   Denied
csr-7hrvm   11m     kubernetes.io/kubelet-serving   system:node:<NODE-NAME>   Denied
csr-7mn6h   87m     kubernetes.io/kubelet-serving   system:node:<NODE-NAME>   Denied
csr-7nd7h   4m57s   kubernetes.io/kubelet-serving   system:node:<NODE-NAME>   Denied
...

Any idea why this is happening ? Maybe a firewall issue ?

Thanks in advance !

Update 1

Here the same commands with a verbose output --v=8, withtout the goroutines stack trace

$ kubectl logs --v=8 <POD-NAME>

I0527 09:27:59.624843   10407 loader.go:375] Config loaded from file:  /home/kevin/.kube/config
I0527 09:27:59.628621   10407 round_trippers.go:420] GET https://<PUBLIC-IP>/api/v1/namespaces/default/pods/<POD-NAME>
I0527 09:27:59.628635   10407 round_trippers.go:427] Request Headers:
I0527 09:27:59.628644   10407 round_trippers.go:431]     Accept: application/json, */*
I0527 09:27:59.628649   10407 round_trippers.go:431]     User-Agent: kubectl/v1.19.3 (linux/amd64) kubernetes/1e11e4a
I0527 09:27:59.727411   10407 round_trippers.go:446] Response Status: 200 OK in 98 milliseconds
I0527 09:27:59.727461   10407 round_trippers.go:449] Response Headers:
I0527 09:27:59.727480   10407 round_trippers.go:452]     Audit-Id: ...
I0527 09:27:59.727496   10407 round_trippers.go:452]     Cache-Control: no-cache, private
I0527 09:27:59.727512   10407 round_trippers.go:452]     Content-Type: application/json
I0527 09:27:59.727528   10407 round_trippers.go:452]     Date: Thu, 27 May 2021 07:27:59 GMT
I0527 09:27:59.727756   10407 request.go:1097] Response Body: {"kind":"Pod","apiVersion":"v1","metadata":{"name":"<POD-NAME>","generateName":"<POD-BASE-NAME>","namespace":"default","selfLink":"/api/v1/namespaces/default/pods/<POD-NAME>","uid":"...","resourceVersion":"6764210","creationTimestamp":"2021-05-19T10:33:28Z","labels":{"app":"<NAME>","pod-template-hash":"..."},"ownerReferences":[{"apiVersion":"apps/v1","kind":"ReplicaSet","name":"<POD-BASE-NAME>","uid":"...","controller":true,"blockOwnerDeletion":true}],"managedFields":[{"manager":"kube-controller-manager","operation":"Update","apiVersion":"v1","time":"2021-05-19T10:33:28Z","fieldsType":"FieldsV1","fieldsV1":{"f:metadata":{"f:generateName":{},"f:labels":{".":{},"f:app":{},"f:pod-template-hash":{}},"f:ownerReferences":{".":{},"k:{"uid":"..."}":{".":{},"f:apiVersion":{},"f:blockOwnerDeletion":{},"f:controller":{},"f:kind":{},"f:name":{},"f:uid":{}}}},"f:spec":{"f:c [truncated 3250 chars]
I0527 09:27:59.745985   10407 round_trippers.go:420] GET https://<PUBLIC-IP>/api/v1/namespaces/default/pods/<POD-NAME>/log
I0527 09:27:59.746035   10407 round_trippers.go:427] Request Headers:
I0527 09:27:59.746055   10407 round_trippers.go:431]     Accept: application/json, */*
I0527 09:27:59.746071   10407 round_trippers.go:431]     User-Agent: kubectl/v1.19.3 (linux/amd64) kubernetes/1e11e4a
I0527 09:27:59.800586   10407 round_trippers.go:446] Response Status: 500 Internal Server Error in 54 milliseconds
I0527 09:27:59.800638   10407 round_trippers.go:449] Response Headers:
I0527 09:27:59.800654   10407 round_trippers.go:452]     Audit-Id: ...
I0527 09:27:59.800668   10407 round_trippers.go:452]     Cache-Control: no-cache, private
I0527 09:27:59.800680   10407 round_trippers.go:452]     Content-Type: application/json
I0527 09:27:59.800693   10407 round_trippers.go:452]     Content-Length: 217
I0527 09:27:59.800712   10407 round_trippers.go:452]     Date: Thu, 27 May 2021 07:27:59 GMT
I0527 09:27:59.800772   10407 request.go:1097] Response Body: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"Get "https://10.156.0.8:10250/containerLogs/default/<POD-NAME>/<SERVICE-NAME>": remote error: tls: internal error","code":500}
I0527 09:27:59.801848   10407 helpers.go:216] server response object: [{
  "metadata": {},
  "status": "Failure",
  "message": "Get "https://10.156.0.8:10250/containerLogs/default/<POD-NAME>/<SERVICE-NAME>": remote error: tls: internal error",
  "code": 500
}]
F0527 09:27:59.801944   10407 helpers.go:115] Error from server: Get "https://10.156.0.8:10250/containerLogs/default/<POD-NAME>/<SERVICE-NAME>": remote error: tls: internal error

kubectl exec --v=8 -it <POD-NAME> -- sh

I0527 09:44:48.673774   11157 loader.go:375] Config loaded from file:  /home/kevin/.kube/config
I0527 09:44:48.678514   11157 round_trippers.go:420] GET https://<PUBLIC-IP>/api/v1/namespaces/default/pods/<POD-NAME>
I0527 09:44:48.678528   11157 round_trippers.go:427] Request Headers:
I0527 09:44:48.678535   11157 round_trippers.go:431]     Accept: application/json, */*
I0527 09:44:48.678543   11157 round_trippers.go:431]     User-Agent: kubectl/v1.19.3 (linux/amd64) kubernetes/1e11e4a
I0527 09:44:48.795864   11157 round_trippers.go:446] Response Status: 200 OK in 117 milliseconds
I0527 09:44:48.795920   11157 round_trippers.go:449] Response Headers:
I0527 09:44:48.795963   11157 round_trippers.go:452]     Audit-Id: ...
I0527 09:44:48.795995   11157 round_trippers.go:452]     Cache-Control: no-cache, private
I0527 09:44:48.796019   11157 round_trippers.go:452]     Content-Type: application/json
I0527 09:44:48.796037   11157 round_trippers.go:452]     Date: Thu, 27 May 2021 07:44:48 GMT
I0527 09:44:48.796644   11157 request.go:1097] Response Body: {"kind":"Pod","apiVersion":"v1","metadata":{"name":"<POD-NAME>","generateName":"","namespace":"default","selfLink":"/api/v1/namespaces/default/pods/<POD-NAME>","uid":"","resourceVersion":"6764210","creationTimestamp":"2021-05-19T10:33:28Z","labels":{"app":"...","pod-template-hash":"..."},"ownerReferences":[{"apiVersion":"apps/v1","kind":"ReplicaSet","name":"<POD-BASE-NAME>","uid":"...","controller":true,"blockOwnerDeletion":true}],"managedFields":[{"manager":"kube-controller-manager","operation":"Update","apiVersion":"v1","time":"2021-05-19T10:33:28Z","fieldsType":"FieldsV1","fieldsV1":{"f:metadata":{"f:generateName":{},"f:labels":{".":{},"f:app":{},"f:pod-template-hash":{}},"f:ownerReferences":{".":{},"k:{"uid":"..."}":{".":{},"f:apiVersion":{},"f:blockOwnerDeletion":{},"f:controller":{},"f:kind":{},"f:name":{},"f:uid":{}}}},"f:spec":{"f:c [truncated 3250 chars]
I0527 09:44:48.814315   11157 round_trippers.go:420] POST https://<PUBLIC-IP>/api/v1/namespaces/default/pods/<POD-NAME>/exec?command=sh&container=<SERVICE-NAME>&stdin=true&stdout=true&tty=true
I0527 09:44:48.814372   11157 round_trippers.go:427] Request Headers:
I0527 09:44:48.814391   11157 round_trippers.go:431]     User-Agent: kubectl/v1.19.3 (linux/amd64) kubernetes/1e11e4a
I0527 09:44:48.814406   11157 round_trippers.go:431]     X-Stream-Protocol-Version: v4.channel.k8s.io
I0527 09:44:48.814420   11157 round_trippers.go:431]     X-Stream-Protocol-Version: v3.channel.k8s.io
I0527 09:44:48.814445   11157 round_trippers.go:431]     X-Stream-Protocol-Version: v2.channel.k8s.io
I0527 09:44:48.814471   11157 round_trippers.go:431]     X-Stream-Protocol-Version: channel.k8s.io
I0527 09:44:48.913928   11157 round_trippers.go:446] Response Status: 500 Internal Server Error in 99 milliseconds
I0527 09:44:48.913977   11157 round_trippers.go:449] Response Headers:
I0527 09:44:48.914005   11157 round_trippers.go:452]     Audit-Id: ...
I0527 09:44:48.914029   11157 round_trippers.go:452]     Cache-Control: no-cache, private
I0527 09:44:48.914054   11157 round_trippers.go:452]     Content-Type: application/json
I0527 09:44:48.914077   11157 round_trippers.go:452]     Date: Thu, 27 May 2021 07:44:48 GMT
I0527 09:44:48.914099   11157 round_trippers.go:452]     Content-Length: 149
I0527 09:44:48.915741   11157 helpers.go:216] server response object: [{
  "metadata": {},
  "status": "Failure",
  "message": "error dialing backend: remote error: tls: internal error",
  "code": 500
}]
F0527 09:44:48.915837   11157 helpers.go:115] Error from server: error dialing backend: remote error: tls: internal error

Update 2

After connecting to one of the GKE worker nodes and checking kubelet logs I found these wired lines

May 27 09:30:11 gke-<CLUSTER-NAME>-default-pool-<NODE-UID> kubelet[1272]: I0527 09:30:11.271022    1272 log.go:181] http: TLS handshake error from 10.156.0.9:54672: no serving certificate available for the kubelet
May 27 09:30:11 gke-<CLUSTER-NAME>-default-pool-<NODE-UID> kubelet[1272]: I0527 09:30:11.305628    1272 log.go:181] http: TLS handshake error from 10.156.0.9:54674: no serving certificate available for the kubelet
May 27 09:30:12 gke-<CLUSTER-NAME>-default-pool-<NODE-UID> kubelet[1272]: I0527 09:30:12.067998    1272 log.go:181] http: TLS handshake error from 10.156.0.11:57610: no serving certificate available for the kubelet
May 27 09:30:14 gke-<CLUSTER-NAME>-default-pool-<NODE-UID> kubelet[1272]: I0527 09:30:14.144826    1272 certificate_manager.go:412] Rotating certificates
May 27 09:30:14 gke-<CLUSTER-NAME>-default-pool-<NODE-UID> kubelet[1272]: I0527 09:30:14.154322    1272 reflector.go:207] Starting reflector *v1.CertificateSigningRequest (0s) from k8s.io/client-go/tools/watch/informerwatcher.go:146
May 27 09:30:14 gke-<CLUSTER-NAME>-default-pool-<NODE-UID> kubelet[1272]: I0527 09:30:14.448976    1272 reflector.go:213] Stopping reflector *v1.CertificateSigningRequest (0s) from k8s.io/client-go/tools/watch/informerwatcher.go:146
May 27 09:30:14 gke-<CLUSTER-NAME>-default-pool-<NODE-UID> kubelet[1272]: E0527 09:30:14.449045    1272 certificate_manager.go:454] certificate request was not signed: cannot watch on the certificate signing request: certificate signing request is denied, reason: AutoDenied, message:

Update 3

I’ve updated the cluster version from 1.19.9-gke.1400 to 1.19.9-gke.1900. Didn’t solved the problem…

Made a Credentials Rotation on the cluster. But didn’t solved as well…

Final

After trying lot of changes in the cluster :

Restarting kubelet on nodes
Restarting nodes
Upscaling/Downscaling node pool size
Upgrading cluster version
Rotating cluster certificates

Even creating a new cluster (on the same project, with same VPC, etc) didn’t solve the issue…

This problem might be related to changes made on Firewall rules.

Only solution found, creating a new GKE cluster in a new GCP project and migrating workflow using Velero.

Источник

Содержание

docker push remote error: tls: internal error #4279
Comments
Footer
AWS EKS — remote error: tls: internal error — CSR pending #610
Comments
Alpha-2 (0.87.0): oc logs: remote error: tls: internal error #218
Comments
Footer
tls: internal error when getting logs on OpenStack #1467
Comments
Platform (aws|libvirt|openstack):
What you expected to happen?
How to reproduce it (as minimally and precisely as possible)?
Getting kubectl tls: internal error on worker nodes scaled by Cluster Autoscaler #1324
Comments

docker push remote error: tls: internal error #4279

Bit new to docker so might be doing something wrong but when I try to push my image I get an error of Get https://docker.myhost.com/v2/: remote error: tls: internal error

The text was updated successfully, but these errors were encountered:

Actually might be something to do with flynn docker login as that also gives an error of

What version of Docker are you using (check docker version ).

Client:
Version: 17.10.0-ce

I am unable to reproduce this. What OS and version are you running? What happens when you hit the endpoint using curl ( curl -v https://docker.myhost.com/v2/ )?

this is what i get

TCP_NODELAY set

Connected to docker.myhost.com (0.0.0.0) port 443 (#0)

Unknown SSL protocol error in connection to docker.myhost.com:-9838

Closing connection 0
curl: (35) Unknown SSL protocol error in connection to docker.myhost.com-9838

Ok, I suspect there is something wrong in your cluster. What is the output of the following right after trying flynn docker login :

Flynn is unmaintained and our infrastructure will shut down on June 1, 2021. See the README for details.

You can’t perform that action at this time.

You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.

Источник

AWS EKS — remote error: tls: internal error — CSR pending #610

What happened: We have EKS cluster deployed with managed nodes. When we try to run kubectl logs or kubectl exec it is giving Error from server: error dialing backend: remote error: tls: internal error. In the admin console, it is showing all the Nodes are ready and Workloads are ready. Then I run kubectl get csr and it is showing all requests as Pending. Then I described a CSR it seems like the details are correct. Please refer the below output

Anything else we need to know?:This issue came suddenly. Our guess after scalling

Environment:

AWS Region: North Virginia
Instance Type(s): M5.Large
EKS Platform version (use aws eks describe-cluster —name —query cluster.platformVersion ):eks.3
Kubernetes version (use aws eks describe-cluster —name —query cluster.version ):1.18
AMI Version:AL2_x86_64
Kernel (e.g. uname -a ):Linux ip-192-168-33-152.ec2.internal 4.14.214-160.339.amzn2.x86_64

Template is missing source_ami_id in the variables section #1 SMP Sun Jan 10 05:53:05 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

Release information (run cat /etc/eks/release on a node):

The text was updated successfully, but these errors were encountered:

In order to debug this issue, we will need the cluster ARN. I recommend creating a support case with AWS and providing relevant details there.

To anyone with a similar issue, be aware AWS will charge you for support cases, but fail to diagnose or help in any way.

Any update on this? We have experienced this three times now, each time having to delete and recreate the cluster. AWS support couldn’t reproduce it on their side, charged us for the support case they never solved, and then asked us to reproduce it for them, giving the following response:

AWS Support:

Also, I’ve tested it in my cluster by scaling the worker nodes from the eks console but in my case the node was launched successfully.

Therefore, please check once again if you can reproduce this issue, if so please share the steps and the logs/outputs that I’ve requested in my previous correspondence and I’ll investigate this further.

In our case, AWS terminated a node (without notifying or requesting):
W0708 15:12:35.439299 1 aws.go:1730] the instance i-04c7a**** is terminated
I0708 15:12:35.439314 1 node_lifecycle_controller.go:156] deleting node since it is no longer present in cloud provider: ip-********.eu-west-1.compute.internal

The node that came back up started with TLS issue, brought down parts of our system and now the cluster is again unhealthy.

CSR’s from nodes have the following auto-approve config:

Источник

Alpha-2 (0.87.0): oc logs: remote error: tls: internal error #218

log can not be viewed.

How to reproduce:

The text was updated successfully, but these errors were encountered:

@rbaumgar this is happen because there is no virtualbox provider so you need to approve the csr manually, try below command to do it.

work around works! Thanks

but I am using libvirt.

@rbaumgar in libvirt we didn’t see this before, will check what changed there.

I’m also facing the issue on libvirt (0.87.0)

If that matters I saw tls: internal error in OSX as well.

it affects the image/c;luster. it is not related to the hypervisor

on libvirt machine config operator should auto approve those csr which is not happening atm, we need to find what changed.

Question: is that workaround ( oc adm certificate. ) required to be periodically run? Say I create and delete a bunch of deployments/pods over time — will I need to run that workaround each time I deploy new things? Or are those certificates fixed after startup and I only have to run that «oc adm» command once?

@jmazzitelli I ran this command once 10 days ago and did not face the error again

You can’t perform that action at this time.

You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.

Источник

tls: internal error when getting logs on OpenStack #1467

Platform (aws|libvirt|openstack):

Running oc logs always seems to fail with a tls internal error. For example:

See the troubleshooting documentation for ideas about what information to collect.
For example, if the installer fails to create resources, attach the relevant portions of your .openshift_install.log .

What you expected to happen?

Enter text here.

How to reproduce it (as minimally and precisely as possible)?

Install OpenShift on OpenStack (you have to create the install-config.yaml manually) and once the cluster is mostly up (e.g. the worker nodes are created), pick a pod and run oc logs pod .

This should produce the logs, but instead outputs the tls error above.

The text was updated successfully, but these errors were encountered:

We need to figure that out asap because it’s blocking us from figuring out any of the other issues.

Ah, thanks for the pointer @zeenix! I’ll see if we need to do something similar for OpenStack.

Oops. This is openstack backend. The PR above is libvirt specific. Having said that, I’d bet it’s a very similar issue.

@tomassedovic I had the same issue because the csr was not approved. This happened because the nodelink-controller can not link the machine because of the missing status field.
I created an PR here openshift/cluster-api-provider-openstack#29 but still no response ;(

Thanks @FlorinPeter, I’ll have a look.

Can you check the following on your node? #1494 See if the certs in /var/lib/kubelet/pki are still valid.

This seems like an machine API and machine approver issue. Please open issues on corresponding PRs to track this.

But if you still think this installer configuring resources incorrectly please feel free to reopen with your reasoning.

This seems like an machine API and machine approver issue. Please open issues on corresponding PRs to track this.

But if you still think this installer configuring resources incorrectly please feel free to reopen with your reasoning.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Источник

Getting kubectl tls: internal error on worker nodes scaled by Cluster Autoscaler #1324

I created a k8s cluster using eksctl & then created Cluster Autoscaler deployment on it.

For the pods running on new worker nodes that were scaled out by Cluster Autoscaler, kubectl logs & kubectl exec is not working. Below error is thrown

Error from server remote error: tls: internal error

I checked the kubelet logs on these scaled worker node, and found these logs :

http: TLS handshake error from 172.XX.X.XX:48784: no serving certificate available for the kubelet

I observed the CSR generated by these nodes were in PENDING state. After manually approving these CSRs, kubectl logs/exec command threw this error

Error from server: Get https://172.XXX.X.XX:10250/containerLogs/kube-system/container-name: x509: cannot validate certificate for 172.XXX.X.XX because it doesn’t contain any IP SANs

This happens only when the Cluster Autoscalar provisions new worker node in the nodegroup.
Everything works fine if I manually change the desired count in the ASG group of the nodes.

I did made appropriate changes in the Cluster Autoscalar deployment yaml for the sslCert file path location as per eksctl

The text was updated successfully, but these errors were encountered:

Источник

Issue

We are getting «An error occurred while retrieving the requested logs.» when trying to view logs for any pod in OCP 4.1 web interface.

WebSocket connection to 'wss://console-openshift-console.apps.example.com/api/kubernetes/api/v1/namespaces/openshift-console/pods/console-79b6c7bb87-gt2ck/log?container=console&follow=true&tailLines=1000&x-csrf-token=ESx4l2bhkAyUQ8nx9f0%2FmA3qThlJEI6IOptYX2N%2FSPBDwcQuQ1K91DDjT0I3J99QYF4rogNwgleVtq6FV%2BkL7Q%3D%3D' failed: Error during WebSocket handshake: Unexpected response code: 500

The command line logs, exec, and rsh tools give a remote error

$ oc logs console-79b6c7bb87-gt2ck

Error from server: Get https://master0.example.com:10250/containerLogs/openshift-console/console-79b6c7bb87-gt2ck/console: remote error: tls: internal error

We have pending CSRs in an OpenShift 4 cluster after install
The attempt to oc exec ... is failing

$ oc exec marketplace-operator-768b99959-9pftm -n openshift-marketplace -- echo foo
Error from server: error dialing backend: remote error: tls: internal error

$ oc logs marketplace-operator-768b99959-9pftm -n openshift-marketplace
Error from server: Get https://master:10250/containerLogs/openshift-marketplace/marketplace-operator-768b99959-9pftm/marketplace-operator: remote error: tls: internal error

kube-apiserver container has errors

$ sudo crictl ps | grep kube-api
239ec13eeaf4e       beaf65fce4dc16947c5bd5d1ca7e16313234c393e8ca1c4251ac9b85094972bb   About an hour ago   Running             kube-apiserver-operator                   3                   bd197ceb6f882
6f2bdcab072ca       beaf65fce4dc16947c5bd5d1ca7e16313234c393e8ca1c4251ac9b85094972bb   About an hour ago   Running             kube-apiserver-cert-syncer-8              1                   6938a6ebc2c3d
e6b9db2994d07       0d8dcfc307048a0f0400e644fcd1c9929018103b15d0f9b23b4841f1e71937bc   About an hour ago   Running             kube-apiserver-8                          1                   6938a6ebc2c3d

$ sudo crictl logs e6b9db2994d07
...
E0725 17:38:54.707552       1 status.go:64] apiserver received an error that is not an metav1.Status: &url.Error{Op:"Get", URL:"https://master:10250/containerLogs/openshift-kube-apiserver/kube-apiserver-master/kube-apiserver-8", Err:(*net.OpError)(0xc01ec89270)}
...

Environment

Red Hat OpenShift Container Platform
- 4.x

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Источник