Ошибка upstream connect error or disconnect reset before headers reset reason connection failure - Исправление ошибок и поиск оптимальных решений проблем

Here’s what “upstream connect error or disconnect/reset before headers connection failure” means and how to fix it:

If you are an everyday user, and you see this message while browsing the internet, then it simply means that you need to clear your cache and cookies.

If you are a developer and see this message, then you need to check your service routes, destination rules, and/or traffic management with applications.

So if you want to learn all about what this 503 error means exactly and how to fix it, then this article is for you.

Let’s delve deeper into it!

Upstream connect error or disconnect reset before headers reset reason connection failure.

That’s a very specific, yet unclear error message to see.

What is it trying to tell you?

Let’s start with an overview.

This is a 503 error message.

It’s a generic message that actually applies to a lot of different scenarios, and the fix for it will depend on the specific scenario at hand.

In general, this error is telling you that there is a connection error, and that error is linked to routing services and rules.

That leaves an absolute ton of possibilities, but I’ll take you through the most common sources.

Then, we can talk about troubleshooting and fixing the problem.

That covers the very zoomed-out picture of this error message, but if you’re getting it, then you probably want to get it to go away.

To fix the problem, we have to address the root cause.

That’s the essence of troubleshooting, and it definitely applies here.

There’s a problem when it comes to identifying the cause of this error.

There are basically two instances where you’re going to see this error, and they are completely different.

One place where you’ll run into it is when you’re coding specific functions that relate to network connection management.

I’m going to break down the three most common scenarios that lead to this error in the next few sections.

But, the other common time you see this error is when you’re browsing the internet.

That means that I’m really answering this question for two very different groups of people.

One group is developing or coding networking resources.

The other group is just browsing the internet.

As you might imagine, it’s hard to consolidate all of that into a single, concise answer.

So, I’m going to split this up.

First, I’ll tackle the developer problems.

If you’re just trying to browse the internet and don’t want to get deep into networking and how it works, then skip to the section that is clearly labeled as not for developers and programmers.

That said, if you want to take a peek behind the curtain and learn a little more about networking, I’ll try to keep these explanations as light as possible.

#1 Reconfiguring Service Routes

I mentioned before that this is a 503 error.

One common place you’ll find it is when reconfiguring service routes.

The boiled-down essence here is that it’s easy to mix up service routing and rules such that the system can receive subsets before they are designed.

Naturally, the system doesn’t know what to do in that case, and you get a 503 error.

The key to avoiding this problem with service route reconfiguring is to follow what you might call a “make-before-break” rule.

Essentially, the steps force the system to add the new subset first and then update the virtual services.

#2 Setting Destination Rules

Considering the issue above, it should not come as a surprise that you can trigger 503 errors when setting destination rules.

Most commonly, destination rules are the issue if you see the 503 errors right after a request to a service.

This issue goes hand in hand with the one above.

The problem is still that the destination rule is creating the issue.

The difference is that this isn’t necessarily a problem with receiving subsets before they have been designed.

Virtually any destination rule error can lead to a 503 message.

Since there are so many ways these rules can break down and so many ways the problems can manifest, I’m going to cheat a little.

If you noticed that the problem correlates with new destination rules, then you can follow this guide.

It breaks down the most common destination rule problems and shows you how to overcome them.

#2 Traffic Management With Applications

The third primary issue is related to conflicts between applications and any proxy sidecar.

In other words, the applications that work with your traffic management rules might not know those rules, and the application can do things that don’t play well with the traffic management system.

That’s pretty vague because, once again, there are a lot of specific possibilities.

The gist is that you’re trying to offload as much error recovery to the applications as you can.

That will minimize these conflicts and resolve most instances of 503 errors.

Considering the detailed problems we just covered, what can you do about the 503 error?

I included some solutions and linked to even more, but if you’re looking for a general guide, then here’s another way to think about the whole thing.

This specific message is telling you that there’s a timing problem between connect errors and disconnect resets.

Somewhere in your system, you have conflicting rules that are trying to do things out of order.

The best way to find the specific area is to focus on rules changes as they relate to traffic management.

Essentially, start with what you touched most recently, and work your way backward from there.

Ok, but What if I’m Not a Developer or Programmer? (3 Steps)

Alright. That was a relatively deep walk-through of connection rules development.

If you’re still with me, that’s great.

We’re going to switch gears and look at this from a simple user perspective.

You don’t need to know any coding to run into this problem, and I’m going to show you how to solve it without any coding either.

It’s actually pretty simple.

#1 The Walmart Bug

But, it still makes more sense when you know more about what went wrong.

So, I’m going to cite one of the most prolific examples of everyday 503 errors.

In 2020, Walmart’s website ran into widespread issues.

Users could browse the site just fine, but when they tried to go to a specific product page to make a purchase, they got the 503 error.

It popped up word for word as I mentioned before: Upstream connect error or disconnect reset before headers reset reason for connection failure.

People were just trying to buy some stuff, and they got hit with this crazy message.

What are you supposed to do with it?

#2 An Easy Fix

Well, the message is actually giving you very specific advice, once you know how to read it.

It’s telling you that your computer and the Walmart servers had a connection failure, and when they tried to automatically fix that connection problem, things broke down.

A quick note: I’m using the famous Walmart bug as an example, but the problems and solutions discussed here will work any time you see this message while browsing the web.

What that means is that there is some piece of information that is tied to your connection to the Walmart site that is messing up the automatic reconnect protocols.

While that might sound a little vague and mysterious, it actually tells us exactly where the problem lies.

The only information that could exist in this space would have to be stored in your browser’s cache.

This is related to your cookies.

Basically, when the error first went wrong, your computer remembered the problem, and so it just kept doing things the wrong way over and over again.

The solution requires you to make your computer forget the bad rule that it’s following.

To do that, you simply need to clear your cache and cookies.

#3 Clearing the Cache

The famous Walmart problem-plagued Chrome users, so I’ll walk you through how to do this on Google Chrome.

If you use a different browser, you can just look up how to clear cache and cookies.

Before we go through the steps, let me explain what is going to happen here.

We’re not deleting anything that is particularly important.

Your internet cache is just storing information related to the websites you visit.

Then, if you go back to that website or reload it, the stored information means that your computer doesn’t actually have to download as much information, and everything can load a little faster and easier.

So, when you delete this cache, it’s going to do a few things.

It’s going to slow down your first visit to any site that no longer has cached files.

But after you visit a site, it will build new cache files, and things will work normally.

This is also going to make your computer forget your sign-in information for any sites that require such.

Sticking with Walmart as an example, if you were signed into the website with your account, then after you clear the cache, you’re going to be automatically signed out again.

Make sure you know your passwords and usernames.

Because of this last issue, some people don’t like to clear their cache.

If you’re worried about that, then you don’t have to clear everything.

Just clear the cache back through the day when the error started.

Ok. With all of that covered, let’s go through the steps:

Look for the three dots and click on them (this opens the tools menu).
Choose “history” from the list.
Choose the time frame on the right that covers the data you want to clear.
Click on “Clear browsing data.”
Look at the checkboxes. You can choose cookies, cached images and files, and browsing history.
To be sure you resolve the 503 error, clear the cookies and cached files.
Click on “Clear Data” and you’re done.

Источник

I’m having a problem migrating my pure Kubernetes app to an Istio managed. I’m using Google Cloud Platform (GCP), Istio 1.4, Google Kubernetes Engine (GKE), Spring Boot and JAVA 11.

I had the containers running in a pure GKE environment without a problem. Now I started the migration of my Kubernetes cluster to use Istio. Since then I’m getting the following message when I try to access the exposed service.

upstream connect error or disconnect/reset before headers. reset reason: connection failure

This error message looks like a really generic. I found a lot of different problems, with the same error message, but no one was related to my problem.

Bellow the version of the Istio:

client version: 1.4.10
control plane version: 1.4.10-gke.5
data plane version: 1.4.10-gke.5 (2 proxies)

Bellow my yaml files:

apiVersion: v1
kind: ServiceAccount
metadata:
  labels:
    account: tree-guest
  name: tree-guest-service-account
---
apiVersion: v1
kind: Service
metadata:
  labels:
    app: tree-guest
    service: tree-guest
  name: tree-guest
spec:
  ports:
  - name: http
    port: 8080
    targetPort: 8080
  selector:
    app: tree-guest
---
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: tree-guest
    version: v1
  name: tree-guest-v1
spec:
  replicas: 1
  selector:
    matchLabels:
      app: tree-guest
      version: v1
  template:
    metadata:
      labels:
        app: tree-guestaz
        version: v1
    spec:
      containers:
      - image: registry.hub.docker.com/victorsens/tree-quest:circle_ci_build_00923285-3c44-4955-8de1-ed578e23c5cf
        imagePullPolicy: IfNotPresent
        name: tree-guest
        ports:
        - containerPort: 8080
      serviceAccount: tree-guest-service-account
---
apiVersion: networking.istio.io/v1alpha3
kind: Gateway
metadata:
  name: tree-guest-gateway
spec:
  selector:
    istio: ingressgateway # use istio default controller
  servers:
    - port:
        number: 80
        name: http
        protocol: HTTP
      hosts:
        - "*"
---
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
  name: tree-guest-virtual-service
spec:
  hosts:
    - "*"
  gateways:
    - tree-guest-gateway
  http:
    - match:
        - uri:
            prefix: /v1
      route:
        - destination:
            host: tree-guest
            port:
              number: 8080

To apply the yaml file I used the following argument:

kubectl apply -f <(istioctl kube-inject -f ./tree-guest.yaml)

Below the result of the Istio proxy argument, after deploying the application:

istio-ingressgateway-6674cc989b-vwzqg.istio-system SYNCED SYNCED SYNCED SYNCED 
istio-pilot-ff4489db8-2hx5f 1.4.10-gke.5 tree-guest-v1-774bf84ddd-jkhsh.default SYNCED SYNCED SYNCED SYNCED istio-pilot-ff4489db8-2hx5f 1.4.10-gke.5

If someone have a tip about what is going wrong, please let me know. I’m stuck in this problem for a couple of days.

Thanks.

Источник

Содержание

upstream connect error or disconnect/reset before headers #25734
Comments
«upstream connect error or disconnect/reset before headers. reset reason: connection failure» error for .NET Core apps run in docker-compose #15727
Comments

upstream connect error or disconnect/reset before headers #25734

Bug description
Large requests over http frequently give an error upstream connect error or disconnect/reset before headers. reset reason: connection termination . With the bookinfo application but no sidecar, sending a 3MB file fails roughly 3% of the time. With the sidecar proxy enabled, sending the same 3MB file fails roughly 10% of the time.

The detailed output from curl on a failed request is:

Affected product area (please put an X in all that apply)
[ ] Configuration Infrastructure
[ ] Docs
[ ] Installation
[ X ] Networking
[ X ] Performance and Scalability
[ ] Policies and Telemetry
[ ] Security
[ ] Test and Release
[ ] User Experience
[ ] Developer Infrastructure

Affected features (please put an X in all that apply)

[ ] Multi Cluster
[ ] Virtual Machine
[ ] Multi Control Plane

Expected behavior
The expected behavior is that I should be able to send a file to an API a thousand times in a row with zero errors. When I run the API without istio’s functionality, I can do that.

Steps to reproduce the bug
The easiest way to reproduce the bug is using the standard «bookinfo» application.

Run istioctl install —set profile=demo
Run kubectl apply -f samples/bookinfo/platform/kube/bookinfo.yaml; kubectl apply -f samples/bookinfo/networking/bookinfo-gateway.yaml
Repeatedly run curl -F ‘foo=@/path/to/large/file’ $/productpage , where you pass in some large file of a few MB’s. Some will succeed and some will fail. NOTE*

I ran the above experiment 1,000 times WITHOUT sidecar injection enabled. In that experiment, 29 of the 1,000 requests failed to complete and returned the upstream connect error .

I ran the experiment 1,000 times WITH sidecar injection enabled. Interestingly, the error rate INCREASED with the proxy enabled: 96 of 1,000 requests failed to go through; and the other 904 returned the expected response (in this case, a 405).

***NOTE: A «successful» request here should return a 405 response, as we are POSTing to a GET-only endpoint. A failure is when we get the upstream connection error. I know it’s not proper to test this way; but it’s the easiest way to replicate. Just pretend for a minute that a 405 is like a 200, and trust (or verify) that if you want to, you can replicate the same behavior with a POST endpoint—but you’ll have to deploy a different container.

Version (include the output of istioctl version —remote and kubectl version and helm version if you used Helm)

How was Istio installed?
Istio was installed as per documentation: https://istio.io/latest/docs/setup/getting-started/

Environment where bug was observed (cloud vendor, OS, etc)
Docker-Desktop on MacOS

Additionally, please consider attaching a cluster state archive by attaching
the dump file to this issue.

The text was updated successfully, but these errors were encountered:

Источник

«upstream connect error or disconnect/reset before headers. reset reason: connection failure» error for .NET Core apps run in docker-compose #15727

Description:
Hello, I have 2 .NET Core apps (Razor-pages web app and GRPC Service) running in docker-compose. Both are running in different localhost ports. If I access them via localhost, like:

http://localhost:5105/ or http://127.0.0.1:5105 — for the web app,
http://localhost:5104/ or http://127.0.0.1:5104 — for the GRPC
both are working. But when I added the envoy configuration listener and clusters and trying to access via:
http://localhost:8080/imageslibs
http://localhost:8080/imagesservice

Envoy returns the exception upstream connect error or disconnect/reset before headers. reset reason: connection failure for both apps.
The docker-compose.yml:
version: ‘3.4’

Config:
Envoy’s dockerfile:

front-envoy_1 | [2021-03-28 16:47:54.444][14][debug][http] [source/common/http/conn_manager_impl.cc:255] [C6] new stream
front-envoy_1 | [2021-03-28 16:47:54.445][14][debug][http] [source/common/http/conn_manager_impl.cc:883] [C6][S14144009116599918894] request headers complete (end_stream=true):
front-envoy_1 | ‘:authority’, ‘localhost:8080’
front-envoy_1 | ‘:path’, ‘/imageslibs’
front-envoy_1 | ‘:method’, ‘GET’
front-envoy_1 | ‘connection’, ‘keep-alive’
front-envoy_1 | ‘cache-control’, ‘max-age=0’
front-envoy_1 | ‘sec-ch-ua’, ‘»Google Chrome»;v=»89″, «Chromium»;v=»89″, «;Not A Brand»;v=»99″‘
front-envoy_1 | ‘sec-ch-ua-mobile’, ‘?0’
front-envoy_1 | ‘upgrade-insecure-requests’, ‘1’
front-envoy_1 | ‘user-agent’, ‘Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.90 Safari/537.36’
front-envoy_1 | ‘accept’, ‘text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,/;q=0.8,application/signed-exchange;v=b3;q=0.9′
front-envoy_1 | ‘sec-fetch-site’, ‘none’
front-envoy_1 | ‘sec-fetch-mode’, ‘navigate’
front-envoy_1 | ‘sec-fetch-user’, ‘?1’
front-envoy_1 | ‘sec-fetch-dest’, ‘document’
front-envoy_1 | ‘accept-encoding’, ‘gzip, deflate, br’
front-envoy_1 | ‘accept-language’, ‘en-US,en;q=0.9’
front-envoy_1 | ‘cookie’, ‘idsrv.session=NlW8VRtzuNJguQYDdVVpIA; .AspNetCore.Cookie=CfDJ8BR22IBZi6xAvAD2wBqZBlG2IUeWsw7hHPiNq4LrY2HBNRWyhGZ2gZuzRIbMi9MLO7IDORqkSIvDTuZDsLDz6RYtLccXi9x2CwlSzHS169Pgs3hs6biCcFKuriLkWZ4lpWHv4OCqZdO4lGgWmdzcrf2ctQbQOA-xPS7O7NSoQ0-a8VGjjthlIolqaxh5gYLtvvdjSI043UZWVOCb_ZDnFNiD4H_WKAtpKmdENFk_4NbSZmmQ3Indj2ty72kNNUUv8OLEswzxI5dBGA9AYI7i-lzMjbl8GjXNhplHR5j7XJTgG7i9dsF2antRfonV_IpL4sabtmLhdti-ZaumXhPewS702E_1BKo-8ELV3LOMfiE_jdkKJTPR15sCSWkSo0-nllUoQczL7de0F8KMolWK8KoB13z8E388w2juHXnmiDYQIAn3MWzKUvhH_bhgK_ZBCEExWvDqgGRRBroI90Nvg6IAwc_-PoJcPE1HE2i6ouzdkNXoBRg6IQWmelHAtDb8uI2CYzYeBu3zYrnJq28vOhAx_Qpr_y7A0GenqHyJO5cw; .AspNetCore.Antiforgery.9TtSrW0hzOs=CfDJ8Do6rlT2pe5IndjlZXmKm7GvuVL61tmcxXKqGH7eWnem071yNAndO5zwY5WDwxxHjY8CnoRIsalbkPMWIIq_ZFysZ-fkQJJdPm78T8dCxUe5DGeKiJqu5GjjEldMAkcnvmYjNYO9Ht13ldBWwzbBUqs’
front-envoy_1 |
front-envoy_1 | [2021-03-28 16:47:54.445][14][debug][http] [source/common/http/filter_manager.cc:774] [C6][S14144009116599918894] request end stream
front-envoy_1 | [2021-03-28 16:47:54.445][14][debug][router] [source/common/router/router.cc:426] [C6][S14144009116599918894] cluster ‘imageslibs’ match for URL ‘/imageslibs’
front-envoy_1 | [2021-03-28 16:47:54.446][14][debug][router] [source/common/router/router.cc:583] [C6][S14144009116599918894] router decoding headers:
front-envoy_1 | ‘:authority’, ‘localhost:8080’
front-envoy_1 | ‘:path’, ‘/imageslibs’
front-envoy_1 | ‘:method’, ‘GET’
front-envoy_1 | ‘:scheme’, ‘http’
front-envoy_1 | ‘cache-control’, ‘max-age=0’
front-envoy_1 | ‘sec-ch-ua’, ‘»Google Chrome»;v=»89″, «Chromium»;v=»89″, «;Not A Brand»;v=»99″‘
front-envoy_1 | ‘sec-ch-ua-mobile’, ‘?0’
front-envoy_1 | ‘upgrade-insecure-requests’, ‘1’
front-envoy_1 | ‘user-agent’, ‘Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.90 Safari/537.36’
front-envoy_1 | ‘accept’, ‘text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,/;q=0.8,application/signed-exchange;v=b3;q=0.9′
front-envoy_1 | ‘sec-fetch-site’, ‘none’
front-envoy_1 | ‘sec-fetch-mode’, ‘navigate’
front-envoy_1 | ‘sec-fetch-user’, ‘?1’
front-envoy_1 | ‘sec-fetch-dest’, ‘document’
front-envoy_1 | ‘accept-encoding’, ‘gzip, deflate, br’
front-envoy_1 | ‘accept-language’, ‘en-US,en;q=0.9’
front-envoy_1 | ‘cookie’, ‘idsrv.session=NlW8VRtzuNJguQYDdVVpIA; .AspNetCore.Cookie=CfDJ8BR22IBZi6xAvAD2wBqZBlG2IUeWsw7hHPiNq4LrY2HBNRWyhGZ2gZuzRIbMi9MLO7IDORqkSIvDTuZDsLDz6RYtLccXi9x2CwlSzHS169Pgs3hs6biCcFKuriLkWZ4lpWHv4OCqZdO4lGgWmdzcrf2ctQbQOA-xPS7O7NSoQ0-a8VGjjthlIolqaxh5gYLtvvdjSI043UZWVOCb_ZDnFNiD4H_WKAtpKmdENFk_4NbSZmmQ3Indj2ty72kNNUUv8OLEswzxI5dBGA9AYI7i-lzMjbl8GjXNhplHR5j7XJTgG7i9dsF2antRfonV_IpL4sabtmLhdti-ZaumXhPewS702E_1BKo-8ELV3LOMfiE_jdkKJTPR15sCSWkSo0-nllUoQczL7de0F8KMolWK8KoB13z8E388w2juHXnmiDYQIAn3MWzKUvhH_bhgK_ZBCEExWvDqgGRRBroI90Nvg6IAwc_-PoJcPE1HE2i6ouzdkNXoBRg6IQWmelHAtDb8uI2CYzYeBu3zYrnJq28vOhAx_Qpr_y7A0GenqHyJO5cw; .AspNetCore.Antiforgery.9TtSrW0hzOs=CfDJ8Do6rlT2pe5IndjlZXmKm7GvuVL61tmcxXKqGH7eWnem071yNAndO5zwY5WDwxxHjY8CnoRIsalbkPMWIIq_ZFysZ-fkQJJdPm78T8dCxUe5DGeKiJqu5GjjEldMAkcnvmYjNYO9Ht13ldBWwzbBUqs’
front-envoy_1 | ‘x-forwarded-proto’, ‘http’
front-envoy_1 | ‘x-request-id’, ‘6def488d-7020-4a79-acee-d1bd5a9f7252’
front-envoy_1 | ‘x-envoy-expected-rq-timeout-ms’, ‘15000’
front-envoy_1 |
front-envoy_1 | [2021-03-28 16:47:54.446][14][debug][pool] [source/common/http/conn_pool_base.cc:79] queueing stream due to no available connections
front-envoy_1 | [2021-03-28 16:47:54.446][14][debug][pool] [source/common/conn_pool/conn_pool_base.cc:229] trying to create new connection
front-envoy_1 | [2021-03-28 16:47:54.446][14][debug][pool] [source/common/conn_pool/conn_pool_base.cc:132] creating a new connection
front-envoy_1 | [2021-03-28 16:47:54.446][14][debug][client] [source/common/http/codec_client.cc:41] [C8] connecting
front-envoy_1 | [2021-03-28 16:47:54.446][14][debug][connection] [source/common/network/connection_impl.cc:861] [C8] connecting to 127.0.0.1:5105
front-envoy_1 | [2021-03-28 16:47:54.446][14][debug][connection] [source/common/network/connection_impl.cc:880] [C8] connection in progress
front-envoy_1 | [2021-03-28 16:47:54.446][14][debug][connection] [source/common/network/connection_impl.cc:671] [C8] delayed connection error: 111
front-envoy_1 | [2021-03-28 16:47:54.447][14][debug][connection] [source/common/network/connection_impl.cc:243] [C8] closing socket: 0
front-envoy_1 | [2021-03-28 16:47:54.447][14][debug][client] [source/common/http/codec_client.cc:101] [C8] disconnect. resetting 0 pending requests
front-envoy_1 | [2021-03-28 16:47:54.447][14][debug][pool] [source/common/conn_pool/conn_pool_base.cc:380] [C8] client disconnected, failure reason:
front-envoy_1 | [2021-03-28 16:47:54.447][14][debug][router] [source/common/router/router.cc:1040] [C6][S14144009116599918894] upstream reset: reset reason: connection failure, transport failure reason:
front-envoy_1 | [2021-03-28 16:47:54.447][14][debug][http] [source/common/http/filter_manager.cc:858] [C6][S14144009116599918894] Sending local reply with details upstream_reset_before_response_started
front-envoy_1 | [2021-03-28 16:47:54.447][14][debug][http] [source/common/http/conn_manager_impl.cc:1454] [C6][S14144009116599918894] encoding headers via codec (end_stream=false):
front-envoy_1 | ‘:status’, ‘503’
front-envoy_1 | ‘content-length’, ’91’
front-envoy_1 | ‘content-type’, ‘text/plain’
front-envoy_1 | ‘date’, ‘Sun, 28 Mar 2021 16:47:54 GMT’
front-envoy_1 | ‘server’, ‘envoy’

Here is the localhost:9999/clusters output:

imageslibs::default_priority::max_connections::1024
imageslibs::default_priority::max_pending_requests::1024
imageslibs::default_priority::max_requests::1024
imageslibs::default_priority::max_retries::3
imageslibs::high_priority::max_connections::1024
imageslibs::high_priority::max_pending_requests::1024
imageslibs::high_priority::max_requests::1024
imageslibs::high_priority::max_retries::3
imageslibs::added_via_api::false
imageslibs::127.0.0.1:5105::cx_active::0
imageslibs::127.0.0.1:5105::cx_connect_fail::2
imageslibs::127.0.0.1:5105::cx_total::2
imageslibs::127.0.0.1:5105::rq_active::0
imageslibs::127.0.0.1:5105::rq_error::2
imageslibs::127.0.0.1:5105::rq_success::0
imageslibs::127.0.0.1:5105::rq_timeout::0
imageslibs::127.0.0.1:5105::rq_total::0
imageslibs::127.0.0.1:5105::hostname::127.0.0.1
imageslibs::127.0.0.1:5105::health_flags::healthy
imageslibs::127.0.0.1:5105::weight::1
imageslibs::127.0.0.1:5105::region::
imageslibs::127.0.0.1:5105::zone::
imageslibs::127.0.0.1:5105::sub_zone::
imageslibs::127.0.0.1:5105::canary::false
imageslibs::127.0.0.1:5105::priority::0
imageslibs::127.0.0.1:5105::success_rate::-1.0
imageslibs::127.0.0.1:5105::local_origin_success_rate::-1.0
secure_imageslibs::default_priority::max_connections::1024
secure_imageslibs::default_priority::max_pending_requests::1024
secure_imageslibs::default_priority::max_requests::1024
secure_imageslibs::default_priority::max_retries::3
secure_imageslibs::high_priority::max_connections::1024
secure_imageslibs::high_priority::max_pending_requests::1024
secure_imageslibs::high_priority::max_requests::1024
secure_imageslibs::high_priority::max_retries::3
secure_imageslibs::added_via_api::false
secure_imageslibs::127.0.0.1:9105::cx_active::0
secure_imageslibs::127.0.0.1:9105::cx_connect_fail::0
secure_imageslibs::127.0.0.1:9105::cx_total::0
secure_imageslibs::127.0.0.1:9105::rq_active::0
secure_imageslibs::127.0.0.1:9105::rq_error::0
secure_imageslibs::127.0.0.1:9105::rq_success::0
secure_imageslibs::127.0.0.1:9105::rq_timeout::0
secure_imageslibs::127.0.0.1:9105::rq_total::0
secure_imageslibs::127.0.0.1:9105::hostname::127.0.0.1
secure_imageslibs::127.0.0.1:9105::health_flags::healthy
secure_imageslibs::127.0.0.1:9105::weight::1
secure_imageslibs::127.0.0.1:9105::region::
secure_imageslibs::127.0.0.1:9105::zone::
secure_imageslibs::127.0.0.1:9105::sub_zone::
secure_imageslibs::127.0.0.1:9105::canary::false
secure_imageslibs::127.0.0.1:9105::priority::0
secure_imageslibs::127.0.0.1:9105::success_rate::-1.0
secure_imageslibs::127.0.0.1:9105::local_origin_success_rate::-1.0
imagesservice::default_priority::max_connections::1024
imagesservice::default_priority::max_pending_requests::1024
imagesservice::default_priority::max_requests::1024
imagesservice::default_priority::max_retries::3
imagesservice::high_priority::max_connections::1024
imagesservice::high_priority::max_pending_requests::1024
imagesservice::high_priority::max_requests::1024
imagesservice::high_priority::max_retries::3
imagesservice::added_via_api::false
imagesservice::127.0.0.1:5104::cx_active::0
imagesservice::127.0.0.1:5104::cx_connect_fail::1
imagesservice::127.0.0.1:5104::cx_total::1
imagesservice::127.0.0.1:5104::rq_active::0
imagesservice::127.0.0.1:5104::rq_error::1
imagesservice::127.0.0.1:5104::rq_success::0
imagesservice::127.0.0.1:5104::rq_timeout::0
imagesservice::127.0.0.1:5104::rq_total::0
imagesservice::127.0.0.1:5104::hostname::127.0.0.1
imagesservice::127.0.0.1:5104::health_flags::healthy
imagesservice::127.0.0.1:5104::weight::1
imagesservice::127.0.0.1:5104::region::
imagesservice::127.0.0.1:5104::zone::
imagesservice::127.0.0.1:5104::sub_zone::
imagesservice::127.0.0.1:5104::canary::false
imagesservice::127.0.0.1:5104::priority::0
imagesservice::127.0.0.1:5104::success_rate::-1.0
imagesservice::127.0.0.1:5104::local_origin_success_rate::-1.0

The text was updated successfully, but these errors were encountered:

Источник

Bug description

We are seeing a upstream connect error with a 503 on istio-envoy when trying to access a context path on a Jetty/JDK 11 based service running inside our GKE On-Prem Kubernetes cluster:
Previously SSL connections were working fine when running on both Tomcat JDK 9 and Jetty JDK 9 based container images (and were being terminated at the proxy level).

We are seeing the following error in the istio-proxy sidecar logs on the service:

{
  "original_message": "[Envoy (Epoch 0)] [2020-03-25 03:50:01.631][19][warning][config] [bazel-out/k8-opt/bin/external/envoy/source/common/config/_virtual_includes/grpc_stream_lib/common/config/grpc_stream.h:91] gRPC config stream closed: 13, "
}

We also have a istio-ingressgateway-external defined on our istio-system namespace and are using the Big-IP kubernetes controller to map all of the services to our Virtual F5 Loadbalancer. We see the following when enabling tracing on the istio-ingressgateway-external pod:

[Envoy (Epoch 0)] [2020-03-25 19:23:10.048][27][debug][pool] [external/envoy/source/common/http/http1/conn_pool.cc:136] [C990786] client disconnected, failure reason: TLS error: 268435703:SSL routines:OPENSSL_internal:WRONG_VERSION_NUMBER
[Envoy (Epoch 0)] [2020-03-25 19:23:10.048][27][debug][pool] [external/envoy/source/common/http/http1/conn_pool.cc:167] [C990786] purge pending, failure reason: TLS error: 268435703:SSL routines:OPENSSL_internal:WRONG_VERSION_NUMBER

Response headers:

content-length: 91
content-type: text/plain
date: Wed, 25 Mar 2020 21:02:40 GMT
server: istio-envoy
status: 503

Body:

upstream connect error or disconnect/reset before headers. reset reason: connection failure

Expected behavior

HTTP 200 response.

Steps to reproduce the bug

Create a Gateway and VirtualService with a rule to route traffic based on context path to a Jetty Based Spring Boot Application running on JDK 11 (Kubernetes service and deployment). Create a corresponding Destination Rule of ISTIO_MUTUAL (mTLS). Create the following network policy at the namespace level:

- apiVersion: authentication.istio.io/v1alpha1
  kind: Policy
  metadata:
    annotations:
      generation: 1
      name: mtls
      namespace: application-namespace
  spec:
    peers:
    - mtls: {}
    targets:
    - name: jetty-services
kind: List
metadata:
  resourceVersion: ""
  selfLink: ""

Version (include the output of istioctl version --remote and kubectl version and helm version if you used Helm)

istioctl version —remote
client version: 1.4.6-asm.0
control plane version: 1.4.6-asm.0
data plane version: 1.4.6-asm.0 (22 proxies), 1.3.3 (3 proxies)

kubectl version
Client Version: version.Info{Major:»1″, Minor:»12″, GitVersion:»v1.12.8″, GitCommit:»a89f8c11a5f4f132503edbc4918c98518fd504e3″, GitTreeState:»clean», BuildDate:»2019-04-23T04:52:31Z», GoVersion:»go1.10.8″, Compiler:»gc», Platform:»linux/amd64″}
Server Version: version.Info{Major:»1″, Minor:»14+», GitVersion:»v1.14.7-gke.24″, GitCommit:»a9c61ae0b1b75106e960849079e243e8a054b580″, GitTreeState:»clean», BuildDate:»2019-11-15T22:28:40Z», GoVersion:»go1.12.11b4″, Compiler:»gc», Platform:»linux/amd64″}

Using jetty 9.4.19.v20190610 and OpenJDK 11 OpenJ9 based base image (adoptopenjdk/openjdk11-openj9:x86_64-alpine-jdk-11.0.6_10_openj9-0.18.1-slim).

How was Istio installed?

Istio 1.4.6-asm.0 was installed using istioctl manifest.

Environment where bug was observed (cloud vendor, OS, etc)
GKE On-Prem running on VMWare/vCenter, k8s nodes running Ubuntu 18.04.3 LTS

Источник

This page describes how to troubleshoot errors that you receive in a response
from a request to your API.

`BAD_GATEWAY`

If you receive error code 13 and the message BAD_GATEWAY, this indicates
that the Extensible Service Proxy (ESP) can’t reach the service’s backend.
Check the following:

Make sure the backend service is running. How you do that depends on
the backend.
- For the App Engine flexible environment, the error code for the
  BAD_GATEWAY message might be 502. See the
  Errors specific to App Engine flexible environment
  section for more information.
- For Compute Engine see
  Troubleshooting Cloud Endpoints on Compute Engine
  for details.
- For GKE, you need to use SSH to access the pod and use
  curl. See
  Troubleshooting
  Endpoints in Google Kubernetes Engine
  for details.
The correct IP address port of the backend service is specified:
- For GKE, check the ESP --backend flag value
  (the short option is -a) in your deployment manifest file (often
  called deployment.yaml).
- For Compute Engine check the ESP --backend flag value
  (the short option is -a) in the docker run command.

`reset reason: connection failure`

If you receive HTTP code 503 or gRPC code 14 and the message upstream connect error or disconnect/reset before headers. reset reason: connection failure, this indicates
that ESPv2 can’t reach the service’s backend.

To troubleshoot, double check the items below.

Backend Address

ESPv2 should be configured with the correct backend address. Common issues include:

The scheme of the backend address should match the backend application type.
OpenAPI backends should be http:// and gRPC backends should be grpc://.
For ESPv2 deployed on Cloud Run, the scheme of the backend address should be either https:// or grpcs://.
The s tells ESPv2 to set up TLS with the backend.

DNS Lookup

By default, ESPv2 attempts to resolve domain names to IPv6 addresses.
If the IPv6 resolution fails, ESPv2 falls back to IPv4 addresses.

For some networks, the fallback mechanism may not work as intended.
Instead, you can force ESPv2 to use IPv4 addresses via the
--backend_dns_lookup_family flag.

This error is common if you configure a Serverless VPC Connector
for ESPv2 deployed on Cloud Run. VPCs do not support IPv6 traffic.

`API is not enabled for the project`

If you sent an API key in the request, an error message like «API
my-api.endpoints.example-project-12345.cloud.goog is not enabled for the
project» indicates that the API key was created in a different Google Cloud
project than the API. To fix this, you can either
create the API key
in the same Google Cloud project that the API is associated with, or you
can
enable the API
in the Google Cloud project that
the API key was created in.

`Service control request failed with HTTP response code 403`

If you receive error code 14 and the message Service control request failed with HTTP response code 403, this indicates that the Service Control API
(servicecontrol.googleapis.com) isn’t enabled on the project.

See Checking required services
to make sure that all the services that Endpoints and
ESP require are enabled on your project.
See Checking required permissions
to make sure that all the required permissions to the service account associated with the instance running ESP.

`Method doesn't allow unregistered callers`

ESP responds with the error,
Method doesn't allow unregistered callers, when you have specified an API key
in the security section in your OpenAPI document, but the request to your API
doesn’t have an API key assigned to a query parameter named key.

If you need to generate an API key to make calls to your API, see
Creating an API key.

`Method does not exist`

The response, Method does not exist, means that the HTTP method
(GET, POST, or other) on the specified URL path wasn’t found. To
troubleshoot, compare the service configuration that you have deployed to make
sure that the method name and URL path that you are sending in the request
match:

In the Google Cloud console, go to the Endpoints Services page for
your project.

Go to the Endpoints Services page
If you have more than one API, select the API that you sent the request to.
Click the Deployment history tab.
Select the latest deployment to see the service configuration.

If you don’t see the method you are calling specified in the paths section
of your OpenAPI document, either add the method, or add the x-google-allow
flag at the top level of the file:

x-google-allow: all

This flag means that you can avoid listing all methods supported in your backend
in your OpenAPI document. When all is used, all calls—with or without an
API key or user authentication—pass through ESP to your
API. See
x-google-allow
for more information.

Errors specific to the App Engine flexible environment

This section describes error responses from APIs deployed on the
App Engine flexible environment.

Error code `502` or `503`

App Engine may take a few minutes to respond successfully to requests.
If you send a request and get back an HTTP 502, 503, or some other server
error, wait a minute and try the request again.

Error message `BAD_GATEWAY`

An error code 502 with BAD_GATEWAY in the message usually indicates that
App Engine terminated the application because it ran out of memory.
The default App Engine flexible VM only has 1GB of memory, with only
600MB available for the application container.

To troubleshoot error code 502:

In the Google Cloud console, go to the Logging page:

Go to the Logs Explorer page
Select the applicable Google Cloud project at the top of the page.
Select Google App Engine Application and open vm.syslog.

Look for a log entry similar to the following:

kernel: [  133.706951] Out of memory: Kill process 4490 (java) score 878 or sacrifice child
kernel: [  133.714468] Killed process 4306 (java) total-vm:5332376kB, anon-rss:2712108kB, file-rss:0kB

If you see an Out of memory entry in the log:

Add the following to your app.yaml file to increase the size of the
default VM:
```
resources:
  memory_gb: 4
```
Redeploy your API:
```
gcloud app deploy
```

If you have the rollout_strategy: managed option specified in the
endpoints_api_service section of the app.yaml file, use the following command
to redeploy your API:

  gcloud app deploy

See
Deploying your API and ESP
for more information.

Checking the Cloud Logging logs

To use the Cloud Logging logs to help troubleshoot response errors:

In the Google Cloud console, go to the Logging page.

Go to the Logs Explorer page
At the top of the page, select the Google Cloud project.
Using the drop-down menu on the left, select Produced API >
[YOUR_SERVICE_NAME].
Adjust the time range until you see a row that shows your response error.
Expand the JSON payload and look for error_cause.
- If the error_cause is set to application, this indicates an issue in
  your code.
- If the error cause is anything else and you are unable to fix the issue,
  export the log
  and include it in any communication that you have with Google.

See the following for more information:

For details on the structure of the logs in the Logs Explorer, see the
Endpoints logs reference
Get started using the Logs Explorer.
Use Advanced log queries
for advanced filtering, such as getting all requests with a latency greater
than 300 milliseconds.

Issues with the example Invoke-WebRequest

In some versions of Windows PowerShell, the example Invoke-WebRequest in the
tutorials
fails. We have also received a report that the response contained a list of
unsigned bytes that had to be converted to characters. If the example
Invoke-WebRequest didn’t return the expected result, try sending the request
using another application. Following are a few suggestions:

Start Cloud Shell
and follow the Linux steps in the tutorial that you were using to send the
request.
Install a third-party application, such as the Chrome browser extension Postman
(offered by www.getpostman.com). When creating the request in Postman:
- Select POST as the HTTP verb.
- For the header, select the key content-type and the value
  application/json.
- For the body, enter: {"message":"hello world"}
- In the URL, use the actual API key rather than the environment variable.
  For example:
  - On the App Engine flexible environment: https://example-project-12345.appspot.com/echo?key=AIza...
  - On other backends: http://192.0.2.0:80/echo?key=AIza...
Download and install curl, which you
run in the command prompt. Because Windows doesn’t handle double quotation
marks nested inside single quotation marks, you have to change the --data
option in the example to:
```
--data "{"message":"hello world"}"
```

Источник

#1 Reconfiguring Service Routes

#2 Setting Destination Rules

#2 Traffic Management With Applications

Ok, but What if I’m Not a Developer or Programmer? (3 Steps)

#1 The Walmart Bug

#2 An Easy Fix

#3 Clearing the Cache

upstream connect error or disconnect/reset before headers #25734

«upstream connect error or disconnect/reset before headers. reset reason: connection failure» error for .NET Core apps run in docker-compose #15727

BAD_GATEWAY

reset reason: connection failure

Backend Address

DNS Lookup

API is not enabled for the project

Service control request failed with HTTP response code 403

Method doesn't allow unregistered callers

Method does not exist

Errors specific to the App Engine flexible environment

Error code 502 or 503

Error message BAD_GATEWAY

Checking the Cloud Logging logs

Issues with the example Invoke-WebRequest

Читайте также:

`BAD_GATEWAY`

`reset reason: connection failure`

`API is not enabled for the project`

`Service control request failed with HTTP response code 403`

`Method doesn't allow unregistered callers`

`Method does not exist`

Error code `502` or `503`

Error message `BAD_GATEWAY`