У нас есть gitlab pipeline, который раскатывает релизы в Kubernetes кластер. Подключение к GitLab не всегда оптимально, поэтому может случиться так, что развертывание не удастся из-за тайм-аута / проблем с сетью. Когда такое происходит то следующий запуск pipeline завершается неудачно, потому что helm сообщает об
ошибке:
Error: UPGRADE FAILED: another operation (install/upgrade/rollback) is in progress
helm history tax-service
REVISION UPDATED STATUS CHART APP VERSION DESCRIPTION 1 Mon Apr 19 10:31:54 2021 superseded helm-1.0.0 1.0.0-ddc8267b Upgrade complete 2 Mon Apr 19 10:52:33 2021 superseded helm-1.0.0 1.0.0-522eebc2 Upgrade complete 3 Mon Apr 19 10:54:33 2021 pending-upgrade helm-1.0.0 1.0.0-39118b96 Preparing upgrade |
откатываемся:
helm rollback tax-service 1
REVISION UPDATED STATUS CHART APP VERSION DESCRIPTION 4 Mon Apr 19 11:31:54 2021 superseded helm-1.0.0 1.0.0-ddc8267b Rollback to 1 5 Mon Apr 19 11:32:12 2021 deployed helm-1.0.0 1.0.0-fb2a5654 Upgrade complete |
проверяем:
helm history tax-service
REVISION UPDATED STATUS CHART APP VERSION DESCRIPTION 1 Mon Apr 19 10:31:54 2021 superseded helm-1.0.0 1.0.0-ddc8267b Upgrade complete 2 Mon Apr 19 10:52:33 2021 superseded helm-1.0.0 1.0.0-522eebc2 Upgrade complete 3 Mon Apr 19 10:54:33 2021 pending-upgrade helm-1.0.0 1.0.0-39118b96 Preparing upgrade 4 Mon Apr 19 11:31:54 2021 superseded helm-1.0.0 1.0.0-ddc8267b Rollback to 1 5 Mon Apr 19 11:32:12 2021 deployed helm-1.0.0 1.0.0-fb2a5654 Upgrade complete |
после этого можно проверять и запускать pipeline
https://github.com/midnight47/
Upgraded from Helm 3.3 to Helm 3.4, existing charts started failing the upgrade with the message:
Error: UPGRADE FAILED: another operation (install/upgrade/rollback) is in progress
At the same time a helm list -n myns the chart disappeared and didn’t show up in the list at all
This is a chart that’s been upgraded over 800 times successfully, only change was the helm version bump, chart failed twice in an attempt to deploy with the command:
helm upgrade --install --namespace myns --timeout 1800s --atomic mychart charts/app/standalone --values values-override.yaml
Once I rolled back to 3.3 I was able to upgrade the chart successfully.
Output of helm version
:
version.BuildInfo{Version:"v3.4.0", GitCommit:"7090a89efc8a18f3d8178bf47d2462450349a004", GitTreeState:"dirty", GoVersion:"go1.15.3"}
Output of kubectl version
:
Client Version: version.Info{Major:"1", Minor:"16", GitVersion:"v1.16.13", GitCommit:"39a145ca3413079bcb9c80846488786fed5fe1cb", GitTreeState:"clean", BuildDate:"2020-07-15T16:18:19Z", GoVersion:"go1.13.9", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"16+", GitVersion:"v1.16.13-gke.401", GitCommit:"eb94c181eea5290e9da1238db02cfef263542f5f", GitTreeState:"clean", BuildDate:"2020-09-09T00:57:35Z", GoVersion:"go1.13.9b4", Compiler:"gc", Platform:"linux/amd64"}
Cloud Provider/Platform (AKS, GKE, Minikube etc.): GKE
All 28 comments
Hi @jlcrow.
We ask to please provide a test case or some way for the maintainers to test and reproduce the error you are experiencing. This way we can better help understand the issue and determine if there’s a fix we can apply to the next release. In most cases a simple set of steps to reproduce the issue on our end can help expedite the process.
Thanks.
@jlcrow Yikes. While I wouldn’t ask you to do this in production could you do an upgrade while using the --debug
and -v 6
flags set. This will provide more diverse output that can help us find the problem.
@jlcrow Yikes. While I wouldn’t ask you to do this in production could you do an upgrade while using the
--debug
and-v 6
flags set. This will provide more diverse output that can help us find the problem.
Thanks @mattfarina, I’m pushing up a branch for deployment and will test with both versions of helm to see if I can reproduce
I have the same issues. I’ll try to get a debug output if possible.
I noticed this issue usually happens when you upgrade a helm chart with —wait and the upgrade clearly fails (like a crashloopbackoff or something like that) and helm is waiting until it reaches the timeout but the user does CTRL+C before reaching the timeout. After that I’ll get the same error as posted above:
STDERR:
Error: UPGRADE FAILED: another operation (install/upgrade/rollback) is in progress
I’m using helmfile, not helm directly, maybe its another problem with helmfile not sending the SIGTERM correctly.
Hi @bacongobbler, the described workaround did indeed works!
$ helm history kyc-api
REVISION UPDATED STATUS CHART APP VERSION DESCRIPTION
1 Mon Nov 9 14:57:36 2020 pending-install generic-base-0.2.1 0.1.0 Initial install underway
$ helm rollback kyc-api 1
Rollback was a success! Happy Helming!
$ helm history kyc-api
REVISION UPDATED STATUS CHART APP VERSION DESCRIPTION
1 Mon Nov 9 14:57:36 2020 pending-install generic-base-0.2.1 0.1.0 Initial install underway
2 Mon Nov 9 15:06:15 2020 deployed generic-base-0.2.1 0.1.0 Rollback to 1
Looking at the werf/helm PR pretty much confirms that CTRL+C breaks the helm installation on 3.4.0.
Thanks for following up. Closing as a duplicate of #4558.
@bacongobbler I don’t understand how this was determined to be the same issue? There’s no ctrl+c running in my pipeline.
I still think there is a regression in 3.4.0 since I never had this issue before.
I’ve tried to reproduce this in a branch of my project and have been unable to at this point.
Okay. Have either of you been able to reproduce this issue?
I know that this particular error can occur when the previous release is in a PENDING_UPGRADE or some other transitional state — usually occurs due to a timeout or someone interrupting the upgrade midway through (hence why I linked to #4558). I have not heard of a regression in Helm 3.4.0 causing an upgrade to enter a transitional state other than through the ticket I linked earlier… Which isn’t isolated to just Helm 3.4.
@bacongobbler I haven’t had luck reproducing the issue, I’ve just bumped to 3.4.1 and upgraded the same deployment that previously failed under 3.4.0, will just assume the issue is resolved unless I see something else. Thanks for everything.
@bacongobbler I ran into this again this morning in one of our pipelines and did confirm that it appears to be a duplicate of https://github.com/helm/helm/issues/4558, it seems to be a bad timing issue where the pipeline kills the previous deployment while it’s in the middle of a deploy, I’m surprised we haven’t run into this before. I confirmed the workaround was successful, I ran a rollback to the last successful deployment in the history and then followed up with the new deployment and everything is back to normal.
Run ‘helm history —all’, job is probably pending, you’ll have to rollback to last successful deployment.
On Nov 29, 2020, at 5:39 PM, Victor Login notifications@github.com wrote:
my out:helm upgrade shortlink-api ops/Helm/shortlink-api —install —wait —namespace=shortlink —set deploy.image.tag=0.7.0.16 —debug -v 6
history.go:53: [debug] getting history for release shortlink-api
upgrade.go:121: [debug] preparing upgrade for shortlink-api
Error: UPGRADE FAILED: another operation (install/upgrade/rollback) is in progress
helm.go:81: [debug] another operation (install/upgrade/rollback) is in progress
helm.sh/helm/v3/pkg/action.init
/home/circleci/helm.sh/helm/pkg/action/action.go:62
runtime.doInit
/usr/local/go/src/runtime/proc.go:5474
runtime.doInit
/usr/local/go/src/runtime/proc.go:5469
runtime.main
/usr/local/go/src/runtime/proc.go:190
runtime.goexit
/usr/local/go/src/runtime/asm_amd64.s:1373
UPGRADE FAILED
main.newUpgradeCmd.func2
/home/circleci/helm.sh/helm/cmd/helm/upgrade.go:156
github.com/spf13/cobra.(Command).execute
/go/pkg/mod/github.com/spf13/[email protected]/command.go:842
github.com/spf13/cobra.(Command).ExecuteC
/go/pkg/mod/github.com/spf13/[email protected]/command.go:950
github.com/spf13/cobra.(*Command).Execute
/go/pkg/mod/github.com/spf13/[email protected]/command.go:887
main.main
/home/circleci/helm.sh/helm/cmd/helm/helm.go:80
runtime.main
/usr/local/go/src/runtime/proc.go:203
runtime.goexit
/usr/local/go/src/runtime/asm_amd64.s:1373
helm ls —allNAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION
GitLab CI Pipeline Job https://gitlab.com/shortlink-org/shortlink/-/jobs/879030237—
You are receiving this because you modified the open/close state.
Reply to this email directly, view it on GitHub, or unsubscribe.
why is this issue still closed? repro: on empty cluster
helm upgrade --install --version=3.13.0 --create-namespace --namespace ingress-nginx-2 --set controller.kind=DaemonSet --set controller.service.type=LoadBalancer --set controller.service.loadBalancerIP=127.0.0.1 ingress-nginx-2 ingress-nginx/ingress-nginx
and ^C it
I can reproduce with helm upgrade --install --atomic
and interrupting it during the execution.
The second run will always return an error:
Error: UPGRADE FAILED: another operation (install/upgrade/rollback) is in progress
Can be solved by:
helm rollback
I guess this proposal could solve this issue https://github.com/helm/helm/issues/8040
I also have the same issue described on this thread with v3.4.1
helm rollback
can be a workaround for development machines, but unacceptable on production CI/CD pipelines
When the helm CLI receives a SIGTERM signal, it should exit gracefully leaving helm labels in a stable state, allowing further deployments without issues
The issue is not fixed yet and should be open again for further research
«fixed» by deleting all helm secrets
«fixed» by deleting all helm secrets
you don’t need to delete all helm secrets but only the last one. it sounds like a workaround but not like fix.
on v3.3.4 such case handled fine (see a picture attached). Using helm in GitLab CI and job cancelation became a problem after upgrading to 3.4.0.
v3.5.1 has the same issue too.
I am having the same issue without even the chart being initially present..
▶ helm3 ls -A | grep -i cert
▶ helm3 upgrade cert-manager --install --set installCRDs=true --namespace extra-services --version v1.1.0 jetstack/cert-manager
Error: UPGRADE FAILED: another operation (install/upgrade/rollback) is in progress
cert-manager
chart (the one I am trying to install) does not even exist on my cluster.
Using
▶ helm3 version --short
v3.4.2+g23dd3af
I just tried it with
▶ helm3 version --short
v3.5.1+g32c2223
the end result is the same.
Why was this issue closed?
The OP determined his issue was a duplicate of #4558. As #4558 describes, there are a few cases where a helm upgrade
can enter the PENDING_UPGRADE state in the event of a timeout. A helm rollback && helm upgrade
resolves the issue; hence why it was closed as a duplicate of #4558 (the symptoms and the workaround is identical).
If you do not believe you are experiencing the same issue as the OP, please open a new ticket.
the problem is that helm rollback && helm upgrade
is not a suitable solution for production deployments
Experienced this on Helm v3.5.2
caused by CTRL+C pressed during upgrade.
Workaround: kubectl delete secret sh.helm.release.v1.<RELEASE_NAME>.v<LATEST_REVISION>
I’ve seen this for helm installs too so no working history where one could rollback to.
Issue was dying k8s api so helm controller of flux2 was dying while installing the chart, what likely has the same effect as ctrl+c.
Had this issue because I tried to cancel a helm deployment from the command line. The workaround suggested by @Skaronator using a rollback
got me past the error. Helm history
looks like this now:
35 Thu Mar 18 10:09:15 2021 superseded wordpress-0.1.6 5.4.2 Upgrade complete
36 Thu Mar 18 10:32:16 2021 superseded wordpress-0.1.6 5.4.2 Rollback to 34
37 Thu Mar 18 10:42:48 2021 pending-upgrade wordpress-0.1.6 5.4.2 Preparing upgrade
38 Thu Mar 18 10:48:11 2021 superseded wordpress-0.1.6 5.4.2 Rollback to 36
39 Thu Mar 18 10:49:02 2021 deployed wordpress-0.1.6 5.4.2 Upgrade complete
Was this page helpful?
0 / 5 — 0 ratings
Tags:
kubernetes
kubernetes-helm
I tried to run a Helm upgrade before running helm repo update
and now it seems to be permanently stuck in «STATUS: pending-upgrade» and won’t let me try to run the upgrade again.
Trying to run:
helm upgrade --namespace coder --install --force --atomic --wait --version 1.13.2 --values ./coder.yaml coder coder/coder
outputs: Error: UPGRADE FAILED: another operation (install/upgrade/rollback) is in progress
212
People also ask
How do you abort a Helm upgrade?
There is no way of stopping it without restarting helm-controller with kubectl -n flux-system delete po helm-controller-xxxx . Please create an issue for this feature request in the https://github.com/fluxcd/helm-controller repository.
How do you fix error upgrade failed another operation install upgrade rollback is in progress?
This error can happen for few reasons, but it most commonly occurs when there is an interruption during the upgrade/install process as you already mentioned. To fix this one may need to, first rollback to another version, then reinstall or helm upgrade again.
How do I force delete Helm release?
If you need to uninstall the deployed release, run the delete command on the Helm command line. The command removes all the Kubernetes components that are associated with the chart and deletes the release.
What are the helm Tiller Kubernetes integration errors?
GitLab Kubernetes integration error; configuration of Helm Tiller already exists 7 Helm install or upgrade release failed on Kubernetes cluster: the server could not find the requested resource or UPGRADE FAILED: no deployed releases 3 Error: validation failed: unable to recognize «»: no matches for kind «Deployment» in version «» 95
How to deploy resources in Kubernetes cluster using Helm charts?
You are deploying resources (pods, service and deployments etc) in Kubernetes cluster using Helm charts. This is deployed by release pipeline in Azure DevOps. In the Release pipeline you start getting errors like below and the task fails to deploy the resources. This started happening for Helm chart versions 3.3.0 and later.
What are the common Kubernetes errors in GitLab?
6 GitLab Kubernetes integration error; configuration of Helm Tiller already exists 7 Helm install or upgrade release failed on Kubernetes cluster: the server could not find the requested resource or UPGRADE FAILED: no deployed releases 3 Error: validation failed: unable to recognize «»: no matches for kind «Deployment» in version «» 95
Why are updates in Helm not consistent?
This is the reason why updates seem to occur in a non consistent fashion. Refer to #5473 (comment) for an example of a pod issue. Helm does not support 3-way merge and therefore if changes are made directly to the cluster, they will not be picked up on an upgrade.
Video Answer
Your browser does not support the video tag.
4 Answers
TLDR: You need to rollback to another version first and then helm upgrade
again:
helm rollback <release> <revision> --namespace <namespace>
This can happen for a few reasons, but it ultimately occurs when there’s an interruption during the upgrade/install process. Commonly, you SIGKILL
(Ctrl C
) while the deployment is ongoing.
You’ll notice that if you helm ls --namespace <namespace>
while it’s stuck in STATUS: pending-upgrade
state, you’ll see the following without any other information:
NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION
The best workaround currently is to rollback to another version, and then helm upgrade
again:
helm rollback <release> <revision> --namespace <namespace>
revision
is optional, but you should try to provide it.
more resources:
- https://github.com/helm/helm/issues/8987
- https://github.com/helm/helm/issues/4558
133
This solution worked for me:
kubectl get secrets
kubectl delete secret sh.helm.release.v1.<RELEASE_NAME>.v<LATEST_REVISION>
Following the resolution described in this issue
24
In case is useful to someone, and in response to explicitsoul’s comment, what fixed it to me was just:
helm delete <release> -n <namespace>
That removed the pending install (in my case, the first one so I hadn’t a previous release to rollback to) and then I was able to run the install again.
What caused the stuck process in my case was a CTRL-C canceling the install command, so don’t do that.
25
answered Oct 16 ’22 18:10
Lucas Veljacic
Here is what worked for me
-
helm list --all
This will list all the releases with their status
NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION
rel1 default 1 2021-06-04 14:15:37.652066 +0530 IST deployed rel1-3.32.0 0.46.0
rel2 default 29 2021-06-18 11:02:38.779801 +0530 IST pending-upgrade rel2-0.0.1
rel3 default 3 2021-06-17 11:27:14.608042 +0530 IST deployed rel3-0.0.1
- Notice that rel2 has status as pending-update . This happened because I
did a Ctrl+C while upgrade was in progress - All i had to do was rollback to previous revision in this case 28
helm rollback rel2 28 --namespace default
NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION
rel1 default 1 2021-06-04 14:15:37.652066 +0530 IST deployed rel1-3.32.0 0.46.0
rel2 default 30 2021-06-18 11:26:07.555547 +0530 IST deployed rel2-0.0.1
rel3 default 3 2021-06-17 11:27:14.608042 +0530 IST deployed rel3-0.0.1
24