Rpc error code canceled desc context canceled

Terraform Version Github action: hashicorp/terraform-github-actions/plan@v0.4.4 which uses Docker: hashicorp/terraform:0.12.10 Terraform Configuration Files https://github.com/ironPeakServices/infr...

Can confirm this also occurs locally. Relevant log:

2019/10/16 11:05:14 [TRACE] Executing graph transform *terraform.OrphanResourceCountTransformer
2019/10/16 11:05:14 [TRACE] Completed graph transform *terraform.OrphanResourceCountTransformer (no changes)
2019/10/16 11:05:14 [TRACE] Executing graph transform *terraform.AttachStateTransformer
2019/10/16 11:05:14 [DEBUG] Resource state not found for node "module.docker_master.module.docker.null_resource.swarm_cluster", instance module.docker_master.module.docker.null_resource.swarm_cluster
2019/10/16 11:05:14 [TRACE] Completed graph transform *terraform.AttachStateTransformer (no changes)
2019/10/16 11:05:14 [TRACE] Executing graph transform *terraform.TargetsTransformer
2019/10/16 11:05:14 [TRACE] Completed graph transform *terraform.TargetsTransformer (no changes)
2019/10/16 11:05:14 [TRACE] Executing graph transform *terraform.ReferenceTransformer
2019/10/16 11:05:14 [WARN] no schema for provisioner "remote-exec" is attached to module.docker_master.module.docker.null_resource.swarm_cluster, so provisioner block references cannot be detected
2019/10/16 11:05:14 [WARN] no schema for provisioner "local-exec" is attached to module.docker_master.module.docker.null_resource.swarm_cluster, so provisioner block references cannot be detected
2019/10/16 11:05:14 [WARN] no schema for provisioner "file" is attached to module.docker_master.module.docker.null_resource.swarm_cluster, so provisioner block references cannot be detected
2019/10/16 11:05:14 [WARN] no schema for provisioner "file" is attached to module.docker_master.module.docker.null_resource.swarm_cluster, so provisioner block references cannot be detected
2019/10/16 11:05:14 [WARN] no schema for provisioner "file" is attached to module.docker_master.module.docker.null_resource.swarm_cluster, so provisioner block references cannot be detected
2019/10/16 11:05:14 [WARN] no schema for provisioner "remote-exec" is attached to module.docker_master.module.docker.null_resource.swarm_cluster, so provisioner block references cannot be detected
2019/10/16 11:05:14 [WARN] no schema for provisioner "file" is attached to module.docker_master.module.docker.null_resource.swarm_cluster, so provisioner block references cannot be detected
2019/10/16 11:05:14 [WARN] no schema for provisioner "file" is attached to module.docker_master.module.docker.null_resource.swarm_cluster, so provisioner block references cannot be detected
2019/10/16 11:05:14 [WARN] no schema for provisioner "remote-exec" is attached to module.docker_master.module.docker.null_resource.swarm_cluster, so provisioner block references cannot be detected
2019/10/16 11:05:14 [WARN] no schema for provisioner "remote-exec" is attached to module.docker_master.module.docker.null_resource.swarm_cluster, so provisioner block references cannot be detected
2019/10/16 11:05:14 [WARN] no schema for provisioner "remote-exec" is attached to module.docker_master.module.docker.null_resource.swarm_cluster, so provisioner block references cannot be detected
2019/10/16 11:05:14 [WARN] no schema for provisioner "remote-exec" is attached to module.docker_master.module.docker.null_resource.swarm_cluster, so provisioner block references cannot be detected
2019/10/16 11:05:14 [WARN] no schema for provisioner "remote-exec" is attached to module.docker_master.module.docker.null_resource.swarm_cluster, so provisioner block references cannot be detected
2019/10/16 11:05:14 [WARN] no schema for provisioner "local-exec" is attached to module.docker_master.module.docker.null_resource.swarm_cluster, so provisioner block references cannot be detected
2019/10/16 11:05:14 [WARN] no schema for provisioner "file" is attached to module.docker_master.module.docker.null_resource.swarm_cluster, so provisioner block references cannot be detected
2019/10/16 11:05:14 [WARN] no schema for provisioner "file" is attached to module.docker_master.module.docker.null_resource.swarm_cluster, so provisioner block references cannot be detected
2019/10/16 11:05:14 [WARN] no schema for provisioner "file" is attached to module.docker_master.module.docker.null_resource.swarm_cluster, so provisioner block references cannot be detected
2019/10/16 11:05:14 [WARN] no schema for provisioner "remote-exec" is attached to module.docker_master.module.docker.null_resource.swarm_cluster, so provisioner block references cannot be detected
2019/10/16 11:05:14 [WARN] no schema for provisioner "file" is attached to module.docker_master.module.docker.null_resource.swarm_cluster, so provisioner block references cannot be detected
2019/10/16 11:05:14 [WARN] no schema for provisioner "file" is attached to module.docker_master.module.docker.null_resource.swarm_cluster, so provisioner block references cannot be detected
2019/10/16 11:05:14 [WARN] no schema for provisioner "remote-exec" is attached to module.docker_master.module.docker.null_resource.swarm_cluster, so provisioner block references cannot be detected
2019/10/16 11:05:14 [WARN] no schema for provisioner "remote-exec" is attached to module.docker_master.module.docker.null_resource.swarm_cluster, so provisioner block references cannot be detected
2019/10/16 11:05:14 [WARN] no schema for provisioner "remote-exec" is attached to module.docker_master.module.docker.null_resource.swarm_cluster, so provisioner block references cannot be detected
2019/10/16 11:05:14 [WARN] no schema for provisioner "remote-exec" is attached to module.docker_master.module.docker.null_resource.swarm_cluster, so provisioner block references cannot be detected
2019/10/16 11:05:14 [DEBUG] ReferenceTransformer: "module.docker_master.module.docker.null_resource.swarm_cluster" references: []
2019/10/16 11:05:14 [TRACE] Completed graph transform *terraform.ReferenceTransform
Error: rpc error: code = Unavailable desc = transport is closing
er (no changes)
2019/10/16 11:05:14 [TRACE] Executing graph transform *terraform.RootTransformer
2019/10/16 11:05:14 [TRACE] Completed graph transform *terraform.RootTransformer (no changes)
2019/10/16 11:05:14 [TRACE] vertex "module.docker_master.module.docker.null_resource.swarm_cluster": entering dynamic subgraph
2019/10/16 11:05:14 [TRACE] dag/walk: updating graph
2019/10/16 11:05:14 [TRACE] dag/walk: added new vertex: "module.docker_master.module.docker.null_resource.swarm_cluster"
2019/10/16 11:05:14 [TRACE] dag/walk: visiting "module.docker_master.module.docker.null_resource.swarm_cluster"
2019/10/16 11:05:14 [TRACE] vertex "module.docker_master.module.docker.null_resource.swarm_cluster": starting visit (*terraform.NodeRefreshableManagedResourceInstance)
2019/10/16 11:05:14 [TRACE] NodeRefreshableManagedResourceInstance: module.docker_master.module.docker.null_resource.swarm_cluster has no existing state to refresh
2019/10/16 11:05:14 [TRACE] vertex "module.docker_master.module.docker.null_resource.swarm_cluster": evaluating
2019/10/16 11:05:14 [TRACE] [walkRefresh] Entering eval tree: module.docker_master.module.docker.null_resource.swarm_cluster
2019/10/16 11:05:14 [TRACE] module.docker_master.module.docker: eval: *terraform.EvalSequence
2019/10/16 11:05:14 [TRACE] module.docker_master.module.docker: eval: *terraform.EvalGetProvider

2019-10-16T11:05:14.109+0200 [DEBUG] plugin: plugin process exited: path=/infrastructure/.terraform/plugins/darwin_amd64/terraform-provider-scaleway_v1.11.0_x4 pid=59952
2019-10-16T11:05:14.109+0200 [DEBUG] plugin: plugin exited
2019/10/16 11:05:14 [TRACE] [walkRefresh] Exiting eval tree: module.docker_master.module.docker.module.node.provider.scaleway (close)
2019/10/16 11:05:14 [TRACE] vertex "module.docker_master.module.docker.module.node.provider.scaleway (close)": visit complete
2019/10/16 11:05:14 [TRACE] module.docker_master.module.docker: eval: *terraform.EvalReadState
2019/10/16 11:05:14 [TRACE] EvalReadState: reading state for module.docker_master.module.docker.null_resource.swarm_cluster
2019/10/16 11:05:14 [TRACE] EvalReadState: no state present for module.docker_master.module.docker.null_resource.swarm_cluster
2019/10/16 11:05:14 [TRACE] module.docker_master.module.docker: eval: *terraform.EvalDiff
2019/10/16 11:05:14 [TRACE] Re-validating config for "module.docker_master.module.docker.null_resource.swarm_cluster"
2019/10/16 11:05:14 [TRACE] GRPCProvider: ValidateResourceTypeConfig
2019/10/16 11:05:14 [TRACE] GRPCProvider: PlanResourceChange
2019/10/16 11:05:14 [WARN] Provider "null" produced an invalid plan for module.docker_master.module.docker.null_resource.swarm_cluster, but we are tolerating it because it is using the legacy plugin SDK.
    The following problems may be the cause of any confusing errors from downstream operations:
      - .triggers: planned value cty.UnknownVal(cty.Map(cty.String)) does not match config value cty.MapVal(map[string]cty.Value{"server_ids":cty.UnknownVal(cty.String)})
2019/10/16 11:05:14 [TRACE] module.docker_master.module.docker: eval: *terraform.EvalWriteState
2019/10/16 11:05:14 [TRACE] EvalWriteState: writing current state object for module.docker_master.module.docker.null_resource.swarm_cluster
2019/10/16 11:05:14 [TRACE] module.docker_master.module.docker: eval: *terraform.EvalWriteDiff
2019/10/16 11:05:14 [TRACE] EvalWriteDiff: recorded
 Create change for module.docker_master.module.docker.null_resource.swarm_cluster
2019/10/16 11:05:14 [TRACE] [walkRefresh] Exiting eval tree: module.docker_master.module.docker.null_resource.swarm_cluster
2019/10/16 11:05:14 [TRACE] vertex "module.docker_master.module.docker.null_resource.swarm_cluster": visit complete
2019/10/16 11:05:14 [TRACE] vertex "module.docker_master.module.docker.null_resource.swarm_cluster": dynamic subgraph completed successfully
2019/10/16 11:05:14 [TRACE] vertex "module.docker_master.module.docker.null_resource.swarm_cluster": visit complete
2019/10/16 11:05:14 [TRACE] dag/walk: visiting "provider.null (close)"
2019/10/16 11:05:14 [TRACE] vertex "provider.null (close)": starting visit (*terraform.graphNodeCloseProvider)
2019/10/16 11:05:14 [TRACE] vertex "provider.null (close)": evaluating
2019/10/16 11:05:14 [TRACE] [walkRefresh] Entering eval tree: provider.null (close)
2019/10/16 11:05:14 [TRACE] <root>: eval: *terraform.EvalCloseProvider
2019/10/16 11:05:14 [TRACE] GRPCProvider: Close
2019-10-16T11:05:14.115+0200 [DEBUG] plugin: plugin process exited: path=/infrastructure/.terraform/plugins/darwin_amd64/terraform-provider-null_v2.1.2_x4 pid=59943
2019-10-16T11:05:14.115+0200 [DEBUG] plugin: plugin exited
2019/10/16 11:05:14 [TRACE] [walkRefresh] Exiting eval tree: provider.null (close)
2019/10/16 11:05:14 [TRACE] vertex "provider.null (close)": visit complete
2019/10/16 11:05:14 [TRACE] dag/walk: upstream of "root" errored, so skipping
2019/10/16 11:05:14 [TRACE] statemgr.Filesystem: removing lock metadata file .terraform.tfstate.lock.info
2019/10/16 11:05:14 [TRACE] statemgr.Filesystem: unlocking terraform.tfstate using fcntl flock
2019-10-16T11:05:14.118+0200 [DEBUG] plugin: plugin exited

Error: rpc error: code = Canceled desc = context canceled



Error: rpc error: code = Canceled desc = context canceled



Error: rpc error: code = Unavailable desc = transport is closing



Error: rpc error: code = Unavailable desc = transport is closing



Error: rpc error: code = Canceled desc = context canceled



Error: rpc error: code = Unavailable desc = transport is closing



Error: rpc error: code = Canceled desc = context canceled



Error: rpc error: code = Canceled desc = context canceled



Error: rpc error: code = Unavailable desc = transport is closing



Error: rpc error: code = Unavailable desc = transport is closing



Error: rpc error: code = Canceled desc = context canceled



Error: rpc error: code = Canceled desc = context canceled



Error: rpc error: code = Canceled desc = context canceled



Error: rpc error: code = Canceled desc = context canceled



Error: rpc error: code = Unavailable desc = transport is closing



Error: rpc error: code = Unavailable desc = transport is closing



Error: rpc error: code = Unavailable desc = transport is closing



Error: rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = "transport: authentication handshake failed: write unix ->/var/folders/z5/90ptswb91rsb7lb29jc3lwsm0000gn/T/plugin251160601: write: broken pipe"



Error: rpc error: code = Unavailable desc = transport is closing

Содержание

  1. rpc error: code = Canceled desc = context canceled #10280
  2. Comments
  3. Terraform Version
  4. Affected Resource(s)
  5. Terraform Configuration Files
  6. Debug Output
  7. Expected Behavior
  8. Actual Behavior
  9. Steps to Reproduce
  10. Community Note
  11. Footer
  12. loki returned 500 and «rpc error: code = Canceled desc = context canceled» when handling large data query #3244
  13. Comments
  14. Error: rpc error: code = Canceled desc = context canceled #10084
  15. Comments

rpc error: code = Canceled desc = context canceled #10280

I’m seeing a bunch of err: rpc error: code = Canceled desc = context canceled . This repo plans fine with 0.11, but fails on 0.12. There are

3000 items in the terraform state.

Terraform Version

Affected Resource(s)

Terraform Configuration Files

Debug Output

Further output at:

Expected Behavior

Terraform planned correctly

Actual Behavior

Terraform errored with multiple err: rpc error: code = Canceled desc = context canceled and a timeout while waiting for plugin to start , which the debug says is the aws plugin.

Steps to Reproduce

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave «+1» or «me too» comments, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

The text was updated successfully, but these errors were encountered:

This is running with the official Docker image.

We ended up splitting this repo into 3, and this section no longer has problems.

Can we reopen this?

Is the problem the size?

I thought it was size, then eventually found that this came up again in one of the repos that had been split. The cause ended up being an extra set of [] being used in formatlist after the upgrader had run, but it required a lot of binary searching of resources to track down.

I think you should file a separate issue.

For the record, I think we had an issue with memory allocation (CI/CD tooling).

I’m going to lock this issue because it has been closed for 30 days ⏳ . This helps our maintainers find and focus on the active issues.

If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. Thanks!

© 2023 GitHub, Inc.

You can’t perform that action at this time.

You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.

Источник

loki returned 500 and «rpc error: code = Canceled desc = context canceled» when handling large data query #3244

Describe the bug
Loki logged «rpc error: code = Canceled desc = context canceled» when more 1.8GB data fetched and more than 1m took by the query.

To Reproduce
Steps to reproduce the behavior:

  1. Started Loki (SHA or version) 2.1
  2. Started Promtail (SHA or version) to tail ‘. ‘
  3. Query: <> term
    can be referenced on the bottom

Expected behavior
Grafana can fetch and visualize large size of data (10+ GB?) from loki

Environment:

Infrastructure: [e.g., Kubernetes, bare-metal, laptop]
loki and grafana are deployed in Azure Kubernetes Service and SSD used for data storage

Deployment tool: [e.g., helm, jsonnet]
Helm

Screenshots, Promtail config, or terminal output
If applicable, add any output to help explain your problem.

ts=2021-01-27T04:50:13.279487527Z caller=spanlogger.go:53 org_id=fake traceID=6d3a3617a9d10b65 method=query.Exec level=debug Ingester.TotalReached=2 Ingester.TotalChunksMatched=0 Ingester.TotalBatches=0 Ingester.TotalLinesSent=0 Ingester.HeadChunkBytes=»0 B» Ingester.HeadChunkLines=0 Ingester.DecompressedBytes=»0 B» Ingester.DecompressedLines=0 Ingester.CompressedBytes=»0 B» Ingester.TotalDuplicates=0 Store.TotalChunksRef=1490 Store.TotalChunksDownloaded=800 Store.ChunksDownloadTime=137.986708ms Store.HeadChunkBytes=»0 B» Store.HeadChunkLines=0 Store.DecompressedBytes=»1.8 GB» Store.DecompressedLines=1349743 Store.CompressedBytes=»174 MB» Store.TotalDuplicates=0
ts=2021-01-27T04:50:13.279600928Z caller=spanlogger.go:53 org_id=fake traceID=6d3a3617a9d10b65 method=query.Exec level=debug Summary.BytesProcessedPerSecond=»26 MB» Summary.LinesProcessedPerSecond=19645 Summary.TotalBytesProcessed=»1.8 GB» Summary.TotalLinesProcessed=1349743 Summary.ExecTime=1m8.703279889s
level=info ts=2021-01-27T04:50:13.279953831Z caller=metrics.go:83 org_id=fake traceID=6d3a3617a9d10b65 latency=slow query=»(sum by (path) (sum_over_time(Источник

Error: rpc error: code = Canceled desc = context canceled #10084

Terraform version: 0.12.8

  • provider.archive v1.2.2
  • provider.aws v2.11.0
  • provider.null v2.1.2
  • provider.random v2.1.2
  • provider.template v2.1.2

this configuration generates errors during ‘terraform plan’ command:

The text was updated successfully, but these errors were encountered:

2019/09/12 11:02:20 [TRACE] Re-validating config for «module.alb_ecs_api_net.aws_alb.alb»
2019/09/12 11:02:20 [TRACE] GRPCProvider: ValidateResourceTypeConfig
2019-09-12T11:02:20.083+0300 [DEBUG] plugin.terraform-provider-aws_v2.11.0_x4: пїЅ[0mпїЅ[1mdata.template_file.user_data_mon: Refreshing state. пїЅ[0m
пїЅ[0mпїЅ[1mdata.template_file.user_data_rabbitmq: Refreshing state. пїЅ[0m
пїЅ[31m
пїЅ[1mпїЅ[31mError: пїЅ[0mпїЅ[0mпїЅ[1mrpc error: code = Canceled desc = context canceledпїЅ[0m

пїЅ[0mпїЅ[0mпїЅ[0m
пїЅ[31m
пїЅ[1mпїЅ[31mError: пїЅ[0mпїЅ[0mпїЅ[1mrpc error: code = Canceled desc = context canceledпїЅ[0m

пїЅ[0mпїЅ[0mпїЅ[0m
пїЅ[31m
пїЅ[1mпїЅ[31mError: пїЅ[0mпїЅ[0mпїЅ[1mrpc error: code = Canceled desc = context canceledпїЅ[0m

пїЅ[0mпїЅ[0mпїЅ[0m
пїЅ[31m
пїЅ[1mпїЅ[31mError: пїЅ[0mпїЅ[0mпїЅ[1mrpc error: code = Unavailable desc = transport is closingпїЅ[0m

пїЅ[0mпїЅ[0mпїЅ[0m
пїЅ[31m
пїЅ[1mпїЅ[31mError: пїЅ[0mпїЅ[0mпїЅ[1mrpc error: code = Canceled desc = context canceledпїЅ[0m

пїЅ[0mпїЅ[0mпїЅ[0m
пїЅ[31m
пїЅ[1mпїЅ[31mError: пїЅ[0mпїЅ[0mпїЅ[1mrpc error: code = Canceled desc = context canceledпїЅ[0m

пїЅ[0mпїЅ[0mпїЅ[0m
пїЅ[31m
пїЅ[1mпїЅ[31mError: пїЅ[0mпїЅ[0mпїЅ[1mrpc error: code = Canceled desc = context canceledпїЅ[0m

пїЅ[0mпїЅ[0mпїЅ[0m
пїЅ[31m
пїЅ[1mпїЅ[31mError: пїЅ[0mпїЅ[0mпїЅ[1mrpc error: code = Canceled desc = context canceledпїЅ[0m

пїЅ[0mпїЅ[0mпїЅ[0m
пїЅ[31m
пїЅ[1mпїЅ[31mError: пїЅ[0mпїЅ[0mпїЅ[1mrpc error: code = Canceled desc = context canceledпїЅ[0m

пїЅ[0mпїЅ[0mпїЅ[0m
пїЅ[31m
пїЅ[1mпїЅ[31mError: пїЅ[0mпїЅ[0mпїЅ[1mrpc error: code = Canceled desc = context canceledпїЅ[0m

пїЅ[0mпїЅ[0mпїЅ[0m
пїЅ[31m
пїЅ[1mпїЅ[31mError: пїЅ[0mпїЅ[0mпїЅ[1mrpc error: code = Unavailable desc = transport is closingпїЅ[0m

пїЅ[0mпїЅ[0mпїЅ[0m
пїЅ[31m
пїЅ[1mпїЅ[31mError: пїЅ[0mпїЅ[0mпїЅ[1mrpc error: code = Canceled desc = context canceledпїЅ[0m

пїЅ[0mпїЅ[0mпїЅ[0m
пїЅ[31m
пїЅ[1mпїЅ[31mError: пїЅ[0mпїЅ[0mпїЅ[1mrpc error: code = Canceled desc = context canceledпїЅ[0m

пїЅ[0mпїЅ[0mпїЅ[0m
пїЅ[31m
пїЅ[1mпїЅ[31mError: пїЅ[0mпїЅ[0mпїЅ[1mrpc error: code = Unavailable desc = transport is closingпїЅ[0m

пїЅ[0mпїЅ[0mпїЅ[0m
пїЅ[31m
пїЅ[1mпїЅ[31mError: пїЅ[0mпїЅ[0mпїЅ[1mrpc error: code = Canceled desc = context canceledпїЅ[0m

пїЅ[0mпїЅ[0mпїЅ[0m
пїЅ[31m
пїЅ[1mпїЅ[31mError: пїЅ[0mпїЅ[0mпїЅ[1mrpc error: code = Canceled desc = context canceledпїЅ[0m

пїЅ[0mпїЅ[0mпїЅ[0m
пїЅ[31m
пїЅ[1mпїЅ[31mError: пїЅ[0mпїЅ[0mпїЅ[1mrpc error: code = Unavailable desc = transport is closingпїЅ[0m

пїЅ[0mпїЅ[0mпїЅ[0m
пїЅ[31m
пїЅ[1mпїЅ[31mError: пїЅ[0mпїЅ[0mпїЅ[1mrpc error: code = Canceled desc = context canceledпїЅ[0m

пїЅ[0mпїЅ[0mпїЅ[0m
пїЅ[31m
пїЅ[1mпїЅ[31mError: пїЅ[0mпїЅ[0mпїЅ[1mrpc error: code = Unavailable desc = transport is closingпїЅ[0m

пїЅ[0mпїЅ[0mпїЅ[0m
пїЅ[31m
пїЅ[1mпїЅ[31mError: пїЅ[0mпїЅ[0mпїЅ[1mrpc error: code = Canceled desc = context canceledпїЅ[0m

пїЅ[0mпїЅ[0mпїЅ[0m
пїЅ[31m
пїЅ[1mпїЅ[31mError: пїЅ[0mпїЅ[0mпїЅ[1mrpc error: code = Canceled desc = context canceledпїЅ[0m

пїЅ[0mпїЅ[0mпїЅ[0m
пїЅ[31m
пїЅ[1mпїЅ[31mError: пїЅ[0mпїЅ[0mпїЅ[1mrpc error: code = Canceled desc = context canceledпїЅ[0m

пїЅ[0mпїЅ[0mпїЅ[0m
пїЅ[31m
пїЅ[1mпїЅ[31mError: пїЅ[0mпїЅ[0mпїЅ[1mrpc error: code = Canceled desc = context canceledпїЅ[0m

пїЅ[0mпїЅ[0mпїЅ[0m
пїЅ[31m
пїЅ[1mпїЅ[31mError: пїЅ[0mпїЅ[0mпїЅ[1mrpc error: code = Unavailable desc = transport is closingпїЅ[0m

пїЅ[0mпїЅ[0mпїЅ[0m
пїЅ[31m
пїЅ[1mпїЅ[31mError: пїЅ[0mпїЅ[0mпїЅ[1mrpc error: code = Unavailable desc = transport is closingпїЅ[0m

пїЅ[0mпїЅ[0mпїЅ[0m
пїЅ[31m
пїЅ[1mпїЅ[31mError: пїЅ[0mпїЅ[0mпїЅ[1mrpc error: code = Canceled desc = context canceledпїЅ[0m

пїЅ[0mпїЅ[0mпїЅ[0m
пїЅ[31m
пїЅ[1mпїЅ[31mError: пїЅ[0mпїЅ[0mпїЅ[1mrpc error: code = Unavailable desc = transport is closingпїЅ[0m

пїЅ[0mпїЅ[0mпїЅ[0m
пїЅ[31m
пїЅ[1mпїЅ[31mError: пїЅ[0mпїЅ[0mпїЅ[1mrpc error: code = Canceled desc = context canceledпїЅ[0m

пїЅ[0mпїЅ[0mпїЅ[0m
пїЅ[31m
пїЅ[1mпїЅ[31mError: пїЅ[0mпїЅ[0mпїЅ[1mrpc error: code = Unavailable desc = transport is closingпїЅ[0m

пїЅ[0mпїЅ[0mпїЅ[0m
пїЅ[31m
пїЅ[1mпїЅ[31mError: пїЅ[0mпїЅ[0mпїЅ[1mrpc error: code = Canceled desc = context canceledпїЅ[0m

пїЅ[0mпїЅ[0mпїЅ[0m
пїЅ[31m
пїЅ[1mпїЅ[31mError: пїЅ[0mпїЅ[0mпїЅ[1mrpc error: code = Canceled desc = context canceledпїЅ[0m

пїЅ[0mпїЅ[0mпїЅ[0m
пїЅ[31m
пїЅ[1mпїЅ[31mError: пїЅ[0mпїЅ[0mпїЅ[1mrpc error: code = Canceled desc = context canceledпїЅ[0m

пїЅ[0mпїЅ[0mпїЅ[0m
пїЅ[31m
пїЅ[1mпїЅ[31mError: пїЅ[0mпїЅ[0mпїЅ[1mrpc error: code = Unavailable desc = transport is closingпїЅ[0m

пїЅ[0mпїЅ[0mпїЅ[0m
пїЅ[31m
пїЅ[1mпїЅ[31mError: пїЅ[0mпїЅ[0mпїЅ[1mrpc error: code = Canceled desc = context canceledпїЅ[0m

пїЅ[0mпїЅ[0mпїЅ[0m
пїЅ[31m
пїЅ[1mпїЅ[31mError: пїЅ[0mпїЅ[0mпїЅ[1mrpc error: code = Canceled desc = context canceledпїЅ[0m

пїЅ[0mпїЅ[0mпїЅ[0m
пїЅ[31m
пїЅ[1mпїЅ[31mError: пїЅ[0mпїЅ[0mпїЅ[1mrpc error: code = Canceled desc = context canceledпїЅ[0m

пїЅ[0mпїЅ[0mпїЅ[0m
panic: reflect: call of reflect.Value.Type on zero Value

2019-09-12T11:02:20.070+0300 [DEBUG] plugin.terraform-provider-aws_v2.11.0_x4: 2019/09/12 11:02:20 [DEBUG] [aws-sdk-go] DEBUG: Response elasticloadbalancing/DescribeLoadBalancers Details:
2019-09-12T11:02:20.070+0300 [DEBUG] plugin.terraform-provider-aws_v2.11.0_x4: —[ RESPONSE ]—————————————
2019-09-12T11:02:20.070+0300 [DEBUG] plugin.terraform-provider-aws_v2.11.0_x4: HTTP/1.1 400 Bad Request
2019-09-12T11:02:20.070+0300 [DEBUG] plugin.terraform-provider-aws_v2.11.0_x4: Connection: close
2019-09-12T11:02:20.070+0300 [DEBUG] plugin.terraform-provider-aws_v2.11.0_x4: Content-Length: 271
2019-09-12T11:02:20.070+0300 [DEBUG] plugin.terraform-provider-aws_v2.11.0_x4: Content-Type: text/xml
2019-09-12T11:02:20.070+0300 [DEBUG] plugin.terraform-provider-aws_v2.11.0_x4: Date: Thu, 12 Sep 2019 08:02:19 GMT
2019-09-12T11:02:20.070+0300 [DEBUG] plugin.terraform-provider-aws_v2.11.0_x4: X-Amzn-Requestid: a50a881e-d533-11e9-8725-611784950610
2019-09-12T11:02:20.070+0300 [DEBUG] plugin.terraform-provider-aws_v2.11.0_x4:
2019-09-12T11:02:20.070+0300 [DEBUG] plugin.terraform-provider-aws_v2.11.0_x4:
2019-09-12T11:02:20.070+0300 [DEBUG] plugin.terraform-provider-aws_v2.11.0_x4: ——————————————————
2019-09-12T11:02:20.070+0300 [DEBUG] plugin.terraform-provider-aws_v2.11.0_x4: 2019/09/12 11:02:20 [DEBUG] [aws-sdk-go]
2019-09-12T11:02:20.070+0300 [DEBUG] plugin.terraform-provider-aws_v2.11.0_x4:
2019-09-12T11:02:20.070+0300 [DEBUG] plugin.terraform-provider-aws_v2.11.0_x4: Sender
2019-09-12T11:02:20.070+0300 [DEBUG] plugin.terraform-provider-aws_v2.11.0_x4: Throttling
2019-09-12T11:02:20.070+0300 [DEBUG] plugin.terraform-provider-aws_v2.11.0_x4: Rate exceeded
2019-09-12T11:02:20.070+0300 [DEBUG] plugin.terraform-provider-aws_v2.11.0_x4:
2019-09-12T11:02:20.070+0300 [DEBUG] plugin.terraform-provider-aws_v2.11.0_x4: a50a881e-d533-11e9-8725-611784950610
2019-09-12T11:02:20.070+0300 [DEBUG] plugin.terraform-provider-aws_v2.11.0_x4:
2019-09-12T11:02:20.070+0300 [DEBUG] plugin.terraform-provider-aws_v2.11.0_x4: 2019/09/12 11:02:20 [DEBUG] [aws-sdk-go] DEBUG: Validate Response elasticloadbalancing/DescribeLoadBalancers failed, not retrying, error Throttling: Rate exceeded
2019-09-12T11:02:20.070+0300 [DEBUG] plugin.terraform-provider-aws_v2.11.0_x4: status code: 400, request id: a50a881e-d533-11e9-8725-611784950610

i’ve tried with -parallelism=1 but error is the same
Error: rpc error: code = Canceled desc = context canceled
aws support recommends to implement Exponential Backoff and retries in tf code.

also there is panic stack trace from log:
panic: reflect: call of reflect.Value.Type on zero Value
2019-09-12T11:02:20.083+0300 [DEBUG] plugin.terraform-provider-aws_v2.11.0_x4:
2019-09-12T11:02:20.083+0300 [DEBUG] plugin.terraform-provider-aws_v2.11.0_x4: goroutine 6904 [running]:
2019-09-12T11:02:20.083+0300 [DEBUG] plugin.terraform-provider-aws_v2.11.0_x4: reflect.Value.Type(0x0, 0x0, 0x0, 0x10, 0xc000b07098)
2019-09-12T11:02:20.083+0300 [DEBUG] plugin.terraform-provider-aws_v2.11.0_x4: /opt/goenv/versions/1.12.2/src/reflect/value.go:1813 +0x169
2019-09-12T11:02:20.084+0300 [DEBUG] plugin.terraform-provider-aws_v2.11.0_x4: github.com/terraform-providers/terraform-provider-aws/vendor/github.com/hashicorp/terraform/helper/schema.schemaMap.validatePrimitive(0xc0005877a0, 0xc0004b8500, 0x11, 0x0, 0x0, 0xc0005c2000, 0xc000ea1560, 0x1, 0x4635c20, 0xc000a2ddc0, . )
2019-09-12T11:02:20.084+0300 [DEBUG] plugin.terraform-provider-aws_v2.11.0_x4: /opt/teamcity-agent/work/5d79fe75d4460a2f/src/github.com/terraform-providers/terraform-provider-aws/vendor/github.com/hashicorp/terraform/helper/schema/schema.go:1706 +0xc8
2019-09-12T11:02:20.084+0300 [DEBUG] plugin.terraform-provider-aws_v2.11.0_x4: github.com/terraform-providers/terraform-provider-aws/vendor/github.com/hashicorp/terraform/helper/schema.schemaMap.validateType(0xc0005877a0, 0xc0004b8500, 0x11, 0x0, 0x0, 0xc0005c2000, 0xc000ea1560, 0x10085dc, 0xc00001e000, 0x4c6f800, . )
2019-09-12T11:02:20.084+0300 [DEBUG] plugin.terraform-provider-aws_v2.11.0_x4: /opt/teamcity-agent/work/5d79fe75d4460a2f/src/github.com/terraform-providers/terraform-provider-aws/vendor/github.com/hashicorp/terraform/helper/schema/schema.go:1778 +0x56f
2019-09-12T11:02:20.084+0300 [DEBUG] plugin.terraform-provider-aws_v2.11.0_x4: github.com/terraform-providers/terraform-provider-aws/vendor/github.com/hashicorp/terraform/helper/schema.schemaMap.validateList(0xc0005877a0, 0x53ffac3, 0xf, 0x4635c20, 0xc000a2ddc0, 0xc0005c2e00, 0xc000ea1560, 0x0, 0x4a99f60, 0xc000ea1590, . )
2019-09-12T11:02:20.084+0300 [DEBUG] plugin.terraform-provider-aws_v2.11.0_x4: /opt/teamcity-agent/work/5d79fe75d4460a2f/src/github.com/terraform-providers/terraform-provider-aws/vendor/github.com/hashicorp/terraform/helper/schema/schema.go:1507 +0x86c
2019-09-12T11:02:20.084+0300 [DEBUG] plugin.terraform-provider-aws_v2.11.0_x4: github.com/terraform-providers/terraform-provider-aws/vendor/github.com/hashicorp/terraform/helper/schema.schemaMap.validateType(0xc0005877a0, 0x53ffac3, 0xf, 0x4635c20, 0xc000a2ddc0, 0xc0005c2e00, 0xc000ea1560, 0x0, 0x0, 0x0, . )
2019-09-12T11:02:20.084+0300 [DEBUG] plugin.terraform-provider-aws_v2.11.0_x4: /opt/teamcity-agent/work/5d79fe75d4460a2f/src/github.com/terraform-providers/terraform-provider-aws/vendor/github.com/hashicorp/terraform/helper/schema/schema.go:1774 +0x9e
2019-09-12T11:02:20.084+0300 [DEBUG] plugin.terraform-provider-aws_v2.11.0_x4: github.com/terraform-providers/terraform-provider-aws/vendor/github.com/hashicorp/terraform/helper/schema.schemaMap.validate(0xc0005877a0, 0x53ffac3, 0xf, 0xc0005c2e00, 0xc000ea1560, 0x0, 0x0, 0x0, 0x0, 0x0, . )
2019-09-12T11:02:20.084+0300 [DEBUG] plugin.terraform-provider-aws_v2.11.0_x4: /opt/teamcity-agent/work/5d79fe75d4460a2f/src/github.com/terraform-providers/terraform-provider-aws/vendor/github.com/hashicorp/terraform/helper/schema/schema.go:1416 +0x21d
2019-09-12T11:02:20.084+0300 [DEBUG] plugin.terraform-provider-aws_v2.11.0_x4: github.com/terraform-providers/terraform-provider-aws/vendor/github.com/hashicorp/terraform/helper/schema.schemaMap.validateObject(0xc0005877a0, 0x0, 0x0, 0xc0005877a0, 0xc000ea1560, 0x53e8eb7, 0xc000707248, 0x7, 0xc000332601, 0x6b00000000000000, . )
2019-09-12T11:02:20.084+0300 [DEBUG] plugin.terraform-provider-aws_v2.11.0_x4: /opt/teamcity-agent/work/5d79fe75d4460a2f/src/github.com/terraform-providers/terraform-provider-aws/vendor/github.com/hashicorp/terraform/helper/schema/schema.go:1673 +0x1c6
2019-09-12T11:02:20.084+0300 [DEBUG] plugin.terraform-provider-aws_v2.11.0_x4: github.com/terraform-providers/terraform-provider-aws/vendor/github.com/hashicorp/terraform/helper/schema.schemaMap.Validate(. )
2019-09-12T11:02:20.084+0300 [DEBUG] plugin.terraform-provider-aws_v2.11.0_x4: /opt/teamcity-agent/work/5d79fe75d4460a2f/src/github.com/terraform-providers/terraform-provider-aws/vendor/github.com/hashicorp/terraform/helper/schema/schema.go:705
2019-09-12T11:02:20.084+0300 [DEBUG] plugin.terraform-provider-aws_v2.11.0_x4: github.com/terraform-providers/terraform-provider-aws/vendor/github.com/hashicorp/terraform/helper/schema.(*Resource).Validate(0xc0005b5080, 0xc000ea1560, 0xc000707248, 0x7, 0xc0005e4198, 0x1, 0x40, 0x0)
2019-09-12T11:02:20.084+0300 [DEBUG] plugin.terraform-provider-aws_v2.11.0_x4: /opt/teamcity-agent/work/5d79fe75d4460a2f/src/github.com/terraform-providers/terraform-provider-aws/vendor/github.com/hashicorp/terraform/helper/schema/resource.go:368 +0x5e
2019-09-12T11:02:20.084+0300 [DEBUG] plugin.terraform-provider-aws_v2.11.0_x4: github.com/terraform-providers/terraform-provider-aws/vendor/github.com/hashicorp/terraform/helper/schema.(*Provider).ValidateResource(0xc0005d6480, 0xc000707248, 0x7, 0xc000ea1560, 0xc0008d9d70, 0xc000ea1560, 0xc000619ad8, 0x4a99f60, 0xc000ea1500, 0x0)
2019-09-12T11:02:20.084+0300 [DEBUG] plugin.terraform-provider-aws_v2.11.0_x4: /opt/teamcity-agent/work/5d79fe75d4460a2f/src/github.com/terraform-providers/terraform-provider-aws/vendor/github.com/hashicorp/terraform/helper/schema/provider.go:242 +0x1d8
2019-09-12T11:02:20.084+0300 [DEBUG] plugin.terraform-provider-aws_v2.11.0_x4: github.com/terraform-providers/terraform-provider-aws/vendor/github.com/hashicorp/terraform/helper/plugin.(*GRPCProviderServer).ValidateResourceTypeConfig(0xc0000b4b18, 0x5c18dc0, 0xc000b859b0, 0xc000c78ac0, 0xc0000b4b18, 0xc000b859b0, 0xc0008a3bd0)
2019-09-12T11:02:20.084+0300 [DEBUG] plugin.terraform-provider-aws_v2.11.0_x4: /opt/teamcity-agent/work/5d79fe75d4460a2f/src/github.com/terraform-providers/terraform-provider-aws/vendor/github.com/hashicorp/terraform/helper/plugin/grpc_provider.go:226 +0x218
2019-09-12T11:02:20.084+0300 [DEBUG] plugin.terraform-provider-aws_v2.11.0_x4: github.com/terraform-providers/terraform-provider-aws/vendor/github.com/hashicorp/terraform/internal/tfplugin5._Provider_ValidateResourceTypeConfig_Handler(0x5302460, 0xc0000b4b18, 0x5c18dc0, 0xc000b859b0, 0xc000927720, 0x0, 0x5c18dc0, 0xc000b859b0, 0xc000f741a0, 0x195)
2019-09-12T11:02:20.084+0300 [DEBUG] plugin.terraform-provider-aws_v2.11.0_x4: /opt/teamcity-agent/work/5d79fe75d4460a2f/src/github.com/terraform-providers/terraform-provider-aws/vendor/github.com/hashicorp/terraform/internal/tfplugin5/tfplugin5.pb.go:2911 +0x23e
2019-09-12T11:02:20.084+0300 [DEBUG] plugin.terraform-provider-aws_v2.11.0_x4: github.com/terraform-providers/terraform-provider-aws/vendor/google.golang.org/grpc.(*Server).processUnaryRPC(0xc000085200, 0x5c388e0, 0xc000001b00, 0xc000cbe800, 0xc0005cec90, 0x9775bd0, 0x0, 0x0, 0x0)
2019-09-12T11:02:20.084+0300 [DEBUG] plugin.terraform-provider-aws_v2.11.0_x4: /opt/teamcity-agent/work/5d79fe75d4460a2f/src/github.com/terraform-providers/terraform-provider-aws/vendor/google.golang.org/grpc/server.go:966 +0x470
2019-09-12T11:02:20.084+0300 [DEBUG] plugin.terraform-provider-aws_v2.11.0_x4: github.com/terraform-providers/terraform-provider-aws/vendor/google.golang.org/grpc.(*Server).handleStream(0xc000085200, 0x5c388e0, 0xc000001b00, 0xc000cbe800, 0x0)
2019-09-12T11:02:20.084+0300 [DEBUG] plugin.terraform-provider-aws_v2.11.0_x4: /opt/teamcity-agent/work/5d79fe75d4460a2f/src/github.com/terraform-providers/terraform-provider-aws/vendor/google.golang.org/grpc/server.go:1245 +0xd25
2019-09-12T11:02:20.085+0300 [DEBUG] plugin.terraform-provider-aws_v2.11.0_x4: github.com/terraform-providers/terraform-provider-aws/vendor/google.golang.org/grpc.(*Server).serveStreams.func1.1(0xc000036120, 0xc000085200, 0x5c388e0, 0xc000001b00, 0xc000cbe800)
2019-09-12T11:02:20.085+0300 [DEBUG] plugin.terraform-provider-aws_v2.11.0_x4: /opt/teamcity-agent/work/5d79fe75d4460a2f/src/github.com/terraform-providers/terraform-provider-aws/vendor/google.golang.org/grpc/server.go:685 +0x9f
2019-09-12T11:02:20.085+0300 [DEBUG] plugin.terraform-provider-aws_v2.11.0_x4: created by github.com/terraform-providers/terraform-provider-aws/vendor/google.golang.org/grpc.(*Server).serveStreams.func1
2019-09-12T11:02:20.085+0300 [DEBUG] plugin.terraform-provider-aws_v2.11.0_x4: /opt/teamcity-agent/work/5d79fe75d4460a2f/src/github.com/terraform-providers/terraform-provider-aws/vendor/google.golang.org/grpc/server.go:683 +0xa1

@evanphx can you help please.

Do not tag random people.

Hi @artem-tomyuk рџ‘‹ Sorry you are running into trouble here.

Источник

Fault Locating

If the pod status is ImagePullBackOff, the image fails to be pulled. For details about how to view Kubernetes events, see Viewing Pod Events.

Troubleshooting Process

Determine the cause based on the event information, as listed in Table 1.

Table 1 FailedPullImage

Event Information

Cause and Solution

Failed to pull image «xxx»: rpc error: code = Unknown desc = Error response from daemon: Get xxx: denied: You may not login yet

You have not logged in to the image repository.

Check Item 1: Whether imagePullSecret Is Specified When You Use kubectl to Create a Workload

Failed to pull image «nginx:v1.1»: rpc error: code = Unknown desc = Error response from daemon: Get https://registry-1.docker.io/v2/: dial tcp: lookup registry-1.docker.io: no such host

The image address is incorrectly configured.

Check Item 2: Whether the Image Address Is Correct When a Third-Party Image Is Used

Check Item 3: Whether an Incorrect Secret Is Used When a Third-Party Image Is Used

Failed to pull image «docker.io/bitnami/nginx:1.22.0-debian-11-r3«: rpc error: code = Unknown desc = Error response from daemon: Get https://registry-1.docker.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)

Failed to connect to the image repository due to the disconnected network.

Check Item 7: Connection to the Image Repository

Failed create pod sandbox: rpc error: code = Unknown desc = failed to create a sandbox for pod «nginx-6dc48bf8b6-l8xrw»: Error response from daemon: mkdir xxxxx: no space left on device

The disk space is insufficient.

Check Item 4: Whether the Node Disk Space Is Insufficient

Failed to pull image «xxx»: rpc error: code = Unknown desc = error pulling image configuration: xxx x509: certificate signed by unknown authority

An unknown or insecure certificate is used by the third-party image repository from which the image is pulled.

Check Item 5: Whether the Remote Image Repository Uses an Unknown or Insecure Certificate

Failed to pull image «XXX»: rpc error: code = Unknown desc = context canceled

The image size is too large.

Check Item 6: Whether the Image Size Is Too Large

Figure 1 Troubleshooting process

Check Item 1: Whether imagePullSecret Is Specified When You Use kubectl to Create a Workload

If the workload status is abnormal and a Kubernetes event is displayed indicating that the pod fails to pull the image, check whether the imagePullSecrets field exists in the YAML file.

Items to Check

  • If an image needs to be pulled from SWR, the name parameter must be set to default-secret.
    apiVersion: extensions/v1beta1
    kind: Deployment
    metadata:
      name: nginx
    spec:
      replicas: 1
      selector:
        matchLabels:
          app: nginx
      strategy:
        type: RollingUpdate
      template:
        metadata:
          labels:
            app: nginx
        spec:
          containers:
          - image: nginx 
            imagePullPolicy: Always
            name: nginx
          imagePullSecrets:
          - name: default-secret
  • If an image needs to be pulled from a third-party image repository, the imagePullSecrets parameter must be set to the created secret name.

    When you use kubectl to create a workload from a third-party image, specify the imagePullSecret field, in which name indicates the name of the secret used to pull the image. For details about how to create a secret, see Using kubectl.

Check Item 2: Whether the Image Address Is Correct When a Third-Party Image Is Used

CCE allows you to create workloads using images pulled from third-party image repositories.

Enter the third-party image address according to requirements. The format must be ip:port/path/name:version or name:version. If no tag is specified, latest is used by default.

  • For a private repository, enter an image address in the format of ip:port/path/name:version.
  • For an open-source Docker repository, enter an image address in the format of name:version, for example, nginx:latest.

    Figure 2 Using a third-party image

The following information is displayed when you fail to pull an image due to incorrect image address provided.

Failed to pull image "nginx:v1.1": rpc error: code = Unknown desc = Error response from daemon: Get https://registry-1.docker.io/v2/: dial tcp: lookup registry-1.docker.io: no such host

Solution

You can either edit your YAML file to modify the image address or log in to the CCE console to replace the image on the Upgrade tab page on the workload details page.

Check Item 3: Whether an Incorrect Secret Is Used When a Third-Party Image Is Used

Generally, a third-party image repository can be accessed only after authentication (using your account and password). CCE uses the secret authentication mode to pull images. Therefore, you need to create a secret for an image repository before pulling images from the repository.

Solution

If your secret is incorrect, images will fail to be pulled. In this case, create a new secret.

To create a secret, see Using kubectl.

Check Item 4: Whether the Node Disk Space Is Insufficient

A 100 GB data disk dedicated for Docker is attached to the new node. If the data disk space is insufficient, image fails to be pulled.

Figure 3 Data disk capacity (GB)

If the Kubernetes event contains the following information, the node has no disk space left for storing images. You need to clean up images or expand the disk capacity.

Run the lvs command to check the disk space for storing images on the node.

Run the following command to clean up images:

docker rmi -f {Image ID}

To expand the disk capacity, perform the following steps:

  1. Expand the capacity of the data disk on the EVS console.
  2. Log in to the CCE console and click the cluster. In the navigation pane, choose Nodes. Click More > Sync Server Data at the row containing the target node.
  3. Log in to the target node.
  4. Run the lsblk command to check the block device information of the node.

    A data disk is divided depending on the container storage Rootfs:

    • Overlayfs: No independent thin pool is allocated. Image data is stored in the dockersys disk.
      # lsblk
      NAME                MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
      sda                   8:0    0   50G  0 disk 
      └─sda1                8:1    0   50G  0 part /
      sdb                   8:16   0  200G  0 disk 
      ├─vgpaas-dockersys  253:0    0   90G  0 lvm  /var/lib/docker               # Space used by Docker.
      └─vgpaas-kubernetes 253:1    0   10G  0 lvm  /mnt/paas/kubernetes/kubelet  # Space used by Kubernetes.

      Run the following commands on the node to add the new disk capacity to the dockersys disk:

      pvresize /dev/sdb 
      lvextend -l+100%FREE -n vgpaas/dockersys
      resize2fs /dev/vgpaas/dockersys
    • Devicemapper: A thin pool is allocated to store image data.
      # lsblk
      NAME                                MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
      sda                                   8:0    0   50G  0 disk 
      └─sda1                                8:1    0   50G  0 part /
      sdb                                   8:16   0  200G  0 disk 
      ├─vgpaas-dockersys                  253:0    0   18G  0 lvm  /var/lib/docker    
      ├─vgpaas-thinpool_tmeta             253:1    0    3G  0 lvm                   
      │ └─vgpaas-thinpool                 253:3    0   67G  0 lvm                   # Thin pool space.
      │   ...
      ├─vgpaas-thinpool_tdata             253:2    0   67G  0 lvm  
      │ └─vgpaas-thinpool                 253:3    0   67G  0 lvm  
      │   ...
      └─vgpaas-kubernetes                 253:4    0   10G  0 lvm  /mnt/paas/kubernetes/kubelet

      Run the following commands on the node to add the new disk capacity to the thinpool disk:

      pvresize /dev/sdb 
      lvextend -l+100%FREE -n vgpaas/thinpool

Check Item 5: Whether the Remote Image Repository Uses an Unknown or Insecure Certificate

When a pod pulls an image from a third-party image repository that uses an unknown or insecure certificate, the image fails to be pulled from the node. The pod event list contains the event «Failed to pull the image» with the cause «x509: certificate signed by unknown authority».

The security of EulerOS 2.9 images is enhanced. Some insecure or expired certificates are removed from the system. It is normal that this error is reported in EulerOS 2.9 but not or some third-party images on other types of nodes. You can also perform the following operations to rectify the fault.

Solution

  1. Check the IP address and port number of the third-party image server for which the error message «unknown authority» is displayed.

    You can see the IP address and port number of the third-party image server for which the error is reported in the event information «Failed to pull image».

    Failed to pull image "bitnami/redis-cluster:latest": rpc error: code = Unknown desc = error pulling image configuration: Get https://production.cloudflare.docker.com/registry-v2/docker/registry/v2/blobs/sha256/e8/e83853f03a2e792614e7c1e6de75d63e2d6d633b4e7c39b9d700792ee50f7b56/data?verify=1636972064-AQbl5RActnudDZV%2F3EShZwnqOe8%3D: x509: certificate signed by unknown authority

    The IP address of the third-party image server is production.cloudflare.docker.com, and the default HTTPS port number is 443.

  2. Load the root certificate of the third-party image server to the node where the third-party image is to be downloaded.

    Run the following commands on the EulerOS and CentOS nodes with {server_url}:{server_port} replaced with the IP address and port number obtained in Step 1, for example, production.cloudflare.docker.com:443:

    If the container engine of the node is containerd, replace systemctl restart docker with systemctl restart containerd.

    openssl s_client -showcerts -connect {server_url}:{server_port} < /dev/null | sed -ne '/-BEGIN CERTIFICATE-/,/-END CERTIFICATE-/p' > /etc/pki/ca-trust/source/anchors/tmp_ca.crt
    update-ca-trust
    systemctl restart docker

    Run the following command on Ubuntu nodes:

    openssl s_client -showcerts -connect {server_url}:{server_port} < /dev/null | sed -ne '/-BEGIN CERTIFICATE-/,/-END CERTIFICATE-/p' > /usr/local/share/ca-certificates/tmp_ca.crt
    update-ca-trust
    systemctl restart docker

Check Item 6: Whether the Image Size Is Too Large

The pod event list contains the event «Failed to pull image». This may be caused by a large image size.

Failed to pull image "XXX": rpc error: code = Unknown desc = context canceled

Log in to the node and run the docker pull command to manually pull the image. The image is successfully pulled.

Root Cause

The default value of image-pull-progress-deadline is 1 minute. If the image pull progress is not updated within 1 minute, image pull is canceled. If the node performance is poor or the image size is too large, the image may fail to be pulled and the workload may fail to be started.

Solution

  • (Recommended) Method 1: Log in to the node, run the docker pull command to manually pull the image, and check whether imagePullPolicy of the workload is IfNotPresent (default policy configuration). In this case, the image that has been pulled to the local host is used to create the workload.
  • Method 2: Modify the kubelet configuration parameters.

    For a cluster of v1.15 or later, run the following command:

    vi /opt/cloud/cce/kubernetes/kubelet/kubelet

    For a cluster earlier than v1.15, run the following command:

    vi /var/paas/kubernetes/kubelet/kubelet

    Add —image-pull-progress-deadline=30m to the end of the DAEMON_ARGS parameter. 30m indicates 30 minutes. You can change the value as required. The added configuration and the existing configuration are separated by a space.

    Run the following command to restart kubelet:

    systemctl restart kubelet

    Wait for a while and check whether the kubelet status is running.

    systemctl status kubelet

    The workload is started properly, and the image is successfully pulled.

Check Item 7: Connection to the Image Repository

Symptom

The following error message is displayed during workload creation:

Failed to pull image "docker.io/bitnami/nginx:1.22.0-debian-11-r3": rpc error: code = Unknown desc = Error response from daemon: Get https://registry-1.docker.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)

Cause

Failed to connect to the image repository due to the disconnected network. SWR allows you to pull images from the official Docker repository. For image pulls from other repositories, you need to connect to the repositories first.

Solution

  • Bind a public IP address (EIP) to the node that pulls the image.
  • Upload the image to SWR and then pull the image from SWR.

Понравилась статья? Поделить с друзьями:
  • Rpc call error
  • Royal quest ошибка mafss initialization error
  • Royal quest create device error
  • Royal clima кассетный кондиционер ошибка e9
  • Rome total war ошибка rome total war при запуске