Skip to content
This repository was archived by the owner on Mar 5, 2024. It is now read-only.
This repository was archived by the owner on Mar 5, 2024. It is now read-only.

On-the-fly agent replacement #202

@max-lobur

Description

@max-lobur

Environment: AWS, k8s 1.10, kops, calico v2.6.7, kiam helm chart from this PR helm/charts#9262 , quay.io/uswitch/kiam:v3.0-rc1.

We've got into an issue when did kiam upgrade:

  • helm upgrade
  • rotate kiam server pods 1 at time
  • rotate kiam agent pods 1 at time.

Server part went smooth.
After agent part, the new agent pod became invisible for all the existing pods: their metadata calls were not intercepted by kiam agent and they were getting bare node role instead of one set in the annotation. This condition persists for at least 30 min, after that I re-create all affected pods and they started to work normally - seeing the annotation role.

In other words, the new kiam agent pod works only for pods created after itself.

Right now we are doing node replacement to rotate an agent. Am I missing something in docs, is this expected, can be avoided?

This also means that crashed kiam agent pod means crashed node, we gotta do custom node healthchecks and so on to make sure we do not get into this condition.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions