Environment: AWS, k8s 1.10, kops, calico v2.6.7, kiam helm chart from this PR helm/charts#9262 , quay.io/uswitch/kiam:v3.0-rc1.
We've got into an issue when did kiam upgrade:
- helm upgrade
- rotate kiam server pods 1 at time
- rotate kiam agent pods 1 at time.
Server part went smooth.
After agent part, the new agent pod became invisible for all the existing pods: their metadata calls were not intercepted by kiam agent and they were getting bare node role instead of one set in the annotation. This condition persists for at least 30 min, after that I re-create all affected pods and they started to work normally - seeing the annotation role.
In other words, the new kiam agent pod works only for pods created after itself.
Right now we are doing node replacement to rotate an agent. Am I missing something in docs, is this expected, can be avoided?
This also means that crashed kiam agent pod means crashed node, we gotta do custom node healthchecks and so on to make sure we do not get into this condition.