-
Notifications
You must be signed in to change notification settings - Fork 38.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kubelet should alway get lastest secret/configmap resource in pod add event #124701
Comments
/sig area/kubelet |
@V0idk: The label(s) In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
/sig node |
/assign @harche |
/priority backlog |
hrmm... when a new pod is registered, the cache manager implementation invalidates the secret / configmap. I expected the watch one to do the same, but it apparently does not. @wojtek-t, this means that updating a secret and deployment does not guarantee the newly started pods use the new secret content. That seems... bad. |
This is where kubelet sets the strategy for fetching the secrets - kubernetes/pkg/kubelet/kubelet.go Line 586 in 5a121aa
|
// GetChangeDetectionStrategy is a mode in which kubelet fetches
// necessary objects directly from apiserver.
GetChangeDetectionStrategy ResourceChangeDetectionStrategy = "Get"
// TTLCacheChangeDetectionStrategy is a mode in which kubelet uses
// ttl cache for object directly fetched from apiserver.
TTLCacheChangeDetectionStrategy ResourceChangeDetectionStrategy = "Cache"
// WatchChangeDetectionStrategy is a mode in which kubelet uses
// watches to observe changes to objects that are in its interest.
WatchChangeDetectionStrategy ResourceChangeDetectionStrategy = "Watch" and we default to
It may happen that under resource contention, Hence, instead of changing the default from
@liggitt @rphillips @haircommander @V0idk What do you think? |
"this works well until it doesn't under load" is a really unactionable thing to document... I think there has to be a "happens before" ordering people can reason about and depend on ... if I update a secret, then create a new pod, and my updated secret doesn't get used, that's ~unusable in my opinion. I didn't know we gave up that ordering with the watch strategy. |
My idea is:
This ensures the ordering without losing too much performance. |
I believe that this issue is particularly evident in kubernetes/website#42359 (comment) (
If that was assured, it becomes a matter of synchronization on the instrument being used to create/update resources on kubernetes (helm hooks, argo synch waves, ...). Otherwise, automated rollouts can get stuck because the new pod may not get up to date values for e.g. environment variables, and since container crashes may not trigger a container recreation, the invalid value stays and the rollout cannot advance. |
to be fair, changing a resource marked immutable is not how that feature is expected to be used |
Catching up.... It was years since we did that and I clearly missed the invalidation aspect. I agree this is a bug, but at least not a regression. The question is how/when we should fix that. Given we're using reflector, which is "eventually consistent" by definition, starts at any point in time anyway and changing that to force from consistent list now would break scalability in many places, I would be afraid of breaking it. Especially that we have a solution for that around the corner. Streaming LISTS will actually allow us address that clearly: With streaming lists, we can change reflector to always start with consistent list, because it will be served from watchcache. And that's what would effectively address that. So what I would do is to:
|
Can you clarify on the "eventually consistent" bit? We have observed a rather prolonged period (at least one day but we suspect much longer) in which a kubelet was not updating the cached data of a secret. |
It seems like it is... before the watch strategy, there was a happens-before relationship between secret/configmap writes and pod creation. |
Well, technically true, but we enabled that in 1.14 (so over 5y ago) and we didn't realize until now (and unless I'm missing something we haven't heard any complaints about it). So it's not something that we found because someone upgraded their cluster recently from a version that behaved differently. |
/triage accepted |
What happened?
The logic is as follows:
but probabilistically, especially at lots of operating pressures. The third step is performed before the second step. As a result, the old secret is mounted when the pod is started.
https://kubernetes.io/docs/reference/config-api/kubelet-config.v1beta1/
What did you expect to happen?
solution: kubelet should alway get lastest secret/configmap resource in pod add event instead of using cache
How can we reproduce it (as minimally and precisely as possible)?
see What happend
Anything else we need to know?
No response
Kubernetes version
Cloud provider
OS version
Install tools
Container runtime (CRI) and version (if applicable)
Related plugins (CNI, CSI, ...) and versions (if applicable)
The text was updated successfully, but these errors were encountered: