Wrong kube-proxy mount bind propagation causes node certificates to not update itself and expire #16400
Labels
kind/bug
Categorizes issue or PR as related to a bug.
kind/office-hours
lifecycle/stale
Denotes an issue or PR has remained open with no activity and has become stale.
/kind bug
1. What
kops
version are you running? The commandkops version
, will displaythis information.
Last applied server version: 1.25.3
2. What Kubernetes version are you running?
kubectl version
will print theversion if a cluster is running or provide the Kubernetes version specified as
a
kops
flag.Server Version: v1.25.5
3. What cloud provider are you using?
AWS
4. What commands did you run? What is the simplest way to reproduce this issue?
5. What happened after the commands executed?
I get two different outputs for the same file.
6. What did you expect to happen?
I expected to get the same output since this should be the same file.
You can see that when you check volumes and volumeMounts of kube-proxy path
Also if you check container configuration using
ctr
sudo ctr -n k8s.io container inspect <kube-config container id>
You can confirm this container uses the same file.
I also believe those lines caused the issue. Especialy
rprivate
option.From docker documentation one can see
So the default
rprivate
option can cause unsynchronized state if we have multiple replicas of this mounts.So another condition must be met for this issue to happen.
There must be multiple containers with the same mount.
This happens if node restarted and new kube-proxy pod is created.
I believe this is the default behaviour so kubernetes can get logs from previous container.
Even if only one the two containers is running/used this can happen.
I can tested this hypothesis by running this command
sudo ctr -n k8s.io containers ls | grep proxy | wc -l
on all of our nodes.On all nodes where we have multiple containers (2) we can see unsynchronized certificates.
Also I can confirm those are the only kube-proxy pods with restarts > 0.
And on nodes with only single kube-proxy container we can see files /var/lib/kube-proxy/kubeconfig from node and pod to be the same.
7. Please provide your cluster manifest. Execute
kops get --name my.example.com -o yaml
to display your cluster manifest.You may want to remove your cluster name and other sensitive information.
9. Anything else do we need to know?
So I believe this bug is caused by:
kubectl logs -p
rprivate
this caused/var/lib/kube-proxy/kubeconfig
to be unsynchronized between containers and node.Logs from kube-proxy
Logs from kube-api-server
10. Possible solutions?
kubectl logs -p
)Also, the more I read, the more I am not sure
rprivate
causes this behavior of unsynchronized file. Can you come up with any way we can confirm that?The text was updated successfully, but these errors were encountered: