Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cache node images #11202

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

Payback159
Copy link
Contributor

What type of PR is this?

Uncomment only one /kind <> line, hit enter to put that in a new line, and remove leading whitespaces from that line:

/kind api-change
/kind bug
/kind cleanup
/kind design
/kind documentation
/kind failing-test

/kind feature

/kind flake

What this PR does / why we need it:

Reason and discussion can be found in #11092

Which issue(s) this PR fixes:

Fixes #11092

Special notes for your reviewer:

I wanted to publish the first work on the issue. The original idea of caching it in the facts was discussed for good reasons. I am now trying to pull the "node_cache" out of the download loop.

My idea would be to manipulate the downloads list during runtime before processing it in the loop.
First tests look promising but it is not finalized yet, I hope to get early feedback due to the open development! :-)

Does this PR introduce a user-facing change?:

improve caching of node image information to avoid unnecessary multiple accesses to the container runtime.

@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. kind/feature Categorizes issue or PR as related to a new feature. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels May 16, 2024
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: Payback159
Once this PR has been reviewed and has the lgtm label, please assign floryut for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label May 16, 2024
@k8s-ci-robot
Copy link
Contributor

Hi @Payback159. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label May 16, 2024
- name: Download | Check if the container image is already present
set_fact:
downloads: "{{ downloads | combine({item.key: item.value | combine({'enabled': false})}) }}"
loop: "{{ downloads | dict2items }}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would avoid adding yet another loop.
I agree that "patching" downloads at runtime is a good idea, but using filter in the include_tasks loop below would be better.
If the ansible filters are not up to the tasks, use json_query, it's quite powerful (but can be headache inducing if abused 😆 )

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks I didn't want to jump on the loop at first (I was a little put off by the loop 😄).
I have moved the logic to the loop task. It's not a json_query but it should still save us the extra loop.

What do you think?

@Payback159
Copy link
Contributor Author

Payback159 commented May 17, 2024

I have moved the logic to the include_tasks logic.

The first run executes 10 download_container includes (cluster.yml without "image cache" with already provisioned 1 control-plane, 1 node). With a rerun on the same setup with the cache there are then only 3 download_container includes.

I have to take a closer look at these 3 includes, as they were also reloaded again and again with the previous approach (extra loop), I would actually expect 0 includes from download_container after adding the "image cache".

Another open point is the support to pull via sha256, I have to check in advance whether sha256 is defined in the item, but I wanted to concentrate on the feasibility first.

Once I have the sha256 logic, I'll start reviewing the download_container and sub-tasks to see if some variables and rules can be tidied up.

@Payback159
Copy link
Contributor Author

I think I have it. The initial cluster.yml downloads 10 containers, the 2nd run downloads 0 containers, because all images are in the cache.

If download_always_pull: true the node_image is registered but the filter logic is canceled by the "or" link and all images are pulled again.

First run

TASK [download : Show node images] *********************************************
ok: [kubespray-control-plane-1] => {
    "msg": ""
}
ok: [kubespray-worker-1] => {
    "msg": ""
--
included: /Users/jelinek/git/kubespray/roles/download/tasks/download_container.yml for kubespray-control-plane-1, kubespray-worker-1 => (item={'key': 'calico_node', 'value': {'enabled': True, 'container': True, 'repo': 'quay.io/calico/node', 'tag': 'v3.27.3', 'sha256': '', 'groups': ['k8s_cluster']}})
included: /Users/jelinek/git/kubespray/roles/download/tasks/download_container.yml for kubespray-control-plane-1, kubespray-worker-1 => (item={'key': 'calico_cni', 'value': {'enabled': True, 'container': True, 'repo': 'quay.io/calico/cni', 'tag': 'v3.27.3', 'sha256': '', 'groups': ['k8s_cluster']}})
included: /Users/jelinek/git/kubespray/roles/download/tasks/download_container.yml for kubespray-control-plane-1, kubespray-worker-1 => (item={'key': 'calico_flexvol', 'value': {'enabled': True, 'container': True, 'repo': 'quay.io/calico/pod2daemon-flexvol', 'tag': 'v3.27.3', 'sha256': '', 'groups': ['k8s_cluster']}})
included: /Users/jelinek/git/kubespray/roles/download/tasks/download_container.yml for kubespray-control-plane-1, kubespray-worker-1 => (item={'key': 'calico_policy', 'value': {'enabled': True, 'container': True, 'repo': 'quay.io/calico/kube-controllers', 'tag': 'v3.27.3', 'sha256': '', 'groups': ['k8s_cluster']}})
included: /Users/jelinek/git/kubespray/roles/download/tasks/download_file.yml for kubespray-control-plane-1 => (item={'key': 'calico_crds', 'value': {'file': True, 'enabled': True, 'version': 'v3.27.3', 'dest': '/tmp/releases/calico-v3.27.3-kdd-crds/v3.27.3.tar.gz', 'sha256': 'd11a32919bff389f642af5df8180ad3cec586030decd35adb2a7d4a8aa3b298e', 'url': 'https://github.com/projectcalico/calico/archive/v3.27.3.tar.gz', 'unarchive': True, 'unarchive_extra_opts': ['--strip=3', '--wildcards', '*/libcalico-go/config/crd/'], 'owner': 'root', 'mode': '0755', 'groups': ['kube_control_plane']}})
included: /Users/jelinek/git/kubespray/roles/download/tasks/download_container.yml for kubespray-control-plane-1, kubespray-worker-1 => (item={'key': 'pod_infra', 'value': {'enabled': True, 'container': True, 'repo': 'registry.k8s.io/pause', 'tag': '3.9', 'sha256': '', 'groups': ['k8s_cluster']}})
included: /Users/jelinek/git/kubespray/roles/download/tasks/download_container.yml for kubespray-control-plane-1, kubespray-worker-1 => (item={'key': 'coredns', 'value': {'enabled': True, 'container': True, 'repo': 'registry.k8s.io/coredns/coredns', 'tag': 'v1.11.1', 'sha256': '', 'groups': ['k8s_cluster']}})
included: /Users/jelinek/git/kubespray/roles/download/tasks/download_container.yml for kubespray-control-plane-1, kubespray-worker-1 => (item={'key': 'nodelocaldns', 'value': {'enabled': True, 'container': True, 'repo': 'registry.k8s.io/dns/k8s-dns-node-cache', 'tag': '1.22.28', 'sha256': '', 'groups': ['k8s_cluster']}})
included: /Users/jelinek/git/kubespray/roles/download/tasks/download_container.yml for kubespray-control-plane-1 => (item={'key': 'dnsautoscaler', 'value': {'enabled': True, 'container': True, 'repo': 'registry.k8s.io/cpa/cluster-proportional-autoscaler', 'tag': 'v1.8.8', 'sha256': '', 'groups': ['kube_control_plane']}})
included: /Users/jelinek/git/kubespray/roles/download/tasks/download_container.yml for kubespray-control-plane-1 => (item={'key': 'metrics_server', 'value': {'enabled': True, 'container': True, 'repo': 'registry.k8s.io/metrics-server/metrics-server', 'tag': 'v0.7.0', 'sha256': '', 'groups': ['kube_control_plane']}})
included: /Users/jelinek/git/kubespray/roles/download/tasks/download_container.yml for kubespray-worker-1 => (item={'key': 'nginx', 'value': {'enabled': True, 'container': True, 'repo': 'docker.io/library/nginx', 'tag': '1.25.2-alpine', 'sha256': '', 'groups': ['kube_node']}})
Friday 17 May 2024  21:08:44 +0200 (0:00:00.782)       0:02:30.673 ************ 

second run

TASK [download : Show node images] *********************************************
ok: [kubespray-control-plane-1] => {
    "msg": "quay.io/calico/cni:v3.27.3,quay.io/calico/cni:<none>,quay.io/calico/kube-controllers:v3.27.3,quay.io/calico/node:v3.27.3,quay.io/calico/node:<none>,quay.io/calico/pod2daemon-flexvol:v3.27.3,quay.io/calico/pod2daemon-flexvol:<none>,registry.k8s.io/coredns/coredns:v1.11.1,registry.k8s.io/coredns/coredns:<none>,registry.k8s.io/cpa/cluster-proportional-autoscaler:v1.8.8,registry.k8s.io/cpa/cluster-proportional-autoscaler:<none>,registry.k8s.io/dns/k8s-dns-node-cache:1.22.28,registry.k8s.io/dns/k8s-dns-node-cache:<none>,registry.k8s.io/kube-apiserver:v1.29.3,registry.k8s.io/kube-apiserver:<none>,registry.k8s.io/kube-controller-manager:v1.29.3,registry.k8s.io/kube-controller-manager:<none>,registry.k8s.io/kube-proxy:v1.29.3,registry.k8s.io/kube-proxy:<none>,registry.k8s.io/kube-scheduler:v1.29.3,registry.k8s.io/kube-scheduler:<none>,registry.k8s.io/metrics-server/metrics-server:v0.7.0,registry.k8s.io/pause:3.9,registry.k8s.io/pause:<none>,<none>:<none>,<none>:<none>,<none>:<none>,<none>:<none>,<none>:<none>,<none>:<none>,<none>:<none>,<none>:<none>,<none>:<none>,<none>:<none>,<none>:<none>,<none>:<none>,<none>:<none>,"
}
ok: [kubespray-worker-1] => {
    "msg": "nginx:1.25.2-alpine,nginx:<none>,quay.io/calico/cni:v3.27.3,quay.io/calico/cni:<none>,quay.io/calico/kube-controllers:v3.27.3,quay.io/calico/kube-controllers:<none>,quay.io/calico/node:v3.27.3,quay.io/calico/node:<none>,quay.io/calico/pod2daemon-flexvol:v3.27.3,quay.io/calico/pod2daemon-flexvol:<none>,registry.k8s.io/coredns/coredns:v1.11.1,registry.k8s.io/coredns/coredns:<none>,registry.k8s.io/dns/k8s-dns-node-cache:1.22.28,registry.k8s.io/dns/k8s-dns-node-cache:<none>,registry.k8s.io/kube-proxy:v1.29.3,registry.k8s.io/kube-proxy:<none>,registry.k8s.io/metrics-server/metrics-server:v0.7.0,registry.k8s.io/metrics-server/metrics-server:<none>,registry.k8s.io/pause:3.9,registry.k8s.io/pause:<none>,<none>:<none>,<none>:<none>,<none>:<none>,<none>:<none>,<none>:<none>,<none>:<none>,<none>:<none>,<none>:<none>,<none>:<none>,<none>:<none>,"

#no download_container includes

third run with download_always_pull: true

TASK [download : Show node images] *********************************************
ok: [kubespray-control-plane-1] => {
    "msg": "quay.io/calico/cni:v3.27.3,quay.io/calico/cni:<none>,quay.io/calico/node:v3.27.3,quay.io/calico/node:<none>,quay.io/calico/pod2daemon-flexvol:v3.27.3,quay.io/calico/pod2daemon-flexvol:<none>,registry.k8s.io/coredns/coredns:v1.11.1,registry.k8s.io/coredns/coredns:<none>,registry.k8s.io/cpa/cluster-proportional-autoscaler:v1.8.8,registry.k8s.io/cpa/cluster-proportional-autoscaler:<none>,registry.k8s.io/dns/k8s-dns-node-cache:1.22.28,registry.k8s.io/dns/k8s-dns-node-cache:<none>,registry.k8s.io/kube-apiserver:v1.29.3,registry.k8s.io/kube-apiserver:<none>,registry.k8s.io/kube-controller-manager:v1.29.3,registry.k8s.io/kube-controller-manager:<none>,registry.k8s.io/kube-proxy:v1.29.3,registry.k8s.io/kube-proxy:<none>,registry.k8s.io/kube-scheduler:v1.29.3,registry.k8s.io/kube-scheduler:<none>,registry.k8s.io/pause:3.9,registry.k8s.io/pause:<none>,<none>:<none>,<none>:<none>,<none>:<none>,<none>:<none>,<none>:<none>,<none>:<none>,<none>:<none>,<none>:<none>,<none>:<none>,<none>:<none>,<none>:<none>,"
}
ok: [kubespray-worker-1] => {
    "msg": "nginx:1.25.2-alpine,nginx:<none>,quay.io/calico/cni:v3.27.3,quay.io/calico/cni:<none>,quay.io/calico/kube-controllers:v3.27.3,quay.io/calico/kube-controllers:<none>,quay.io/calico/node:v3.27.3,quay.io/calico/node:<none>,quay.io/calico/pod2daemon-flexvol:v3.27.3,quay.io/calico/pod2daemon-flexvol:<none>,registry.k8s.io/coredns/coredns:v1.11.1,registry.k8s.io/coredns/coredns:<none>,registry.k8s.io/dns/k8s-dns-node-cache:1.22.28,registry.k8s.io/dns/k8s-dns-node-cache:<none>,registry.k8s.io/kube-proxy:v1.29.3,registry.k8s.io/kube-proxy:<none>,registry.k8s.io/metrics-server/metrics-server:v0.7.0,registry.k8s.io/metrics-server/metrics-server:<none>,registry.k8s.io/pause:3.9,registry.k8s.io/pause:<none>,<none>:<none>,<none>:<none>,<none>:<none>,<none>:<none>,<none>:<none>,<none>:<none>,<none>:<none>,<none>:<none>,<none>:<none>,<none>:<none>,"
--
included: /Users/jelinek/git/kubespray/roles/download/tasks/download_container.yml for kubespray-control-plane-1, kubespray-worker-1 => (item={'key': 'calico_node', 'value': {'enabled': True, 'container': True, 'repo': 'quay.io/calico/node', 'tag': 'v3.27.3', 'sha256': '', 'groups': ['k8s_cluster']}})
included: /Users/jelinek/git/kubespray/roles/download/tasks/download_container.yml for kubespray-control-plane-1, kubespray-worker-1 => (item={'key': 'calico_cni', 'value': {'enabled': True, 'container': True, 'repo': 'quay.io/calico/cni', 'tag': 'v3.27.3', 'sha256': '', 'groups': ['k8s_cluster']}})
included: /Users/jelinek/git/kubespray/roles/download/tasks/download_container.yml for kubespray-control-plane-1, kubespray-worker-1 => (item={'key': 'calico_flexvol', 'value': {'enabled': True, 'container': True, 'repo': 'quay.io/calico/pod2daemon-flexvol', 'tag': 'v3.27.3', 'sha256': '', 'groups': ['k8s_cluster']}})
included: /Users/jelinek/git/kubespray/roles/download/tasks/download_container.yml for kubespray-control-plane-1, kubespray-worker-1 => (item={'key': 'calico_policy', 'value': {'enabled': True, 'container': True, 'repo': 'quay.io/calico/kube-controllers', 'tag': 'v3.27.3', 'sha256': '', 'groups': ['k8s_cluster']}})
included: /Users/jelinek/git/kubespray/roles/download/tasks/download_file.yml for kubespray-control-plane-1 => (item={'key': 'calico_crds', 'value': {'file': True, 'enabled': True, 'version': 'v3.27.3', 'dest': '/tmp/releases/calico-v3.27.3-kdd-crds/v3.27.3.tar.gz', 'sha256': 'd11a32919bff389f642af5df8180ad3cec586030decd35adb2a7d4a8aa3b298e', 'url': 'https://github.com/projectcalico/calico/archive/v3.27.3.tar.gz', 'unarchive': True, 'unarchive_extra_opts': ['--strip=3', '--wildcards', '*/libcalico-go/config/crd/'], 'owner': 'root', 'mode': '0755', 'groups': ['kube_control_plane']}})
included: /Users/jelinek/git/kubespray/roles/download/tasks/download_container.yml for kubespray-control-plane-1, kubespray-worker-1 => (item={'key': 'pod_infra', 'value': {'enabled': True, 'container': True, 'repo': 'registry.k8s.io/pause', 'tag': '3.9', 'sha256': '', 'groups': ['k8s_cluster']}})
included: /Users/jelinek/git/kubespray/roles/download/tasks/download_container.yml for kubespray-control-plane-1, kubespray-worker-1 => (item={'key': 'coredns', 'value': {'enabled': True, 'container': True, 'repo': 'registry.k8s.io/coredns/coredns', 'tag': 'v1.11.1', 'sha256': '', 'groups': ['k8s_cluster']}})
included: /Users/jelinek/git/kubespray/roles/download/tasks/download_container.yml for kubespray-control-plane-1, kubespray-worker-1 => (item={'key': 'nodelocaldns', 'value': {'enabled': True, 'container': True, 'repo': 'registry.k8s.io/dns/k8s-dns-node-cache', 'tag': '1.22.28', 'sha256': '', 'groups': ['k8s_cluster']}})
included: /Users/jelinek/git/kubespray/roles/download/tasks/download_container.yml for kubespray-control-plane-1 => (item={'key': 'dnsautoscaler', 'value': {'enabled': True, 'container': True, 'repo': 'registry.k8s.io/cpa/cluster-proportional-autoscaler', 'tag': 'v1.8.8', 'sha256': '', 'groups': ['kube_control_plane']}})
included: /Users/jelinek/git/kubespray/roles/download/tasks/download_container.yml for kubespray-control-plane-1 => (item={'key': 'metrics_server', 'value': {'enabled': True, 'container': True, 'repo': 'registry.k8s.io/metrics-server/metrics-server', 'tag': 'v0.7.0', 'sha256': '', 'groups': ['kube_control_plane']}})
included: /Users/jelinek/git/kubespray/roles/download/tasks/download_container.yml for kubespray-worker-1 => (item={'key': 'nginx', 'value': {'enabled': True, 'container': True, 'repo': 'docker.io/library/nginx', 'tag': '1.25.2-alpine', 'sha256': '', 'groups': ['kube_node']}})
Friday 17 May 2024  21:54:12 +0200 (0:00:00.803)       0:01:09.291 ************ 

@Payback159 Payback159 marked this pull request as ready for review May 17, 2024 20:23
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label May 17, 2024
@yankay
Copy link
Member

yankay commented May 20, 2024

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels May 20, 2024
@VannTen
Copy link
Contributor

VannTen commented May 21, 2024

I'm gonna take a look, but not right now, I'm a bit busy

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/feature Categorizes issue or PR as related to a new feature. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Cache Node Images List in facts
4 participants