Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failing test due to pv expecting a "topology.gke.io/zone" label that is not set in OSS kubernetes nodes on GCE #124721

Open
jbtk opened this issue May 7, 2024 · 18 comments
Labels
kind/failing-test Categorizes issue or PR as related to a consistently or frequently failing test. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. sig/cloud-provider Categorizes an issue or PR as relevant to SIG Cloud Provider. sig/storage Categorizes an issue or PR as relevant to SIG Storage.

Comments

@jbtk
Copy link
Contributor

jbtk commented May 7, 2024

Which jobs are failing?

autoscaling e2e test: Kubernetes e2e suite.[It] [sig-autoscaling] Cluster size autoscaling [Slow] should increase cluster size if pod requesting volume is pending [Feature:ClusterSizeAutoscalingScaleUp]

Which tests are failing?

Kubernetes e2e suite.[It] [sig-autoscaling] Cluster size autoscaling [Slow] should increase cluster size if pod requesting volume is pending [Feature:ClusterSizeAutoscalingScaleUp]

Since when has it been failing?

Not sure, it is failing for the whole time in the testgrid

Testgrid link

https://testgrid.k8s.io/sig-autoscaling-cluster-autoscaler#gci-gce-autoscaling

Reason for failure (if possible)

The scheduler rewrites the PV requirements from InTree to CSI requiring a label "topology.gke.io/zone" that is not set on nodes that are running in OSS kubernetes (started with kube up script).

Starting the cluster command:
kubetest2 gce -v 2 --repo-root ~/src/k8s.io/kubernetes --gcp-project --legacy-mode --build --up --env=ENABLE_CUSTOM_METRICS=true --env=KUBE_ENABLE_CLUSTER_AUTOSCALER=true --env=KUBE_AUTOSCALER_MIN_NODES=3 --env=KUBE_AUTOSCALER_MAX_NODES=6 --env=KUBE_AUTOSCALER_ENABLE_SCALE_DOWN=true --env=KUBE_ADMISSION_CONTROL=NamespaceLifecycle,LimitRanger,ServiceAccount,ResourceQuota,Priority --env=ENABLE_POD_PRIORITY=true

The problematic code seems to be here: https://github.com/kubernetes/csi-translation-lib/blob/master/plugins/gce_pd.go#L257

I see what is the problem, but it is not clear for me what should be the correct behavior. It seems that in GKE this label is actually set on the node.

The labels that I see on the node of my cluster:
beta.kubernetes.io/arch=amd64
beta.kubernetes.io/instance-type=e2-standard-2
beta.kubernetes.io/os=linux
cloud.google.com/metadata-proxy-ready=true
failure-domain.beta.kubernetes.io/region=us-central1
failure-domain.beta.kubernetes.io/zone=us-central1-b
kubernetes.io/arch=amd64
kubernetes.io/hostname=kt2-1b77b5e4-87ae-minion-group-5pgr
kubernetes.io/os=linux
node.kubernetes.io/instance-type=e2-standard-2
topology.kubernetes.io/region=us-central1
topology.kubernetes.io/zone=us-central1-b

What the test is doing:

  • the test creates a PD on GCE
  • connects a PV to it
  • creates a PVC
  • tries to schedule a pod that requires this PV

Anything else we need to know?

No response

Relevant SIG(s)

/sig-storage

@jbtk jbtk added the kind/failing-test Categorizes issue or PR as related to a consistently or frequently failing test. label May 7, 2024
@k8s-ci-robot k8s-ci-robot added needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels May 7, 2024
@k8s-ci-robot
Copy link
Contributor

This issue is currently awaiting triage.

If a SIG or subproject determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@jbtk
Copy link
Contributor Author

jbtk commented May 7, 2024

/sig storage

@k8s-ci-robot k8s-ci-robot added sig/storage Categorizes an issue or PR as relevant to SIG Storage. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels May 7, 2024
@xing-yang
Copy link
Contributor

@jbtk You need to install the CSI driver as part of the K8s cluster installation. Please ping @mattcary @BenTheElder.

@xing-yang
Copy link
Contributor

/remove-sig storage
/sig cloud-provider

@k8s-ci-robot k8s-ci-robot added sig/cloud-provider Categorizes an issue or PR as relevant to SIG Cloud Provider. and removed sig/storage Categorizes an issue or PR as relevant to SIG Storage. labels May 22, 2024
@mattcary
Copy link
Contributor

+1

How are you bringing up you cluster? IIUC, k/k cluster up will not bring up a cloud-provider specific cluster, you need to use the tooling in cloud-provider-gcp.

@BenTheElder
Copy link
Member

IIUC, k/k cluster up will not bring up a cloud-provider specific cluster, you need to use the tooling in cloud-provider-gcp.

no, kube-up.sh has been using external cloud provider by default for a while now, but might have gaps.
... and worse, there is a fork in cloud-provider-gcp, but they're for HEAD of the respective repos currently so ...

It seems likely that the gap is here while @dims / @aojea had it using the out of tree CCM for prep on removing the provider, CSI driver may not have been running vs in-tree volume? In which case we should add CSI install to the in-tree kube-up for now, and long term autoscaling tests may move to the GCP repo depending on which are broadly relevant vs very GCP specific.

We can see what is used from the logs at
https://testgrid.k8s.io/sig-autoscaling-cluster-autoscaler#gci-gce-autoscaling

@mattcary
Copy link
Contributor

For background, see #102701.

It seems there we decided that the pdcsi driver should not be installed in k/k and we should only use cloud-provider-gcp --- that's why I thought there was no cloud provider support.

Anyway I'm sure there's been a lot of evolution around this. If we want to revisit that PR, or a variation of it, I'd be happy to help.

@BenTheElder
Copy link
Member

BenTheElder commented May 23, 2024

Thanks, I do think we need to revisit this.

How are other PV tests running on GCE right now?

If this test depends on GCP specific behavior then yeah, we should just move all of that over to cloud-provider-gcp and not install it here, but if we have Kubernetes tests that cover in-tree logic where we'd ... want a storage class / PV with these behavior we should install a current implementation.

Even if the tests sometimes depend on GCP specific things, we need to split them between:

  1. this is testing GCP specific functionality outright and should move to cloud-provider-gcp
  2. this may be implemented against GCP specific tooling/functionality/details but covers core code paths and we should keep it functional while encouraging generalized rewrites where possible (e.g. testing node reboot impact on kubelet may be really useful for kubelet but there's no portable API for this and we should not kick out the test just because it knows how to reboot a kube-up GCE VM but not other nodes)

What I can't quit tell from above is which category these tests fall into. If it's 1) then we should just port the tests over and make sure we're running the pdcsi driver over there.

If it's 2) or any other tests reasonably depending on the pdcsi driver would be 2) then we should revisit installing pdcsi in cluster/ here. As much as we want to deprecate cluster/ we should not regress on coverage until we have it replaced.

@mattcary
Copy link
Contributor

Most PV tests use a csi hostpath or similar cloud-provider independent volume, or the rancher default storage class that's now in kind. This gets most use cases, but not everything, because for example it can't move between nodes.

For the second part of your comment, I think it's (2). We should revisit installing pdcsi in k/k.

My long-winded rationale for that is:

My understanding at the time of #102701 was that we were trying to move to a world where k/k e2e tests would be on kind, and anything that couldn't run on kind would have to be run in a cloud provider.

I think this was meant to be the same thing that you're saying, except instead of saying "GCP-specific" we said "can't run on kind". Those two notions may be about the same when talking about storage tests, but I think it's quite different for the autoscaler tests.

I suspect the autoscaler tests fall into a donut hole where they need more than kind, but they're not really GCP specific -- they don't care about certain GCE PD features, they just want a generic-looking cloud volume.

My thinking is that either we should draw a nice line and say "only kind for k/k e2e" (and say that the autoscalar tests have to move to cloud-provider-gcp), or if we're going to support running k/k e2e on GCE, we need that GCE cluster-up to install the pd csi driver, even if that does leak cloud-provider-gcp a little into k/k.

At least we'd have a clear rationale for the leakage (testing cloud features) even if we don't have a well-defined generic "cloud features" framework. I think it would be too strong to force the autoscalar tests into a cloud provider, because probably the tests are generic in practice, and it's a good idea for this test code to live in k/k and not be sequestered out to one cloud provider.

@jbtk
Copy link
Contributor Author

jbtk commented May 23, 2024

Ok - some clarification:

  1. this issue was created before the removal of gcp provider from k/k, currently it is not possible to run this test out of k/k repo
  2. the test creates manually a persistent disk in GCE and creates a volume for already existing PD (https://github.com/kubernetes/kubernetes/edit/master/test/e2e/autoscaling/cluster_size_autoscaling.go line 485) - it does not use a preprovisioned one (there is a question why we test this instead of preprovisioned - probably because it was originally easier to write, but my understanding is that this case should still work).
  3. While all the labels seems correctly set (on pvc and on the node) the pods are not scheduled because the translations happens.

Note that when I wrote a test that installed a driver and used the preprovisioned PV it worked because all the needed labels were there (I was planning to add it, but the removal of gce provider happened and I will likely add it to gcp specific repo)

P.S. I do not think that it is possible to rewrite storage related autoscaling tests in provider agnostic way so to run it we will need to run it in provider specific repo. Fixing the tests/moving them is something that I am working on.

@BenTheElder
Copy link
Member

I think this was meant to be the same thing that you're saying, except instead of saying "GCP-specific" we said "can't run on kind". Those two notions may be about the same when talking about storage tests, but I think it's quite different for the autoscaler tests.

Here I'm specifically referencing "GCP-specific" as in, "this test only covers things that are relevant to GCP" as opposed to "this test ensures coverage for code in the kubernetes/kubernetes repo that we wouldn't otherwise have".

While we continue to expand what can be tested in a completely general way and/or with kind specifically, there are plenty of cases where we're NOT going to do that with kind, such as node tests, scalability / performance testing, etc.

We should be deciding to put tests in this repo or cloud-provider-gcp based on if they're actually testing the cloud provider or if the cloud provider is a detail of testing Kubernetes core (and in the latter case within reason we'll continue to generalize / decouple these).

For autoscaler tests, I'm not actually sure they belong here because my understanding is the implementation we're testing is out of tree, if there's some in-this-repo logic we're testing then we should fix them here.

At least we'd have a clear rationale for the leakage (testing cloud features) even if we don't have a well-defined generic "cloud features" framework. I think it would be too strong to force the autoscalar tests into a cloud provider, because probably the tests are generic in practice, and it's a good idea for this test code to live in k/k and not be sequestered out to one cloud provider.

In general yes, and we should aim to make them portable, but we shouldn't let portable block coverage.

But for these specific tests, are they ever going to be tested on another provider? Do they cover logic contained in this repo? If not, we should move them.

I think these are really testing features of cluster-autoscaler without code in this repo, so the cluster-autoscaler or cloud provider GCP repos are a better target.


Seperately, I think installing the GCE PD CSI driver is reasonable and not really different from installing the cloud-provider-GCP CCM and I would approve a new PR to include it in kube-up. Tests that are testing that driver versus testing e.g. CSI migration for that driver should be written elsewhere though and the tests should not import GCP code.


P.S. I do not think that it is possible to rewrite storage related autoscaling tests in provider agnostic way so to run it we will need to run it in provider specific repo. Fixing the tests/moving them is something that I am working on.

Thanks, it sounds like we should move this test. We can support doing this in the cloud-provider-gcp repo or possibly the cluster autoscaler repo.

@mattcary
Copy link
Contributor

I think I agree with everything you say, Ben.

There is code in k/k/pkg/volume that I don't think can be tested on kind --- ie PVC handling and the same volume being used by different nodes in different ways.

The only way in practice we know how to exercise those tests are to run on a cloud provider (GCE, AWS, etc).

I think that means in order to continue to test k8s core features, we need some cloud provider support in k/k, and so we need to install the pd csi driver for GCE clusters.

I'll look into reviving #102701.

@BenTheElder
Copy link
Member

There is code in k/k/pkg/volume that I don't think can be tested on kind --- ie PVC handling and the same volume being used by different nodes in different ways.

Yes, at least not currently ... maybe if someone invests in a sophisticated local volume driver ... but we should not block on that.

We were in a similar state for loadbalancers but we're starting to cover those more with https://sigs.k8s.io/cloud-provider-kind, maybe someday we'll do write a more advanced local-PV to go with ...

I think that means in order to continue to test k8s core features, we need some cloud provider support in k/k, and so we need to install the pd csi driver for GCE clusters.

I'll look into reviving #102701.

Thank you!

@jbtk
Copy link
Contributor Author

jbtk commented May 23, 2024

For moving tests into the cluster autoscaler repo - is there any way to run tests in other repos (support of kubetest2, using the testing libraries etc.)? These tests use extensively the e2e testing framework, I know that cloud-provider-gcp needed some work to make tests runnable and depend on k/k repo even though it does not officially support such dependencies. Are there any other components that are out of main k/k repo and use e2e testing framework/tooling from k/k that I could look into as an example?

P.S. Forgive my basic questions, I am very new to kubernetes and might not have enough context here.

@jbtk
Copy link
Contributor Author

jbtk commented May 23, 2024

I feel that this issue drifted from the initial problem to the problem of where the tests should be and how they are written. While I understand that the latter questions is indeed important (and may require some wider discussion) I would like to reiterate the original issue:
the PV that is created directly from GCE disk has problem to be scheduled on a node that should match it because of the csi translation code in https://github.com/kubernetes/csi-translation-lib/blob/master/plugins/gce_pd.go#L257 that is used in the scheduler.

The question to storage sig would be - should this be fixed and how?
I see that this is a result of translation from in tree topology to CSI - is creating a PV directly from GCE PD deprecated so that we are ok that this does not work? Sadly I do not know the details of migration from the in-tree to CSI of the storage so I am not sure who could look into this?

/sig storage

@k8s-ci-robot k8s-ci-robot added the sig/storage Categorizes an issue or PR as relevant to SIG Storage. label May 23, 2024
@mattcary
Copy link
Contributor

@jbtk --- the pd csi driver ensures that the topology label is set correctly. The csi translation layer depends on a pd csi driver being installed; the behavior you see is working as intended.

The solution is to install the pd csi driver as part of the GCE cluster creation which I'll look into.

Since you're probably just concerned with getting your tests to pass we may be able to do a quick fix. Can you point me to how the test brings up a cluster? (eg is it kubetest or kubetest2)?

@jbtk
Copy link
Contributor Author

jbtk commented May 24, 2024

  1. When you write "the pd csi driver ensures that the topology label is set correctly" - does it ensure it also for PVs that are created based on existing PD or only for provisioned PDs? Where is it documented that using a pd requires this driver? How the user who runs OSS cluster is supposed to know this? (I am trying to fix the problem rather than just make the test pass :D)
  2. My tests do not pass anyway (since the gce provider was removed and I am just working on them) - we were considering actually dropping this test in favor of a test that would preprovision a disk (since this is more standard way of using PVs as far as I understand), but I want to make sure that if there is a bug - we catch it and fix it.
  3. Currently I am bringing up a cluster using kubetest2 which runs kube up underneath. The tests are run using configuration here: https://github.com/kubernetes/test-infra/blob/master/config/jobs/kubernetes/sig-autoscaling/sig-autoscaling-config.yaml#L232. If you make it possible to install the driver as part of kube up - I would be able to write a test that depends on the driver and uses PVC to provision a PD which would make it much easier to maintain tests for the OSS autoscaler (tests in a single place rather than some tests in k/k and some in provider specific repo)

@mattcary
Copy link
Contributor

  1. When you write "the pd csi driver ensures that the topology label is set correctly" - does it ensure it also for PVs that are created based on existing PD or only for provisioned PDs? Where is it documented that using a pd requires this driver? How the user who runs OSS cluster is supposed to know this? (I am trying to fix the problem rather than just make the test pass :D)

This seems to be the documentation: https://kubernetes.io/blog/2022/09/26/storage-in-tree-to-csi-migration-status-update-1.25/#what-does-it-mean-for-the-storage-driver-csi-migration-to-go-ga, for better or worse.

  1. My tests do not pass anyway (since the gce provider was removed and I am just working on them) - we were considering actually dropping this test in favor of a test that would preprovision a disk (since this is more standard way of using PVs as far as I understand), but I want to make sure that if there is a bug - we catch it and fix it.

I don't think preprovisioning is sufficient -- the csi driver is needed to attach & mount a PV. I'm also not sure the preprovisioning is a "more standard way" --- in our experience cloud users usually dynamically provision a PV. It's certainly much easier for testing.

  1. Currently I am bringing up a cluster using kubetest2 which runs kube up underneath. The tests are run using configuration here: https://github.com/kubernetes/test-infra/blob/master/config/jobs/kubernetes/sig-autoscaling/sig-autoscaling-config.yaml#L232. If you make it possible to install the driver as part of kube up - I would be able to write a test that depends on the driver and uses PVC to provision a PD which would make it much easier to maintain tests for the OSS autoscaler (tests in a single place rather than some tests in k/k and some in provider specific repo)

Ok, thanks. Adding it to kube-up will be the nicest way, since AFAICT you're just running straight from the e2e ginkgo and not doing any setup on your own.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/failing-test Categorizes issue or PR as related to a consistently or frequently failing test. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. sig/cloud-provider Categorizes an issue or PR as relevant to SIG Cloud Provider. sig/storage Categorizes an issue or PR as relevant to SIG Storage.
Projects
None yet
Development

No branches or pull requests

5 participants