Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tracing permissions issues when service_account_key specified #688

Open
TheSpy opened this issue Apr 26, 2022 · 16 comments
Open

Tracing permissions issues when service_account_key specified #688

TheSpy opened this issue Apr 26, 2022 · 16 comments

Comments

@TheSpy
Copy link

TheSpy commented Apr 26, 2022

Hello,
I am having issues in GKE environment with workload identity enabled.
image: gcr.io/endpoints-release/endpoints-runtime:2.34.0

I am using a provided service account through --service_account_key parameter which has owner permissions in a project

Endpoints reporting works fine however tracing keeps throwing errors
BatchWriteSpans failed (1 spans, 769 bytes): PERMISSION_DENIED: The caller does not have permission

I have tried querying metadata server and it lists workload identity service accounts

image

Seems like reporting works through specified service account, however tracing use workload identity. Is that an expected behavior?

My expectation is: when a service account key is defined, both endpoints reporting and tracing works through the same specified service account.

@TheSpy
Copy link
Author

TheSpy commented Apr 26, 2022

Adding --non_gcp flag helped.
Since I am running endpoints-runtime in GCP environment is that the right way to setup?

@qiwzhang
Copy link
Contributor

Currently, flag --service_account_key doesn't apply the OpenCensus tracing API calls. It only applies to ServiceControl API calls. That is the problem.

My suggestion is to setup GKE workload identify correctly and not to use flag --service_account_key.

Here is the steps on how to set it up

The flag --non_gcp is not the right approach, It still call GKE metadata server to get credential.

@TheSpy
Copy link
Author

TheSpy commented Apr 26, 2022

Thank you for clarification.
For my specific case it is not possible, because workload identity service account is enabled on a pod level and not all containers running inside a pod supports workload identity service accounts yet.
As a temporary solution is it safe to use --non_gcp flag in a GCP environment?

@qiwzhang
Copy link
Contributor

The flag --non_gcp is pretty simple, it prevents ESPv2 to call GCP metadata sever to get info it needs. As long as you provide these info, all features are working fine, it is ok to use it.

For example, you need to provide --tracing_project_id for CloudESF to send the trace to that project. For tracing, it is the only place using this flag.

The flag --non_gcp doesn't change its way of fetching credential to call StackDriver service. By default, it is calling GCP metadata server to get credential. So I am surprised it works with the --non_gcp flag.

I think underneath it is using grpc client, If you set environment variable APPLICATION_DEFAULT_CREDENTIAL to your key path, it will use it:

export GOOGLE_APPLICATION_CREDENTIALS=your-key-path
``

@TheSpy
Copy link
Author

TheSpy commented Apr 26, 2022

Ah yes, you are correct.
I have been experimenting and forgot to tell that I have GOOGLE_APPLICATION_CREDENTIALS environment variable set together with --non_gcp flag
Thanks!

@qiwzhang
Copy link
Contributor

Cool, I believe it is the GOOGLE_APPLICATION_CREDENTIALS that make it work, not the flag --non_gcp.

@qiwzhang qiwzhang closed this as completed Jun 4, 2022
@TheSpy
Copy link
Author

TheSpy commented Oct 4, 2022

Just an observation: when --non_gcp flag is set together with --tracing_project_id, a trace property is not added to the log item (trace property is the one which is described here #431)
As I understand it is not added because project id is missing and my assumption was that tracing_project_id could be used in case metadata info not available. Is that the right assumption?

@qiwzhang
Copy link
Contributor

qiwzhang commented Oct 4, 2022

@nareddyt do you know?

@qiwzhang qiwzhang reopened this Oct 4, 2022
@qiwzhang
Copy link
Contributor

qiwzhang commented Oct 4, 2022

I guess we could use producer_project_id which is the project you deployed your service config to when doing gcloud endpoints service deploy ...

@nareddyt
Copy link
Contributor

nareddyt commented Oct 4, 2022

You are correct @TheSpy. It is because we have two separate project IDs - tracing project ID and deployment project ID.

Trace property is filled into access log here:

log_entry->set_trace(TraceResourceName(info.trace_id, info.project_id));

We make use of the deployment project ID, which we retrieve from metadata server:

std::string project_id;

We never propagate --tracing_project_id down to this field

@qiwzhang
Copy link
Contributor

qiwzhang commented Oct 4, 2022

We should send tracing_project_id to service_control filter to generating trace_id in the log.

@lvl99
Copy link

lvl99 commented Nov 18, 2022

I've got this same issue myself, but using ESPv2 with Cloud Run. Latest image I'm using is gcr.io/endpoints-release/endpoints-runtime-serverless:2.39.0.

I deploy via gcloud run deploy {...} --service-account {...}. I've granted the permission cloudtrace.traces.patch to my service account, but still seem to have the same BatchWriteSpans failed (1 spans, 769 bytes): PERMISSION_DENIED: The caller does not have permission error coming up in my logs. It doesn't tell me exactly what permission is required or needed.

@nareddyt
Copy link
Contributor

Hi @lvl99 , what flags are you passing to ESPv2 on Cloud Run?

ESPv2 should auto-detect the service account you specified in gcloud run deploy and use it to publish traces. No other ESPv2 config is required (i.e. no need to specify flags like tracing_project_id). Just making sure that is clear.

I've granted the permission cloudtrace.traces.patch to my service account

Can you double check this? BTW you should grant a role to the service account, not a permission. Did you grant the Cloud Trace Agent role?

@nareddyt
Copy link
Contributor

One more question: Is the service account above from the same project that you deploy ESPv2 to? Or is the service account from a different project?

@lvl99
Copy link

lvl99 commented Nov 18, 2022

Thanks for your reply @nareddyt

I've created a custom role which has assigned the permission cloudtrace.traces.patch to it. The custom role is assigned to the service account I've configured for my Cloud Run instance.

All roles and service accounts are within the same project as the Cloud Run instance as well.

The flags I use to pass to ESPv2 are:

ENV ESPv2_ARGS ^++^--cors_preset=basic

That's about it. I'll try working in --enable-debug as well to see if it gives me any further info.

@nareddyt
Copy link
Contributor

Thanks for the info. That is very odd, as far as I can tell, everything is set up correctly. This is the first time I've seen permission issues for Cloud Trace.

To debug this further, can you deploy with --enable_debug, and then share the full ESPv2 application logs (from startup to PERMISSION_DENIED error)? Feel free to email the logs to nareddyt@google.com

Unfortunately other than the logs, I can't think of other ways to debug this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants