Score:0

Clarifying questions around id_token usage in service to service

ml flag

I work heavily with OAuth2/OIDC in my current job. Now moving more and more to GCP. I have some clarifying questions about the use of OIDC tokens for Service to Service communication, one high level, one tactical:

Problem: I'm securing a Cloud Scheduler job to a Cloud Run endpoint. I have solved the problem (as best I can tell), but I'm highly confused on why Google set things up this way and hoping to get clarification. Not challenging things, just seeking understanding. It feels so different than what I know. I've always used id_tokens for humans.

  1. Why did they choose OIDC ID Tokens for service to service communication? I've used OIDC a ton on the user side, but never on the server to server side. So getting an ID Token for server to server communication feels very odd. I would love a link to points to the docs explaining this architecture choice on the server to server side.. I would have expected a OAuth2 Access Token with Client Credentials for all service to service communication not an ID Token. I see that their docs indicate the platform uses a mix of both

  2. Why is the Audience field arbitrary? In the Cloud Scheduler, it appears that as long as I use a valid service account in the project, I can put any value in the audience field? I'm sure there is a reason for this, Google folks are smart, but this feels like a security hole. I mean, the audience could be any valid url (best I can tell). Can I put a audience of a Cloud Run endpoint in a different project and make that call?

  3. Obviously there is a split here between AuthN and AuthZ, so the id_token is more about authN, but an audience field validated on the request of the token would indicate solid Authz. BUT with it being arbitrary, I feel like the validation of the audience can't be trusted because anybody can put anything there. Please tell me what I'm missing.

I hope these questions make sense. I'm new to GCP, but like what I see, but part of my job is to find the edges of stuff, and these just feel odd compared to what I've used in the past.

Score:0
cn flag

Why did they choose OIDC ID Tokens for service to service communication? I've used OIDC a ton on the user side, but never on the server to server side. So getting an ID Token for server to server communication feels very odd. I would love a link to points to the docs explaining this architecture choice on the server to server side.. I would have expected a OAuth2 Access Token with Client Credentials for all service to service communication not an ID Token.

In Google Cloud, there are two types of authorization. Role-based (OAuth Access Token) and Identity-based (OIDC Identity Token). Role-based authorization provides access to all resources based upon roles. For example viewer access to all Compute Engine instances. This type of permission is managed at the Project/Folder/Organization level. Identity-based authorization provides access to an individual resource. The key difference is where to permission/role is assigned: at the project level or at the resource level. Since a role is too broad of permission at the resource level, you need another way. That way is identities. I grant john access to KMS key secret2. The identity + permissions are stored at the KMS key access management layer.

Why is the Audience field arbitrary? In the Cloud Scheduler, it appears that as long as I use a valid service account in the project, I can put any value in the audience field? I'm sure there is a reason for this, Google folks are smart, but this feels like a security hole. I mean, the audience could be any valid url (best I can tell). Can I put a audience of a Cloud Run endpoint in a different project and make that call?

If your code creates the Identity Token, the audience is required. There are certain Google-managed OAuth Client IDs that allow the audience to be ignored. Since the identity and permissions are stored at the resource, calling a different Cloud Run endpoint will not pass the identity check. IMHO the audience field is a fast method for an identity system to first check if the Identity Token should be approved for the IAM layer.

Obviously there is a split here between AuthN and AuthZ, so the id_token is more about authN, but an audience field validated on the request of the token would indicate solid Authz. BUT with it being arbitrary, I feel like the validation of the audience can't be trusted because anybody can put anything there. Please tell me what I'm missing.

Identity Tokens are signed. Only authorized services and code can create Identity Tokens. As I mentioned in my previous point, the audience does not provide permission or grant access. It is one step in a layer of steps to authorize access based upon the Identity Token. The end validation is the identity + permission assigned at the resource. The audience does not grant any of those, but can be used as a filter to quickly discard tokens.

Cade Thacker avatar
ml flag
Thank you for the thoughtful answer. I've read this through a few times and I see what you are saying. Everything that I work with now is a fully role based approach, so this is just different. Need to get my head around this. Thanks again!
mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.