Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Controller should check for orphan shadow services #563

Open
kevinpollet opened this issue May 11, 2020 · 8 comments
Open

Controller should check for orphan shadow services #563

kevinpollet opened this issue May 11, 2020 · 8 comments

Comments

@kevinpollet
Copy link
Member

Feature Request

Proposal

To make sure that shadow services exist for services created before Maesh installation or when the controller is down, at startup, the controller list the existing services and create the missing shadow services.

A nice addition could be to check that all shadow services have a corresponding service to check that there are no orphans. e.g. when a service is deleted during controller downtime.

@dtomcej
Copy link
Contributor

dtomcej commented May 11, 2020

In the past there was a poor mans "audit" that ran on a timer to verify all services were matched up, and that if there was a missing shadow service, trigger it as a new create event.

Did we want to go this route?

@kevinpollet
Copy link
Member Author

kevinpollet commented May 11, 2020

I think doing this check at startup is enough. Maybe that's not necessary if the shadow service manager is idempotent (#562). This will ensure that orphan services will not disrupt the system.

One problem with the orphan shadow services is that they hold entry points for nothing.

@kevinpollet
Copy link
Member Author

Just a thought, maybe we could use ownerReferences to make sure that shadow service are properly deleted. wdyt?

@dtomcej
Copy link
Contributor

dtomcej commented May 11, 2020

oooo That is a neat idea. Letting kubernetes handle it automatically? I like it.

@kevinpollet
Copy link
Member Author

Yep, that's the idea. If a shadow service has a user service as ownerReferences, it should be garbage collected when the user service is removed but, we still have to keep the TCP and UDP tables in sync to avoid port leaks. Maybe it will help us when the TCP and UDP tables will be computed at startup.

Btw, I think that we should try to handle orphan shadow services gracefully to avoid port leaks which will disrupt the system.

@jspdown
Copy link
Contributor

jspdown commented Jul 17, 2020

@kevinpollet I did a test with the ownerReferences property. It works great but there's a side effect. We are relying on service deletion events to update state tables but when we receive this event the shadow service can already be deleted. And so, we don't know which mapping should be removed. There are solutions though:

  • Update the PortMapper and expose a RemoveAll method which takes a name and a namespace as parameter. It will then remove all port mapping for the given service. And so, we no longer need to inspect the shadow service to know which port should be unmapped.
  • Listen for ShadowService events (create, update, delete) to maintain the state tables.
  • Stop maintaining two distinct states, build the state table from the shadow services when we need them.

The first solution is the easiest to implement and the less disruptive. The second adds one more moving piece in the system and will introduce some delay. The third solution simplify the port mapping management but has a cost. We would have to build this table before each build of the topology.

I tend to prefer the 3rd option or eventually the 1st.
@kevinpollet, @dtomcej Any opinion on this?

@kevinpollet kevinpollet changed the title Controller should check for orphan shadow services at startup Controller should check for orphan shadow services Jul 17, 2020
@dtomcej
Copy link
Contributor

dtomcej commented Jul 20, 2020

@jspdown I think that if we use references, we lose a bit of control over the deletion events. We have to just "trust" that things will be handled properly by k8s. We would have to run an "audit" loop to verify that k8s has in fact, removed the objects it was supposed to.

However, isn't the goal to remove the persistent state table entirely, and use the shadow services as the source of truth?

@jspdown
Copy link
Contributor

jspdown commented Sep 17, 2020

Using ownerReferences is not an option for us. Shadow services and user services are in different namespaces. The kubernetes documentation on garbage collection says:

Cross-namespace owner references are disallowed by design. This means:

  1. Namespace-scoped dependents can only specify owners in the same namespace, and owners that are cluster-scoped.
  2. Cluster-scoped dependents can only specify cluster-scoped owners, but not namespace-scoped owners.

https://kubernetes.io/docs/concepts/workloads/controllers/garbage-collection/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants