-
Notifications
You must be signed in to change notification settings - Fork 694
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dynamically reload cert/keys if changed on disk #1595
base: master
Are you sure you want to change the base?
Conversation
Also on reflection I'll probably rename "monitor" to "watch" (and variants thereof), just because it's a touch more idiomatic and "monitored" has other connotations. |
df8b1e1
to
f5a95a0
Compare
f5a95a0
to
71354e3
Compare
You want feedback on this now? Or should I wait? |
If you have thoughts now, fire away. I probably won't be able to resume work for a couple days. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes sense to me, but needs to be approached differently. For example services like this usually return a channel, allowing them to be shutdown cleanly. See https://github.com/rqlite/rqlite/blob/master/queue/queue.go#L198 for an example in rqlite. I would expect to see use of the "done" channel pattern.
Also, this could could be quite racy, since I don't see any read-write synchronization being done.
if changed != nil { | ||
cb(changed) | ||
} | ||
time.Sleep(2 * time.Second) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not idiomatic Go. You should use the ticker pattern.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I can adapt it easily enough. Although it's not clear to me what value using a ticker and channel will provide over sleep and callback in this case. A ticker makes sense when you want to select on multiple things (most notably a context), but in this case it was literally just a "do a thing every interval T."
Also, this could could be quite racy, since I don't see any read-write synchronization being done.
It uses an atomic.Value to hold a scalar (a pointer to tls.Certificate).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would expect to see use of the "done" channel pattern.
What is it that will signal doneness?
In this case there's no cleanup activity for the goroutine to do, it's expected to run continuously until program exit, and it doesn't impede rqlited's ability to terminate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be clear, I'm not objecting to making the changes, just asking to be administered the cluebat a little bit more. :)
Yeah, following Go patterns is very important to me. It promotes quality, ease of testing, and makes it clearer for other programmers what your actual intention is. In this case using a Ticker makes it crystal clear what you're doing, in a way Sleep does not. Not being able to shut down the monitoring service makes unit-testing more difficult. For example, unit testing takes place within the context of a single binary. Without clean shutdown, if I force a stack trace of the test binary (perhaps I'm debugging a test deadlock), I've extra Goroutines running that are nothing to do with the test failure. |
Multiple unit tests run inside a single binary. Multiple different unit tests of the monitoring service would leave multiple goroutines running that make debug harder. Every time a given unit test finishes within the binary, I want all resources cleaned up so runtime is fully reset of the next unit test. |
More generally, the monitoring code should be a struct, one the code can instantiate. You can see this pattern in the fsnotify system -- |
Right, good point. I'm with you from a testing perspective. Outside of the testing context, I'm not sure how this should work. The usual pattern is to propagate a Alternatively, the file monitor goroutine could be spawned (and subsequently signaled to terminate) once from |
Similar to what I concluded my last comment with. The only nuance with rqlite is if the watcher's lifecycle is to be maintained from |
The use of Context within rqlite is weak, it should be better (rqlite development started before Context became a real thing, and that pattern has stuck unfortunately). If there is a better, idiomatic, solution using Context that gives the code complete control over the Monitoring object's lifecycle, that works better. |
I think there is, similar to what you're doing with auto backups in In fact, I actually started down these lines but back-pedaled to a single function when I thought I could get away with it. :) But it sounds like something more sophisticated is in order. Two approaches come to mind:
So in the first case the global state is a map (plus corresponding mutex) that tracks the files registered for watch, while in the second case the state is a shared context set by main() and a singleton instance of |
You know, maybe this is a premature/unnecessary optimization. Even if we do spawn a separate file watcher goroutine per node connection watching the same files, so what? We aren't talking about clusters with thousands or even hundreds of nodes. Ok, we stat the same files multiple times per interval, and this will look a little wonky to anyone paying attention to strace, but it'll simplify things in rtls which uses the file watcher. |
|
||
// load reads the certificate and key files from disk and caches the result for Get() | ||
func (mc *monitoredCertificate) load() error { | ||
cert, err := tls.LoadX509KeyPair(mc.certFile, mc.keyFile) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question, does this safely fails if one file changed but the other still didn't?
For example, what happens if the cert file changes and there is some random slowness until the key file gets also written? Before loading them it could be wise to have a delay, unless LoadX509KeyPair
fails if there is any kind of mismatch
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unless LoadX509KeyPair fails if there is any kind of mismatch
Exactly. If the key doesn't match the cert, LoadX509KeyPair()
returns an error and the active cert/key in memory won't be replaced. When the other file gets updated (or whatever operational issue is resolved), then this will get re-executed and at that time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did this similar feature in a lib
https://github.com/LuKks/simple-hosting/blob/main/index.js#L228-L255
Plus https://github.com/LuKks/simple-hosting/blob/main/lib/watch-file.js
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh great then, I should find the equivalent for LoadX509KeyPair
in Node.js, thanks!
Sketching out the ideas for #1554. Works for server certs, should work for client certs but I haven't tested it yet. CA certs aren't dynamically reloaded, but I don't think that's too important (and I don't think gotls makes that easy).
As mentioned in #1554, this implementation causes the node cert/key to be monitored multiple times (because they're passed multiple times to
CreateClientConfig()
andCreateServerConfig()
), and so changes will result in logging several instances of "reloading certificate". I plan to fix this by keeping a global cache ofmonitoredCertificate
objects keyed by cert/key and returning the same object when the same cert/key is passed, but if you have a better idea let me know.monitorFiles()
is the stat poller. Pretty simple. I left it under rtls for now and we can decide where to put it later.No tests yet, I can look at that once the design is finalized.