New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
regexp: The regexp package should support context #67278
Comments
Can you change your proposal to follow the guidelines at https://github.com/golang/proposal? Thanks. |
How often do you have a match that runs that long? |
This is an excellent question! Here's how you do it. This is from an example where I did something in a similar lib for type unification. You add the func LongTypeUnificationOperation(ctx context.Context, inputData interface{}) error {
for _, x := range inputData {
select {
case <-ctx.Done():
return ctx.Err()
default:
// pass
}
if err := Unify(x.Actual, x.Expect); err != nil {
return err
}
}
// do more stuff...
return nil
} HTH and sorry I didn't explain it clearer before. |
It's necessary in latency sensitive contexts, and sometimes it's a user-supplied regexp, so I can't guarantee in advance that it's not slow/exponential or worse/etc. (I forget the specific time complexities possible with/without backtracking and in this lib, but you get the idea...) |
Not sure what I did wrong. Feel free to edit as you see fit. |
I guess this proposal calls for inserting a non-blocking read from the context's "Done" channel somewhere in each iteration of this loop: Lines 197 to 240 in a0a6026
(and perhaps some other loops too, but this one seemed the most likely to my not-so-familiar-with-this-package eyes) However, I wouldn't want to significantly regress the performance of the non-cancelable case given that long-running regex matches seem like an unusual situation. Maybe it would be worth experimenting in a local branch to see what impact it would have to repeatedly try to read a channel in that loop. If the overhead is material then it seems like there'd either need to be two separate variations of the same function, or there'd need to be some way to avoid that overhead in the case where there's no context provided. |
@apparentlymart Your whole comment was very helpfully phrased. Thanks for commenting!
Indeed, it probably makes sense to add a new function that takes a context, and the remaining ones get ported to internally just pass context.Background(). Fully backwards compatibility, but new consumers can use the cancellable version if they want to. |
It would help to know whether this is a problem that has been observed in practice, or whether this is a theoretical problem. Go's regexp package is based on RE2, which was designed to not have exponential behavior. See https://swtch.com/~rsc/regexp/regexp3.html. |
Yes awesome, I remember reading this article, thanks for adding the link! That's what I was referring to above.
Great question. It's somewhere in the middle. I think this might be generally useful, but I can only offer my personal use-case. As part of https://github.com/purpleidea/mgmt/ (think a general automation tool like Ansible) users write code in a small DSL. If the user uses a regexp function, we implement that by wrapping the golang regexp library to run their user-supplied regexp when that function runs. The catch is that we have no control over what kind of regexp the user provides. We are also very good about shutting everything down when there is a global request to cancel. Many others tools when asked to cancel have a long shutdown process, which usually ends up with the user hitting ^\ which can cause data corruption. To avoid these cases, we want to guarantee we can shutdown very quickly. Say in under 500ms. (Arbitrary, but shorter is better.) I identified the regexp library as one situation where we wouldn't be able to guarantee that. So TL;DR: there isn't a specific user that I know of personally experiencing this pain point today, but if possible I would like to proactively plumb this in to prevent a future negative user experience or worse. Happy to answer any other questions. (Our internal function API that wraps this is a: |
Go version
latest
Output of
go env
in your module/workspace:What did you do?
N/A
What did you see happen?
Regexp did not cancel early
What did you expect to see?
It would be sensible to have a long running regexp match operation be cancellable with the context package.
I know this package was created before the
context
package was, but this might be a useful addition for a future extension of this package or golang2.xCheers
The text was updated successfully, but these errors were encountered: