-
Notifications
You must be signed in to change notification settings - Fork 708
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize memory management in Trilinos sparsity pattern accessors. #16406
base: master
Are you sure you want to change the base?
Conversation
|
if (colnum_cache.use_count() > 1) | ||
colnum_cache = std::make_shared<std::vector<size_type>>( | ||
sparsity_pattern->row_length(this->a_row)); | ||
else | ||
colnum_cache->resize(sparsity_pattern->row_length(this->a_row)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this thread-safe? The use count could be increased on a different thread after the check but before resize
, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, good question. My primary motivation was that when you do
SparsityPattern::iterator p = sp.begin();
++p;
and if ++p
moves you to the next row of the matrix, then the current design deallocates the std::vector
only to allocate another one. That's wasteful. Note that here, only one iterator is ever around, so the use count of these shared pointers is always one.
Your question pertains to the situation where another thread makes a copy of an iterator owned by the first thread (bumping the use count to two) at the very inopportune moment that the first thread has just passed the use_count()
check. So perhaps something like this:
SparsityPattern::iterator p = sp.begin();
auto t = std::thread([&p]() { auto x = p; ++x; });
++p;
t.join();
Note that here, the lambda function must capture p
by reference -- if it had captured it by value, the capture would be a second copy, resulting in a use count of two, so the optimization would not have applied.
So I think you're right that the optimization is not thread-safe. Good catch!
The question is what we want to do about the situation. I would really like to optimize the use case here because typically one will only have one iterator object sitting around, no copies being made, and it seems silly to release and re-allocate the memory all the time. I could guard access to that variable with a mutex. What would you suggest?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The question is what we want to do about the situation. I would really like to optimize the use case here because typically one will only have one iterator object sitting around, no copies being made, and it seems silly to release and re-allocate the memory all the time. I could guard access to that variable with a mutex. What would you suggest?
Yeah, I would just use a static std::mutex
and lock it for any access to colnum_cache
.
No description provided.