Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

indexes: Don't wipe indexes again when continuing a prior reindex #30132

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

TheCharlatan
Copy link
Contributor

@TheCharlatan TheCharlatan commented May 17, 2024

When restarting bitcoind during an ongoing reindex without setting the -reindex flag again, the block and coins db is left intact, but any data from the optional indexes is discarded. While not a bug per se, wiping the data again is
wasteful, both in terms of having to write it again, as well as potentially leading to longer startup times. So keep the index data instead when continuing a prior reindex.

Also includes a bugfix and smaller code cleanups around the reindexing code. The bug was introduced in b47bd95: "kernel: De-globalize fReindex".

@DrahtBot
Copy link
Contributor

DrahtBot commented May 17, 2024

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Code Coverage

For detailed information about the code coverage, see the test coverage report.

Reviews

See the guideline for information on the review process.

Type Reviewers
ACK furszy
Concept ACK luke-jr
Approach ACK stickies-v

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

Conflicts

Reviewers, this pull request conflicts with the following ones:

  • #29641 (scripted-diff: Use LogInfo/LogDebug over LogPrintf/LogPrint by maflcko)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

@stickies-v
Copy link
Contributor

Concept ACK

Reverts a bug introduced in b47bd95
"kernel: De-globalize fReindex". The change leads to a GUI user being
prompted to re-index on a chainstate loading failure more than once as
well as the node actually not reindexing if the user chooses to. Fix
this by setting the reindexing option instead of the atomic, which can
be safely re-used to indicate that a reindex should be attempted.

The bug specifically is caused by the chainman, and thus the blockman
and its m_reindexing atomic being destroyed on every iteration of
the for loop.

The reindex option for ChainstateLoadOptions is currently also set in a
confusing way. By using the reindex atomic, it is not obvious in which
scenario it is true or false.

The atomic is controlled by both the user passing the -reindex option,
the user chosing to reindex if something went wrong during chainstate
loading when running the gui, and by reading the reindexing flag from
the block tree database in LoadBlockIndexDB. In practice this read is
done through the chainstate module's CompleteChainstateInitialization's
call to LoadBlockIndex. Since this is only done after the reindex option
is set already, it does not have an effect on it.

Make this clear by using the reindex option from the blockman opts which
is only controlled by the user.
It does not control any actual logic and the log message as well as the
comment are obsolete, since no database initialization takes place there
anymore. Log messages indicating when indexes and chainstate databases
are loaded exist in other places.
Before this change continuing a reindex without the -reindex flag set
would leave the block and coins db intact, but discard the data of the
optional indexes. While not a bug per se, wiping the data again is
wasteful, both in terms of having to write it again, and potentially
leading to longer startup times.

When initially running a reindex, both the block index and any further
activated indexes are wiped. On an index's Init(), both the best block
stored by the index and the chain's tip are null. An index's m_synced
member is therefore true. This means that it will process blocks through
validation events while the reindex is running.

Currently, if the reindex is continued without the user re-specifying
the reindex flag, the block index is preserved but further index data is
wiped. This leads to the stored best block being null, but the chain tip
existing. The m_synced member will be set to false. The index will not
process blocks through the validation interface, but instead use the
background sync once the reindex is completed.

If the index is preserved (this change) after a restart its best block
may potentially match the chain tip. The m_synced member will be set to
true and the index can process validation events during the rest of the
reindex.
@TheCharlatan
Copy link
Contributor Author

Updated 991f50a -> 9de8b26 (preserveIndexOnRestart_0 -> preserveIndexOnRestart_1, compare)

@ryanofsky
Copy link
Contributor

ryanofsky commented May 20, 2024

fixing a bug introduced in b47bd95: "kernel: De-globalize fReindex".

Is this true? That commit, which was part of #29817, should not have changed any previous behavior

EDIT: Never mind, I see the problem now after reading f27290c commit description. The bug happens because the BlockManager is destroyed each loop iteration in AppInitMain so the value of the chainman.m_blockman.m_reindexing variable gets reset.

Copy link
Member

@furszy furszy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good in a first glance. It would be nice to add some coverage for it just so it doesn't happen again. Maybe assert that certain logs are not present during init? Like the "Wiping LevelDB in <index_path>" one.

@maflcko maflcko added this to the 28.0 milestone May 21, 2024
@stickies-v
Copy link
Contributor

stickies-v commented May 21, 2024

Approach ACK. First 2 commits (0d04433) LGTM but the third one I'm going to need to spend a lot more time wrapping my head around the implications.

@TheCharlatan
Copy link
Contributor Author

Updated 9de8b26 -> dd290b3 (preserveIndexOnRestart_1 -> preserveIndexOnRestart_2, compare)

  • Added a commit for testing that the indexes are still there when continuing a reindex.

test/functional/feature_reindex.py Outdated Show resolved Hide resolved
test/functional/feature_reindex.py Outdated Show resolved Hide resolved
test/functional/feature_reindex.py Outdated Show resolved Hide resolved
@luke-jr
Copy link
Member

luke-jr commented May 23, 2024

Concept ACK

@TheCharlatan
Copy link
Contributor Author

Updated dd290b3 -> 891784c (preserveIndexOnRestart_2 -> preserveIndexOnRestart_3, compare)

  • Addressed @furszy's comment, removed the timeout on the initload busy loop.
  • Addressed @maflcko's comment, removed the busyloop waiting for the block filter index. I initially thought it might be useful to wait for the index to load completely, but I don't think this is strictly required for this test.
  • Addressed @maflcko's comment, moved stop_node out of the busy loop.
  • Addressed @maflcko's comment, using named args for literal arguments now.

test/functional/test_framework/test_node.py Outdated Show resolved Hide resolved
test/functional/feature_reindex.py Outdated Show resolved Hide resolved
@DrahtBot
Copy link
Contributor

🚧 At least one of the CI tasks failed. Make sure to run all tests locally, according to the
documentation.

Possibly this is due to a silent merge conflict (the changes in this pull request being
incompatible with the current code in the target branch). If so, make sure to rebase on the latest
commit of the target branch.

Leave a comment here, if you need help tracking down a confusing failure.

Debug: https://github.com/bitcoin/bitcoin/runs/25367317004

Co-authored-by: furszy <matiasfurszyfer@protonmail.com>
@TheCharlatan
Copy link
Contributor Author

Thanks for the reviews @maflcko,

891784c -> eeea081 (preserveIndexOnRestart_3 -> preserveIndexOnRestart_4, compare)

  • Addressed @maflcko's comment, removed redundant initial syncing of the blockfilterindex.
  • Addressed @maflcko's comment, enforce using named argument.

cbergqvist

This comment was marked as resolved.

Copy link
Member

@furszy furszy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code review ACK eeea081

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants