Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider periodically using VACUUM to reduce memory footprint #530

Open
freeekanayaka opened this issue Aug 17, 2023 · 9 comments
Open

Consider periodically using VACUUM to reduce memory footprint #530

freeekanayaka opened this issue Aug 17, 2023 · 9 comments
Labels
Feature New feature, not a bug

Comments

@freeekanayaka
Copy link
Contributor

I've experimented using VACUUM against a real-world medium-size Incus/LXD database (~700 containers) that had a size of 41 Megabytes. After running the VACUUM command, the size of the database was down to 3 Megabytes.

Depending on the workflow, if there are large databases that create issues with memory/snapshotting, this might be a relatively cheap and effective measure to explore. It might not have such a dramatic effect in all cases though.

@freeekanayaka
Copy link
Contributor Author

The database I mentioned had been running for several years, most probably performing a number of INSERTs and DELETEs over the course of time, which presumably slowly increased database fragmentation more and more. That means that running VACUUM regularly is advisable, especially for workloads with a lot of data updates.

I also tried to run .dump on that database and create it from scratch with the same data, and I got the same result: database size went from 41 MB to 3 MB, as it would be expected.

@MathieuBordere
Copy link
Contributor

Haven't looked into it yet, but weren't there issues with running VACUUM on a dqlite database, did you have to adapt some things?

@MathieuBordere
Copy link
Contributor

The VACUUM command works by copying the contents of the database into a temporary database file and then overwriting the original with the contents of the temporary file. When overwriting the original, a rollback journal or [write-ahead log](https://www.sqlite.org/wal.html) WAL file is used just as it would be for any other database transaction.

I'm just wondering how it would work in dqlite's case

@cole-miller
Copy link
Contributor

Yeah, I looked into using VACUUM to reset/empty a database in #435 and concluded that a prerequisite was proper support for attaching additional databases to a dqlite-managed connection, see #435 (comment). But I could certainly be missing something.

@freeekanayaka
Copy link
Contributor Author

I can think of several design options. But in my mind it's most probably something that's needed for long-running deployments.

@MathieuBordere
Copy link
Contributor

I can think of several design options. But in my mind it's most probably something that's needed for long-running deployments.

yes agree that it should be scheduled regularly when e.g. # free pages / # total pages passes a certain threshold (just thinking out loud)

@MathieuBordere MathieuBordere changed the title Consider using VACUUM to reduce memory footprint Consider periodically using VACUUM to reduce memory footprint Aug 18, 2023
@MathieuBordere MathieuBordere added the Feature New feature, not a bug label Aug 18, 2023
@freeekanayaka
Copy link
Contributor Author

If you are looking for something cheap, this could initially be implemented "offline", e.g. when the process restarts.

@cole-miller
Copy link
Contributor

VACUUM needs to be coordinated using a Raft log entry, since it affects page numbers in the future.

@freeekanayaka
Copy link
Contributor Author

VACUUM needs to be coordinated using a Raft log entry, since it affects page numbers in the future.

Sounds like a good point, thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature New feature, not a bug
Projects
None yet
Development

No branches or pull requests

3 participants